The world’s Largest Sharp Brain Virtual Experts Marketplace Just a click Away
Levels Tought:
Elementary,Middle School,High School,College,University,PHD
| Teaching Since: | Apr 2017 |
| Last Sign in: | 103 Weeks Ago, 3 Days Ago |
| Questions Answered: | 4870 |
| Tutorials Posted: | 4863 |
MBA IT, Mater in Science and Technology
Devry
Jul-1996 - Jul-2000
Professor
Devry University
Mar-2010 - Oct-2016
INSY 5339HW #5: Predict Treatment OutcomeUse the same genes-leukemia.csv dataset used in assignment #2.As a predictor use field TREATMENT_RESPONSE, which has values Success, Failureor "?" (missing)Step 1. Examine the records where TREATMENT_RESPONSE is non-missing.Q1: How many such records are there?Q2: Can you describe these records using other sample fields (e.g. Year from XXXX toYYYY , or Gender = X, etc)Q3: Why is it not correct to build predictive models for TREATMENT_RESPONSEusing records where it is missing?Step 2. Select only the records with non-missing TREATMENT_RESPONSE. KeepSNUM (sample number) but remove sample fields that are all the same or missing (i.e.,they have same value for 10 or more samples). Call the reduced dataset genesreduced.csvQ4: How many sample fields should you keep?Step 3. Build a J48 Model using leave-one-out cross validation (which is same as n-foldcross-validation where n is the number of instances in the data set).Q5: What tree do you get? What is the incorrectly classified error rate?Q6: What are the important variables and their relative importance, according to J48?Q7: Remove the top predictor -- and re-run the J48 using leave-one-out cross validation --what do you get?
-----------