dataset has 400 patients tested positive for CKD and 400 tested negative for CKD (400 + ve; 400 - ve). Now, suppose you have two

icon
Related questions
Question

There is a dataset of 800 patients tested for Chronic Kidney Disease (CKD). The dataset has 400 patients tested positive for CKD and 400 tested negative for CKD (400 + ve; 400 - ve).

Now, suppose you have two possible splits based on a decision tree classifier.

Possible split #1:

(300 + ve; 100 - ve) and (100 + ve; 300 - ve)

Possible split #2:

(200 + ve; 400 - ve) and (200 + ve; 0 - ve).

 

1. Calculate the Misclassification index for split #1 and split #2?  

2. Calculate the Gini index for split #1 and split #2?

3. Which one of the two splitting methods is better and why? 

300 +ve
100 -ve
100 +ve
300 -ve
Possible split #1
200 +ve
400-ve
200 +ve
0 -ve
Possible split #2
Transcribed Image Text:300 +ve 100 -ve 100 +ve 300 -ve Possible split #1 200 +ve 400-ve 200 +ve 0 -ve Possible split #2
Expert Solution
steps

Step by step

Solved in 3 steps

Blurred answer