Exercise 17.5.6. Which of the two classifiers used above produces the best
AUC for the two Reuters datasets? Compare this to the outcome for percent
correct. What do the different outcomes mean?
The ROC curves discussed in Section 5.7 (page 172) can be generated by
right-clicking on an entry in the result list and selecting Visualize threshold
curve. This gives a plot with FP Rate on the x-axis and TP Rate on the y-axis.
Depending on the classifier used, this plot can be quite smooth or it can be
fairly irregular.
Exercise 17.5.7. For the Reuters dataset that produced the most extreme
difference in Exercise 17.5.6, look at the ROC curves for class 1. Make a
very rough estimate of the area under each curve, and explain it in words.
Exercise 17.5.8. What does the ideal ROC curve corresponding to perfect
performance look like?
Other types of threshold curves can be plotted, such as a precision–recall curve
with Recall on the x-axis and Precision on the y-axis.
Exercise 17.5.9. Change the axes to obtain a precision–recall curve. What is
the shape of the ideal precision–recall curve, corresponding to perfect
performance?