Classifying Actual Documents
A standard collection of newswire articles is widely used for evaluating document
classifiers. ReutersCorn-train.arff and ReutersGrain-train.arff are training sets
derived from this collection; ReutersCorn-test.arff and ReutersGrain-test.arff are
corresponding test sets. The actual documents in the corn and grain data are the
same; only the labels differ. In the first dataset, articles concerning corn-related
issues have a class value of 1 and the others have 0; the aim is to build a classifier
that identifies “corny” articles. In the second, the labeling is performed with respect
to grain-related issues; the aim is to identify “grainy” articles.
Exercise 17.5.4. Build classifiers for the two training sets by applying
FilteredClassifier with StringToWordVector using (1) J48 and (2)
NaiveBayesMultinomial, evaluating them on the corresponding test set in
each case. What percentage of correct classifications is obtained in the four
scenarios? Based on the results, which classifier would you choose?
Other evaluation metrics are used for document classification besides the percentage
of correct classifications: They are tabulated under Detailed Accuracy By
Class in the Classifier Output area—the number of true positives (TP), false positives
(FP), true negatives (TN), and false negatives (FN). The statistics output by
Weka are computed as specified in Table 5.7; the F-measure is mentioned in Section
5.7 (page 175).
Exercise 17.5.5. Based on the formulas in Table 5.7, what are the best possible
values for each of the output statistics? Describe when these values are
attained.
The Classifier Output also gives the ROC area (also known as AUC), which, as
explained in Section 5.7 (page 177), is the probability that a randomly chosen positive
instance in the test data is ranked above a randomly chosen negative instance, based
on the ranking produced by the classifier. The best outcome is that all positive
examples are ranked above all negative examples, in which case the AUC is 1. In the
worst case it is 0. In the case where the ranking is essentially random, the AUC is 0.5,
and if it is significantly less than this the classifier has performed anti-learning!
Classifying Actual Documents
A standard collection of newswire articles is widely used for evaluating document
classifiers. ReutersCorn-train.arff and ReutersGrain-train.arff are training sets
derived from this collection; ReutersCorn-test.arff and ReutersGrain-test.arff are
corresponding test sets. The actual documents in the corn and grain data are the
same; only the labels differ. In the first dataset, articles concerning corn-related
issues have a class value of 1 and the others have 0; the aim is to build a classifier
that identifies “corny” articles. In the second, the labeling is performed with respect
to grain-related issues; the aim is to identify “grainy” articles.
Exercise 17.5.4. Build classifiers for the two training sets by applying
FilteredClassifier with StringToWordVector using (1) J48 and (2)
NaiveBayesMultinomial, evaluating them on the corresponding test set in
each case. What percentage of correct classifications is obtained in the four
scenarios? Based on the results, which classifier would you choose?
Other evaluation metrics are used for document classification besides the percentage
of correct classifications: They are tabulated under Detailed Accuracy By
Class in the Classifier Output area—the number of true positives (TP), false positives
(FP), true negatives (TN), and false negatives (FN). The statistics output by
Weka are computed as specified in Table 5.7; the F-measure is mentioned in Section
5.7 (page 175).
Exercise 17.5.5. Based on the formulas in Table 5.7, what are the best possible
values for each of the output statistics? Describe when these values are
attained.
The Classifier Output also gives the ROC area (also known as AUC), which, as
explained in Section 5.7 (page 177), is the probability that a randomly chosen positive
instance in the test data is ranked above a randomly chosen negative instance, based
on the ranking produced by the classifier. The best outcome is that all positive
examples are ranked above all negative examples, in which case the AUC is 1. In the
worst case it is 0. In the case where the ranking is essentially random, the AUC is 0.5,
and if it is significantly less than this the classifier has performed anti-learning!
การแปล กรุณารอสักครู่..
