Other common measures of model performance, particularly in Information Re-
trieval, are Precision and Recall . Precision, defined as P = T P/(T P + FP), is a Xavier Amatriain, Alejandro Jaimes, Nuria Oliver, and Josep M. Pujol
measure of how many errors we make in classifying samples as being of class A.
On the other hand, recall, R = T P/(T P + FN), measures how good we are in not
leaving out samples that should have been classified as belonging to the class. Note
that these two measures are misleading when used in isolation in most cases. We
could build a classifier of perfect precision by not classifying any sample as being
of class A (therefore obtaining 0 TP but also 0 FP). Conversely, we could build a
classifier of perfect recall by classifying all samples as belonging to class A. As a
matter of fact, there is a measure, called the F1-measure that combines both Preci-
T
Sometimes we would like to compare several competing models rather than es-
timate their performance independently. In order to do so we use a technique de-
veloped in the 1950s for analysis of noisy signals: the Receiver Operating Charac-
teristic (ROC) Curve. A ROC curve characterizes the relation between positive hits
and false alarms. The performance of each classifier is represented as a point on the
curve (see Fig. 2.7).