25.5.1 Evaluation Metrics
To compare different detection algorithms, we are interested primarily in measures
of classification performance. Taking a ‘positive’ classification to mean the labeling
of a profile as Attack, a confusion matrix of the classified data contains four sets,
two of which – the true positives and true negatives – consist of profiles that were
correctly classified as Attack or Authentic, respectively; and two of which – the false
positives and false negatives – consist of profiles that were incorrectly classified
as Attack or Authentic, respectively. Various measures are used in the literature
to compute performance based on the relative sizes of these sets. Unfortunately,
different researchers have used different measures, making direct comparison of
results sometimes difficult.
Precision and recall are commonly used performance measures in information
retrieval. In this context, they measure the classifier’s performance in identifying