25.5.2.2 Supervised Detection
Supervised detection algorithms have focussed on the selection of attributes of attack profiles from which to build a feature vector for input to a classifier. Generally,
such features have been selected by observation of generic attributes that are common across attack profiles of a number of different attack strategies and also model
specific attributes that are common across profiles that have been generated for a
specific type of attack.
In [5] profile attributes based to those proposed in [7] and others along similar
lines were developed into features for inclusion in a feature vector input to a supervised classifer. Moreover, other features based on the statistics of the filler and target
items in the user profile, rather than the entire profile, were proposed. For example,
the filler mean variance feature is defined as the variance of the ratings in the filler
partition of the profile and is used to detect average attacks; the filler mean target
difference feature, defined as the difference between the means of the target items
and the means of the filler items, is used to detect bandwagon attacks.
The authors looked at three supervised classifiers: kNN, C4.5, and SVM. The
kNN classifier uses detection attributes of the profiles to find the k = 9 nearest
neighbors in the training set using Pearson correlation for similarity to determine
the class. The C4.5 and SVM classifiers are built in a similar manner such that they
classify profiles based on the detection attributes only. The results for the detection
of a 1% average attack over various filler sizes are reproduced in Figure 25.8. SVM
and C4.5 have near perfect performance on identifying attack profiles correctly, but
on the other hand, they also misclassify more authentic profiles than kNN. SVM has
the best combination of recall and specificity across the entire range of filler sizes
for a 1% attack.