Up until this point, our discussion has focused solely on how support vector machines
can be used for binary classification tasks. We will now describe two of the
most popular ways to turn a binary classifier, such as a support vector machine,
into a multi-class classifier. These approaches are relatively simple to implement
and have been shown to be work effectively.
The first technique is called the one versus all (OVA) approach. Suppose that
we have aK ≥ 2 class classification problem. The OVA approach works by training
K classifiers. When training the kth classifier, the kth class is treated as the
positive class and all of the other classes are treated as the negative class. That is,
each classifier treats the instances of a single class as the positive class, and the
remaining instances are the negative class. Given a test instance x, it is classified
using all K classifiers. The class for x is the (positive) class associated with the
classifier that yields the largest value of w · x. That is, if wc is the “class c versus
not class c” classifier, then items are classified according to: