In this we section we will discuss the current formulation as given
in Szedmak and Shawe-Taylor (2005) and then we will show
that the formulation is basically a variation of proximal support
vector machines. To discuss the problem, we will be at first reproducing
the problem formulation given in Szedmak and Shawe-Taylor
(2005). First of all, we will see how the multiclass formulation
and interpretation differs from classical binary SVMs. Firstly, class
labels are vectors instead of 1s and 1s in the binary SVM. Thus class
labels in binary SVM belong to one-dimensional subspace where as
for multiclass SVM class label belongs to multi-dimensional subspace.
Secondly, W, which defines the separating hyper plane in
Binary SVM, is a vector. In multiclass, W is a matrix. We can imagine
the job of W in two-class SVMs is to map the data/feature vector
into one-dimensional subspace. In multiclass SVM, the natural
extension is then, mapping data/feature space into vector label
space whose defining bases are vectors. In other words multiclass
learning may be viewed as vector labeled learning or vector valuelearning.
Now we give the formulation given in Szedmak and Shawe-Taylor
(2005) and then the modification. Assume we have a sample S
of pairs fðyi; xiÞ : yi 2 Hy; xi 2 Hx; i ¼ 1; . . .mg independently and
identically generated by an unknown multivariate distribution P.
The support vector machine with vector output is realized on this
sample by the following optimization problem: