where, 0 denote the vectors with components 0. The real values qi
and pi denote normalization constraints that can be chosen from
the set of values f1; kyik; k/ðxiÞk; kyikk/ðxiÞkg, depending on the particular
task. The bias term b can be put as zero because it has been
shown in Kecman et al. (2005) that polynomial and RBF kernel do
not require the bias term.
To understand the geometry of the problem better, first we let
and are 1, and then the magnitude of the error measured by the
slack variables will be the same independently of the norm of
the feature vectors. Introducing dual variables faiji ¼ 1; . . . ;mg to
the margin constraints and based on the Karush–Kuhn–Tucker theory
we can express the linear operatorWby using the tensor products
of the output and the feature vectors, that is