Even within the context of classifiers defined in
terms of simple linear combinations of the predictor
variables, it has often been observed that the major
gains are made by (for example) weighting the
variables equally, with only little further gains to
be had by careful optimization of the weights. This
phenomenon has been termed the flat maximum effect
[13, 43]: in general, often quite large deviations
from the optimal set of weights will yield predictive
performance not substantially worse than the optimal
weights. An informal argument that shows why
this is often the case is as follows.