Although the literature contains examples of artificial
data which simple models cannot separate
(e.g., intertwined spirals or checkerboard patterns),
such data sets are exceedingly rare in real life. Conversely,
in the two-class case, although few real data
sets have exactly linear decision surfaces, it is common
to find that the centroids of the predictor variable
distributions of the classes are different, so that
a simple linear surface can do surprisingly well as an
estimate of the true decision surface. This may not
be the same as “can do surprisingly well in classifying
the points,” since in many problems the Bayes
error rate is high, meaning that no decision surface
can separate the distributions of such problems
very well. However, it means that the dramatic steps