Crossvalidation randomly splits the dataset into a fixed number of equal-sized parts, or
folds. All but one fold is used for training and the remaining fold for testing each classifier.
This procedure is repeated so that each fold is used for testing exactly once. The average
accuracy over all test folds is the crossvalidation’s estimate of the classifier’s accuracy.
2In rank comparisons, see e.g. Table 3.1, we have found that selection by crossvalidation
is usually the worst ensemble learning scheme – even with just four classifiers.