Recall that two factors affect the performance measure:
the training set, and the test set. The training set
affects the measurement indirectly through the learning
algorithm, whereas the composition of the test set has a
direct impact on the performance measure. A reasonable
experimental compromise may be to allow for
overlapping training sets, while keeping the test sets
independent. K-fold cross-validation does just that.