process is repeated J times so that each fold is used for testing exactly once, thus
generating one prediction for every example.4 One classifier’s output is therefore
a class probability distribution for every example.5 Figure 1.1(b) shows how
such a reasonable class probability distribution may look. The rows correspond
to the examples from the original training set, see Figure 1.1(a).