Classification Problems
Skin
This is a dataset produced by the author and colleagues; the problem is to predict whether
a particular pixel in a real-world image is human skin or not. The data was generated
by asking volunteers to select pixels corresponding to skin and not skin, from a variety of
real-world images, and recording the image information at those pixels. The data has 4500
examples (each corresponding to one pixel in an image), with 6 continuous valued inputs
and 1 binary output. For a given pixel, the first three input variables are the red, green and
blue values at that point, rescaled to [0, 1]. The last three input variables were generated by
calculating the sample variance of a 3x3, 5x5 and 7x7 window around the pixel. Networks
were trained for 500 iterations. We divided the dataset into 5 equal sized portions; we
trained on one fifth of the data (900 examples), then tested on the remaining four fifths
(3600 examples). This was repeated 5 times so each portion was used in turn as training
data. Each time, the ensemble was evaluated from 40 trials of random weights, giving a
total of 200 trials for each experiment.