A special case of model selection comes into play
when dealing with non-classification model selection.
For example when trying to pick a feature selection [7]
algorithm that will maximize a classifier’s performance
on a particular dataset. Refaeilzadeh et al. [10] explore
this issue in detail and explain that there are in fact two
variants of cross-validation in this case: performing
feature selection before splitting data into folds
(OUT) or performing feature selection k times inside
the cross-validation loop (IN). The paper explains that
there is potential for bias in both cases: With OUT, the
feature selection algorithm has looked at the test set, so
the accuracy estimate is likely inflated; On the other
hand with IN the feature selection algorithm is looking
at less data than would be available in a real experimental
setting, leading to underestimated accuracy.
Experimental results confirm these hypothesis and
further show that: