Conclusion
This paper has presented a correlation-based approach
to feature selection for machine learning and compared
it with the wrapper|a well known feature selection
technique that uses the target learning algorithm to
guide its search for good features. The experiments
have shown that, in many cases, CFS gives results that
are comparable or better than the wrapper. Because
CFS makes use of all the training data at once, it can
1The number of features selected by the wrapper using
C4.5 is very similar. Note that because CFS is a ¯lter, the
feature sets it selects are the same regardless of the ¯nal
learning algorithm.
give better results than the wrapper on small datasets.
CFS is much faster than the wrapper (by more than an
order of magnitude), which allows it to be applied to
larger datasets than the wrapper.
Many applications of machine learning involve predicting
a class" that takes on a continuous numeric
value. Future work will aim at extending CFS to handle
problems where the class is numeric.
Figure