In experimental and applied machine learning work, it is
hard to exaggerate the influence of topdown heuristics
for building a decision tree from labeled sample data. These
algorithms grow a tree from the root to the leaves by repeatedly replacing an existing leaf with an internal node,
thus ``splitting'' the data reaching this leaf into two new
leaves and reducing the empirical error on the given sample.
The tremendous popularity of such programs (which include
the C4.5 and CART software packages [3, 16]) is due to
their efficiency and simplicity, the advantages of using
decision trees (such as potential interpretability to humans),
and, of course, their success in generating trees with good
generalization ability (that is, performance on new data).
Dozens of papers describing experiments and applications
involving topdown decision tree learning algorithms appear
in the machine learning literature each year