Naive-Bayes induction algorithms were previously
shown to be surprisingly accurate on many classi-
cation tasks even when the conditional independence
assumption on which they are based is violated. However,
most studies were done on small databases. We
show that in some larger databases, the accuracy of
Naive-Bayes does not scale up as well as decision trees.
We then propose a new algorithm, NBTree, which induces
a hybrid of decision-tree classiers and NaiveBayes
classiers: the decision-tree nodes contain univariate
splits as regular decision-trees, but the leaves
contain Naive-Bayesian classiers. The approach retains
the interpretability of Naive-Bayes and decision
trees, while resulting in classiers that frequently outperform
both constituents, especially in the larger
databases tested