For continuous attributes, a sufficiently large PET can estimate any class probability
function to arbitrary precision. Consider the simple univariate, two-class problem depicted
in figure 1: each class is distributed normally about a different mean. These overlapping
probability densities define a continuous class-membership probability function over the
domain of the variable (call it x). This may be one of theworst possible problems to which to
apply a PET, because piecewise-uniform representations are obviously a poor inductive bias,
and moreover because the problem is easy for other sorts of density estimators. However,
for this and for any such problem a PET can estimate the probability of class membership
to arbitrary precision. For this problem, each split in the tree partitions the x-axis, and each
leaf is a segment of the x-axis. A PET would estimate the probability by looking at the class
distribution for its segment (which in the figure can be seen by cutting a vertical slice and
looking at the relative heights of the curves of the two classes in the slice). The key is to
note that as the number of leaves increases, the slices become narrower, and the probability
estimates can become more and more precise. In the limit, the tree predicts class probability
perfectly