simple: firstly, select a window at random from training set,
and form a decision tree for the current window (the sample
subset including both positive example and negative
example) with tree building algorithm; secondly, determine
the classification of samples in training set (except window)
with obtained decision tree so as to find misjudged
examples; if there are misjudge examples, insert them into
window and transfer them to tree building process, if not,
stop. ID3 algorithm has the following problems:
When traversing the space of decision tree, ID3
algorithm just maintains single, current hypothesis, and
loses the advantages brought by presenting all the
hypothesis, for example, it could not determine how many
other decision trees are consistent with current training data,
or use new examples query to optimally determine these
competitive hypothesis.
ID3 algorithm does not backtrack in searching.
Whenever certain layer of tree chooses a property to test, it
will not backtrack to reconsider this choice. In this way,
algorithm could easily converged local optimal answer, but
not global optimal answer.
The computational method based on mutual information
and used by ID3 algorithm relies on the property which has
more property value. But this property is not sure to be the
property with optimal classification.
ID3 algorithm is a kind of greedy algorithm. For
incremental learning task, ID3 algorithm could not accept
training sample incrementally, so the each increase of
example requires to abandon original decision tree, to
restructure new decision tree, and to cause lots of overhead.
So ID3 algorithm does not suit incremental learning.
ID3 algorithm is sensitive to noise. Quinlan defines the
noise as the wrong property value and wrong classification
of training sample data.
ID3 algorithm focuses attention on the selection of
property, but this method has been doubted by some
scholars. Whether property selection greatly affects the
precision of decision tree is still not concluded now.
Generally speaking, ID3 algorithm fits for dealing with
large-scale learning problems for clear theory, simple
methods, and stronger learning capacity. It is a very good
example in data mining and machine learning area, as well
as a useful tool for obtaining knowledge.