The SCENT model contains many elements of
stochasticity, such as: the random order of vector
presentation, the noisy process of growth, and the nondeterministic
pruning of unsuccessful growth. Whilst
this allows for exploration of the architectural and
classification spaces available to the model, it also
imples that each run produces a unique structure.
Stability and repeatability therefore become important
aspects of SCENT’s performance. In order to address
this issue, the two zoo data sets, with and without the
type label, together with the picture data were presented
for four separate runs. The structural features of the
resulting trees are presented in Table 3.
It is apparent, first of all, that the there is variation in
the overall structure of trees produced in different runs,
however this variance is not excessive.
Overall the trees produced for the picture data set were
larger than those produced for either of the zoo sets.
This is accounted for exclusively by these trees having a
greater branching factor; indeed the zoo data tended to
produce slightly deeper trees. In view of the model's
growth criterion for new clusters, downwards for dense
data points and sideways for spatially separated data,
this implies that the picture data has greater spatial
separation.
The two zoo data sets produced trees of slightly different
shape. The removal of the type label caused the trees to
be shallower but with more branches, which as before
shows the untyped data to be the more spatially
separated. The reason for this is that two similar
vectors with identical type fields are slightly less
similar when the type field is removed. The full
theoretical relationship between training data, in
general, and the resulting structures produced by
SCENT is an issue currently being investigated.