Table IV shows the prediction performance of the six
classification algorithms using the features. The classification
performance ranges from 89% to 91.4% in terms of
specificity and from 90.2% to 95.9% in terms of sensitivity.
Therefore, we are certain that the proposed features are suitable
for classifying UML diagram based on the input images.
The four features whose InfoGain values equal 0 can be
considered of too small of influence and can be excluded
from the feature set. We check this exclusion by comparing
the performance of the classification algorithms on the two
features sets: one is full-feature set (so called FS23), and the
other is the reduced-feature set (so called FS19). The result is
shown on Table VI explicitly shows that the exclusion helps
the classification algorithms to improve their performance of
eliminating non-UML CDs. Comparing with FS23, while
sensitivity scores recorded on FS19 slightly decrease with at
most 0.2% through all algorithms, specificity values increase
from 0.2% to 0.7% on 3 out of 6 algorithms.