Table 4 shows how the error of the canopy clustering varies
when the final number of clusters is changed. Note that
having the correct number of clusters (121), or slightly more,
provides the best accuracy. Detailed error analysis shows
that most of our error (85% of it) comes from citations that