We used an experimental methodology similar to the one used
to demonstrate the effectiveness of the SBK model [33]. For each
dataset, we initialized the overlapping clustering by running k-means
clustering, where the additive inverse of the corresponding Bregman
divergence was used as the similarity measure and the number
of clusters was set by the number of underlying categories in the
dataset. The resulting clustering was used to initialize our overlapping
clustering algorithm.
To evaluate the clustering results, precision, recall, and F-measure
were calculated over pairs of points. For each pair of points that
share at least one cluster in the overlapping clustering results, these
measures try to estimate whether the prediction of this pair as being
in the same cluster was correct with respect to the underlying
true categories in the data. Precision is calculated as the fraction of
pairs correctly put in the same cluster, recall is the fraction of actual
pairs that were identified, and F-measure is the harmonic mean of
precision and recall