Commonality (i.e. high similarity) here
identifies the document sets (clusters) being
relevant to the domain of interest. An assumption
is that individual clusters having high similarity
across ontology entities have a high probability in
the same domain. This hypothesis is backed up
with observed patterns of collocated terms within
the same domain, and consequently different
domains will have a different collocation pattern of
terms (more details are found in [Tomassen &
Strasunskas, 2009]). However, the similarity of
clusters depends a lot on the quality of the
ontology, especially on the semantic distance
between entities.