Overlapping clustering. In 1979 Shepard and Arabie introduced
the ADCLUS algorithm [20] for additive clustering,
which perhaps can be considered the first overlappingclustering
method. The method, which has been later applied
in the marketing domain [21], subsumes hierarchical clustering
as a special case and can be regarded as a discrete analog of
principal components analysis.
Regardless this ancient roots, in the last decades overlapping
clustering has not attracted as much attention as nonoverlapping
clustering. One close sibling is fuzzy clustering
[22], where each data point has a membership value in all
the clusters. In this context cluster membership is “soft”, as
apposed to our paper that we are interested in “hard” cluster
assignments. Obviously a hard (and overlapping) cluster assignment
can be obtained by thresholding membership values.
The prototypical fuzzy-clustering method is fuzzy c-means,
which is essentially a soft version of k-means.
Recently mixture-models have been generalized to allow
overlapping clusters. Banerjee et al. [7] generalize the work
of Segal et al. [23] to work with any regular exponential
family distribution, and corresponding Bregman divergence.
The work of Banerjee et al. [7] has later been extended to coclustering
by Shafiei and Milios [24]. Multiplicative mixture
models have been proposed as a framework for overlapping
clustering by Fu and Banerjee [25].
Our work distinguishes from this body of research as it
develops within the correlation clustering framework, and thus
it has a different input and different objectives. One of the main
differences is that the above discussed methods are not easily
applicable when features vectors are not available, as in our
application on trajectories and proteins.
Overlapping clustering. In 1979 Shepard and Arabie introduced
the ADCLUS algorithm [20] for additive clustering,
which perhaps can be considered the first overlappingclustering
method. The method, which has been later applied
in the marketing domain [21], subsumes hierarchical clustering
as a special case and can be regarded as a discrete analog of
principal components analysis.
Regardless this ancient roots, in the last decades overlapping
clustering has not attracted as much attention as nonoverlapping
clustering. One close sibling is fuzzy clustering
[22], where each data point has a membership value in all
the clusters. In this context cluster membership is “soft”, as
apposed to our paper that we are interested in “hard” cluster
assignments. Obviously a hard (and overlapping) cluster assignment
can be obtained by thresholding membership values.
The prototypical fuzzy-clustering method is fuzzy c-means,
which is essentially a soft version of k-means.
Recently mixture-models have been generalized to allow
overlapping clusters. Banerjee et al. [7] generalize the work
of Segal et al. [23] to work with any regular exponential
family distribution, and corresponding Bregman divergence.
The work of Banerjee et al. [7] has later been extended to coclustering
by Shafiei and Milios [24]. Multiplicative mixture
models have been proposed as a framework for overlapping
clustering by Fu and Banerjee [25].
Our work distinguishes from this body of research as it
develops within the correlation clustering framework, and thus
it has a different input and different objectives. One of the main
differences is that the above discussed methods are not easily
applicable when features vectors are not available, as in our
application on trajectories and proteins.
การแปล กรุณารอสักครู่..
