Abstract: We present a new iterative method for probabilistic clustering of data.
Given clusters, their centers and the distances of data points from these centers, the
probability of cluster membership at any point is assumed inversely proportional to
the distance from (the center of) the cluster in question. This assumption is our working
principle.
The method is a generalization, to several centers, of the Weiszfeld method for
solving the Fermat–Weber location problem. At each iteration, the distances (Euclidean,
Mahalanobis, etc.) from the cluster centers are computed for all data points, and
the centers are updated as convex combinations of these points, with weights determined
by the above principle. Computations stop when the centers stop moving.
Progress is monitored by the joint distance function a measure of distance from
all cluster centers, that evolves during the iterations, and captures the data in its low
contours.
The method is simple, fast (requiring a small number of cheap iterations) and
insensitive to outliers.
Keywords: Clustering; Probabilistic clustering; Mahalanobis distance; Harmonic
mean; Joint distance function; Weiszfeld method; Similarity matri