Clustering (or cluster analysis) aims to organize a collection of data items
into clusters, such that items within a cluster are more “similar” to each other
than they are to items in the other clusters. This notion of similarity can be
expressed in very different ways, according to the purpose of the study, to
domain-specific assumptions and to prior knowledge of the problem.
Clustering is usually performed when no information is available concerning the membership of data items to predefined classes. For this reason,
clustering is traditionally seen as part of unsupervised learning. We nevertheless speak here of unsupervised clustering to distinguish it from a more recent
and less common approach that makes use of a small amount of supervision
to “guide” or “adjust” clustering (see section 2).
To support the extensive use of clustering in computer vision, pattern
recognition, information retrieval, data mining, etc., very many different
methods were developed in several communities. Detailed surveys of this
domain can be found in [25], [27] or [26]. In the following, we attempt to
briefly review a few core concepts of cluster analysis and describe categories
of clustering methods that are best represented in the literature. We also take
this opportunity to provide some pointers to more recent work on clustering
Clustering (or cluster analysis) aims to organize a collection of data items
into clusters, such that items within a cluster are more “similar” to each other
than they are to items in the other clusters. This notion of similarity can be
expressed in very different ways, according to the purpose of the study, to
domain-specific assumptions and to prior knowledge of the problem.
Clustering is usually performed when no information is available concerning the membership of data items to predefined classes. For this reason,
clustering is traditionally seen as part of unsupervised learning. We nevertheless speak here of unsupervised clustering to distinguish it from a more recent
and less common approach that makes use of a small amount of supervision
to “guide” or “adjust” clustering (see section 2).
To support the extensive use of clustering in computer vision, pattern
recognition, information retrieval, data mining, etc., very many different
methods were developed in several communities. Detailed surveys of this
domain can be found in [25], [27] or [26]. In the following, we attempt to
briefly review a few core concepts of cluster analysis and describe categories
of clustering methods that are best represented in the literature. We also take
this opportunity to provide some pointers to more recent work on clustering
การแปล กรุณารอสักครู่..
