Within each subspace divided by COI, an optimized k-means clustering algorithm is developed to
organize the features into clusters as illustrated in Fig. 8. The major reason for clustering is to
facilitate the query discussed in the next section. k-means, which has been studied extensively, is
one of the most popular clustering algorithms applied in pattern recognition due to its properties
of local-minimum convergence and implementation simplicity. However, k-means suffers some
intrinsic deficiencies that impede its application to data mining. First, it is very sensitive to initial
starting conditions, i.e., it is fully deterministic given the randomly or arbitrarily chosen initial
centers. Second, the number of clusters has to be provided as a parameter, which assumes a priori
knowledge about the data is available. Third, computation is expensive as it requires multiple data
scans to achieve convergence. Although a universal solution does not exist, various approaches
have been proposed as partial remedies. Speed is greatly improved by embedding the data set in a
multiresolution kd-tree and storing sufficient statistics at its nodes [40]. X-mean [41] can quickly
estimate the number of clusters and scales better than k-means. The number of clusters can be
automatically determined via a cluster validity measure [42] based on intra-cluster and intercluster
distance. Bradley and Fayyad [43] discuss an approach to refine the selection of starting
centers through repeated sub-sampling and smoothing. An empirical comparison of several
initialization methods for the k-means algorithm can be found in [44].