1. Introduction
The application of data mining techniques is becoming increasingly important in modern organizations that seek to utilize
the knowledge that is embedded in the mass organizational data to improve efficiency, effectiveness and competitiveness.
In recent years data mining practitioners and researchers have become aware of the need for formal data mining
process models that prescribes the journey from data to discovering knowledge. Thus a multi-industry collective of practitioners
(e.g. www.crisp-dm.org; [55]) came together to develop the cross-industry standard procedure for data mining
(CRISP-DM) which was further extended by researchers (e.g. [17]). The data mining process (see Fig. 1) has been described
in various ways (e.g. [17,55]) but essentially consists of the following steps [40]: Application Domain or Business Understanding
(which includes definition of data mining goals), Data Understanding, Data Preparation, Data Mining, Evaluation
(e.g. evaluation of results based on DM goals), and Deployment.
In this paper we will use the term segmentation to refer to the set of clusters that result from a given partitioning of the
dataset by a clustering algorithm. It should be noted that a common presumption that underlies the proposal of solution
approaches for clustering is that for each dataset there is a single optimal segmentation (i.e. partitioning) that is independent
of the objectives of the end-user. For example Halkidi et al. [32] speaks of the ‘‘ ‘optimal’ clustering scheme as the outcome of
running a clustering algorithm (i.e., a partitioning) that best fits the inherent partitions of the data set”. So how would this ‘optimal’
partitioning (i.e. segmentation) be identified if two or more segmentations appear to have approximately the ‘best’ fit.
Further would this choice be based on the goals of the clustering exercise (e.g. fraud detection vs. marketing profiling vs.
theory building)? It appears to us that when clustering is used for knowledge discovery that what is the ‘best fit’ may not
be independent of the goals of the clustering exercise. Thus Kim [38] suggests that for some situations ‘‘comparison between