In this paper we present a fast clustering algorithm used to cluster categorical data. The algorithm, called k-mode, is an extension to the well known k-means algorithm (MacQueen 1967). Compared to other clustering methods the k-means algorithm and its variants (Anderberg 1973) are efficient in clustering large data sets, thus very suitable for data mining. However, their use is often limited to numeric data because these algorithms minimize a cost function by calculating the means of clusters. Data mining applications frequently involve categorical data. The traditional approach to converting
categorical data into numeric values does not necessarily produce meaningful results in the case where categorical
domains are not ordered. The k-modes algorithm in this paper removes this limitation and extends the k-means
paradigm to categorical domains whilst preserving the-efficiency of the k-means algorithm.