k-means is a simple algorithm based on the firm foundation of analysis of variances. In this method, a set of data is clustered into a predefined number of clusters. k-means starts with randomly initial cluster centroids and keeps reassigning the data objects in the dataset to cluster centroids based on the similarity between the data objects and the cluster centroids. The reassignment procedure will stops when a convergence criterion (e.g. the number of iteration, or no change in the cluster results after a certain number of iteration) is met. The k-means clustering process is described by the four following steps: