ABSTRACT
Outliers are the points which are different from or
inconsistent with the rest of the data. They can be
novel, new, abnormal, unusual or noisy information.
Outliers are sometimes more interesting than the
majority of the data. The main challenges of outlier
detection with the increasing complexity, size and
variety of datasets, are how to catch similar outliers as
a group, and how to evaluate the outliers. This paper
describes an approach which uses Univariate outlier
detection as a pre-processing step to detect the outlier
and then applies K-means algorithm hence to analyse
the effects of the outliers on the cluster analysis of
dataset.
Keywords: Outlier, Univariate outlier detection,
K-means algorithm.