Clustering algorithms are attractive for the task of class identification
in spatial databases. However, the application to
large spatial databases rises the following requirements for
clustering algorithms: minimal requirements of domain
knowledge to determine the input parameters, discovery of
clusters with arbitrary shape and good efficiency on large databases.
The well-known clustering algorithms offer no solution
to the combination of these requirements. In this paper,
we present the new clustering algorithm DBSCAN relying on
a density-based notion of clusters which is designed to discover
clusters of arbitrary shape. DBSCAN requires only one
input parameter and supports the user in determining an appropriate
value for it. We performed an experimental evaluation
of the effectiveness and efficiency of DBSCAN using
synthetic data and real data of the SEQUOIA 2000 benchmark.
The results of our experiments demonstrate that (1)
DBSCAN is significantly more effective in discovering clusters
of arbitrary shape than the well-known algorithm CLARANS,
and that (2) DBSCAN outperforms CLARANS by
factor of more than 100 in terms of efficiency.
Keywords: Clustering Algorithms, Arbitrary Shape of Clusters,
Efficiency on Large Spatial Databases, Handling Nlj4-
275oise