The most distinct characteristic of data mining is that
it deals with very large data sets (gigabytes or even
terabytes). This requires the algorithms used in data
mining to be scalable. However, most algorithms currently
used in data mining do not scale well when applied to very
large data sets because they were initially developed for
other applications than data mining which involve small
data sets. The study of scalable data mining algorithms has
recently become a data mining research focus (Shafer et al.
1996).