Abstract Recent development in Graphics Processing Units (GPUs) has enabled inexpensive
high performance computing for general-purpose applications. Compute
Unified Device Architecture (CUDA) programming model provides the programmers
adequate C language like APIs to better exploit the parallel power of the GPU. Data
mining is widely used and has significant applications in various domains. However,
current data mining toolkits cannot meet the requirement of applications with
large-scale databases in terms of speed. In this paper, we propose three techniques
to speedup fundamental problems in data mining algorithms on the CUDA platform:
scalable thread scheduling scheme for irregular pattern, parallel distributed top-k
scheme, and parallel high dimension reduction scheme. They play a key role in our
CUDA-based implementation of three representative data mining algorithms, CUApriori,
CU-KNN, and CU-K-means. These parallel implementations outperform the
other state-of-the-art implementations significantly on a HP xw8600 workstation with
a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that
GPU + CUDA parallel architecture is feasible and promising for data mining applications