Through the C-based API CUDA (Compute Unified Device Architecture), NVIDIA2
recently brought the power of parallel computing on Graphics Processing Units (GPU) to
general-purpose algorithmic [4, 5]. This opportunity represents a promising alternative to solve the kNN problem in reasonable time. In this paper, we propose a CUDA implementation
for solving the brute force kNN search problem. We compared its performances to several CPU-based implementations.Besides being faster by up to two orders of magnitude, we noticed that the dimension of the sample points has only a small impact on the computation time with the proposed CUDA implementation, contrary to the C-based implementations.