processor cores. CUDA C is the language specifically designed to provide general purpose computing on
GPU.CUDA C adds the global qualifier to standard C. This mechanism alerts the compiler that a function
should be compiled to run on a device instead of the host. In this simple example, nvcc gives the function
kernel ( ) to the compiler that handles device code, and it feeds main ( ) to the host compiler. It provides this
language integration so that device function calls look very much like host function calls [5].
Fig.2 shows the architecture of CUDA which is composed of Grids and Blocks. In the grid there are multiple
threads of the main kernel running parallel and each grid composes a number of blocks that contains the
threads and the shared memory [5].The NVIDIA CUDA technology (developer.download.nvidia.com) is a
fundamentally new computing architecture that enables the GPU to solve complex computational problems.