There are needs for improving the overall image processing performance as a response to the increasing picture size and resolution capabilities of multimedia devices. There are also needs for massively parallel programming approaches as we have multicore CPU and manycore GPU cards available. Therefore, we dig into the available CUDA technology to improve the image processing and filtering performance. In this work, we develop a CPU/GPU based image processing technique using CUDA/C programming to process large images very fast.