Data-parallel accelerators such as the Graphic Processor Units (GPU) are the most popular among the accelerator cores used in such designs. With easy to adopt programming models, such as Nvidia CUDA and OpenCL, these data-parallel cores are now being employed to accelerate diverse workloads.