To the best of our knowledge, the only research related to
OpenCL GPU optimizations is a short paper [24] and a case
study discussing an auto-tuning framework for designing
kernels [25]. Hence, the work presented here is the first
to publish and propose OpenCL optimization strategies for
AMD GPUs. We believe that one needs to exploit the
causal relationship between programming techniques and the
underlying GPU architecture to extract peak performance.
Hence, there exists the need for architecture-specific optimizations.