In this work, we extend the knowledge of the optimization
space applicable to GPU architectures. We accomplish this
by studying the optimizations which improve performance
on a 1600-core GPU, specifically an AMD Radeon HD
5870, in comparison with those which are known to achieve
similar results on NVIDIA’s CUDA GPU platform.