II. RELATED WORK
Previous studies on the energy consumption o f GPU mostly focus on desktop GPUs with comparatively less focus on embedded GPUs. Collange et al. [6] used an oscilloscope to measure the GPU energy consumption in a CUDA environment, and find out the bottleneck of a GPGPU program. Leng et al. [9] built an energy model for GPGPU using the measurement data and performance counters from GPGPU- Sim. The model can also estimate the energy consumption of GPU component. Shaikh et al. [7] profiled the power consumption of two desktop GPGPU architectures and showed that t he power dissipation of a data transfer instruction consumes less th an half of that of a kernel instruction. Ma et al. [5] chose five main GPU workload parameters to build their energy model, where the workload parameters represent the runtime utilizations of the major pipeline stages of the GPU. Hong and Kim [8] design ed a set of micro- benchmarks to stress different micro-architectural components of a GPU, and built both the power and performance model of the GPU. The above studies consider desktop GPUs whos e design objective is performance and it is different from embedded GPUs which emphasize on power. Moreover, most of the above energy models rely on low-level hardware performance counters to estimate the energy, which might not be easy to be interpreted by application programmers.