For this, we adopt a large buffer (usually between 4-6 frames) to store the intermediate data generated by CPU. One problem with coprocessing between a GPU and a CPU is data dependencies. The GPU and CPU do not share memory spaces. One more skill which is very helpful is using DMA (Direct memory access) [8] as intermediate buffer that is designed between ARM core and GPU core.