GPUs are throughput-oriented processors that depend on massive
multithreading to tolerate long latency memory accesses. The
latest GPUs all are equipped with on-chip data caches to reduce
the latency of memory accesses and save the bandwidth of NOC
and off-chip memory modules.