DIVERGENCE-AWARE GPU CACHE MANAGEMENT
As described in the previous section, divergent load instructions lead to many cache misses in L1D, especially inter-warp conflict misses. With more data blocks not being found in L1D, the num- ber of warps that can be actively scheduled are significantly re- duced. To address this problem, we propose Divergence-Aware Cache (DaCache) management for GPU. Based on the observation that the re-reference interval of cache blocks are shaped by warp schedulers, DaCache aims to exploit the prioritization information of warp scheduling, protect the cache blocks of highly prioritized warps from conflict-triggered eviction and maximize their chance of staying in L1D. In doing so, DaCache can alleviate the conflict misses across warps such that more warps can locate all data blocks from L1D for their load instructions. We refer to such warps as Fully Cached Warps (FCWs).