Handling Coherence: The contemporary GPU does not support coherent memory hierarchy. However, if coherence is supported in future GPUs, bypassing can also be easily supported.
The additional support for maintaining coherence with GPU bypassing may or may not be required depending upon the inclusion property of GPU cache hierarchy. Inclusion ensures that cache blocks present in high level caches are also present in the LLC, while non-inclusion or exclusion relaxes such a constraint. Bypassing essentially turns the inclusive LLC into a non-inclusive cache. Therefore, the support necessary for maintaining coherence in non-inclusive LLC can also be used to support bypassing in inclusive LLC. Coherence in non-inclusive caches is maintained by employing mechanisms such as snoop filter [22], which is essentially a replica of higher level cache tags at the LLC. Therefore, bypassing for non-inclusive LLC will not require any modifications for handling coherence, while support similar to snoop filter will be necessary for inclusive LLC.