3.1 Assigning tags to data
When the GPU memory functions as a cache of the CPU memory,
conflict misses can occur in the GPU memory. In order to detect
such conflict misses, the GPU memory in ScaleGPU must store the
tag of the memory address. ScaleGPU implements tags as the most
significant bits of the virtual address instead of those of the physical
address.
ScaleGPU associates a single tag to each DRAM row and increases
the memory partition interleaving granularity to the DRAM row size
for exploiting the high spatial locality inherent in GPU memory
accesses. The overhead of the increased interleaving granularity is
compensated by the increased row buffer hits and the overlapping
of data transfers and computations. The size of the GPU memory
determines the number of bits used for the tag. The rest of the
virtual address indicates the physical location such as the target
memory chip, bank, row, or column. ScaleGPU places tags and data in
separate banks to perform tag comparisons and DRAM row accesses
in parallel.