3 SCALEGPU ARCHITECTURE Figure 1 shows the architecture of ScaleGPU, where the grey components are newly introduced on top of the existing GPU architecture. ScaleGPU implements the GPU memory as a DRAM cache [13] of the CPU memory by storing tags and data in GPU DRAM chips and adding a hardware module to handle tag misses and message queues to buffer requests and replies. This section describes each hardware component of ScaleGPU in detail.