All these circumstances endorse the movement towards the resurgence of processing in memory (PIM) [3], [4], which offloads certain computations to processing units near the memory. One way to implement PIM is to add fully functional cores atop DRAM dies with 3D stacking. However, integrating cores with DRAM incurs much overhead and design changes.