In this paper, we propose a novel buffered compare scheme,
a (PIM) technique that performs compare-n-op operations inside DRAM banks to speedup many workloads and amplify effective memory bandwidth.
In contrast to existing (PIM) techniques,the buffered compare operations have deterministic latency, so that they can be treated as simple extensions of ordinary DRAM commands, which leaves the DRAM as a ‘passive’ device (a device that does not invoke any event by itself). Also, without any caches or complicated pipelines of ordinary cores, the buffered compare approach incurs minimal overhead to existing
DRAM dies.