the capacity was only about 40% of commodity DRAM. As
mentioned earlier in the paper, Cooper-Balis et al. [1] and
Udipi et al. [2] proposed to leverage fine-grained access in the
commodity DRAM to reduce power. However, both work fail
to comply with the n-bit prefetching, and the implementation
overhead to sustain full data bandwidth is significant [28].
Sub-rank level parallelism has been studied as another level
of fine-grained structure. Zheng et al. introduced a bridge chip
MRB to split the original rank into mini-ranks [39]. The minirank
design not only significantly reduces power consumption,
but also improves the memory parallelism so as to compensate
the potential performance degradation from narrowing down
the data bus. In spite of the power optimization, the extra MRB
increases the DRAM cost and it still suffers from the bandwidth
loss. Leveraging the mini-rank structure, D. Yoon et al.
implemented a memory system that has adaptive access granularity
at rank level [40]. The adaptive granularity memory
system, however, requires the co-design of a corresponding
fine-grained cache architecture, which significantly affects its
flexibility. To summarize, without reasonable optimizations
in DRAM core, it is hard to achieve a good trade-off between
performance (bandwidth) and power. All above approaches
always improve one aspect by sacrificing the other. Distinguished
from the previous work, Half-DRAM takes a holistic
consideration of both performance and power and thus can
achieve better compromise in between.