The other two schemes modify data layout or internal DRAM chip architecture. In the sub-rank system, a rank is divided into smaller sub-ranks. Addresses and commands from the memory controller are reinterpreted by register/demux to generate particular granularity of DRAM accesses that fit in the sub-ranks. A sub-rank can be composed of one or more chips (e.g. x8, x16, and x32 sub-ranks if one, two, and four x8 DRAM chips are used, respectively). For example, in an x8 sub-rank system, each sub-rank transfers 64-byte cache line data with 8 burst read/write commands if DDR3 SDRAM is used (4 in x16 sub-rank and 2 in x32 sub-rank). Therefore, other sub-ranks not involved in the data transfer can reside in a low-power state for reducing power consumption. Mini-rank [28] and Multi-Core DIMM (MC-DIMM) [2] belong to the sub-rank organization. Single Subarray Access (SBA) scheme [25] also can be categorized into the sub-rank organization but there is an additional modification to the DRAM chip architecture for further energy reduction. The sub-rank organization has several unavoidable drawbacks. Because the entire cache line is transferred to/from a small sub-rank and the data bus is narrower in the sub-rank system, it takes more cycles to deliver the data. For example, in an x8 sub-rank organization, 32 cycles are required while only 4 cycles are needed with conventional DDR3 SDRAM organization, to deliver a 64-byte cache line. In addition, the sub-rank organization involves large overhead for error protection. Each sub-rank requires a dedicated DRAM chip for ECC storage. Those trade-offs in the sub-rank memory system are well studied in [26].