With the emergence of big data applications [1], the centroid of computing paradigm is shifting from computation towards data. Together with the trend of increasing number of cores in a system, the memory bandwidth requirement of a system has steadily increased. However, in contrast to the rapidly growing computing power and bandwidth requirement, actual bandwidth and energy efficiency of off-chip channels are not improving as much, so called the memory wall problem [2].