5. Evaluation Results
In this work, we used gem5 [31] as our simulation platform.
We integrated the NVMain [32] into gem5 as the DRAM
model. Table 3 shows the simulation setup. The DRAM timing
and power parameters are excerpted from Micron’s data
sheet [11]. Based on the power analysis in Section 2.2, the
IDD0 of an 8Kb row (42mA) is used as the half-row activation
current. FR-FCFS memory scheduling policy [33] is
deployed in the memory controller with separate read/write
queue. The selected SPEC2006 CPU benchmarks with reference
input size [34] and STREAM with all functions [29]
are evaluated as multi-programmed tests. Eight benchmarks
that have high MPKIs (miss per kilo instructions) are selected
and each benchmark is either duplicated or mixed for the fourcore
simulation. The four-core benchmarks are listed at the
bottom of Table 3, where each of them is given a test number.
We run all benchmarks for 500 million instructions for cache
warmup and then the following 100 million instructions for
statistics. The weighted IPC (instructions per cycle) defined in
Equation 5 is used as the performance criteria for the four-core
simulation. The aforementioned two Half-DRAM models:
Half-DRAM-1Row and Half-DRAM-2Row, are evaluated and
compared to the baseline. Obviously, they represent the lower
and upper bound of performance improvements, respectively.