5.1. Performance Analysis
The performance results of the four-core simulation are shown
in Figure 10a. To show the advantage of Half-DRAM, the
result of the prior work with 1/2 bank activation is also evaluated
and denoted as FGA-1/2Bank. The burst length in FGA-
1/2Bank is set to 16 (8×2) to compensate the bandwidth loss
discussed in Section 2.2. All tests have been normalized to the
baseline where full bank activation and the Relaxed-ClosePage
policy are applied. First, all tests suffer severe performance
degradation, from 20% (test8) to 36% (test3). Since test3 has
high row buffer hit rate and intensive memory accesses, the
reduced bandwidth leads to much more contention on the data
bus, which can explain the performance drop.
In contrast, thanks to the extremely low CHR ratio, no tests
suffer from performance drop in Half-DRAM-1Row. Moreover,
test6 that has the highest CHR ratio even shows 3% performance
improvement. The reason of the improvement is that
it has a low row buffer hit rate so that Half-DRAM-1Row can
take advantage of the relaxation of Four-activation-window
constraint to overwhelm the slight increase of activation number.
On average, Half-DRAM-1Row can improve the performance
by 1.3%. Even though the performance improvement
is trivial, Half-DRAM-1Row does not induce performance
degradation. In addition, Half-DRAM-2Row shows a promising
performance improvement over baseline by leveraging the
sub-array level parallelism. In particular, test1 can achieve as
much as 19% performance improvement. The performance
gain comes from the relatively low data locality, which can
utilize the half-bank parallelism well. Again, the relaxation of
tFAW also boosts the performance gain. The average performance
improvement in Half-DRAM-2Row is 10.7%.