1.3.3 Limitations of Way-Prediction Schemes
Way-prediction designs have been proposed for fast L1 caches.
There are several reasons for which the original way-prediction
idea cannot be applied directly to large L2 caches.
First, in way-prediction designs, the predicted way number must
be made available before the actual data address is generated. We call this an out-cache1 feature for way-prediction designs.
As large L2 caches are typically physically-indexed caches, a
virtual to physical address translation must be conducted before the
address can be presented to the way-prediction hardware. The way prediction
mechanism sitting between the TLB and the L2 cache
will add extra delay to the critical path. Second, L2 caches are
unified caches, where most of the references come from L1 data
cache misses. MRU based prediction does not always work well
with data references [13, 14]. Third, the cache line size of the L2
cache is large. In Intel P4 processors, the L2 cache line size is
128 bytes. This means exchanging the locations of cache lines is
prohibitively expensive. Finally, way-prediction introduces nonunified
cache access latency. The processor must be redesigned to
take the advantage of non-unified L2 cache latency.
1.3.3 Limitations of Way-Prediction SchemesWay-prediction designs have been proposed for fast L1 caches.There are several reasons for which the original way-predictionidea cannot be applied directly to large L2 caches.First, in way-prediction designs, the predicted way number mustbe made available before the actual data address is generated. We call this an out-cache1 feature for way-prediction designs.As large L2 caches are typically physically-indexed caches, avirtual to physical address translation must be conducted before theaddress can be presented to the way-prediction hardware. The way predictionmechanism sitting between the TLB and the L2 cachewill add extra delay to the critical path. Second, L2 caches areunified caches, where most of the references come from L1 datacache misses. MRU based prediction does not always work wellwith data references [13, 14]. Third, the cache line size of the L2cache is large. In Intel P4 processors, the L2 cache line size is128 bytes. This means exchanging the locations of cache lines isprohibitively expensive. Finally, way-prediction introduces nonunifiedcache access latency. The processor must be redesigned totake the advantage of non-unified L2 cache latency.
การแปล กรุณารอสักครู่..
