Niagara2’s Primary and L2 cache sizes are relatively small
compared to some other processors. Even though this may cause
higher cache miss rates, the miss latency is well hidden by the
presence of other threads whose operands/data is available and
hence can make good use of the “compute” time slots, thus
minimizing wastage of “compute” resources. This factor explains
why the optimum design point moved towards having
higher thread counts and lower cache sizes. In effect, this can
be thought of as devoting more transistors on chip to the intelligent
“processing” function as opposed to the nonintelligent
“data-storing” function.