In many systems, more than one level of the memory hierarchy is implemented as a cache, as shown in fig8 When this is done, it is most common for the first level cache (the one closest to the processor) to be implemented as separate instruction and data caches, while other levels are implemented as unified caches. This gives the processor the additional bandwidth provided by a Harvard architecture at the top level of the memory system, while simplifying the design of the lower levels. For a multilevel cache to significantly improve the average memory access time of a system, each level must have a significantly larger capacity than the a level above it in the hierarchy, because the locality of reference seen by each level decreases as one gets deeper in the hierarchy. (Requests to recently referenced data are handled by the upper levels of the memory system, so requests that make it to the lower levels tend to be more widely distributed across the address space.) Caches with larger capacities tend to be slower, so the speed benefit of separate instruction and data caches is not as significant in lower levels of the memory hierarchy, another argument in favor of using unified caches for these levels.