Caches

The main goal of caches is to increase the performance of a computer through the memory system in order to:

Provide the user the illusion to use a memory that is simultaneously large and fast
Provide the data to the processor at high frequency

For this reason two principles are exploited:

Temporal Locality: when there is a reference to one memory element, the trend is to refer again to the same memory element soon (i.e., instruction and data reused in loop bodies)
Spatial Locality: when there is a reference to one memory element, the trend is to refer soon at other memory elements whose addresses are close by (i.e., sequence of instructions or accesses to data organized as arrays or matrices)

The resulting memory hierarchy is composed of several levels (main memory + levels of cache).

If the requested data is found in one of the cache blocks (upper level) then there is a hit in the cache access. If the requested data is not found in in one of the cache blocks (upper level) then there is a miss in the cache access. In case of a data miss, we need to:

Stall the CPU
Require to block from the main memory
Copy (write) the block in cache
Repeat the cache access (hit)

Performance metrics

Hit Rate: number of memory accesses that find the data in the upper level with respect to the total number of memory accesses.

$$\textrm{Hit Rate} = \frac{# hits}{# memory accesses}$$

Hit Time: time to access the data in the upper level of the hierarchy, including the time needed to decide if the attempt of access will result in a hit or miss.

Miss Rate: number of memory accesses that do not find the data in the upper level with respect to the total number of memory accesses.

$$\textrm{Miss Rate} = \frac{#misses}{#\textrm{memory accesses}}$$