Você está na página 1de 16

Computer Architecture and Organization

Lecture on Cache Memory

55:035 Computer Architecture and Organization

Introduction

Memory access time is important to performance! Users want large memories with fast access times ideally unlimited fast memory To use an analogy, think of a bookshelf containing many books:

Suppose you are writing a paper on birds. You go to the bookshelf, pull out some of the books on birds and place them on the desk. As you start to look through them you realize that you need more references. So you go back to the bookshelf and get more books on birds and put them on the desk. Now as you begin to write your paper, you have many of the references you need on the desk in front of you.

This is an example of the principle of locality:


This principle states that programs access a relatively small portion of their address space at any instant of time.
55:035 Computer Architecture and Organization 2

Levels of the Memory Hierarchy


Part of The On-chip CPU Datapath ISA 16-128 Registers One or more levels (Static RAM): Level 1: On-chip 16-64K Level 2: On-chip 256K-2M Level 3: On or Off-chip 1M-16M Dynamic RAM (DRAM) 256M-16G
Interface: SCSI, RAID, IDE, 1394 80G-300G

CPU
Farther away from the CPU: Lower Cost/Bit Higher Capacity Increased Access Time/Latency Lower Throughput/ Bandwidth

Registers

Cache Level(s)
Main Memory

Magnetic Disc
Optical Disk or Magnetic Tape
55:035 Computer Architecture and Organization 3

Memory Hierarchy Comparisons


Capacity Access Time Cost CPU Registers 100s Bytes <10s ns Cache K Bytes 10-100 ns 1-0.1 cents/bit Main Memory M Bytes 200ns- 500ns $.0001-.00001 cents /bit Disk G Bytes, 10 ms (10,000,000 ns) -5 -6 10 - 10 cents/bit Tape infinite sec-min -8 10 Registers Instr. Operands Cache Blocks Memory Pages Disk Files Tape
55:035 Computer Architecture and Organization 4

Staging Xfer Unit

faster

prog./compiler 1-8 bytes

cache cntl 8-128 bytes

OS 4K-16K bytes

user/operator Mbytes

Larger

Memory Hierarchy

We can exploit the natural locality in programs by implementing the memory of a computer as a memory hierarchy.

Multiple levels of memory with different speeds and sizes. The fastest memories are more expensive, and usually much smaller in size (see figure).
Accomplished by using efficient methods for memory structure and organization.

The user has the illusion of a memory that is both large and fast.

55:035 Computer Architecture and Organization

Inventor of Cache

M. V. Wilkes, Slave Memories and Dynamic Storage Allocation, IEEE Transactions on Electronic Computers, vol. EC-14, no. 2, pp. 270-271, April 1965.
55:035 Computer Architecture and Organization 6

Cache
Processor
words

Cache small, fast memory


blocks

Main memory large, inexpensive (slow)

Processor does all memory operations with cache. Miss If requested word is not in cache, a block of words containing the requested word is brought to cache, and then the processor request is completed. Hit If the requested word is in cache, read or write operation is performed directly in cache, without accessing main memory. Block minimum amount of data transferred between cache and main memory.

55:035 Computer Architecture and Organization

The Locality Principle


A program tends to access data that form a physical cluster in the memory multiple accesses may be made within the same block. Physical localities are temporal and may shift over longer periods of time data not used for some time is less likely to be used in the future. Upon miss, the least recently used (LRU) block can be overwritten by a new block. P. J. Denning, The Locality Principle, Communications of the ACM, vol. 48, no. 7, pp. 19-24, July 2005.

55:035 Computer Architecture and Organization

Temporal & Spatial Locality

There are two types of locality:


TEMPORAL LOCALITY (locality in time) If an item is referenced, it will likely be referenced again soon. Data is reused. SPATIAL LOCALITY (locality in space) If an item is referenced, items in neighboring addresses will likely be referenced soon

Most programs contain natural locality in structure. For example, most programs contain loops in which the instructions and data need to be accessed repeatedly. This is an example of temporal locality. Instructions are usually accessed sequentially, so they contain a high amount of spatial locality. Also, data access to elements in an array is another example of spatial locality.
55:035 Computer Architecture and Organization 9

Data Locality, Cache, Blocks


Memory Increase block size to match locality size Cache Block 1 Data needed by a program Increase cache size to include most blocks

Block 2

55:035 Computer Architecture and Organization

10

Basic Caching Concepts

Memory system is organized as a hierarchy with the level closest to the processor being a subset of any level further away, and all of the data is stored at the lowest level (see figure). Data is copied between only two adjacent levels at any given time. We call the minimum unit of information contained in a two-level hierarchy a block or line. See the highlighted square shown in the figure. If data requested by the user appears in some block in the upper level it is known as a hit. If data is not found in the upper levels, it is known as a miss.
55:035 Computer Architecture and Organization 11

Basic Cache Organization


Block address Full byte address: Tag Idx Off Tags Data Array

Decode & Row Select

Compare Tags

? Hit

Mux select Data Word


12

55:035 Computer Architecture and Organization

Direct-Mapped Cache
Memory

Cache LRU Block 1 Data needed by a program Data needed


55:035 Computer Architecture and Organization 13

Block 2 Swap-in

Set-Associative Cache
Memory

Cache LRU Block 1 Swap-in Block 2 Data needed


55:035 Computer Architecture and Organization 14

Data needed by a program

Three Major Placement Schemes

55:035 Computer Architecture and Organization

15

Direct-Mapped Placement

A block can only go into one place in the cache

Determined by the blocks address (in memory space)


The index number for block placement is usually given by some loworder bits of blocks address.

This can also be expressed as:


(Index) = (Block address) mod (Number of blocks in cache)

Note that in a direct-mapped cache,

Block placement & replacement choices are both completely determined by the address of the new block that is to be accessed.

55:035 Computer Architecture and Organization

16

Você também pode gostar