Você está na página 1de 42

Week 10: Chapter 4 Instruction Set Architectures

Learn the components common to every

modern computer system.

Be able to explain how each component

contributes to program execution.

Understand an ISA (Instruction Set

Architecture) and how it relates to a computer architecture.


CPU Basics
The computers CPU fetches, decodes, and executes

program instructions. The two principal parts of the CPU are the datapath and the control unit. Datapath consists of an arithmetic-logic unit and storage units (registers) that are interconnected by a data bus that is also connected to main memory. Control unit sequences no. of operations performed by the CPU. Makes sure the correct data is where it needs to be and at the correct time. The control unit determines which actions to carry out according to the values in a program counter register and a status register.

The Bus
PARALLEL BUS The CPU shares data with other system components by way of a data bus.
A bus is a set of wires that simultaneously convey a

single bit along each line.

Two types of buses are commonly found in computer

systems: point-to-point, and multipoint buses.

The Bus
Buses have data lines, control lines, and address lines. The data lines transmits bits from one device to another

(holds information to be transferred),

Control lines determine the direction of data flow (r/w), and

which device has permission to access the bus. Also transfer interrupts, clk synchronization signals
Address lines determine the location of the source or

destination of the data.

The Bus
Typical bus transactions include:

Sending an address (for a read or write), Transferring data from memory to a register (a memory read), Transferring data to the memory from a register (a memory write). Reads and writes from peripheral devices. Each type of transfer occurs within a bus cycle, the time between two ticks of the bus clock. Due to the different types of information buses transport and the various devices that use the buses, buses themselves have been divided into different types: Processor-memory buses, I/O buses , backplane bus (so 7 all devices share one bus).

Types of Bus Devices

Bus Master(s): Devices that can issue commands

that initiate the transfer of data across the bus. For example, the CPU is a bus master and can use the control bus to issue a command to main memory to read the value of a variable. Allowing multiple bus masters can improve system performance.
Slave Devices: Devices that can only respond to

commands from other devices. For example, main memory is a slave device. It can only respond to commands from other devices asking it to read or write particular memory locations.

The Bus Arbitration

In a master-slave configuration, where more than one device can be the bus master, concurrent bus master requests must be arbitrated. This step provides priority and makes sure lower priority devices are not left out (never allowed to use the bus) Daisy categories of bus arbitration are: using self-detection: Four chain: Distributed
(not fair) Permissions are passed from the highest-priority device to the lowest. Centralized parallel: (employing central arbiter) Each device is directly connected to an Devices decide which gets the bus among themselves.


Distributed using collisiondetection: Any device can try to use the bus. If its data collides with the data of another device, it tries again.

Bus Cycles
Bus cycles control the flow of information

(data/commands/addresses) across the bus. One piece of information moves across the bus during each cycle. Types of Bus Cycles Synchronous: The length of the bus cycle is controlled by a clock and thus every bus cycle is the same length. Asynchronous: The length of a bus cycle is determined dynamically by the devices that are communicating. Control lines coordinate the operations using a complex protocol to enforce timing (Request / Ready / Acknowledge)

Every computer contains at least one clock that

synchronizes the activities of its components.

A fixed number of clock cycles are required to execute

each instruction.
The clock frequency, measured in megahertz or gigahertz,

determines the speed with which all operations are carried out. Clock cycle time is the reciprocal of clock frequency.
A 2 GHz clock has a cycle time of 0.5 nanoseconds.

A 8 MHz clock has a cycle time of 0.125


Clock speed should not be confused with CPU

performance. The CPU time required to run a program is given by the general performance equation:

We can improve CPU performance when we reduce the number of instructions in a program, reduce the number of cycles per instruction, or reduce the number of nanoseconds per clock cycle.


Role of Clock in Processors
single-cycle machine: does everything in one clock cycle instruction execution = up to 5 steps must complete 5th step before cycle ends

rising clock edge

falling clock edge

clock signal

instruction execution step 1/step 2/step 3/step 4/step 5

datapath stable register(s) written


The Input / Output Subsystem

A computer communicates with the outside world

through its input/output (I/O) subsystem. I/O devices connect to the CPU through various interfaces: I/O can be memory-mapped-- where the I/O device behaves like main memory from the CPUs point of view. Or I/O can be instruction-based, where the CPU has a specialized I/O instruction set.(involving interrupts)


Memory Organization
Computer memory is a linear array of addressable

storage cells that are similar to registers. Memory can be byte-addressable (most common), or word-addressable, where a word consists of two or more bytes. Memory is constructed of RAM chips, Memory is often referred to using the notation L x W (length x width). For example: 4M x 16 means the memory is 4M long (it has 4M = 22 x 220= 222words) and it is 16 bits wide (each word is 16 bits).


Memory Organization
Example: How many bits would you need to address

a 2M x 32 memory if (a) The memory is byteaddressable. (b) The memory is word-addressable? Solution: a)There are 2M x 4 bytes which equals 2 x 220 x 22=223 total bytes, so 23 bits are needed for an address b) There are 2M words which equals 2 x 220 = 221, so 21 bits are required for an address ByteAddressable


Power 0 1 2 3 4 5

2 ^ Power 1 2 4 8 16 32

Memory Organization
How does the computer access a memory location

corresponds to a particular address?

We see that 4 Megabytes = 222 bytes. The memory locations for this memory are numbered 0

7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

128 256 512 1,024 2,048 4,096 8,192 16,384 32,768 65,536 131,072 262,144 524,288 1,048,576 2,097,152

through 222 -1.

Thus, the memory bus of this system requires at least 22

address lines. The address lines count from 0 to 222 - 1 in binary. Each line is either on or off indicating the location of the desired memory element.




Memory Organization
Physical memory usually consists of more

than one RAM chip. Consequently, these chips are combined into a single memory module to give the desired memory size. For example, suppose you need to build a 32K x 16 memory and all you have are 2K x 8 RAM chips. You could connect 16 rows and 2 columns of chips together. Addresses for total memory must have 15 bits (there are 32K = 25 x 210 words to access) But each chip pair (each row) requires only 11 address lines (each chip pair holds only 2k= 21 x 210 words) chip 4 bits for 11 bits for word



0 1 .. 10

0 . .. . . . .. .

Memory Interleaving
Access is more efficient when memory is organized into banks of chips with

the addresses interleaved across the chips (Memory interleaving is a way to distribute individual addresses over memory modules)

High-order interleaving, the more intuitive organization, distributes the

addresses so that each module contains consecutive addresses Low-order interleaved memory places consecutive words of memory in different memory modules.

Low-Order Interleaving

Byte Addresses

Important tips on Memory Organization

With memory interleaving, the low-order k bits of the

memory address select a module, and the high-order m bits name a location within that module. A request to access consecutive memory locations can keep several modules busy at the same time; that is with the appropriate buses using low-order interleaving, a read or write using one module can be started before a read or write using another module actually completes
Recall: (1) Memory addresses are unsigned binary

values, and (2) The number of items to be addressed determines the numbers of bits that occur in the

This concept deals with how computer components


interact with the processor. Interrupts: are events that alter (or interrupt) the normal flow of execution in the system. Reasons for triggering an interrupt I/O requests Arithmetic errors (e.g., division by zero) Arithmetic underflow or overflow Hardware malfunction User-defined break points (such as when debugging a program) Invalid instructions (usually resulting from pointer issues)

Types of Interrupts
An interrupt can be: Maskable (disabled or ignored) or nonmaskable (a high priority interrupt that cannot be disabled and must be acknowledged), can occur within or between instructions. May be synchronous (occurs at the same place every time a program is executed) or asynchronous (occurs unexpectedly). Can result in the program terminating or continuing execution once the interrupt is handled.

Functional Architecture MARIE: Machine Architecture that is Really Intuitive and Easy
MARIE has the following characteristics: Binary coded addresses Fixed word length Word (but not byte) addressable 4K words of main memory (this implies 12 bits per address) 16-bit data (words have 16 bits) 16-bit instructions, 4 for the opcode and 12 for the address A16-bit accumulator (AC) A16-bit instruction register (IR) A16-bit memory buffer register (MBR) A12-bit program counter (PC) A12-bit memory address register (MAR) An 8-bit input register An 8-bit output register 24

Registers and Buses

REGISTERS AC: The accumulator, which holds data values. This is a general purpose register and holds data that the CPU needs to process. Most computers today have multiple general purpose registers. MAR: The memory address register, which holds the memory address of the data being referenced. MBR: The memory buffer register, which holds either the data just read from memory or the data ready to be written to memory. PC: The program counter, which holds the address of the next instruction to be executed in the program. IR: The instruction register, which holds the next instruction to be executed. InREG: The input register, which holds data from the input device. OutREG: The output register, which holds data for the output device.


BUSES MARIE, assumes a common bus scheme. Each device connected to the bus has a number, and before the device can use the bus, it must be set to that identifying number. We also have some additional pathways to speed up execution. MBR ALU AC ALU AC MBR

The Instruction Set Architecture

The instruction


set architecture (ISA) of a machine specifies the instructions that the computer can perform and the format for each instruction. The ISA is essentially an interface

4 bits specify the instruction to be executed

12 bits for accessing the 4k words memory

MARIE instruction set

Machine language Assembly language


Register Transfer Notation

Each instruction appears to be very simplistic;

however, if you examine what actually happens at the component level, each instruction involves multiple operations mini-instructions are being executed. These mini-instructions are called micro-operations and specify the elementary operations that can be performed on data stored in registers. The symbolic notation used to describe the behavior of micro-operations is called register transfer notation (RTN) or register transfer language (RTL).

Register Transfer Notation

Load X
Recall that this instruction loads the

Add X
The data value stored at address X is added to the AC.

contents of memory location X into the AC.



This instruction causes an unconditional branch to the given address, X. Therefore, MAR X, MBR AC to execute this instruction, X M[MAR] MBR must be loaded into the PC. NOTE PC X Register transfer notation is a symbolic means of expressing what is happening in the system when a given instruction is executing. 29 RTN is sensitive to the data path, in that if multiple micro-operations must share the bus, they must be executed in a sequential fashion, one following
This instruction stores the contents of the AC in memory location X:

Store X

Jump X

Fetch- Decode-Execute Cycle

The fetch-decode-execute cycle


represents the steps that a computer follows to run a program. The CPU fetches an instruction (transfers it from main memory to the instruction register) Decodes it (determines the opcode and fetches any data necessary to carry out the instruction), Gets the operand and Executes the instruction (performs the operation(s) indicated by the instruction).

Fetch- Decode-Execute Cycle

IR M[MAR], PC PC+1 MAR IR[110], decode IR[15

12] MBR M[MAR] Execute the instruction


Interrupts and I/O

MARIE has two registers to accommodate input and output. The input register holds data being transferred from an input device into the computer; the output register holds information ready to be sent to an output device. The timing used by these two registers is very important. MARIE addresses these issue by using interrupt-driven I/O. When the CPU executes an input or output instruction, the appropriate I/O device is notified. This process requires the following: 1-A signal (interrupt) from the I/O device to the CPU indicating that input or output is complete by using a special register, the status or flag register. 2-Some means of allowing the CPU to turn from the usual fetch-decode-execute cycle to recognize this interrupt. The CPU then processes the interrupt, after which it continues with the normal fetch-decode-execute cycle (original program).


Process the Interrupt







The control unit causes the CPU to execute a

sequence of steps correctly. In reality, there must be control signals to assert lines on various digital components to make things happen For example, when we perform an Add instruction in MARIE in assembly language, we assume the addition takes place because the control signals for the ALU are set to add and the result is put into the AC. The ALU has various control lines that determine which operation to perform. The question we need to answer is, How do these control lines actually become asserted?

Hardwired Control
The first approach is to physically connect all of the


control lines to the actual machine instructions. The instructions are divided up into fields, and different bits in the instruction are combined through various digital logic components to drive control lines. This is called hardwired control. The control unit is implemented using hardware (with simple NAND gates, flip-flops, and counters, for example). We need a special digital circuit that uses, as inputs, the bits from the opcode field in our instructions, bits from the flag (or status) register, signals from the bus, and signals from the clock. It should produce, as outputs, the control signals to drive the various components in the computer.

0 000000

9 01001

10 01010

8 01000

The hardware reads each of those fields.

0 00000

32 100000



Out MUX From ALU


Micro programmed Control

The other approach, called microprogramming,

uses software for control. All machine instructions are input into a special program, the microprogram, to convert the instruction into the appropriate control signals. The microprogram is essentially an interpreter, written in microcode, that is stored in firmware (ROM, PROM, or EPROM), which is often referred to as the control store. This program converts machine instructions of zeros and ones into control signals.

Pros and Cons

Hardwired The advantage of hardwired control is that it is very fast. The disadvantage is that the instruction set and the control logic are directly tied together by special circuits that are complex and difficult to design or modify. Micro-programmed Essentially there is one subroutine in this program for each machine instruction. The advantage of this approach is that if the instruction set requires modification, the micro-program is simply updated to matchno change is required in the actual hardware. The disadvantage to this approach is that all instructions must go through an additional level of interpretation, slowing down the program execution.

Exercise 5,6,7,8,9,10

Read the rest of MARIEs instructions: Halt,

Skipcond, Jump x Exercises 25, 26