Você está na página 1de 20

Chapter 5

The Processor: Datapath and Control


Part - I

Introduction - Performance
The performance of a machine was determined by
three key factors:
Instruction Count
Clock Cycle Time
Clock Cycle per Instruction (CPI)
The compiler and the instruction set architecture
determine the instruction count required for a given
program.
Both the clock cycle time and CPI are determined
by the implementation of the processor.

The Processor: Datapath & Control


We're ready to look at an implementation of the MIPS
Simplified to contain only:
memory-reference instructions: lw, sw
arithmetic-logical instructions: add, sub, and, or, slt
control flow instructions: branch-on-equal (beq), jump (j)
Generic Implementation:

use the program counter (PC) to supply instruction address


get the instruction from memory
read registers
use the instruction to decide exactly what to do

All instructions use the ALU after reading the registers


Why? memory-reference? arithmetic? control flow?

Overview of the Implementation


For every instruction, the first two steps are identical:
1. Send the PC to the memory that contains the code and
fetch the instruction from that memory.
2. Read 1 or 2 registers, using fields of the instruction to
select the registers to read. For load word instruction we
need to read only 1 register, but most other instructions
require that we read 2 registers.
After these two steps, the actions required to complete the
instruction depend on the instruction class (memory-reference,
arithmetic-logical, and branches).
All instruction classes use the ALU after reading the registers:
Memory-reference instructions use ALU for an address
calculation.
Arithmetic-logical instructions use ALU for operation
execution.
Branch instructions use ALU for comparison

Overview of the Implementation


After using ALU, the actions required to complete the
different instruction classes differ:
A memory-reference instruction will need to
access the memory either to write data for a store
or read data for a load.
An arithmetic-logical instruction must write the
data from the ALU back into a register.
A branch instruction may need to change the next
instruction address based on the comparison.

An Abstract / Simplified View of MIPS

Data

Register #
PC

Address
Instruction

Instruction

Registers

ALU

Address

Register #
Data

memory

memory

Register #

Data

Implementation Details of MIPS


The functional units in MIPS implementation consist of
two different types of logic elements:
1. Elements that operate on data values:
These elements are all combinational, which means
that their outputs depend only on the current
inputs.
Given the same input, a combinational element
always produces the same output, because it has
no internal storage.
The ALU is a combinational element.

Implementation Details of MIPS


2. Elements that contain state:
An element contains state if it has some internal storage.
We call these elements state elements because, if we pulled the
plug on the machine, we could restart it by loading the state
elements with the values they contained before we pulled the plug.
Logic components that contain state called sequential because
their outputs depend on both their inputs and the contents of the
internal state.
The instruction and data memories as well as the registers are all
examples of state elements.
A state element has at least two inputs and one output.
The required inputs are the data value to be written into the
element, and the clock, which determines when the data value is
written.
The output from a state element provides the value that was written
in an earlier clock cycle.

State Elements
Unclocked vs. Clocked
Clocks used in synchronous logic
when should an element that contains state be
updated?
falling edge

cycle time
rising edge

An unclocked state element


The set-reset latch
output depends on present inputs and also on past
inputs

10

Example of State elements: D Flip-flops


A D-type flip-flop has exactly two inputs ( a value and
a clock) and one output.
The clock is used to determine when the state
element should be written; a state element can be
read at any time.

11

Clocking Methodology
A clocking methodology defines when signals can be read
and when they can be written.
If a signal is written at the same time it is read, the value of
the read could correspond to the old value, the newly written
value, or even some mix of the two!
Assume an edge-triggered clocking methodology, which
means that any values stored in the machine are updated
only on a clock edge.
The state elements all update their internal storage on the
clock edge.
Because only state elements can store a data value, any
collection of combinational logic must have its inputs coming
from a set of state elements and its outputs written into a set
of state elements.
The inputs are values that were written in a previous clock
cycle, while the outputs are values that can be used in the
following clock cycle.

12

Our Implementation
An edge triggered methodology
Typical execution:
read contents of some state elements,
send values through some combinational logic
write results to one or more state elements
State
element
1

Combinational logic

State
element
2

Clock cycle

13

Combinational logic, state elements, and clock are


closely related
From the figure in the previous slide, the two state elements
surrounding a block of combinational logic, which operates
in a single clock cycle:
All signals must propagate from state element 1, through
the combinational logic, and to state element 2 in the time
of one clock cycle.
The time for signals to reach state element 2 defines the
length of the clock cycle.
Both the clock signal and the write control signal are inputs,
and the state element is changed only when the write control
signal is asserted and a clock edge occurs.

14

The MIPS Subset Implementation


We start with a simple implementation that uses a
single long clock cycle for every instruction and
follows the general form (figure) in slide 3.
In this design, every instruction begins execution
on one clock edge and completes execution on the
next clock edge.
This approach is not practical, since it would be
slower than an implementation that allows different
instruction classes to take different number of
clock cycles, each of which could be much shorter.
An implementation that uses multiple clock cycles
for each instruction is more realistic and requires
more complex control.
15

Building a Datapath

Include the functional units we need for each instruction


Instruction
address
MemWrite

PC
Instruction

Add Sum

Instruction
memory

Address

a. Instructionmemory
5
Register
numbers

5
5

Data

b.Programcounter

Read
register 1
Read
register 2
Registers
Write
register
Write
data

c.Adder

Data
memory

16

Sign
extend

32

ALU control

Read
data 1
Data

Write
data

Read
data

Zero
ALU ALU
result

Read
data 2

MemRead
a. Data memory unit

b. Sign-extension unit

Why do we need this stuff?

RegWrite
a. Registers

b. ALU

16

Instruction Memory, Program Counter, and Adder


First element we need is a place to store the
instructions of a program (a memory unit).
To execute any instruction, we must start by fetching
the instruction from memory.
To prepare for executing the next instruction, we
must increment the program counter (PC), so that it
points at the next instruction, 4 bytes later.
Therefore, two state elements are needed to store
and access instructions, and an adder is needed to
compute the next instruction.
17

A portion of the datapath used for fetching


instructions and incrementing the PC
The PC is a 32-bit register that will be written at the end of every
clock cycle.
The adder is an ALU wired to always perform an add of its two
32-bit inputs and place the result on its output.

Add Sum

PC

Instruction
address
Instruction
Instruction
memory

18

Register File
R-format instructions (add, sub, slt, and, or) all
read two registers, perform an ALU operation on
the contents of the registers, and write the result.
The processors 32 registers are stored in a
structure called a register file.
A register file is a collection of registers in which
any register can be read or written by specifying
the number of the register in the file.
R-format instructions have 3 register operands, we
will need to read two data words from the register
file and write one data word into the register file for
each instruction.
19

Register File
For each data word to be read from the registers:
We need an input to the register file that
specifies the register number to be read
We need an output from the register file that will
carry the value that has been read from the
registers.
To write a data word, we need two inputs:
One to specify the register number to be written
One to supply the data to be written into the
register.
We need a total of 4 inputs (3 for register numbers
and 1 for data) and 2 outputs (both for data) as
shown in the next slide.
20

Register File

Built using D flip-flops

Readregister
number 1

Read register
number 1
Register 0
Register 1
Register n 1

M
u
x

Read data 1

Read
data1

Readregister
number 2
Register file

Register n

Write
register

Read register
number 2

M
u
x

Read data 2

Write
data

Read
data2
Write

21

Register File

Note: we still use the real clock to determine when to write


W rite
0

R eg is te r n um be r

C
R e giste r 0

5 -to -32
de co d e r

30

R e giste r 1
D

31

C
R eg is te r 30
D
C
R e giste r 31
R e gister d ata

22

Register File and ALU


The register number inputs are 5 bits wide to
specify one of the 32 registers (32=25),
whereas the data input and the two output
buses are each 32 bits wide.
ALU is controlled by 3-bit signal and it takes
two 32-bit inputs and produces a 32-bit result
as shown in the next slide.

23

Register File and ALU

5
Register
numbers

5
5

Data

Read
register 1
Read
register 2
Registers
Write
register
Write
data

ALU control

Read
data 1
Data

Zero
ALU ALU
result

Read
data 2

RegWrite
a. Registers

b. ALU

24

The datapath for R-type instructions

Two elements needed to implement R-format ALU operations are


register file and ALU
A
5

L U

c o n t r o l

R e a d
re g is te r 1
R e a d

R e g is t e r

Instruction

d a ta 1
R e a d
Z

re g is te r 2

n u m b e rs

R e g is t e r s
5

D a ta

W r ite

L U

e r o

L U

r e s u l t

re g is te r
R e a d
d a ta 2
W r ite
D a ta
d a ta

R e g W rite

25

MIPS load word and store word instructions


Load word instruction:
Store word instruction:

lw $t1, offset_value($t2)
sw $t1, offset_value($t2)

These instruction compute a memory address by


adding the base register, which is $t2, to the 16-bit
signed offset field contained in the instruction.
If the instruction is a store, the value to be stored
must be read from the register file where it resides
in $t1.
If the instruction is a load, the value read from the
memory must be written into register file in the
specified register, which is $t1.
26

To implement a datapath for MIPS load and store instructions

Therefore we will need the following:


Both the register file and ALU.
A unit to sign-extend the 16-bit offset field in the
instruction to a 32-bit signed value.
A data memory unit to read from or write to.
The data memory must be written on store
instructions, it has both read and write control
signals, an address input, and an input for the data
to be written into memory (check out next slide).

27

Data Memory and Sign-extension Units


M e m W r it e

1 6
R e a d

A d d re s s

3 2
S

ig n

d a ta
e x t e n d

W rite
d a ta

D a ta
m e m o ry

M e m R e a d

a . D a ta

m e m o ry

u n it

b .

ig n - e x t e n s io n

u n it

28

Datapath for MIPS load and store instructions


Assume the instruction has already been fetched.
The register number inputs for the register file
come from fields of the instruction, as does the
offset value, which after sign extension becomes
the second ALU input.
The figure in the next slide shows the datapath for
a load or a store does a register access, followed
by a memory address calculation, then a read or
write from memory, and a write into the register file
if the instruction is a load.

29

Datapath for MIPS load and store instructions


A LU
5

c o n tr o l

R e ad
r e g ist e r 1

M em W rite

Read
R e g is te r

Instruction

d ata 1
R e ad
Z e ro

r e g ist e r 2

n u m b e rs

R e g is te r s
5

A L U

A L U

R ead

A ddre ss

W rite

r e s u lt

d ata

r e g ist e r
Read
d ata 2
W rite
D a ta
d a ta

W rite

D ata

da ta

m e m ory

R e g W rite

M em R ea d
1 6

S ig n
e x te n d

3 2
a. D ata m e m o ry un it

30

Branch if equal (beq) instruction


beq instruction has 3 operands, 2 registers that are
compared for equality, and a 16-bit offset used to
compute the branch target address relative to the
branch instruction address.
Form:
beq $t1, $t2, offset
To implement this instruction, we must compute
the branch target address by adding the signextended offset field of the instruction to the PC.

31

beq Instruction
There are two details in the definition of the branch
instruction (for details see chapter 3):
1. The instruction set architecture specifies that the base
for the branch address calculation is the address of the
instruction following the branch.
Since we compute PC+4 (address of next instruction), it
is easy to use this value as the base for computing the
branch target address.
2. The architecture states that the offset field is shifted
left 2 bits so that it is a word offset; this shift increase
the effective range of the offset field by a factor of four.
32

beq Instruction (Cont.)


If the two operands are equal (condition is true)
then the branch target address becomes the new
PC, and we say the branch is taken.
If the two operands are not equal (condition is not
true) then the incremented PC should replace the
current PC, and we say the branch is not taken.
The branch datapath must do two operations:
compute the branch target address and compare
the register contents.

33

A Datapath for MIPS beq Instruction


PC + 4 from instruction datapath
A dd

S um

Branch
target

Shift
left 2

A L U

R e ad

Instruction

c o n tr o l

r e g is te r 1

R e a d
d a ta 1

R e ad
r e g is te r 2
R e g is t e r s

A L U

W r ite
r e g is te r
R e a d

Z e ro

To
branch
control
logic

d a ta 2

W r ite
d a ta

R e g W r ite

16

S ig n
e x te n d

32

34

A Datapath for MIPS beq Instruction


To compute the branch target address, the branch
datapath includes a sign extension unit and an adder.
To perform the compare, we need to use the register file to
supply the two register operands.
Also, the ALU provides an output signal that indicates
whether the result was 0, we can send the two register
operands to the ALU with the control set to do a subtract.
If the Zero signal out of the ALU unit is asserted, then
the two values are equal.
Since the offset was sign-extended from 16 bits, the shift
will throw away only sign bits.
Control logic is used to decide whether the incremented
PC or branch target should replace the PC, based on the
Zero output of the ALU.

35

Combining datapaths for memory and R-type instructions


using multiplexors
R e g is t e r s
3

R ead

A L U o p e r a t io n
M e m W r ite

A L U S rc

re g is t e r 1

Load

R ead
R ead

M e m to R e g

d a ta 1
Z e ro

re g iste r 2

instruction

A LU A LU
W r ite

R ea d

re g is t e r

M
u

d a ta 2

r e s u lt

A d d re s s

d a ta

W r ite

D a ta

d a ta
W rite
R e g W r it e
32

16

R ead

Store

M
u
x

m e m ory

d a ta

S ig n
M em R ead

e x te n d

R-Type
R-Type or Load
36

Combining datapaths for memory instructions, R-type


instructions, and instruction fetch

Add

Sum

4
Registers

PC

Read
register 1

Instruction
address

Read
register 2

Instruction
Instruction
memory

Write
register

ALU operation

Read
data 1

Load

MemtoReg
Zero

Read
data 2

M
u
x

Write
data

ALU ALU
result

Address

Sign

32

Store

Read
data
Data

Write
RegWrite
16

MemWrite

ALUSrc

M
u
x

memory

data

MemRead

extend

R-Type
R-Type or Load
37

Combining datapaths for memory instructions, R-type


instructions, and instruction fetch
The combined datapaths in the previous slide
requires both an adder and an ALU, since the adder
is used to increment the PC while the other ALU is
used for executing the instruction in the same
clock cycle.
In the next slide we build a simple datapath for the
MIPS architecture, which can execute the basic
instructions (load / store word, ALU operations,
and branches) in a single clock cycle.

38

Building the Datapath for MIPS Architecture


Use multiplexors to stitch them together
PCSrc
M
u
x

Add
Add ALU
result

4
Shift
left 2

PC

Read
address
Instruction
Instruction
memory

Registers
Read
register 1
Read
Read
data 1
register 2
Write
register
Write
data
RegWrite
16

ALUSrc

Read
data 2

M
u
x

ALU operation

Zero
ALU ALU
result

MemtoReg
Address

Write
data
Sign
extend

MemWrite

Read
data

Data
memory

M
u
x

32
MemRead

39

Building the Datapath for MIPS Architecture


From the previous slide, the branch instruction
uses the main ALU for comparison of the register
operands, so the adder is used for computing the
branch target address.
Also, an additional multiplexor is required to select
either the sequentially following instruction
address (PC + 4) or the branch target address to be
written into the PC.

40

Você também pode gostar