Você está na página 1de 52

Pipeline

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-1

Lavanderia – analogia com o pipelining


6 PM 7 8 9 10 11 12 1 2 AM
Time
Task
order
A

6 PM 7 8 9 10 11 12 1 2 AM
Time

Task
order

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-2

1
Pipeline de instruções no MIPS

• Fetch da instrução

• Leitura dos registradores e decodificação

• Execução da operação ou cálculo de endereço

• Acesso ao operando na memória

• Escrita do resultado em um registrador

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-3

Exemplo

• Compare o tempo médio entre instruções da implementação


em single-cycle (uma instrução por ciclo) com uma
implementação com pipeline. Supor maior tempo de
operação para acesso à memória = 2ns, operação da ULA =
2ns e acesso ao register file = 1ns. (Instrs lw, sw, add, sub,
and, or slt e beq).

• Inicialmente suponha a execução de 3 instruções lw (tempo


entre o início da 1ª instrução e o início da 4ª instrução)

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-4

2
Tempo total para as oito instruções calculado a
partir do tempo de cada componente

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-5

Execução não-pipeline X pipeline


Program
execution 2 4 6 8 10 12 14 16 18
order Time
(in instructions)
Instruction Data
lw $1, 100($0) Reg ALU Reg
fetch access

Instruction Data
lw $2, 200($0) 8 ns Reg ALU Reg
fetch access

Instruction
lw $3, 300($0) 8 ns fetch
...
8 ns

Program
execution 2 4 6 8 10 12 14
Time
order
(in instructions)
Instruction Data
lw $1, 100($0) Reg ALU Reg
fetch access

Instruction Data
lw $2, 200($0) 2 ns Reg ALU Reg
fetch access

Instruction Data
lw $3, 300($0) 2 ns Reg ALU Reg
fetch access

2 ns 2 ns 2 ns 2 ns 2 ns

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-6

3
OBS.:

• Sob condições ideais, com estágios balanceados, o speedup do


pipeline é igual ao número de estágios do pipeline ( 5 estágios
, 5 vezes mais rápido)
• Na realidade o tempo de execução de uma instrução é um
pouco superior (overheads)  speedup é menor que o número
de estágios do pipeline
Desempenho do pipeline
Suponha a execução de 1003 instruções é devido ao aumento do
throughput.
com pipeline 
1000 X 2ns + 14 = 2014 (para cada instrução adiciono 2ns)

sem pipeline  1000 X 8ns + 24 = 8024

spedup = 8024 / 2014 = 3.98 ~~ 8 / 2

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-7

Projeto de um conjunto de instruções para pipeline

• O que torna a implementação mais fácil

– Instruções de mesmo tamanho


– Poucos formatos, com campos de registradores
sempre dispostos no mesmo lugar (Simetria, no 2º
estágio podemos ler registradores e decodificar ao
mesmo tempo).
– Acesso à memória apenas com as instruções lw e sw.
– Operandos alinhados na memória: o dado pode ser
transferido da memória para a CPU e CPU para a
memória em um único estágio do pipeline.

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-8

4
Projeto de um conjunto de instruções para pipeline

• O que torna a implementação mais dificil

– Hazard

• Hazard Estrural

• Hazard de Controle

• Hazard de Dados

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-9

Pipeline Hazards

• Hazard Estrutural

– O hardware não suporta uma combinação de


instruções que queremos executar em um único
período de clock

• Ex.: escrever e ler da memória em um mesmo ciclo

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-10

5
Pipeline Hazards

• Hazard de Controle

– Problemas devido à execução de instruções de desvio

• Ex.: Quando um branch é tomado, como tratar a(s)


instruções que seguem (fisicamente) o branch no programa e
que já estão no pipeline

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-11

Pipelining stalling para instruções branch

Program
execution 2 4 6 8 10 12 14 16
order Time
(in instructions)
Instruction Data
add $4, $5, $6 Reg ALU Reg
fetch access

Instruction Data
beq $1, $2, 40 fetch
Reg ALU
access
Reg
2ns
Instruction Data
lw $3, 300($0) fetch
Reg ALU
access
Reg
4 ns

2ns

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-12

6
Branch prediction: Tentar “adivinhar” qual dos caminhos do branch
será tomado
Program
execution 2 4 6 8 10 12 14
order Time
(in instructions)
add $4, $5, $6
Instruction
Reg ALU
Data
Reg O branch não
fetch access
será tomado
Instruction Data
beq $1, $2, 40 Reg ALU Reg
2 ns fetch access

Instruction Data
lw $3, 300($0) Reg ALU Reg
2 ns fetch access

Program
execution 2 4 6 8 10 12 14
Time
order
(in instructions)
add $4, $5 ,$6 Instruction Data
Reg ALU Reg
fetch access

beq $1, $2, 40 Instruction Data


Reg ALU Reg
fetch access
2 ns

bubble bubble bubble bubble bubble

Instruction Data
or $7, $8, $9 Reg ALU Reg
4 ns fetch access

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-13

Pipeline delayed branch

Program
execution 2 4 6 8 10 12 14
order Time
(in instructions)
beq $1, $2, 40 Instruction Data
Reg ALU Reg
fetch access

add $4, $5, $6 Instruction Data


Reg ALU Reg
fetch access
(Delayed branch slot) 2 ns
Instruction Data
lw $3, 300($0) fetch
Reg ALU
access
Reg
2 ns

2 ns

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-14

7
Hazard de Dados

• Quando uma instrução necessita de um dado que ainda


não foi calculado

– Ex.: add $s0,$t0,$t1

sub $t2,$s0,$t3

Soluções :
Compilador (programador) gera código livre de
data hazard (introduzindo, por ex., instruções nop
no código; alterando a ordem das instruções; ...)

Stall; Forwarding ou bypassing

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-15

Hazard de Dados
2 4 6 8 10
Time

add $s0, $t0, $t1 IF ID EX MEM WB

Program
execution 2 4 6 8 10
order Time
(in instructions)
add $s0, $t0, $t1 IF ID EX MEM WB

sub $t2, $s0, $t3 IF ID EX MEM WB

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-16

8
Hazard de Dados

2 4 6 8 10 12 14
Program Time
execution
order
(in instructions) Stall
lw $s0, 20($t1) IF ID EX MEM WB

bubble bubble bubble bubble bubble

sub $t2, $s0, $t3 IF ID EX MEM WB

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-17

Exemplo

• Encontre o hazard no código abaixo e resolva-o:

# $t1 tem o end. de v[k]


lw $t0, 0($t1) # $t0 = v[k]
lw $t2,4($t1) # $t2 = v[k+1]
sw $t2, 0($t1) # v[k] = $t2
sw $t0, 4($t1) # v[k+1] = $t0

Solução:
# $t1 tem o end. de v[k]
lw $t0, 0($t1) # $t0 = v[k]
lw $t2,4($t1) # $t2 = v[k+1]
sw $t0, 4($t1) # v[k+1] = $t0
sw $t2, 0($t1) # v[k] = $t2

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-18

9
Pipeline: Idéia Básica
• 5 estágios: Fetch; Decodificação e leitura dos regs; execução ou cálculo
de end. ; acesso à memória; escrita no reg. destino

IF: Instruction fetch ID: Instruction decode/ EX: Execute/ MEM: Memory access WB: Write back
register file read address calculation
0
M
u
x
1

O que é necessário
Add
para tornar cada
Add
4

Shift
Add result
divisão em estágios?
left 2

Read
PC Address register 1 Read
Read data 1
register 2 Zero
Instruction Registers Read ALU ALU
Write data 2 0 result Address Read 1
Instruction register M data
u Data M
memory Write x u
memory x
data 1 0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-19

Instruções sendo executadas pelo datapath

Time (in clock cycles)


Program
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7
execution
order
(in instructions)
lw $1, 100($0) IM Reg ALU DM Reg

lw $2, 200($0) IM Reg ALU DM Reg

lw $3, 300($0) IM Reg ALU DM Reg

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-20

10
Pipelined Datapath
0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

Add
4 Add result

Shift
left 2

Read
Instruction

PC Address register 1
Read
data 1
Read
register 2 Zero
Instruction Registers Read ALU ALU
memory Write data 2 0 Address Read 1
result data
register M
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Pode acontecer algum problema nesta solução se não


existir dependência de dados?
A execução de qual instrução causa o problema?
Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers
Ch6.a-21

Pipelined Datapath

lw
Instruction fetch
0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-22

11
Pipelined Datapath

lw
0
M Instruction decode
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-23

Pipelined Datapath

lw
0
M
u
Execution
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u ata M
D u
Write x memory x
data 1 0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-24

12
Pipelined Datapath

lw
0
M
u
Memory
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

Add
4 Add result

Shift
left 2

Read
Instruction

PC Address register 1
Read
Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
Data M
u
Write x memory u
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-25

Pipelined Datapath

0
M
lw
u
x Write back
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M Data data
u M
memory u
Write x
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-26

13
Pipelined Datapath

sw
0
M Execution
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

Add
4 Add result

Shift
left 2

Read
Instruction

PC Address register 1
Read
Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-27

Pipelined Datapath

sw
0
M
u
Memory
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

Add
4 Add result

Shift
left 2

Read
Instruction

PC Address register 1
Read
Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-28

14
Pipelined Datapath

0
sw
M
u Write back
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-29

Datapath Correto

Problema na execução da instrução load.


0
M
u
Qual é o erro?
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Address Read
data 2 result 1
register M data
u Data M
Write x u
memory x
data 1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-30

15
Datapath com os estágios usados para um instrução lw

0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Address Read
data 2 result 1
register M data
u Data M
Write x u
memory x
data 1
0
Write
data
16 32
Sign
extend

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-31

Representações Gráfica do Pipeline

T im e (in c lo c k c y cl e s )
P ro g ra m
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
e x e c u t io n
o rd e r
(in in s tr u c tio n s )
lw $ 1 0 , 2 0 ( $ 1 ) IM R eg ALU DM Reg

s ub $ 11 , $ 2 , $3 IM Reg ALU DM R eg

Ajuda a responder perguntas como:


Quantos ciclos são gasto para executar este código?
O que a ALU está fazendo durante o ciclo 10?

Ajuda a entender os datapaths

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-32

16
Representações Gráfica do Pipeline

Time ( in clock cycles)


Program
execution
order CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
(in instructions)

lw $10, $20($1) Instruction Instruction Execution Data


access Write back
fetch decode
Instruction Instruction Data
sub $11, $2, $3 fetch decode Execution access Write back

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-33

lw $10, 20($1)
Instruction fetch
0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add
Add result

Shift
left 2

Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Address Read
data 2 result 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Clock 1

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-34

17
sub $11, $2, $3 lw $10, 20($1)
Instruction fetch Instruction decode
0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1
Read
data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Clock 2

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-35

sub $11, $2, $3 lw $10, 20($1)


Instruction decode Execution
0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Address Read
data 2 result 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Clock 3

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-36

18
sub $11, $2, $3 lw $10, 20($1)
0
M
u Execution Memory
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Address Read
data 2 result 1
register M data
u Data M
Write x u
memory x
data 1
0
Write
data
16 32
Sign
extend

Clock 4

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-37

0
sub $11, $2, $3 lw $10, 20($1)
M
u Memory Write back
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Clock 5

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-38

19
0
sub $11, $2, $3
M
u Write back
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add Add
result
Shift
left 2

Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
16 32
Sign
extend

Clock 6

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-39

Controle do Pipeline
PCSrc

0
M
u
x
1

IF/ID ID /EX EX/MEM MEM/WB

Add

Add
4 Add
res ult
Branch
Shift
RegW rite left 2

R ead MemWrite
Instruction

PC Address register 1 Read


R ead data 1 ALU Src
register 2 Zero
Zero MemtoReg
Instruction
R egisters Read ALU ALU
memory W rite 0 Read
data 2 result Address 1
register M data
u M
Data u
W rite x memory
data x
1
0
Write
data
Instruction
[15– 0] 16 32 6
Sign ALU
extend control MemRead
Instruction
[20– 16]
0
M ALUO p
Instruction u
[15– 11] x
1

RegDs t

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-40

20
Controle do Pipeline

• 5 estágios. O que deve ser controlado em cada estágio?

– 10: Fetch da instrução e incremento do PC


– 20: Decodificação da instrução e Fetch dos
registradores
– 30: Execução
– 40: Acesso à memória
– 50: Write back

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-41

Sinais de controle

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-42

21
Sinais de controle

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-43

Sinais de controle

Execution/Address Write-back
Calculation stage Memory access stage stage control
control lines control lines lines

Reg ALU ALU ALU Mem Mem Reg Mem to


Instruction Dst Op1 Op0 Src Branch Read Write write Reg
R-format 1 1 0 0 0 0 0 1 0
lw 0 0 0 1 0 1 0 1 1
sw X 0 0 1 0 0 1 0 X
beq X 0 1 0 1 0 0 0 X

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-44

22
WB

Instruction
Control M WB

EX M WB

IF/ID ID/EX EX/MEM MEM/WB

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-45

Datapath com Controle


PC Src

ID/E X
0
M
u WB
x EX /ME M
1
C ontrol M WB
ME M/W B

EX M WB
IF/ID

Add

Add
4 A dd result
RegWrite

Branch
Shift
left 2
MemWrite

AL U Src
MemtoReg

R e ad
Instruction

PC Address register 1
R ea d
data 1
R e ad
register 2 Z ero
In struc tion
R egisters R ea d ALU ALU
mem ory W rite 0 Read
data 2 re sult A ddress 1
register M data
u D ata M
W rite x me mory u
data x
1
0
W rite
data

Instruction 16 32 6
[15– 0] Sign A LU M emR ead
extend control

Instruction
[20– 16]
0 ALU Op
M
Instruction u
[15– 11] x
1
RegD st

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-46

23
lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 1o ciclo
or $13, $6, $7
add $14, $8, $9
IF: lw $10, 20($1) ID: before<1> EX: before<2> MEM: before<3> WB: before<4>

IF/ID ID/EX EX/MEM MEM/WB


0
M 00 00
u WB
x
1 000 000 00
Control M WB
0 0 0
0000 00 0
EX M WB 0
0 0

Add

Add
4 Add result

RegWrite
Shift Branch
left 2

MemWrite
ALUSrc
Read

MemtoReg
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
[15– 0] Sign ALU MemRead
extend control

Instruction
[20– 16]
0 ALUOp
M
Instruction u
[15– 11] x
1
Clock 1 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-47

lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 2o ciclo
or $13, $6, $7
add $14, $8, $9
IF: sub $11, $2, $3 ID: lw $10, 20($1) EX: before<1> MEM: before<2> WB: before<3>

IF/ID ID/EX EX/MEM MEM/WB


0
M 11 00
u WB
x
1 lw 010 000 00
Control M WB
0 0 0
0001 00 0
EX M WB 0
0 0

Add

4 Add
Add result
RegWrite

Shift Branch
left 2
MemWrite

ALUSrc
1 Read
MemtoReg
Instruction

register 1
PC Address Read $1
X data 1
Read
register 2 Zero
Instruction
Registers Read $X ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
20 [15– 0] Sign 20 ALU MemRead
extend control

Instruction
10 [20– 16] 10
0 ALUOp
M
Instruction u
X [15– 11] X x
1
Clock 2 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-48

24
lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 3o ciclo
or $13, $6, $7
add $14, $8, $9
IF: and $12, $4, $5 ID: sub $11, $2, $3 EX: lw $10, . . . MEM: before<1> WB: before<2>

IF/ID ID/EX EX/MEM MEM/WB


0
M 10 11
u WB
x
1 sub 000 010 00
Control M WB
0 0 0
1100 00 0
EX M WB 0
1 0

Add

Add
4 Add result

RegWrite Shift Branch


left 2

MemWrite
ALUSrc
2 Read

MemtoReg
Instruction

PC Address register 1 Read $2 $1


3 Read data 1
register 2 Zero
Instruction
Registers Read $3 ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
X [15– 0] Sign X 20 ALU MemRead
extend control

Instruction
X [20– 16] X 10
0 ALUOp
M
Instruction u
11 [15– 11] 11 x
1
Clock 3 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-49

lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 4o ciclo
or $13, $6, $7
add $14, $8, $9
IF: or $13, $6, $7 ID: and $12, $2, $3 EX: sub $11, . . . MEM: lw $10, . . . WB: before<1>

IF/ID ID/EX EX/MEM MEM/WB


0
M 10 10
u WB
x
1 and 000 000 11
Control M WB
1 0 0
1100 10 1
EX M WB 0
0 0

Add

4 Add
Add result
RegWrite

Shift Branch
left 2
MemWrite

ALUSrc
4 Read
MemtoReg
Instruction

register 1
PC Address Read $4 $2
5 data 1
Read
register 2 Zero
Instruction
Registers Read $5 $3 ALU ALU
memory Write 0 Address Read
data 2 result 1
register M data
u Data M
Write x u
memory x
data 1
0
Write
data

Instruction
X [15– 0] Sign X ALU MemRead
extend control

Instruction
X [20– 16] X
0 ALUOp
M 10
Instruction u
12 [15– 11] 12 11 x
1
Clock 4 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-50

25
lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 5o ciclo
or $13, $6, $7
add $14, $8, $9
IF: add $14, $8, $9 ID: or $13, $6, $7 EX: and $12, . . . MEM: sub $11, . . . WB: lw $10, . . .

IF/ID ID/EX EX/MEM MEM/WB


0
M 10 10
u WB
x
1 or 000 000 10
Control M WB
1 0 1
1100 10 0
EX M WB 1
0 0

Add

4 Add
Add result

RegWrite
Shift Branch
left 2

MemWrite
ALUSrc
6 Read

MemtoReg
Instruction

PC Address register 1 Read $6 $4


7 Read data 1
register 2 Zero
Instruction $5
Registers Read $7 ALU ALU
memory 10 Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
X [15– 0] Sign X ALU MemRead
extend control

Instruction
X [20– 16] X
0 ALUOp
M 11 10
Instruction u
13 [15– 11] 13 12 x
Clock 5 1
RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-51

lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 6o ciclo
or $13, $6, $7
add $14, $8, $9
IF: after<1> ID: add $14, $8, $9 EX: or $13, . . . MEM: and $12, . . . WB: sub $11, . . .

IF/ID ID/EX EX/MEM MEM/WB


0
M 10 10
u WB
x
1 add 000 000 10
Control M WB
1 0 1
1100 10 0
EX M WB 0
0 0

Add

4 Add
Add result
RegWrite

Shift Branch
left 2
MemWrite

ALUSrc
8 Read
MemtoReg
Instruction

register 1
PC Address Read $8 $6
9 data 1
Read
register 2 Zero
Instruction
Registers Read $9 $7 ALU ALU
memory 11 Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
X [15– 0] Sign X ALU MemRead
extend control

Instruction
X [20– 16] X
0 ALUOp
M 12 11
Instruction u
14 [15– 11] 14 13 x
1
Clock 6 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-52

26
lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 7o ciclo
or $13, $6, $7
add $14, $8, $9
IF: after<2> ID: after<1> EX: add $14, . . . MEM: or $13, . . . WB: and $12, . . .

IF/ID ID/EX EX/MEM MEM/WB


0
M 00 10
u WB
x
1 000 000 10
Control M WB
1 0 1
0000 10 0
EX M WB 0
0 0

Add

Add
4 Add result

RegWrite Shift Branch


left 2

MemWrite
ALUSrc
Read

MemtoReg
Instruction

PC Address register 1 Read $8


Read data 1
register 2 Zero
Instruction $9
Registers Read ALU ALU
memory 12 Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
[15– 0] Sign ALU MemRead
extend control

Instruction
[20– 16]
0 ALUOp
M 13 12
Instruction u
[15– 11] 14 x
1
Clock 7 RegDst

IF: after<3> ID: after<2> EX: after<1> MEM:Kaufmann


1998 Morgan add $14, ...
Publishers WB: or $13, ...
Ch6.a-53
Paulo C. Centoducatte

lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 8o ciclo
or $13, $6, $7
add $14, $8, $9

IF: after<3> ID: after<2> EX: after<1> MEM: add $14, . . . WB: or $13, . . .

IF/ID ID/EX EX/MEM MEM/WB


0
M 00 00
u WB
x
1 000 000 10
Control M WB
0 0 1
0000 00 0
EX M WB 0
0 0

Add

4 Add
Add result
RegWrite

Shift Branch
left 2
MemWrite

ALUSrc
Read
MemtoReg
Instruction

PC Address register 1
Read
data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory 13 Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
[15– 0] Sign ALU MemRead
extend control

Instruction
[20– 16]
0 ALUOp
M 14 13
Instruction u
[15– 11] x
1
Clock 8 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-54

27
lw $10, 20 ($1)
sub $11, $2, $3
and $12, $4, $5 9o ciclo
or $13, $6, $7
add $14, $8, $9
IF: after<4> ID: after<3> EX: after<2> MEM: after<1> WB: add $14, . . .

IF/ID ID/EX EX/MEM MEM/WB


0
M 00 00
u WB
x
1 000 000 00
Control M WB
0 0 1
0000 00 0
EX M WB 0
0 0

Add

Add
4 Add result

RegWrite
Shift Branch
left 2

MemWrite
ALUSrc
Read

MemtoReg
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory 14 Write 0 Read
data 2 result Address 1
register M data
u Data M
Write x memory u
data x
1
0
Write
data

Instruction
[15– 0] Sign ALU MemRead
extend control

Instruction
[20– 16]
0 ALUOp
M 14
Instruction u
[15– 11] x
1
Clock 9 RegDst

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-55

Dependências

• Problema com iniciar a execução de uma instrução antes


do término da anterior

– Exemplo: Instruções com dependências

sub $2, $1, $3 # reg $2 modificado


and $12, $2, $5 # valor (1º operando) de $2
# depende do sub
or $13, $6, $2 # idem (2º operando)
add $14, $2, $2 # idem (1º e 2º operando)
sw $15, 100($2) # idem (base do endereçamento)

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-56

28
Time (in clock cycles)
Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution
order Se a escrita no banco
(in instructions)
Reg
de registradores é
sub $2, $1, $3 IM Reg DM
feita no 1o semi-ciclo
e a leitura no 2o, em
and $12, $2, $5 IM Reg DM Reg CC5 não há hazard.

or $13, $6, $2 IM Reg DM Reg

add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg

Reg WR Reg RD

CC4 CC5 CC6

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-57

Solução por Software

• Compilador garante um código sem Hazard inserindo nops


– Onde inserir os nops?

sub $2, $1, $3


sub $2, $1, $3 nop
and $12, $2, $5 nop
or $13, $6, $2 and $12, $2, $5
add $14, $2, $2 or $13, $6, $2
sw $15, 100($2) add $14, $2, $2
sw $15, 100($2)

Problema: reduz o desempenho

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-58

29
Solução por Hardware

• Usar o resultado assim que calculado, não esperar que ele


seja escrito.

• Neste caso é necessário mecanismo para detectar o hazard.


Qual a condição para que haja hazard?

Quando uma instrução tenta ler um registrador


(estágio EX) e esse registrador será escrito por
uma instrução anterior no estágio WB

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-59

Forwarding
Tim e (in clock cycles)
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
Value of register $2 : 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Value of EX/M EM : X X X – 20 X X X X X
Value of M EM /W B : X X X X – 20 X X X X

Program
execution order
(in instru ctions)
sub $2, $1, $ 3 IM Reg DM R eg

and $12, $2, $5 IM R eg DM R eg

or $13, $6, $2 IM R eg DM R eg

add $14, $2, $2 IM R eg DM Reg

sw $15, 100($2) IM R eg DM R eg

O que ocorre se $13 no lugar de $2

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-60

30
Condições que determinam Hazards de dados

• EX/MEM

– EX/MEM.RegisterRd = ID/EX.RegisterRs

– EX/MEM.RegisterRd = ID/EX.RegisterRt

• MEM/WB

– MEM/WB.RegisterRd = ID/EX.RegisterRs

– MEM/WB.RegisterRd = ID/EX.RegisterRt

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-61

Condições que determinam Hazard de dados

• Exemplo:
sub $2, $1, $3 # reg $2 modificado
and $12, $2, $5 # valor de $2 depende do sub
or $13, $6, $2 # idem (2º operando)
add $14,$2, $2 # idem (1º e 2º operandos)
sw $15, 100($2) # idem (base do endereçamento)

– sub–and: EX/MEM.RegisterRd = ID/EX.RegisterRs = $2


– sub-or: MEM/WB.RegisterRd = ID/EX.RegisterRt = $2
– sub-add: não tem hazard
– sub-sw: não tem hazard

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-62

31
Condições que determinam Hazard de dados

• Esta não é uma política precisa, pois existem instruções


que não escrevem no register file.
– solução seria verificar o sinal RegWrite
• MIPS usa $0 para operandos de valor 0. Para instruções
onde $0 é destino?
• sll $0, $1, 2
• valor diferente de zero nos regs de pipeline
• add $3, $0, $2
• se houver fowarding, $3 terá valor errado no fim da
instrução add
• Para isto temos que incluir as condições
• EX/MEM.registerRd <> 0 (1º hazard)
• MEM/WB.registerRd <> 0 (2º hazard)

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-63

Datapath sem Forwarding

ID/EX EX/MEM MEM/WB

Registers ALU

Data
memory M
u
x

a. No forwarding

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-64

32
Datapath com Forwarding ( para add, sub, and e or )

ID/EX EX/MEM MEM/WB

M
u
x
Registers
ForwardA ALU

M Data
u memory
x M
u
x

Rs ForwardB
Rt
Rt M
u EX/MEM.RegisterRd
Rd
x
Forwarding MEM/WB.RegisterRd
unit

b. With forwarding

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-65

Valores dos sinais ForwardA e ForwardB

Mux Control Source Explanação


ForwardA = 00 ID/EX Primeiro operando da ULA 
register file
ForwardA = 10 EX/MEM Primeiro operando da ULA 
resultado anteiror da ULA
ForwardA = 01 MEM/WB Primeiro operando da ULA é
antecipado da memória de dados
ou um resultado anterior da ULA
ForwardB = 00 ID/EX Segundo operando da ULA 
register file
ForwardB = 10 EX/MEM Segundo operando da ULA 
resultado anteiror da ULA
ForwardB = 01 MEM/WB Segundo operando da ULA é
antecipado da memória de dados
ou um resultado anterior da ULA

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-66

33
Detecção e Controle de hazard

• O controle de fowarding será no estágio EX, pois


é neste estágio que se encontram os multiplexadores
de fowarding da ULA. Portanto devemos passar o número
do registrador operando do estágio ID via registrador de
pipeline ID/EX  adicionar campo rs (bits 25-21).

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-67

Condições para detecção de hazard


EX hazard
if (EX/MEM.RegWrite and (EX/MEM.RegisterRd <> 0 )
and (EX/MEM.RegisterRd = ID/EX.RegisterRs))
FowardA = 10
if (EX/MEM.RegWrite and (EX/MEM.RegisterRd <> 0 )
and (EX/MEM.RegisterRd = ID/EX.RegisterRt))
FowardB = 10
MEM hazard
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd <> 0 )
and (MEM/WB.RegisterRd = ID/EX.RegisterRs))
FowardA = 01
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd <> 0 )
and (MEM/WB.RegisterRd = ID/EX.RegisterRt))
FowardB = 01

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-68

34
Observações:

• Não existe data hazard no estágio WB, pois estamos


assumindo que o register file supre o resultado correto se
a instrução no estágio ID é o mesmo registrador escrito
pela instrução no estágio WB  forwarding no register
file

– Na 1a edição do livro P&H tem hazard

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-69

Observações:

• Um data hazard potencial pode ocorrer entre o resultado


de uma instrução em WB, o resultado de uma instrução
em MEM e o operando fonte da instrução no estágio da
ULA add $1, $1, $2
add $1, $1, $3
add $1, $1, $4
• Neste caso o resultado é antecipado do estágio MEM
porque o resultado neste estágio é mais recente
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd <> 0 )
and (EX/MEM.RegisterRd <> ID/EX.registerRs)
and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) FowardA = 01

if (MEM/WB.RegWrite and (MEM/WB.RegisterRd <> 0 )


and (EX/MEM.RegisterRd <> ID/EX.registerRt)
and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) FowardB = 01

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-70

35
Datapath modificado para fowarding
ID/EX

WB
EX/MEM

C ontrol M WB
MEM/WB

IF/ID EX M WB

M
Instruction

u
x
Registers
Instruction Data
PC ALU
mem ory memory M
u
M x
u
x

IF/ID.RegisterRs Rs
IF/ID.RegisterRt Rt
IF/ID.RegisterRt Rt
M EX/MEM.RegisterRd
IF/ID.RegisterRd Rd u
x
Forwarding MEM/WB.RegisterRd
unit

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-71

sub $2, $1, $3


Forwarding and $4, $2, $5
or $4, $4, $2
add $9, $4, $2
or $4, $4, $2 and $4, $2, $5 sub $2, $1, $3 before<1> before<2>

ID/EX
10 10
WB
EX/MEM

Control M WB
MEM/WB

IF/ID EX M WB

2 $2 $1
M
Instruction

5 u
x
Registers
Instruction Data
PC ALU
memory memory M
$5 $3
u
M x
u
x

2 1
5 3
M
4 2 u
x
Forwarding
unit

Clock 3

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-72

36
sub $2, $1, $3
Forwarding and $4, $2, $5
or $4, $4, $2
add $9, $4, $2

add $9, $4, $2 or $4, $4, $2 and $4, $2, $5 sub $2, . . . before<1>

ID/EX
10 10
WB
EX/MEM
10
Control M WB
MEM/WB

IF/ID EX M WB

4 $4 $2
M
Instruction

6 u
x
Registers
Instruction Data
PC ALU
memory memory M
$2 $5
u
M x
u
x

2 2
6 5
M 2
4 4 u
x
Forwarding
unit

Clock 4

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-73

sub $2, $1, $3


Forwarding and $4, $2, $5
or $4, $4, $2
add $9, $4, $2
after<1> add $9, $4, $2 or $4, $4, $2 and $4, . . . sub $2, . . .

ID/EX
10 10
WB
EX/MEM
10
Control M WB
MEM/WB
1
IF/ID EX M WB

4 $4 $4
M
Instruction

2 u
x
Registers
Instruction 2 Data
PC ALU
memory memory M
$2 $2
u
M x
u
x

4 4
2 2
M 4 2
9 4 u
x

Forwarding
unit

Clock 5

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-74

37
sub $2, $1, $3
Forwarding and $4, $2, $5
or $4, $4, $2
add $9, $4, $2

after<2> after<1> add $9, $4, $2 or $4, . . . and $4, . . .

ID/EX
10
WB
EX/MEM
10
Control M WB
MEM/WB
1
IF/ID EX M WB

$4
M
Instruction

u
x
Registers
Instruction 4 Data
PC ALU
memory memory M
$2
u
M x
u
x

4
2

M 4 4
9 u
x
Forwarding
unit

Clock 6

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-75

Forwarding: Datapath para entrada signed-imediate


necessária para lw e sw
ID/EX EX/MEM MEM/WB

M
u
x
Registers
ALUSrc
ALU

M M Data
u u memory
x x M
u
x

M
u
x
Forwarding
unit

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-76

38
Data Hazard e Stalls
Time (in clock cycles)
Program CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
execution
order
(in instructions)
lw $2, 20($1) IM Reg DM Reg

and $4, $2, $5 IM Reg DM Reg

or $8, $2, $6 IM Reg DM Reg

add $9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-77

Data Hazards e Stalls


• Quando uma instrução tenta ler um registrador precedida
por uma instrução de load, que escreve no mesmo
registrador  o dado tem que ser mantido (ciclo 4)
enquanto a ULA executa a operação  atrasar o pipeline
para que a instrução leia o valor correto.

• Condição de detecção de hazard para atraso no pipeline


# testa se é um load:

if (ID/EX.MemRead and

# verifica se registrador destino da instrução load em EX é o registrador


fonte da instrução em ID:

((ID/EX.RegisterRt = IF/ID.RegisterRs) or
( ID/EX.RegisterRt = IF/ID.RegisterRt)))
stall pipeline
Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers
Ch6.a-78

39
Data Hazards e Stalls

Program Time (in clock cycles)


execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC 10
order
(in instructions)

lw $2, 20($1) IM Reg DM Reg

and $4, $2, $5 IM Reg Reg DM Reg

or $8, $2, $6 IM IM Reg DM Reg

bubble

add $9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-79

Data Hazards e Stalls

• Se a instrução no estágio ID é atrasada, a instrução no


estágio IF também deve ser atrasada

• Como fazer?
Impedir a mudança no PC e no registrador IF/IF. A
instrução em IF continua sendo lida e em ID continuam
sendo lido os mesmos campos da instrução.

• Stall pipeline: mesmo efeito da instrução nop começando


pelo estágio EX  desativar os 9 sinais de controle dos
estágios EX, MEM e WB

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-80

40
Datapath com forwarding e data hazard detection

Hazard ID/EX.MemRead
detection
unit ID/EX

WB
IF/IDWrite

EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB
PCWrite

M
In struction

u
x
Registers
Instruction Data
PC ALU
mem ory memory M
u
M x
u
x

IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt Rt M EX/MEM.RegisterRd
IF/ID.RegisterRd Rd u
x
ID/EX.RegisterRt Rs Forwarding MEM/WB.RegisterRd
Rt unit

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-81

Seqüência de execução do exemplo anterior

and $4, $2, $5 lw $2, 20($1) before<1> before<2> before<3>


Hazard
ID/EX.MemRead
detection
1 unit ID/EX
X
11
WB
IF/IDWrite

EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB

1 $1
PCWrite

M
Instruction

X u
x
Registers
Instruction Data
PC ALU
memory memory M
$X
u
M x
u
x

1
X
2
M
u
x
ID/EX.RegisterRt Forwarding
unit

Clock 2

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-82

41
Seqüência de execução do exemplo anterior

or $4, $4, $2 and $4, $2, $5 lw $2, 20($1) before<1> before<2>


Hazard
detection ID/EX.MemRead
2 unit ID/EX
5
00 11
WB
IF/IDWrite

EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB

2 $2 $1
PCWrite

M
Instruction

5 u
x
Registers
Instruction Data
PC ALU
memory memory M
$5 $X
u
M x
u
x

2 1
5 X
2 M
4 u
x
ID/EX.RegisterRt Forwarding
unit

Clock 3

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-83

Seqüência de execução do exemplo anterior


or $4, $4, $2 and $4, $2, $5 bubble lw $2, . . . before<1>
Hazard
ID/EX.MemRead
detection
2 unit ID/EX
5
10 00
WB
IF/IDWrite

EX/MEM
M 11
Control u M WB
x MEM/WB
0
IF/ID EX M WB

2 $2 $2
PCWrite

M
Instruction

5 u
x
Registers
Instruction Data
PC ALU
memory memory M
$5 $5
u
M x
u
x

2 2
5 5

M 2
4 4 u
x
ID/EX.RegisterRt Forwarding
unit

Clock 4

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-84

42
Seqüência de execução do exemplo anterior
add $9, $4, $2 or $4, $4, $2 and $4, $2, $5 bubble lw $2, . . .
Hazard
ID/EX.MemRead
detection
4 unit ID/EX
2
10 10
WB
IF/IDWrite

EX/MEM
M 0
Control u M WB
x MEM/WB
0
11
IF/ID EX M WB

4
PCWrite

$4 $2
M
Instruction

2 u
x
Registers
Instruction 2 Data
PC ALU
memory memory M
$2 $5
u
M x
u
x

4 2
2 5
M 2
4 4 u
x
ID/EX.RegisterRt Forwarding
unit

Clock 5

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-85

Seqüência de execução do exemplo anterior


after<1> add $9, $4, $2 or $4, $4, $2 and $4, . . . bubble
Hazard ID/EX.MemRead
detection
4
unit ID/EX
2
10 10
WB
IF/IDWrite

EX/MEM
M 10
Control u M WB
x MEM/WB
0
0
IF/ID EX M WB

4 $4
PCWrite

$4
M
Instruction

2 u
x
Registers
Instruction Data
PC ALU
memory memory M
$2 $2
u
M x
u
x

4 4
2 2

M 4
9 4 u
x
ID/EX.RegisterRt Forwarding
unit

Clock 6

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-86

43
Seqüência de execução do exemplo anterior
after<2> after<1> add $9, $4, $2 or $4, . . . and $4, . . .
Hazard
detection ID/EX.MemRead
unit ID/EX
10 10
WB
IF/IDWrite

EX/MEM
M 10
Control u M WB
x MEM/WB
0
1
IF/ID EX M WB

$4
PCWrite

M
Instruction

u
x
Registers
Instruction 4 Data
PC ALU
memory memory M
$2
u
M x
u
x

4
2

M 4 4
9 u
x
ID/EX.RegisterRt Forwarding
unit

Clock 7

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-87

Branch Hazards

• Devemos ter um fetch de instrução por ciclo de clock, decisão:


qual caminho de um branch deve ocorrer até o estágio MEM.
Program Time (in clock cycles)
execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
order
(in instructions)

40 beq $1, $3, 7 IM Reg DM Reg

44 and $12, $2, $5 IM Reg DM Reg

48 or $13, $6, $2 IM Reg DM Reg

52 add $14, $2, $2 IM Reg DM Reg

72 lw $4, 50($7) IM Reg DM Reg

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-88

44
Branch Hazards

• Tês esquemas para resolver control hazard :

• Branch-delay Slots

• Assume Branch Not Taken

• Dynamic Branch Prediction

• Assume Branch Not Taken

• continua a execução seqüencialmente e se o branch for


tomado, descarta as instruções entre a instrução de
branch e a instrução no endereço alvo, fazendo seus
sinais de controle iguais a zero

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-89

Branch Hazards
• Redução do atraso de branches

• reduzir o custo se o branch for tomado

• adiantar a execução da instrução de branch.

• O next PC para uma instrução de branch é selecionado


no estágio MEM
• executar o branch no estágio ID
(apenas uma instrução será descartada)

• deslocar o cálculo do endereço de branch (branch adder)


do MEM para o ID e comparando os registradores lidos
do register file.

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-90

45
Branch Hazards
IF.Flush

Hazard
detection
unit
M ID/EX
u
x
WB
EX/MEM
M
Control u M WB
x MEM/WB
0

IF/ID EX M WB

4 Shift
left 2
M
u
x
Registers =
Instruction Data
PC ALU
memory memory M
u
M x
u
x

Sign
extend

M
u
x
Forwarding
unit

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-91

Branch Hazards (exemplo)

36 sub $10, $4, $8


40 beq $1, $3, 7 # PC  40 + 4 +7*4 = 72
44 and $12, $2, $5
48 or $13, $2, $6
52 add $14, $4, $2
56 slt $15, $6, $7
.... ............
.... ............
72 lw $4, 50($7)

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-92

46
Branch Hazards (exemplo)
and $12, $2, $5 beq $1, $3, 7 sub $10, $4, $8 before<1> before<2>

IF.Flush

Hazard
detection
unit
72 ID/EX
M
u
48 x WB
EX/MEM
M
Control u M WB
x MEM/WB
28
0
IF/ID EX M WB
48 44 72

4
$1
Shift M $4
left 2 u
x
=
Registers
Instruction Data
PC ALU
memory memory M
72 44 $3
u
M $8 x
7 u
x

Sign
extend

10

Forwarding
unit

Clock 3

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-93

Branch Hazards (exemplo)


lw $4, 50($7) bubble (nop) beq $1, $3, 7 sub $10, . . . before<1>

IF.Flush

Hazard
detection
unit
ID/EX
M
u
76 x WB
EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB
76 72

Shift M $1
left 2 u
x
Registers
=
Instruction Data
PC ALU
memory memory M
76 72
u
M $3 x
u
x

Sign
extend

10

Forwarding
unit

Clock 4

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-94

47
Dymanic Branch Prediction

• Branch not taken

• forma de branch predicton que assume que o


branch não será tomado.

• Dymanic Branch Prediction

• descobre se o branch foi tomado ou não na última vez


que foi executado e faz o fetch das instruções pelo
mesmo local da última vez.

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-95

Dymanic Branch Prediction

• Implementação
• branch prediction buffer
• branch history table

• Branch prediction buffer: pequena memória indexada


por bits menos significativos do endereço da instrução de
branch. Ela contém um bit que diz se o branch foi
recentemente tomado ou não (Neste esquema não
sabemos se a previsão é correta ou não, pois este buffer
pode ser alterado por outra instrução de branch que tem
os mesmos bits menos significativos de endereço).

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-96

48
Tratamento de Exceção

IF.Flush ID.Flush EX.Flush

Hazard
detection
unit
M ID/EX
40000040 u M
x u
WB x
0 EX/MEM
M M
Control u M u WB
x x MEM/WB
0
0
EX Cause M WB
IF/ID

Except
PC
4 Shift
left 2
M
u
x
Registers = Data
Instruction ALU
PC memory
memory M
u
M x
u
x

Sign
extend

M
u
x
Forwarding
unit

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-97

Tratamento de Exceção

Exemplo:
Dado a seqüência abaixo:
40hex sub $11, $2, $4
44hex and $12, $2, $5
48hex or $13, $2, $6
4Chex add $1, $2, $1
50hex slt $15, $6, $7
54hex lw $16, 50($7)
Assumir que as instruções a serem chamadas em um tratamento de exceção
comecem com:
40000040hex sw $25, 1000($0)
40000044hex sw $26, 1004($0)

Mostre o que acontece no pipeline se uma exceção de overflow ocorre na


instrução add.

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-98

49
Tratamento de Exceção - A figura abaixo mostra o que acontece a
partir da instrução add em EX (clock 5)
lw $ 1 6 , 5 0 ( $ 7 ) s lt $ 1 5 , $ 6 , $ 7 add $1, $2, $1 or $13, . . . and $12, . . .

IF .F l u s h ID .F lu s h E X . F lu s h

H a za rd
d e te c ti o n
u nit
M ID / E X
u M 0
4 0 00 0 0 4 0
x 0 10 u
W B x
0 E X /M E M
M 0 01 0 M 0
C o n t ro l u M u W B
x x M E M /W B
0
0
0 C ause 1
IF / I D EX M W B
58 54 50 E xce pt
PC
4 S h i ft
le f t 2 $6
M $2
u
x

12 R e g is t e r s = D ata
In s t r u c tio n ALU
PC m em ory
m e m ory $7 M
40 0 0 0 0 4 0 u
M $1 x
54 u
x

S ig n
e x te n d

M 13 12
$1 u
15 x
F o r w a r d in g
u n it

C lo c k 5

sw $ 2 5 , 1 0 0 0 ($ 0 ) b u b b le (n o p ) b u b b le b u b b le or $13, . . .

IF .F l u s h I D . F lu s h E X . F lu s h

H a za rd
d e te c ti o n
u nit
M ID / E X
u M 00
4 0 00 0 0 4 0 u
x 0 00
W B x
0 E X /M E M

C o n tr o l M 0 000 M 00
u M u W B
x x M E M /W B
0
0
4 0 0 0 00 4 4 0 1
EX C ause M W B
IF / I D

E xce pt
PC
4 S h i ft
le ft 2
M
u
x

13 R e g is t e r s = D a ta
In s t r u c tio n ALU
PC m em o ry
m e m ory M
4 0 0 0 0 0 44 u
M x
40000040 u
x

S ig n
e x te n d

M 13
u
x

F o r w a r d in g
u n it

C lo c k 6

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-99

Pipeline Super Escalar


M
40000040 u
x

M
u
x
4

ALU

M
Registers u
Instruction x
PC
memory Write
data

Data
memory

Sign ALU Address


extend Sign
extend

M
u
x

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-100

50
Pipeline Super Escalar

Exemplo:

Como o loop abaixo será escalonado em um MIPS superscalar:

Loop: lw $t0, $0($s2) # t0= elemento do array


addu$t0, $t0, $t2 # add escalar em $s2
sw $t0, 0($s2) # armazena o resultado
addi $s1, $s1,-4 # decrementa o ponteiro
bne $s1, $zero, Loop # branch se $s1 != 0

Reordenar as instruções para evitar tantos pipelines stalls quanto


possível

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-101

Pipeline Super Escalar

Solução:
A primeira com a terceira e as duas últimas instruções
tem dependência de dados.

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-102

51
Pipeline Super Escalar
Loop unrolling para pipelines escalares  técnica para aumentar o
desempenho para loops que acessam arrays  múltiplas cópias do corpo do
loop são feitas e instruções de diferentes iterações são escalonadas juntas

Exemplo  supor exemplo anterior

Solução  4 cópias do corpo do loop

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-103

Datapath final

Branch

IF.Flush ID.Flush EX.Flush

Hazard
detection
unit
M ID/EX
u M
40000040 u
x
WB x
0 EX/MEM
M M
Control u M u WB
x x MEM/WB
0
0
e EX Cause M WB
IF/ID t
ri
W e
g
e Except rit
W
R PC m
4 Shift ALUSrc e
M
left 2 Read
Read g
data 1 M e
register 1 R
u Data o
Instruction n Read x tm
o memory e
memory i
ctu
register 2 = M
Registers
r ALU Address
PC Address stn Write
I register Read Read
Read M
data Write data 2 M data u
M Write x
data u data
x u
x ALU
control
16 Sign 32 MemRead
extend ALUOp

RegDst
Instruction [25– 21]
Instruction [20– 16]
Instruction [20– 16] M
Instruction [15– 11] u
x
Forwarding
unit

Paulo C. Centoducatte 1998 Morgan Kaufmann Publishers


Ch6.a-104

52

Você também pode gostar