Escolar Documentos
Profissional Documentos
Cultura Documentos
Outline
Multi-cycle operations
Floating-point operations
Structural and data hazards
Interrupts, Faults and Exceptions
Precise exceptions
Complications in pipelines
READING: Appendix A
and E2
1
1
2
2
1
3
3
2
1
4
4
3
1
2
5
5
4
3
2
1
6
5
4
3
2
7
5
4
2
3
10 11 12 13
5
4
5
4
5
4
2
3
4
5
4
5
Out-of-order completion
3 finishes before 2, and 5 finishes before 4
ID
WAW hazards
possible; WAR
hazards not
possible
Out-of-order
completion; has
ramifications for
exceptions
EX
M1 M2 M3 M4 M5 M6 M7
A1 A2 A3 A4
Longer operation
latency implies
more frequent
stalls for RAW
hazards
MEM
DIV (25)
Structural hazard:
not fully pipelined
Structural hazard:
instructions have
varying running
times
WB
5
1
D
IF
2
D
ID
IF
3
D
M1
ID
IF
4
5
6
7
8
9
10 11
D
D
D
D
D
D MEM WB
M2 M3 M4 M5 M6 M7 MEM WB
EX MEM WB
ID EX MEM WB
IF
ID A1 A2 A3 A4 MEM WB
IF
ID EX MEM WB
IF
ID EX MEM WB
IF
ID EX MEM WB
Late resolution
Stall instructions at entry to MEM or WB stage
Complicates pipeline control (two stall locations)
WAW Hazards
DIV.D (issued at t = -16)
MULT.D F0, F4, F6
integer instruction
integer instruction
ADD.D F2, F4, F6
L.D F2, 0(R2)
1
D
IF
2
D
ID
IF
3
D
s
s
4
D
M1
ID
IF
5
6
7
8
9
10 11 12 13
D
D
D
D
D MEM WB
M2 M3 M4 M5 M6 M7 MEM WB
EX MEM WB
ID EX MEM WB
IF
ID
s
A1 A2 A3 A4 MEM WB
IF
ID EX MEM WB
WAW hazard arises only when no instruction between ADD.D and L.D uses
Adding an instruction like ADD.D F8,F2,F4 before L.D would stall pipeline
RAW Hazards
L: L.D F4, 0(R2)
IF
M:MUL.D F0, F4, F6 ID
A:ADD.D F2, F0, F8 EX
S:S.D 0(R2), F2
Mult
D:DIV.D F12, F4, F8 Add
Div
MEM
WB
1
L
2 3 4
M A A
L M M
L
5
S
A
6
S
A
7
S
A
8
S
A
9 10 11 12 13 14 15
S S S D
A A A S D
S S S
M M M M M M M
A A A A
D D
M
L
M
16 17 18 19
D
A
D
S
A
RAW hazards
Two methods for reducing stalls
1
A
D
E
M
N
R
S
U
Functional Unit
FP adder
FP divider
FP multiplier
FP multiplier
FP multiplier
FP adder
FP adder
2
x
3
x
Add
Subtract
x
x
x
Description
Mantissa ADD stage
Divide pipeline stage
Exception test stage
First stage of multiplier
Second stage of multiplier
Rounding stage
Operand shift stage
Unpack FP numbers
x
x
A
D
E
M
N
R
S
U
1
A
D
E
M
N
R
S
U
2
x
4
x
x
x
7
x
Multiply
x
30 31 32 33 34 35 36
x
x
x
x x x x x
x
x
Divide
10
1
A
D
E
M
N
R
S
U
2
x
3
x
Add
Subtract
x
x
x
Cant initiate
another add
on cycle 3
Conflict here
x
x
A
D
E
M
N
R
S
U
2
x
x
x
3
x
x
x
y
5
y
6
y
y
y
x
8
x
9
x
10 11 12 13 14 15 16 17 18 19
y y
x x
y y
x
x
y
y
y
y
y
x
x
x
x
x
y
y
y
y
y
1
A
D
E
M
N
R
S
U
x
x
7
x
x
x
y
y
x
7
x
y
x
x
x
Collision vector:
1 indicates forbidden latency
0 indicates allowed latency
Steady-state utilization (cycles 5-24)
= (5*10)/(8*20) = 50/160 = 31.25%
Total utilization (cycles 1-28)
= (5+5*10+5)/(8*28) = 60/224 = 26.79%
Multiply
x
1
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
y
z
x
y
z
z
z
y
z
y
y
z
x
x
z
x
z
z
x
y
y
x
y
x
x
y
z
z
y
z
y
z
z
z
z
12
3
x
Note out-of-order
completion
Steady-state utilization
(cycles 6-21)
= (4*17)/(8*16) = 68/128
= 53.13%
Total utilization
= (12+4*17+22)/(8*28)
= 85/224 = 37.95%
Add
Subtract
x
x
x
x
x
1
A
D
E
M
N
R
S
U
2
x
m
m
4
a
5
a
m
a
a
m
n
n
m
a
a
7
m
n
m
8
b
n
m
b
9
b
n
b
m
1
A
D
E
M
N
R
S
U
x
x
7
x
Multiply
x
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
n a a
m b b
n a a
m b b
n
m
m
n
b
b
m
n
m
n
a
m
a
n
n
n
m
a
a
n
m
n
m
b
n
b
m
m
m
n
b
b
m
n
m
n
a
m
a
n
n
n
m
a
a
n
m
n
m
b
n
b
n
b
b
n
n
13
Async
Coerced
Between
instr.
Resume
OS call
Sync
User
request
Between
instr.
Resume
Breakpoint Sync
User
request
Between
instr.
Resume
Power fail
Coerced
Within
instr.
Terminate
Async
14
internal
Imperative for some interrupts (VM page faults, IEEE FP standard)
15
undone
Example: First operand of
VAX instruction uses
autodecrement addressing
mode, which writes a
register. Trying to access
second operand causes a
page fault. Since instruction
execution cannot be
completed, we must restore
the register written by
autodecrement to its original
value
Long-running instructions
Not enough to be able to
16
ID
EX
MEM
WB
Problem exceptions
Page fault on instruction fetch
Misaligned memory access
Memory-protection violation
Undefined or illegal opcode
Arithmetic exception
Page fault on data fetch
Misaligned memory access
Memory-protection violation
None
it precise?
What problems do delayed branches cause?
What happens if multiple exceptions occur in the pipeline?
Can exceptions occur out-of-order?
What problems do multi-cycle instructions cause?
17
1
F
2
D
F
3
X
D
F
4
M
X
D
F
5
W
M
X
D
F
10
W
M
X
D
F
W
M
X
D
W
M
X
W
M
instructions
After exception-handling routine in OS receives control, save
PC of faulting instruction
When exception has been handled, the RFE instruction reloads
PC and restarts sequential instruction execution
18
1
F
2
D
F
3 4 5 6 7 8 9
X M W
D X M W
F D X M W
F D X M W
F D X M W
19
LW
ADD
1
F
2
D
F
3
X
D
4
M
X
5
W
M
6
W
20
2
D
F
3
X
D
4
M
X
5
W
M
6
W
fault
Relative timing differs between unpipelined and
pipelined machines
21
1 2 3
F D X
F D
F
4
X
X
D
5
X
X
X
6
X
X
X
7
X
X
X
8
X
M
X
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
X X X X X X X X X X X X X X X X X X M W
W
M W
immediately
Differences in running times causes out-of-order termination
DIVF throws arithmetic exception late in its execution
At that point, ADDF and SUBF have both completed execution
and destroyed one of their operands
Can we maintain precise interrupts under these conditions?
22
checkpointing)
2
D
F
3
X
D
F
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
X X X X X X X X X X X X X X M W
X X X X X X X X M W
D X X X X X X X X M W
F D X X X X M W
23
24