Você está na página 1de 88

Digital Design

Chapter 5:
Register-Transfer Level
(RTL) Design
Slides to accompany the textbook Digital Design, with RTL Design, VHDL,
and Verilog, 2nd Edition,
by Frank Vahid, John Wiley and Sons Publishers, 2010.
http://www.ddvahid.com

Copyright 2010 Frank Vahid


Instructors of courses requiring Vahid's Digital Design textbook (published by John Wiley and Sons) have permission to modify and use these slides for customary course-related activities,
subject to keeping this copyright notice in place and unmodified. These slides may be posted as unanimated pdf versions on publicly-accessible course websites.. PowerPoint source (or pdf
Digital
with animations) may Design 2e
not be posted to publicly-accessible websites, but may be posted for students on internal protected sites or distributed directly to students by other electronic means.
Copyright 2010 1
Instructors may make printouts of the slides available to students for a reasonable photocopying charge, without incurring royalties. Any other use requires explicit permission. Instructors
Frank Vahid
may obtain PowerPoint source or obtain special use permissions from Wiley see http://www.ddvahid.com for information.
5.1

Introduction
Chpt 2

Higher levels
Register-
Capture Comb. behavior: Equations, truth tables transfer
Convert to circuit: AND + OR + NOT Comb. logic level (RTL)

Chpt 3 Logic level


Capture sequential behavior: FSMs Transistor level
Convert to circuit: Register + Comb. logic Controller
Chpt 4 Levels of digital
design abstraction
Datapath components, simple datapaths
Chpt 5
Capture behavior: High-level state machine Processors:
Convert to circuit: Controller + Datapath Processor Programmable
Known as RTL (register-transfer level) design (microprocessor)
Custom
Digital Design 2e
Copyright 2010 2
Frank Vahid
Note: Slides with animation are denoted with a small red "a" near the animated items
5.2

High-Level State Machines (HLSMs)


Some behaviors too complex for
equations, truth tables, or FSMs s a
Ex: Soda dispenser
c: bit input, 1 when coin deposited
a: 8-bit input having value of c Soda
deposited coin d dispenser
s: 8-bit input having cost of a soda processor
d: bit output, processor sets to 1
when total value of deposited coins
equals or exceeds cost of a soda s a 25
FSM cant represent 25
8-bit input/output 0 1 0 1 0 50
Storage of current total c Soda tot:
Addition (e.g., 25 + 10) d
tot:
dispenser a

0 1 0 processor 50
25

Digital Design 2e
Copyright 2010 3
Frank Vahid
HLSMs s
8
a
8

High-level state machine c Soda


(HLSM) extends FSM with: d dispenser
processor
Multi-bit input/output
Local storage
Arithmetic operations
Inputs: c (bit), a (8 bits), s (8 bits)
Conventions Outputs: d (bit) // '1' dispenses soda
Numbers: Local storage: tot (8 bits)
Single-bit: '0' (single quotes) c
Integer: 0 (no quotes)
Multi-bit: 0000 (double quotes) Init Wait
tot:=tot+a
a == for equal, := for assignment d:='0' c'*(tot<s)
c*(tot<s)
Multi-bit outputs must be tot:=0
registered via local storage Disp

// precedes a comment SodaDispenser d:='1'


Digital Design 2e
Copyright 2010 4
Frank Vahid
Ex: Cycles-High Counter
P = total number (in binary) of cycles that m is 1
CountHigh
Capture behavior as HLSM m
clk Preg

Preg required (multibit outputs must be registered) 32


Use to hold count P
CountHigh Inputs: m (bit) CountHigh Inputs: m (bit) CountHigh Inputs: m (bit)
Outputs: P (32 bits) Outputs: P (32 bits) Outputs: P (32 bits)
Local storage: Preg Local storage: Preg Local storage: Preg
S_Clr // Clear Preg to 0s S_Clr // Clear Preg to 0s S_Clr
// Clear Preg to 0s
Preg := 0 Preg := 0 Preg := 0
a

? m' // Wait for m == '1' m' // Wait for m == '1'


S_Wt S_Wt

m m' m

? // Increment Preg
m S_Inc Preg := Preg + 1
(a) (b) (c)

Digital Design 2e Note: Could have designed directly using an up-counter. But, that methodology
Copyright 2010
is ad hoc, and won't work for more complex examples, like the next one. a 5
Frank Vahid
Example: Laser-Based Distance Measurer
T (in seconds)
laser
D
Object of
a
interest
sensor
2D = T sec * 3*108 m/sec

Laser-based distance measurement pulse laser,


measure time T to sense reflection
Laser light travels at speed of light, 3*108 m/sec
Distance is thus D = (T sec * 3*108 m/sec) / 2

Digital Design 2e
Copyright 2010 6
Frank Vahid
Example: Laser-Based Distance Measurer
T (in seconds)
B L
laser from button to laser
Laser-based
distance
sensor D 16 measurer S
to display from sensor

Inputs/outputs
B: bit input, from button, to begin measurement
L: bit output, activates laser
S: bit input, senses laser reflection
D: 16-bit output, to display computed distance

Digital Design 2e
Copyright 2010 7
Frank Vahid
Example: Laser-Based Distance Measurer
DistanceMeasurer from button B Laser-
L
to laser
Inputs: B (bit), S (bit) based
Outputs: L (bit), D (16 bits) distance
D 16 measurer S
Local storage: Dreg(16) to display from sensor
(required)
a
S0 ?
(first state usually
L := '0' // laser off initializes the system)
Dreg := 0 // distance is 0

Declare inputs, outputs, and local storage


Dreg required for multi-bit output
Create initial state, name it S0
Initialize laser to off (L:='0') Recall: '0' means single bit,
Initialize displayed distance to 0 (Dreg:=0) 0 means integer

Digital Design 2e
Copyright 2010 8
Frank Vahid
Example: Laser-Based Distance Measurer
from button B Laser-
L
to laser
DistanceMeasurer based
... B' // button not pressed distance
D 16 measurer S
to display from sensor

S0 S1 ?
B
L := '0' // button
Dreg := 0 pressed

Add another state, S1, that waits for a button press


B' stay in S1, keep waiting
B go to a new state S2

Q: What should S2 do? A: Turn on the laser


a
Digital Design 2e
Copyright 2010 9
Frank Vahid
Example: Laser-Based Distance Measurer
from button B Laser-
L
to laser
based
DistanceMeasurer distance
... B' D 16 S
to display measurer
from sensor

S0 S1 S2 S3
B
L := '0' L := '1' L := '0'
Dreg := 0 // laser on // laser off

Add a state S2 that turns on the laser (L:='1')


Then turn off laser (L:='0') in a state S3

Q: What do next? A: Start timer, wait to sense reflection


a

Digital Design 2e
Copyright 2010 10
Frank Vahid
Example: Laser-Based Distance Measurer
B L
from button to laser
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits) Laser-based
Local storage: Dreg, Dctr (16 bits) 16
distance
D measurer S
B' to display from sensor
S' // no reflection

S // reflection
S0 S1 S2 S3 ?
B
L := '0' Dctr := 0 L := '1' L := '0'
Dreg := 0 // reset cycle Dctr := Dctr + 1
count // count cycles
a

Stay in S3 until sense reflection (S)


To measure time, count cycles while in S3
To count, declare local storage Dctr
Initialize Dctr to 0 in S1. In S2 would have been O.K. too.
Don't forget to initialize local storagecommon mistake
Increment Dctr each cycle in S3
Digital Design 2e
Copyright 2010 11
Frank Vahid
Example: Laser-Based Distance Measurer
B L
from button Laser- to laser
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits) based
Local storage: Dreg, Dctr (16 bits) distance
D 16 S
to display measurer
from sensor
B' S'

S0 S1 S2 S3 S4
B S
L := '0' Dctr := 0 L := '1' L := '0' Dreg := Dctr/2
Dreg := 0 Dctr := Dctr+1 // calculate D

Once reflection detected (S), go to new state S4


Calculate distance
Assuming clock frequency is 3x108, Dctr holds number of meters, so
Dreg:=Dctr/2
After S4, go back to S1 to wait for button again
Digital Design 2e
Copyright 2010 12
Frank Vahid
HLSM Actions: Updates Occur Next Clock Cycle
S'

Local storage updated on clock edges only


S3
Enter state on clock edge S
Dctr := Dctr+1
Storage writes in that state occur on next clock edge
Can think of as occurring on outgoing transitions S' / Dctr := Dctr+1

Thus, transition conditions use the OLD value,


S3
not the newly-written value S/
Dctr := Dctr+1
Example:
Inputs: B (bit)
Outputs: P (bit) // if B, 2 cycles high S0 S1 S1 S0
Local storage: Jreg (8 bits) clk

B' B
!(Jreg<2) Jreg<2
1 2 3
S0 S1 Jreg ? 1 2 3
B
P := '0' P := '1'
Jreg := 1 Jreg := Jreg + 1 P
Digital Design 2e
Copyright 2010
(a) (b) 13
Frank Vahid
5.3

RTL Design Process


External data
Capture behavior DP
inputs
...
Convert to circuit External control
inputs
control
Need target architecture inputs
... ...
Controller Datapath
Datapath capable of HLSM's External ...
...
control
data operations outputs DP
Controller to control datapath control ...
outputs
External data
outputs

Digital Design 2e
Copyright 2010 14
Frank Vahid
Ctrl/DP Example for Earlier CountHigh a

Cycles-High Counter m 000...00001

A B
Create DP
add1
CountHigh
S
Connect
First clear Preg to 0s
m
Preg ? 32 with
Then increment Preg for each Preg_clr controller
clr I
clock cycle that m is 1 ld Preg
(a) Preg_ld Q
P
(c) DP
32 Derive
CountHigh Inputs: m (bit) P
controller
Outputs: P (32 bits)
LocStr: Preg (32 bits) CountHigh
//Clear Preg to 0s
S_Clr Preg := 0 m 000...00001
We //Preg := 0
S_Clr Preg_clr = 1 A B
created
Preg_ld = 0 add1
this S
m' //Wait for m=='1'
HLSM S_Wt 32
//Wait for m=1 Preg_clr
earlier m' S_Wt Preg_clr = 0 clr I
m' m ld Preg
Preg_ld = 0
Preg_ld Q
//Increment Preg m' m
m S_Inc Preg := Preg + 1 DP
//Preg:=Preg+1
a
m S_Inc Preg_clr = 0
(b) Preg_ld = 1 a

Digital Design 2e
Controller
Copyright 2010 15
Frank Vahid (d) 32
P
RTL Design Process

Digital Design 2e
Copyright 2010 16
Frank Vahid
Example: Soda Dispenser from Earlier
s a
Quick overview example.
More details of each step to come.
tot_ld ld
tot
tot_clr clr
a
8
Inputs: c (bit), a (8 bits), s (8 bits) 8 8
Outputs: d (bit) // '1' dispenses soda
Local storage: tot (8 bits) 8-bit
tot_lt_s 8-bit
c < adder

Datapath 8
Init Wait
tot:=tot+a
Step 2A
d:='0' c'*(tot<s) s a
c*(tot<s)
tot:=0 8 8
Disp
c
SodaDispenser d:='1' tot_ld
d a
tot_clr
Step 1 Controller Datapath
tot_lt_s
Digital Design 2e
Copyright 2010
Step 2B 17
Frank Vahid
Example: Soda Dispenser
Quick overview example. s
8
a
8
More details of each step to come.
c
tot_ld
d
tot_clr
Inputs: c (bit), a (8 bits), s (8 bits)
Controller Datapath
Outputs: d (bit) // '1' dispenses soda tot_lt_s
Local storage: tot (8 bits)
Step 2B
c
Inputs: c, tot_lt_s (bit)
Init Wait Outputs: d, tot_ld, tot_clr (bit)
tot:=tot+a c tot_ld
c
d:='0' c'*(tot<s) d
Add tot_clr
c*(tot<s)
tot:=0 Init Wait
tot_ld=1 tot_lt_s
Disp
d=0 c' *
tot_lt_s
c*tot_lt_s
tot_clr=1
SodaDispenser d:='1'
Disp

Step 1 Controller d=1

Digital Design 2e
Copyright 2010 Step 2C 18
Frank Vahid
Example: Soda Dispenser
Quick overview example.
Inputs: c, tot_lt_s (bit)
More details of each step to come. Outputs: d, tot_ld , tot_clr (bit)
c tot_ld
c
Add tot_clr
d
tot_lt_s

tot_clr
tot_ld
Init Wait
tot_ld=1 tot_lt_s
c*tot_lt_s
s1 s0 c n1 n0 d c *
d=0 tot_lt_s
0 0 0 0 0 1 0 0 1 tot_clr=1
0 0 0 1 0 1 0 0 1 Disp
Init

0 0 1 0 0 1 0 0 1
Controller d=1
0 0 1 1 0 1 0 0 1
0 1 0 0 1 1 0 0 0
0 1 0 1 0 1 0 0 0 Step 2C
Wait

0 1 1 0 1 0 0 0 0
0 1 1 1 1 0 0 0 0 Use controller design process
1 0 0 0 0 1 0 1 0
Add

(Ch3) to complete the design


1 1 0 0 0 0 1 0 0
Disp

Digital Design 2e
Copyright 2010 19
Frank Vahid
RTL Design ProcessStep 2A: Create a datapath
Sub-steps
HLSM data inputs/outputs Datapath inputs/outputs.
HLSM local storage item Instantiated register
"Instantiate": Add new component ("instance") to design
Each HLSM state action and transition condition data computation
Datapath components and connections
Also instantiate multiplexors as needed
Need component library from which to choose

clr I A B A B I I1 I0
ld reg add cmp shift<L/R> mux2x1
Q S lt eq gt Q s0 Q

clk^ and clr=1: Q=0 S = A+B (unsigned) shiftL1: <<1 s0=0: Q=I0
clk^ and ld=1: Q=I A<B: lt=1 shiftL2: <<2 s0=1: Q=I1
else Q stays same A=B: eq=1 shiftR1: >>1
A>B: gt=1 ...
Digital Design 2e
Copyright 2010 20
Frank Vahid
Step 2A: Create a DatapathSimple Examples
X Y Z X X Y Z X Y Z

k k=0: Preg = Y + Z
Preg = X + Y + Z Preg = Preg + X Preg=X+Y; regQ=Y+Z
k=1: Preg = X + Y
Preg Preg Preg regQ Preg

P P P Q P
(a) (b) (c) (d)
X Y Z X X Y Z X Y Z

DP DP
A B
add1 A B A B A B
S add1 A B A B add1 add2
S add1 add2 S S
X+Y S S

A B 0 clr I I1 I0
add2 1 ld Preg 0 clr I 0 clr I mux2x1
S Q 1 ld Preg 1 ld regQ k s0 Q
X+Y+Z Q Q a

0 clr I 0 clr I
1 ld Preg P P Q 1 ld Preg
Q Q
DP DP

P P
Digital Design 2e
Copyright 2010 21
Frank Vahid
Laser-Based Distance MeasurerStep 2A: Create a
Datapath
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits)
Local storage: Dreg, Dctr (16 bits)

B' S'

S0 S1 S2 S3 S4
B S
L := '0' Dctr := 0 L := '1' L := '0' Dreg := Dctr/2
Dreg := 0 Dctr := Dctr+1 // calculate D

1 Datapath
16
a
A B
HLSM data I/O DP I/O Add1: add(16) 16 I
S Shr1: shiftR1(16)
HLSM local storage reg Dreg_clr 16 Q
HLSM state action and Dreg_ld 16

transition condition data Dctr_clr clr I clr I


computation Datapath Dctr_ld ld Dctr: reg(16) ld Dreg: reg(16)
components and connections Q Q
16
D
Digital Design 2e
Copyright 2010 22
Frank Vahid
Laser-Based Distance MeasurerStep 2B: Connecting
the Datapath to a Controller
L
B to laser
from button
Controller
from sensor
Dreg_clr S

Dreg_ld

Dctr_clr Datapath

Dctr_ld
D
to display
16
a
300 MHz Clock

Digital Design 2e
Copyright 2010 23
Frank Vahid
Laser-Based Distance MeasurerStep 2C: Derive the
HLSM
Controller FSM 1
16
Datapath

A B
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits) Add1: add(16) 16 I
S Shr1: shiftR1(16)
Local storage: Dreg, Dctr (16 bits) Q
Dreg_clr 16
Dreg_ld 16
B' S'
Dctr_clr clr I clr I
Dctr_ld ld Dctr: reg(16) ld Dreg: reg(16)
S0 S1 S2 S3 S4 Q Q
B S
16
L := '0' Dctr := 0 L := '1' L := '0' Dreg := Dctr/2 D
Dreg := 0 Dctr := Dctr+1 // calculate D

Controller Inputs: B, S Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_ld


FSM has same
states, B S
transitions, and
control I/O a
B S
Achieve each S0 S1 S2 S3 S4

HLSM data L=0 L=0 L=1 L=0 L=0


Dreg_clr = 1 Dreg_clr = 0 Dreg_clr = 0 Dreg_clr = 0 Dreg_clr = 0
operation using Dreg_ld = 0 Dreg_ld = 0 Dreg_ld = 0 Dreg_ld = 0 Dreg_ld = 1
datapath control Dctr_clr = 0 Dctr_clr = 1 Dctr_clr = 0 Dctr_clr = 0 Dctr_clr = 0
Dctr_ld = 0 Dctr_ld = 0 Dctr_ld = 0 Dctr_ld = 1 Dctr_ld = 0
signals in FSM (laser off) (clear count) (laser on) (laser off) (load Dreg with Dctr/2)
(clear Dreg) (count up) (stop counting)
Digital Design 2e
Copyright 2010 24
Frank Vahid
Laser-Based Distance MeasurerStep 2C: Derive the
Controller FSM
Controller Inputs: B, S Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_ld

B S

B S
S0 S1 S2 S3 S4

L=0 Dctr_clr = 1 L=1 L=0 Dreg_ld = 1


Dreg_clr = 1 (clear count) (laser on) Dctr_ld = 1 Dctr_ld = 0
(laser off) (laser off) (load Dreg with Dctr/2)
(clear Dreg) (count up) (stop counting)

Same FSM, using


convention of Some assignments to 0 still shown, due to
unassigned their importance in understanding
outputs implicitly desired controller behavior
assigned 0
Digital Design 2e
Copyright 2010 25
Frank Vahid
5.4

More RTL Design


Additional datapath components

(signed)
W_d
A B A B A clr W_a
sub mul abs inc upcnt W_e
clk^ and W_e=1:
S P Q Q RF[W_a]= W_d
RF R_e=1:
R_a
R_e R_d = RF[R_a]
S = A-B P = A*B Q = |A| clk^ and clr=1: Q=0 R_d
(signed) (unsigned) (unsigned) clk^ and inc=1: Q=Q+1
else Q stays same

Digital Design 2e
Copyright 2010 26
Frank Vahid
RTL Design Involving Register File or Memory
HLSM array: Ordered list of items
Ex: Local storage: A[4](8-bit) 4 8-bit items
Accessed using notation "A[i]", i is index
A[0] := 9; A[1] := 8; A[2] := 7; A[3] := 22
Array contents now: <9, 8, 7, 22>
X := A[1] will set X to 8
Note: First element's index is 0
Array can be mapped to instantiated register file or memory

Digital Design 2e
Copyright 2010 27
Frank Vahid
ArrayEx Inputs: (none)

Simple Array Example


Outputs: P (11 bits)
Local storage: A[4](11 bits)
Preg (11 bits)
Init1 Preg := 0
A[0] := 9

(A[0] == 8)' a

Init2 A[1] := 12
12 9
11 11
A[0] == 8
I1 I0
A_s Amux
Out1 Preg := A[1] s0 Q

8
(a) A_Wa0 W_d
A_Wa1 W_a
ArrayEx Inputs: A_eq_8 A_We W_e A B
A_Ra0 A
Outputs: A_s, A_Wa0, ... Acmp
A_Ra1 R_a RF[4](11)
lt eq gt
Preg_clr = 1 A_Re R_e
A_s = 0 R_d
Init1 A_Wa1=0, A_Wa1=0
A_We = 1 A_eq_8
(A_eq_8)'
Preg_clr
A_s = 1 clr I
Init2 ld Preg
A_Wa1=0, A_Wa0=1 Preg_ld Q
A_We = 1 DP
A_Ra1=0, A_Ra0=0
A_eq_8 A_Re = 1 (b) 11
P

Out1 Preg_ld = 1
Digital Design 2e Controller
Copyright 2010 28
Frank Vahid (c)
a
RTL Example: Video Compression Sum of Absolute
Only difference: ball moving
Differences
Frame 1 Frame 2 Frame 1 Frame 2

Digitized Digitized Digitized Difference of a


frame 1 frame 2 frame 1 2 from 1

1 Mbyte 1 Mbyte 1 Mbyte 0.01 Mbyte


(a) (b)
Just send
Video is a series of frames (e.g., 30 per second) difference
Most frames similar to previous frame
Compression idea: just send difference from previous frame
Digital Design 2e
Copyright 2010 29
Frank Vahid
RTL Example: Video Compression Sum of Absolute
Differences
compare Each is a pixel, assume
Frame 1 Frame 2
represented as 1 byte
(actually, a color picture
might have 3 bytes per
pixel, for intensity of
red, green, and blue
components of pixel)
Need to quickly determine whether two frames are similar
enough to just send difference for second frame
Compare corresponding 16x16 blocks
Treat 16x16 block as 256-byte array
Compute the absolute value of the difference of each array item
Sum those differences if above a threshold, send complete frame
for second frame; if below, can use difference method (using
another technique, not described)
Digital Design 2e
Copyright 2010 30
Frank Vahid
Array Example: Video CompressionSum-of-Absolute
Differences
A SAD Inputs: A, B [256](8 bits); go (bit)
RF[256](8) Outputs: sad (32 bits)
Local storage: sum, sadreg (32 bits); i (9 bits)
B sad
RF[256](8) S0 !go
go go
sum := 0
S1
i := 0

a
S2
S0: wait for go

(i<256)
i<256
S1: initialize sum and index sum:=sum+abs(A[i]-B[i])
S3
i := i + 1
S2: check if done ( (i<256) )
S3: add difference to sum, S4 sadreg := sum

increment index
(b)
S4: done, write to output sad_reg
Digital Design 2e
Copyright 2010 31
Frank Vahid
Array Example: Video
Inputs: A, B [256](8 bits); go (bit)
Outputs: sad (32 bits)
Local storage: sum, sadreg (32 bits); i (9 bits)

S0 !go CompressionSum-of-
Absolute Differences
go
sum := 0
S1
i := 0

S2
!(i<256) i<256
sum:=sum+abs(A[i]-B[i])
S3
i := i + 1

S4 sadreg := sum

go AB_rd AB_addr A_data B_data

i_lt_256 A
lt
S0 go cmp B 8 8
256 9 a
go i_inc
A B
S1
sum=0 sum_clr=1
i_clr
i
i=0 i_clr=1
8
S2 sum_ld
(i<256) (i_lt_256)

i<256 i_lt_256 sum 32 abs


sum_clr
S3 sum=sum+abs(A[i]-B[i])
sum_ld=1; AB_rd=1 32 32 8
i=i+1 i_inc=1 sadreg_ld
S4 sad_reg = sum
sadreg_clr
sadreg +
sadreg_ld=1
Controller Datapath 32
Digital Design 2e
Copyright 2010 sad 32
Frank Vahid
Circuit vs. Microprocessor
Circuit: Two states (S2 & S3) for each i, 256 is 512 clock cycles
Microprocessor: Loop (for i = 1 to 256), but for each i, must move
memory to local registers, subtract, compute absolute value, add to
sum, increment i say 6 cycles per array item 256*6 = 1536 cycles
Circuit is about 3 times (300%) faster (assuming equal cycle lengths)
Later, well see how to build SAD circuit that is much faster

(i<256)
S2
i<256
sum:=sum+abs(A[i]-B[i])
S3
i:=i+1

Digital Design 2e
Copyright 2010 33
Frank Vahid
Common RTL Design Pitfall Involving Storage Updates
Questions Local storage: R, Q (8 bits)
Value of Q after state A?
R<100 C
Final state is C or D?
A B
Answers (R<100)'

Q is NOT 99 after state A R:=99 R:=R+1 D


Q:=R
Q is 99 in state B, so final state is C
Storage update actions in state R<100
occur simultaneously on next clock clk A B C
edge 99 100
Thus, order actions are written is R ? 99 100 a

irrelevant
A's actions same if: Q ? ? ?
Q:=R R:=99 or
R:=99 Q:=R
Digital Design 2e
Copyright 2010 34
Frank Vahid
Common RTL Design Pitfall Involving Storage Updates

New HLSM Local storage : R, Q (8 bits)


using extra R<100 C
state so read of
A B B2 (R<100)'
R occurs after
write of R R:=99 R:=R+1 D
Q:=R Q:=R

R<100 (R<100)'

clk A B B2 D
99 100
R ? 99 100 100

Q ? ? 99 99

Digital Design 2e
Copyright 2010 35
Frank Vahid
RTL Design Involving a Timer
Commonly need explicit time intervals L
Ex: Repeatedly blink LED on 1 second, off 1 second
Pre-instantiate timer that HLSM can then use

BlinkLed Timer: T Outputs: L (bit)


BlinkLed
T_M 32
T_ld M T_Q' T_Q'
load
T_Q T_Q
enable 32-bit
T_en 1-microsec Init Off On
Q timer T L:='0' L:='0' L:='1' a
T:=1000000 T_en:='1' T_en:='1'
T_Q T_en:='0'
L
a (a) (b)
Pre-instantiated timer HLSM making use of timer
Digital Design 2e
Copyright 2010 36
Frank Vahid
Button Debouncing
Press button
Ideally, output changes to 1 button

Actually, output bounces B B


Due to mechanical reasons 0 1
Like ball bouncing when dropped to
Ideal: B
floor
Digital circuit can convert actual Actual: B
bounce
signal closer to ideal signal
ButtonDebouncer Inputs: Bin (bit) Outputs: Bout (bit)
Timer: T
Bin' T_Q' Bin
Bin T_Q Bin'
Init WaitBin Wait20 WhileBin
Bout :='0' Bout:='0' Bout:='1' Bout:='1'
T:=20000 T_en:='0' T_en:='1' T_en:='0'
Digital Design 2e T_en:='0'
Copyright 2010 37
Frank Vahid
a
Data Dominated RTL Design Example
Data dominated design: Extensive DP,
simple controller
Control dominated design: Complex
controller, simple DP
Example: Filter
Converts digital input stream to new X Y
digital output stream 12 digital filter 12
clk
Ex: Remove noise
180, 180, 181, 180, 240, 180, 181
240 is probably noise, filter might replace
by 181
Simple filter: Output average of last N
values
Small N: less filtering
Large N: more filtering, but less sharp
output

Digital Design 2e
Copyright 2010 38
Frank Vahid
Data Dominated RTL Design Example: FIR Filter
FIR filter X Y

Finite Impulse Response 12 digital filter 12

Simply a configurable weighted clk

sum of past input values


y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)
Above known as 3 tap
Inputs: X (12 bits) Outputs: Y (12 bits)
Tens of taps more common Local storage: xt0, xt1, xt2, c0, c1, c2 (12 bits);
Very general filter User sets Yreg (12 bits)
the constants (c0, c1, c2) to
define specific filter Init FC

RTL design Yreg := 0


xt0 := 0
Yreg :=
c0*xt0 +
Step 1: Create HLSM xt1 := 0
xt2 := 0
c1*xt1 +
c2*xt2
Very simple states/transitions c0 := 3 xt0 := X
c1 := 2 xt1 := xt0
FIR filter c2 := 2 xt2 := xt1
Digital Design 2e
Copyright 2010 Assume constants set to 3, 2, and 2 39
Frank Vahid
FIR Inputs: X (12 bits) Outputs: Y (12 bits)
Step 2A: Create datapath
Filter
Local storage: xt0, xt1, xt2, c0, c1, c2 (12 bits);
Yreg (12 bits)

Init FC
Step 2B: Connect Ctrlr/DP (as
Yreg := 0 Yreg :=
earlier examples)
xt0 := 0 c0*xt0 +
xt1 := 0 c1*xt1 + Step 2C: Derive FSM
xt2 := 0 c2*xt2
c0 := 3
c1 := 2
xt0 := X
xt1 := xt0
Set clr and ld lines appropriately
FIR filter c2 := 2 xt2 := xt1

3 2 2

c0_ld c1_ld c2_ld


xt0_clr
c0 c1 c2

...
...

xt0_ld
xt0 xt1 xt2
X
12
clk
x(t)
* x(t-1)
* x(t-2)
*

Yreg_clr
+ + Yreg_ld
Y
Yreg
12
Datapath for 3-tap FIR filter

Digital Design 2e
Copyright 2010 40
Frank Vahid
Circuit vs. Microprocessor y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)

Comparing the FIR circuit to microprocessor instructions


Microprocessor
100-tap filter: 100 multiplications, 100 additions. Say 2 instructions
per multiplication, 2 per addition. Say 10 ns per instruction.
(100*2 + 100*2)*10 = 4000 ns
Circuit
Assume adder has 2 ns delay, multiplier has 20 ns delay
Longest path goes through one multiplier and two adders
20 + 2 + 2 = 24 ns delay
100-tap filter, following design on previous slide, would have about a
34 ns delay: 1 multiplier and 7 adders on longest path
Circuit is more than 100 times faster (4000/34). Wow.

Digital Design 2e
Copyright 2010 41
Frank Vahid
5.5

Determining Clock Frequency


Designers of digital circuits
often want fastest
performance clk a b
Means want high clock
frequency
Frequency limited by longest
register-to-register delay
2 ns +
delay
Known as critical path
If clock is any faster, incorrect
data may be stored into register c
Longest path on right is 2 ns
Ignoring wire delays, and
register setup and hold times,
for simplicity

Digital Design 2e
Copyright 2010 42
Frank Vahid
Critical Path
Example shows four paths
a to c through +: 2 ns
a to d through + and *: 7 ns a b
b to d through + and *: 7 ns
b to d through *: 5 ns
Longest path is thus 7 ns
2 ns
delay
+ * 5 ns
delay
Fastest frequency

2 ns
2 ns

7 ns
7 ns
Max
1 / 7 ns = 142 MHz (2,7,7,5)
= 7 ns c d
a

Digital Design 2e
Copyright 2010 43
Frank Vahid
Critical Path Considering Wire Delays
Real wires have delay too
Must include in critical path
Example shows two paths clk a b
Each is 0.5 + 2 + 0.5 = 3 ns
0.5 ns
Trend 0.5 ns
1980s/1990s: Wire delays were tiny
compared to logic delays + 2 ns
But wire delays not shrinking as fast as a
logic delays 0.5 ns

3 ns

3 ns
Wire delays may even be greater than
logic delays! c
Must also consider register setup and
hold times, also add to path
Then add some time to the computed
path, just to be safe
e.g., if path is 3 ns, say 4 ns instead
Digital Design 2e
Copyright 2010 44
Frank Vahid
A Circuit May Have Numerous Paths
Paths can exist s a

In the datapath Combinational logic 8 8


d
In the controller
Between the tot_ld
ld
controller and t ot_clr tot
c clr
datapath
(c) 8
May be tot_lt_s
n1
hundreds or
thousands of n0
8-bit 8-bit
< adder
paths tot_lt_s 8

Timing analysis Datapath a


s1 s0
tools that evaluate (b) (a)
all possible paths clk State register

automatically very
helpful
Digital Design 2e
Copyright 2010 45
Frank Vahid
5.6

Behavioral Level Design: C to Gates


Inputs : A, B [256](8 bits); go (bit)
Outputs : sad (32 bits) C code
Local storage : sum, sadreg (32 bits); i (9 bits)
int SAD (byte A[256], byte B[256]) // not quite C syntax
S0 !go {
uint sum; short uint I;
go
sum := 0 sum = 0;
S1 i := 0 i = 0;
while (i < 256) {
S2 sum = sum + abs(A[i] B[i]);
(i<256)

i<256 i = i + 1;
S3 sum:=sum+abs(A[i]-B[i]) }
i := i + 1 return sum;
}
a
S4 sadreg := sum

Earlier sum-of-absolute-differences example


Started with high-level state machine
C code is an even better starting point -- easier to understand
Digital Design 2e
Copyright 2010 46
Frank Vahid
Converting from C to High-Level State Machine
Convert each C construct to
equivalent states and
transitions
Assignment statement
target := a
Becomes one state with target = expression;
expression
assignment
If-then statement
Becomes state with condition cond
check, transitioning to then cond
if (cond) {
statements if condition true, // then stmts (then stmts) a

otherwise to ending state }

then statements would also (end)


be converted to states

Digital Design 2e
Copyright 2010 47
Frank Vahid
Converting from C to High-Level State Machine
If-then-else
cond
Becomes state with condition if (cond) { cond
check, transitioning to then // then stmts
(then stmts) (else stmts)
statements if condition true, or }
a
else {
to else statements if condition // else stmts (end)
false }

While loop statement cond

Becomes state with condition while (cond) {


cond
// while stmts (while stmts)
check, transitioning to while a
}
loops statements if true, then
transitioning back to condition
(end)
check
Digital Design 2e
Copyright 2010 48
Frank Vahid
Simple Example of Converting from C to High-
Level State Machine
Inputs: uint X, Y
Outputs: uint Max (X>Y) (X>Y)

X>Y X>Y
if (X > Y) {
Max = X; (then stmts) (else stmts) Max:=X Max:=Y
}
else {
Max = Y;
(end) (end)
}
a a

(a) (b) (c)

Simple example: Computing the maximum of two numbers


Convert if-then-else statement to states (b)
Then convert assignment statements to states (c)
Digital Design 2e
Copyright 2010 49
Frank Vahid
Example: SAD C Inputs: byte A[256],B[256]

code to HLSM
(go')'
bit go;
Output: int sad
go' go' go go' go
main()
{
uint sum; short uint i; sum:=0 sum:=0
Convert each construct while (1) { i:=0

to states while (!go); i=0


(d)
sum = 0;
Simplify, e.g., merge i = 0;
(b) (c)
states while (i < 256) {
sum = sum + abs(A[i] B[i]);
RTL design process to i = i + 1;
}
convert to circuit } sad = sum;
go' go go' go
Can thus convert C to }
(a)

circuit using sum:=0


i:=0
sum:=0
i:=0
straightforward process go' go
(i<256)' (i<256)'
Actually, subset of C sum:=0
i:=0 i<256 i<256
(not all C constructs
sum:=sum sum:=sum
easily convertible) (i<256)' + abs...
i := i + 1
+ abs...
i := i + 1
Can use language i<256
while stmts sadreg :=
other than C sum

(g)

Digital Design 2e a
sadreg :=
Copyright 2010 sum 50
Frank Vahid (e)
(f)
5.7

Memory Components
RTL design instantiates
datapath components to
create datapath, controlled
by a controller
Some components are used

M words
outside the controller and DP
MxN memory
M words, N bits wide each
Several varieties of memory,
N-bits
which we now introduce wide each
MN memory

Digital Design 2e
Copyright 2010 51
Frank Vahid
Random Access Memory (RAM)
RAM Readable and writable memory 32 32
W_data R_data
Random access memory 4 4
Strange nameCreated several decades ago to W_addr R_addr
contrast with sequentially-accessed storage like W_en R_en
tape drives 1632
register file
Logically same as register fileMemory with
address inputs, data inputs/outputs, and control Register file from Chpt. 4
RAM usually one port; RF usually two or more
RAM vs. RF
32
RAM typically larger than about 512 or 1024 words data
10
RAM typically stores bits using a bit storage addr
approach that is more efficient than a flip-flop 1024 32
rw RAM
RAM typically implemented on a chip in a square
rather than rectangular shapekeeps longest en
wires (hence delay) short
RAM block symbol

Digital Design 2e
Copyright 2010 52
Frank Vahid
RAM Internal Structure
Let A = log2M wdata(N-1) wdata(N-2) wdata0
32
data
10 word bit storage
addr block
1024x32 enable
rw RAM d0 (aka cell)
en
addr0 a0 word
addr1 a1 AxM
d1
decoder
addr(A-1) a(A-1) data cell
word word
e d(M-1) enable enable
clk
Combining rd and wr rw data
en
data lines rw to all cells
wdata

rdata0

wdata0
(N-1)
rdata

(N-1)

rdata(N-1) rdata(N-2) rdata0 RAM cell


rw

data(N-1) data0 Similar internal structure as register file


Decoder enables appropriate word based on address inputs
rw controls whether cell is written or read
Digital Design 2e
rd and wr data lines typically combined
Copyright 2010
Frank Vahid
Lets see whats inside each RAM cell 53
Static RAM (SRAM) SRAM cell
data data
cell
32 d d
data
10
addr
1024x32 a
rw RAM
en
word 0
enable

SRAM cell
data data
Static RAM cell 1 0
6 transistors (recall inverter is 2 transistors) d
a
Writing this cell 1 0
word enable input comes from decoder
When 0, value d loops around inverters word 1
That loop is where a bit stays stored enable
When 1, the data bit value enters the loop
data data
data is the bit to be stored in this cell cell
data enters on other side d d
1 0 a
Example shows a 1 being written into cell

word 0
Digital Design 2e
enable
Copyright 2010 54
Frank Vahid
Static RAM (SRAM)
32
data
10
addr
1024x32
rw RAM
en

SRAM cell
Static RAM cell
data data
Reading this cell 1 1
Somewhat trickier d
When rw set to read, the RAM logic sets
both data and data to 1 1 0

The stored bit d will pull either the left line or a

the right bit down slightly below 1 1 1 <1


word
Sense amplifiers detect which side is enable
slightly pulled down To sense amplifiers

The electrical description of SRAM is really


beyond our scope just general idea here,
mainly to contrast with DRAM...
Digital Design 2e
Copyright 2010 55
Frank Vahid
Dynamic RAM (DRAM)
32
data
10
addr
1024x32
rw RAM
en

DRAM cell
Dynamic RAM cell data

cell
1 transistor (rather than 6)
word
Relies on large capacitor to store bit enable
d
capacitor
Write: Transistor conducts, data voltage slowly
level gets stored on top plate of capacitor discharging

Read: Just look at value of d (a)


Problem: Capacitor discharges over time
data
Must refresh regularly, by reading d and
then writing it right back enable
discharges
d
(b)
Digital Design 2e
Copyright 2010 56
Frank Vahid
Comparing Memory Types
Register file MxN Memory
Fastest implemented as a:

But biggest size register


file
SRAM
Fast SRAM
More compact than register file DRAM
DRAM
Slowest
And refreshing takes time
Size comparison for same
But very compact
number of bits (not to scale)
Use register file for small items,
SRAM for large items, and DRAM
for huge items
Note: DRAMs big capacitor requires
a special chip design process, so
DRAM is often a separate chip
Digital Design 2e
Copyright 2010 57
Frank Vahid
Reading and Writing a RAM
clk clk
1 2 3
addr 9 13 9 addr valid setup
time
data 500 999 Z 500 data valid hold Z 500
time
rw 1 means write setup
rw
time
en access
RAM[9] RAM[13] time
now equals 500 now equals 999
Writing (b)
Put address on addr lines, data on data lines, set rw=1, en=1
Reading
Set addr and en lines, but put nothing (Z) on data lines, set rw=0
Data will appear on data lines
Dont forget to obey setup and hold times
In short keep inputs stable before and after a clock edge
Digital Design 2e
Copyright 2010 58
Frank Vahid
RAM Example: Digital Sound Recorder
4096x16
RAM

addr
data

en
rw
wire 16
analog-to- digital-to-
digital 12 analog
ad_buf Ra RrwRen wire
microphone converter converter
ad_ld processor da_ld

speaker
Behavior
Record: Digitize sound, store as series of 4096 12-bit digital values in RAM
Well use a 4096x16 RAM (12-bit wide RAM not common)
Play back later
Common behavior in telephone answering machine, toys, voice recorders
To record, processor should read a-to-d, store read values into
successive RAM words
To play, processor should read successive RAM words and enable d-to-a
Digital Design 2e
Copyright 2010 59
Frank Vahid
RAM Example: Digital Sound Recorder
4096x16
RTL design of processor RAM

Create HLSM
16
Begin with the record behavior analog-to-
12
digital-to-
digital ad_buf Ra Rw Ren analog
Create local storage a converter converter
ad_ld processor da_ld
Stores current address,
ranges from 0 to 4095 (thus
need 12 bits) Record behavior
Local register: a, Rareg (12 bits)
Create state machine that
a<4095
counts from 0 to 4095 using a S T
For each a a:=0 ad_ld:=1 a
ad_buf:=1
Read analog-to-digital conv. Rareg:=a U
ad_ld:=1, ad_buf:=1 Rrw:=1 a:=a+1
Ren:=1
Write to RAM at address a
(a<4095)
Rareg:=a, Rrw:=1,
Ren:=1
Digital Design 2e
Copyright 2010 60
Frank Vahid
RAM Example: Digital Sound Recorder
4096x16
Now create play behavior RAM data bus
Use local register a again,
create state machine that 16
counts from 0 to 4095 again analog-to-
digital 12
digital-to-
analog
ad_buf Ra Rw Ren
For each a converter converter
ad_ld processor da_ld
Read RAM
Write to digital-to-analog conv.
Note: Must write d-to-a one Play behavior
cycle after reading RAM, when
Local register: a,Rareg (12 bits)
the read data is available on
the data bus a<4095
V W
The record and play state a:=0
a
ad_buf:=0
machines would be parts of a Rareg:=a
larger state machine controlled Rrw=0 X
Ren=1
by signals that determine when da_ld:=1
a:=a+1
to record or play
(a<4095)

Digital Design 2e
Copyright 2010 61
Frank Vahid
Read-Only Memory ROM
Memory that can only be read from, not 32
data
written to 10
addr
Data lines are output only rw
1024 32
RAM
No need for rw input en

Advantages over RAM


Compact: May be smaller RAM block symbol

Nonvolatile: Saves bits even if power supply


is turned off 32
Speed: May be faster (especially than data
DRAM) 10 1024x32
addr
ROM
Low power: Doesnt need power supply to
save bits, so can extend battery life en

Choose ROM over RAM if stored data wont ROM block symbol
change (or wont change often)
For example, a table of Celsius to Fahrenheit
conversions in a digital thermometer

Digital Design 2e
Copyright 2010 62
Frank Vahid
Read-Only Memory ROM
32
data
10 1024x32
addr Let A = log2M
ROM
en bit storage
word
enable block
ROM block symbol d0 (aka cell)
addr0 a0 word
addr1 a1 AxM
d1
decoder
data
addr
addr(A-1) a(A-1)
word word
e d(M-1) enable enable
clk
en data

rdata(N-1) rdata(N-2) rdata0 ROM cell

Internal logical structure similar to RAM, without the data


input lines

Digital Design 2e
Copyright 2010 63
Frank Vahid
ROM Types
If a ROM can only be read, how
are the stored bits stored in the
first place?
Storing bits in a ROM known as
programming
Several methods
Mask-programmed ROM 1 data line 0 data line

Bits are hardwired as 0s or 1s cell cell


during chip manufacturing word
2-bit word on right stores 10 enable
word enable (from decoder) simply
passes the hardwired value
through transistor
Notice how compact, and fast, this
memory would be
Digital Design 2e
Copyright 2010 64
Frank Vahid
ROM Types
Fuse-Based Programmable
ROM
Each cell has a fuse
A special device, known as a
programmer, blows certain fuses
(using higher-than-normal voltage)
1 data line 1 data line
Those cells will be read as 0s
(involving some special electronics) cell cell
Cells with unblown fuses will be read word
a

as 1s enable

2-bit word on right stores 10


fuse blown fuse
Also known as One-Time
Programmable (OTP) ROM

Digital Design 2e
Copyright 2010 65
Frank Vahid
ROM Types
Erasable Programmable ROM
(EPROM)
Uses floating-gate transistor in each cell
Special programmer device uses higher-
than-normal voltage to cause electrons to
tunnel into the gate

floating-gate
Electrons become trapped in the gate data line data line

transistor
Only done for cells that should store 0 cell cell
Other cells (without electrons trapped in 1 1
0 a

gate) will be 1 word e- e-


2-bit word on right stores 10 enable
Details beyond our scope just general trapped electrons
idea is necessary here
To erase, shine ultraviolet light onto chip
Gives trapped electrons energy to escape
Requires chip package to have window

Digital Design 2e
Copyright 2010 66
Frank Vahid
ROM Types
Electronically-Erasable Programmable ROM
(EEPROM)
Similar to EPROM
Uses floating-gate transistor, electronic programming to
trap electrons in certain cells
But erasing done electronically, not using UV light
Erasing done one word at a time
Flash memory
Like EEPROM, but all words (or large blocks of
words) can be erased simultaneously 32
data
Became very common starting in late 1990s 10
Both types are in-system programmable addr
en 1024x32
Can be programmed with new stored bits while in the EEPROM
system in which the ROM operates write
Requires bi-directional data lines, and write control input busy
Also need busy output to indicate that erasing is in
progress erasing takes some time
Digital Design 2e
Copyright 2010 67
Frank Vahid
ROM Example: Talking Doll
Hello there! audio
divided into 4096 speaker

Hello there!
4096x16 ROM
samples, stored
in ROM Hello there!

16 a
digital-to-
analog vibration
Ra Ren converter sensor

da_ld
processor

v
Doll plays prerecorded message, triggered by vibration
Message must be stored without power supply Use a ROM, not a RAM,
because ROM is nonvolatile
And because message will never change, may use a mask-programmed ROM or
OTP ROM
Processor should wait for vibration (v=1), then read words 0 to 4095 from
the ROM, writing each to the d-to-a
Digital Design 2e
Copyright 2010 68
Frank Vahid
ROM Example: Talking Doll
Local register: a, Rareg (12 bits)
4096x16 ROM
v a<4095
a:=0 S T

16 Rareg:=a
digital-to- Ren:=1
analog U a

Ra Ren converter v
da_ld:=1
da_ld (a<4095) a:=a+1
processor
v

HLSM
Create state machine that waits for v=1, and then counts from 0 to
4095 using a local storage a
For each a, read ROM, write to digital-to-analog converter

Digital Design 2e
Copyright 2010 69
Frank Vahid
ROM Example: Digital Telephone Answering Machine
Using a Flash Memory
Want to record the outgoing
announcement 4096x16 Flash
Were not home.

erase
When rec=1, record digitized

busy
addr
data

en
rw
sound in locations 0 to 4095
When play=1, play those
stored sounds to digital-to- 16
analog converter analog-to-
digital 12 digital-to-
ad_buf Ra Rrw Ren er bu
What type of memory? converter analog
Should store without power ad_ld processor converter
da_ld
supply ROM, not RAM
Should be in-system rec
programmable EEPROM record play
or Flash, not EPROM, OTP
microphone speaker
ROM, or mask-programmed
ROM
Will always erase entire
memory when
reprogramming Flash
better than EEPROM

Digital Design 2e
Copyright 2010 70
Frank Vahid
ROM Example: Digital Telephone Answering Machine
Using a Flash Memory
HLSM 4096x16 Flash

Once rec=1, begin


erasing flash by setting
16
er=1 analog-to-
digital 12 digital-to-
ad_buf Ra Rrw Ren er bu
Wait for flash to finish converter
ad_ld processor
analog
converter
da_ld
erasing by waiting for
rec
bu=0 record play

Execute loop that sets microphone speaker

local register a from 0 to


4095, reading analog-to- Local register: a, Rareg (13 bits)
bu
digital converter and a<4096 a
writing to flash for each a S T bu U
a:=0 er:=0 ad_ld:=1
er:=1 ad_buf:=1
Rareg:=a V
rec
Rrw:=1
Ren:=1
a:=a+1 (a<4096)

Digital Design 2e
Copyright 2010 71
Frank Vahid
Blurring of Distinction Between ROM and RAM
We said that
RAM is readable and writable ROM Flash RAM
a
EEPROM NVRAM
ROM is read-only
But some ROMs act almost like RAMs
EEPROM and Flash are in-system programmable
Essentially means that writes are slow
Also, number of writes may be limited (perhaps a few million times)
And, some RAMs act almost like ROMs
Non-volatile RAMs: Can save their data without the power supply
One type: Built-in battery, may work for up to 10 years
Another type: Includes ROM backup for RAM controller writes RAM contents to
ROM before turning off
New memory technologies evolving that merge RAM and ROM benefits
e.g., MRAM
Bottom line
Lot of choices available to designer, must find best fit with design goals
Digital Design 2e
Copyright 2010 72
Frank Vahid
5.8

Queues (FIFOs)
A queue is another component
sometimes used during RTL
back front
design
Queue: A list written to at the
back, from read from the front
Like a list of waiting restaurant
customers write items read (and
Writing called a push, reading to the back
of the queue
remove) items
from front of
called a pop the queue

Because first item written into a


queue will be the first item read
out, also called a FIFO (first-in-
first-out)

Digital Design 2e
Copyright 2010 73
Frank Vahid
Queues
7 6 5 4 3 2 1 0
Queue has addresses, and two
pointers: rear and front
Initially both point to 0
Push (write) rf
7 6 5 4 3 2 1 0
Item written to address pointed to
by rear A
A a
rear incremented
Pop (read) r f
Item read from address pointed 7 6 5 4 3 2 1 0
to by front
front incremented B B A a

If front or rear reaches 7, next


r f
(incremented) value should be 0 7 6 5 4 3 2 1 0
(for a queue with addresses 0 to a

7) B A

r f
Digital Design 2e
Copyright 2010 74
Frank Vahid
Queues
Treat memory as a circle 7 6 5 4 3 2 1 0
If front or rear reaches 7, next (incremented)
value should be 0 rather than 8 (for a queue B A
with addresses 0 to 7)
Two conditions of interest r f

Full queue no room for more items 0


In 8-entry queue, means 8 items present
1 7
No further pushes allowed until a pop occurs a

Causes front=rear B
Empty queue no items
f
No pops allowed until a push occurs
2 6
Causes front=rear r
r

Both conditions have front=rear


To detect whether front=rear means full or
empty, need state machine that detects if
previous operation was push or pop, sets full 3 5
or empty output signal (respectively)
4

Digital Design 2e
Copyright 2010 75
Frank Vahid
Queue Implementation
Can use register file for 8x16 register file
item storage wdata 16
wdata rdata
16 rdata

Implement rear and front 3 3


waddr raddr
using up counters wr rd
rear used as register files
write address, front as read wr clr
clr
address inc
inc
Simple controller would rd 3-bit 3-bit

Controller
up counter up counter
set control lines for rear front
reset
pushes and pops, and
also detect full and empty eq
=
full
situations
FSM for controller not empty
8-word 16-bit queue
shown

Digital Design 2e
Copyright 2010 76
Frank Vahid
Common Uses of a Queue
Computer keyboard
Pushes pressed keys onto queue, meanwhile pops and sends to
computer
Digital video recorder
Pushes captured frames, meanwhile pops frames, compresses
them, and stores them
Computer network routers
Pushes incoming packets onto queue, meanwhile pops packets,
processes destination information, and forwards each packet out
over appropriate port

Digital Design 2e
Copyright 2010 77
Frank Vahid
Queue Usage Example
7 6 5 4 3 2 1 0

Example series of pushes Initially empty


and pops queue

rf
Note how rear and front 7 6 5 4 3 2 1 0
pointers move 1. After pushing 3 2 7 5 8 5 9
Note that popping doesnt 9, 5, 8, 5, 7, 2, 3

really remove the data from the r f


7 6 5 4 3 2 1 0
queue, but that data is no
longer accessible 2. After popping 3 2 7 5 8 5 9 data:
9 a

Note how rear (and front) r f


wraps around from address 7 7 6 5 4 3 2 1 0
to 0
3. After pushing 6 6 3 2 7 5 8 5 9
Note: pushing a full queue is
f r
an error 7 6 5 4 3 2 1 0

So is popping an empty queue 4. After pushing 3 6 3 2 7 5 8 5 3 full

rf
Digital Design 2e
5. After pushing 4 ERROR! Pushing a full queue
Copyright 2010 78
Frank Vahid results in unknown state.
5.9

Multiple Processors
Using multiple processors
can ease design from ButtonDebouncer
Keeps distinct behaviors button
Bin Bout
B
L
separate to laser
Laser-based
Ex: Laser-based distance distance
D 16 measurer S
measurer with button
debounce to display from sensor

Use two processors


Ex: Code detector with
button press synchronizers
(BPS) Start
si BPS s
BPS processor for each u
input, plus CodeDetector ri BPS r Door
Red Code
gi g lock
processor Green BPS detector
bi b
Blue BPS
ai a
BPS

Digital Design 2e
Copyright 2010 79
Frank Vahid
Interfacing Multiple Processors
Use signal, register, or other component outside processors
Known as global
Common methods use global...
control signal, data signal, register, register file, queue
Typically all multiple processors and clocked globals use
same clock
Synchronized

Digital Design 2e
Copyright 2010 80
Frank Vahid
Ex: Temperature Statistics with Multiple Processors
16-bit unsigned input T from temperature sensor, 16-bit output A. Sample T
every 1 second. Compute output A every minute, should equal average of most
recent 64 samples.
Single HLSM: Complicated
Instead, two HLSMs (and hence two processors) and shared register file
Tsample HLSM: Store T into successive RF address, once per sec.
Avg HLSM: Compute and output average of all 64 RF words, once per min.
Note that each uses distinct timer

T W_d TempStats
Keeping the T
W_d
sampling and W_a W_a R_a R_a A
a

A
averaging W_e W_e R_e R_e
behaviors Tsample TRF
RF[64](16)
separate leads to R_d
Avg
simple design R_d

Digital Design 2e
Copyright 2010 81
Frank Vahid
Ex: Digital Camera with Mult. Processors and Queue
Read and Compress processors (Ch 1)
Compress may take longer, depends on picture
Use queue, read can push additional pics (up to 8)
Likewise, use queue between Compress and Store

Image sensor
8 8
wdata rdata
Read wr Compress Queue Store
circuit rd Memory
full Queue empty circuit [8](8) circuit
[8](8)
a

Digital Design 2e
Copyright 2010 82
Frank Vahid
5.10

Hierarchy A Key Design Concept


CityA CityD
Hierarchy

Province 2

Province 3
Province 1
CityF
a
Organization with few items at the top, with CityB
each item decomposed into other items CityE
Common example: Country CityC
CityG
1 item at top (the country) Country A

Country item decomposed into


state/province items
Each state/province item decomposed into
city items

Province 2

Province 3
Province 1
Hierarchy helps us manage complexity
To go from transistors to gates, muxes,
decoders, registers, ALUs, controllers,
datapaths, memories, queues, etc. Country A
Imagine trying to comprehend a controller
and datapath at the level of gates Map showing just top two levels
of hierarchy

Digital Design 2e
Copyright 2010 83
Frank Vahid
Hierarchy and Abstraction

Abstraction
Hierarchy often involves not just
grouping items into a new item, but also
associating higher-level behavior with
the new item, known as abstraction a7.. a0 b7.. b0
Ex: 8-bit adder has understandable high-
level behavioradds two 8-bit binary 8-bit adder ci
numbers
co s7.. s0
Frees designer from having to
remember, or even understand, the
lower-level details

Digital Design 2e
Copyright 2010 84
Frank Vahid
Hierarchy and Composing Larger Components
from Smaller Versions
A common task is to compose smaller components 4x1
into a larger one i0 i0 a
Gates: Suppose you have plenty of 3-input AND gates, i1 i1
but need a 9-input AND gate i2 i2 d
Can simple compose the 9-input gate from several 3-input
gates i3 i3
Muxes: Suppose you have 4x1 and 2x1 muxes, but 2x1
s1 s0 i0
need an 8x1 mux
d
s2 selects either top or bottom 4x1
4x1 i1
s1s0 select particular 4x1 input
i4 i0 s0
Implements 8x1 mux 8 data inputs, 3 selects, one output
i5 i1
i6 i2 d
i3
P
ro
vin
c s1 s0
e1
0
s1 s0 s2 1

Digital Design 2e
Copyright 2010 85
Frank Vahid
Hierarchy and Composing Larger Components
from Smaller Versions
Composing memory very common
Making memory words wider
Easy just place memories side-by-side until desired width obtained
Share address/control lines, concatenate data lines
Example: Compose 1024x8 ROMs into 1024x32 ROM
10
addr

addr addr addr addr


1024x8 1024x8 1024x8 1024x8
ROM ROM ROM ROM
en

en en en en
data data data data
8 8 8 8

data(31..0)
10
1024x32
ROM
data
Digital Design 2e 32
Copyright 2010 86
Frank Vahid
Hierarchy and Composing Larger Components
from Smaller Versions
11
a9..a0

addr
Creating memory with more words addr
Put memories on top of one another until the a10 1x2 d0 1024x8
number of desired words is achieved i0 dcd ROM
Use decoder to select among the memories e d1 en data
Can use highest order address input(s) as 8

en
decoder input
Although actually, any address line could be addr
used 1024x8
11
Example: Compose 1024x8 memories into ROM

addr
2048x8 memory 2048x8 en data
ROM

en
a10a9a8 a0 8
data
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 addr 8 a

a 0 0 0 0 0 0 0 0 0 1 0 1024x8
P P
r1 1
ROM
0 1 ro1 1 1 1 1ovin 1 0 en data
a10 just chooses vin a
0 1 1 1 1 1 1 c1 1 1 1
which memory To create memory with more
to access 1 0 0 0 0 0 0 0 0 0 0 words and wider words, can first
1 0 0 0 0 0 0 0 0 0 1 addr
compose to enough words, then
1 0 0 0 0 0 0 0 0 1 0 1024x8
ROM
widen.
Digital Design 2e
Copyright 2010 1 1 1 1 1 1 1 1 1 1 0 en data 87
Frank Vahid 1 1 1 1 1 1 1 1 1 1 1
Chapter Summary
Modern digital design involves creating processor-level components
High-level state machines
RTL design process
1. Capture behavior: Use HLSM
2. Convert to circuit
A. Create datapath B. Connect DP to controller C. Derive controller FSM
More RTL design
More components, arrays, timers, control vs. data dominated
Determining fastest clock frequency
By finding critical path
Behavioral-level design C to gates
By using method to convert C (subset) to high-level state machine
Memory components (RAM, ROM)
Queues
Multiple processors
Hierarchy: A key concept used throughout Chapters 2-5

Digital Design 2e
Copyright 2010 88
Frank Vahid