Você está na página 1de 277

Digital Design

To IIIv!alllily. Alii.". Eric. Kelsi , alld Mom:


IlIId ' 0 ' hose ell gill eers II'ho applv ,hel l' skills
10 bll ild 'hill gs ,hll' illlp' 'OI'e 'he 11111 11011 colldll lOlI .
\ P A;\ D EXECLTI \ 'l:, PUBLISHER
\SSOCIATE PLBLI SHER
,.\ CQ ISITIO:--: S EDITOR ..\ \10 PRODL'CT MANAGER
PROJ ECT EDITOR
SE;\' IOR FDITORIAL ASSISTANT
,\IEDI\ EDITOR
SENI OR PRODllCTION EDITOR
1\ I ARKETI:"\G l\ 1,\,'\AGER
COVER
COVER DESIGj\ER
PRODc cnON SERVICES
bon!. I' pnnted on acid Ift!c paper.
BRL'CE SPATL.
DAN SAYRE
Ct\TII ERINE FI ELDS SHULTZ
GI.J\ DYS SOTO
DANA KELLOGG
STEVEN CHASEY
V,\ LERI E A V,\RGAS
PHYLLI S CERY5
..\ EL JUNG
LESURE
I NGRAO ASSOCIATES
Cllp\nght :!007 Juhn \\Ik} & Sum. Inc. All ri ghh ;-';0 par1 or lhl, publlcalion nhl) be
,lIlfed 10 a rct nc\ al ,)"'tcm or tran\l1l ll\cd In any rorm or by an) me;I"" electront ,
ml!chann:al. phmocop) In!!. rlcordlng. <.Cannlng.or mhe"" ...c, a .. permuted under SL"Cll on, 107 or
10801 the 1976 L' nilcd State ... COP) nght At!. cuhcr the pnor \\nllcn pcrTlll '\IOn or the pubh, her.
or authOfl/.lllon through Pol) mcnt of the appropnate pcr-<:(lp} fcc to the COP) fl ghl Cle:u-:.ncc Cenlcr. Inc ..
n" Ru,c1,I,oOO Ome. Oan\c,.... \IA 01923. 197H) 7S0--W70 or on the \\ Cb :11
nlp.1 m.:hu(l1II Rcque"h 10 !he Pubh .. hcr ror perml"wn .. hould he addrc\\Cd to thc I' enm .... ion ..
Dcpanmenl. John Wl lc} & Som. Inc .. III RI \cr Street. ilobol-cn. j\J 070JO.577.a . (101) 748.('01 J.
f.l\ 1201 J 74H60(Jlt IIr onllOc al hlll)://I\'\I'II ' 11/1(,. n}ltrlJ.:(I/I't' fII l/\\I01l 1.
T., unlet bon!., or ror CU\ lomer ..en ic:e plcd,e (illl l ijO(JCALI WI U. Y /125-W45,.
ISB"I 1J7H{)-HO04JJ77
PrlOted In the Lnlted Sl:Ite' or Ament,1
10 9 II 7 6 5 4 3 2.
Preface
TO STUDENTS ABOUT TO STUDY DIGITAL DESIGN
Digilal ci rcu.its. form the basis of general-purpose computers as well as peciaJ-
purpose cell phones or video game consoles. are dramatically changing the
worl d. SlUdymg digItal design not only gives you the confidence Ihal comes" ith funda-
mentall y unders tand ing how digital circuits work. but also introduce!' you 10 an e:tcitiof!
and useful possible career direction. This statement appl ies regardless of "-hether )ou';
maj or is Electrical Engineering. Computer Engineering. or even Computer Science (in
fac!. the need for digital designers with strong computer science skills continues to
increase). J hope you find (hi subjeci to be as imerestine:. excitin2. and useful as J do.
Throughout lhi s book. J have tried not only to in the rna I inrui-
live and easy to learn manner. but I have al 0 tried (Q sho\\ ho" those concepb can be
appli ed to real-world systems. such as pacemakers. ul trasound machine . pnmers. auto-
mobil es. and cell phones. Young and capable engineering ludems (including computer
science students) Some limes lea\ their major. clai ming the) " 'ant a job that is more:
"peopl e oriented." Yet we need those people-oriented rudenlS more than e\cr. 35 engi-
neering j obs are increasingly people-oriented. in scveraJ wa) s. First. engineers
work in ,ighTly-imegraled groups involving numerous other engineers (rather than "silting
a lone in front of a compuler all day" as many students belic\'e). econd. engineers often
work direCily lI'ilh CUSTomers ( uch as busine people. doctors. la\\ ) ers. go\ cmment offi-
cial s. etc.). and must therefore be able 10 connect with those non-engineer ClJ.)(Qmero.
Third. and in my opinion mosl importantly. ellgineers build 'lrings tlral dramatically
impo('l people's /in's. teedcd are engineers \\ ho combine the.ir enthlbiasm. C'T'eati\i ) .
and innm'alion wi th Iheir olid engineering skills to con ehe and buiJd ne" product., thai
il11prme people's quality of life.
I have included "Designer Profiles" at the end of most chaplen.. The de"lgnef't..
\\ hose experi ence le\els \ 'M) from j ust a to . e\ eral dec:lde .., and \\ho. l. '"Omparu
mnge fmm 5111 alli0 huge. , hare \\ ith their in .. ighb. and ad\ Il.':e. h'IU \\i ll
notice hO\\ common I) the) discuss the people aspects of wetr You 3ho notice
thei r cOIhuli; ias m and p:bsiQn for their job .
TO INSTRUCTORS OF DIGITAL DESIGN
TIlis book. break!o from the 19 (hJ19 Os. digiml d(!, ign \It:" 'Iu-hmlted
dc\ i gll , lI1 :-- tc3d cmph3!<olllOg the 1()()()" situation of rtgisur-trolld'er-Iel'el (RTL) de'lgn
B) Ic:ml) dl\lingul\htng the IOP1C of bask design from optimlzJtion. 1\\0 Il"PII,." pre\ 1-
ou.:-- I) IO"lCp.lmbl) int enwlOtXi. the book 3110\1" J tiJ".1 cour..e on UigllJI de .. lgn I,) re:""' h
fi nd emplm'l zt' the tOpH.' of RTL J ign. A .. rudent t' \ P'l,,.ru tl) RTL Je"llgn In 3 hl"t
COu ..... C," \ \ III h3\ e :1 mol'\" I\'le\ ant \ ie\\ of the Illl1dem ili,gi lal dC' I,gn rield. le.u.hn,g n<' 1
iii
Preface
to a bell er apprec iati on of modem computers 3nd other digi tal devices. but n more 3CCU-
understanding of careers involving digi t31 design. Such an accurate undcrst:'lIlding is
cri ti cal to atlr3C( computing majors 10 C3rcers involvi ng some 3mOUJl( of digital design.
and to 3 ci.ldre of engineers wi th the comfort in both softw3re" and "h3rdware"
nccessary in modem embedded comput ing system design.
The distingui shing of basic design from optimi znrion should not be interpreted as
avoiding a bOllom-up 3pproach or glossing over import,lIlt steps - the book takes a con-
crcte bOllom-up 3pproach, starting from transistors. and building incrementall y up
through gates. flip-fl ops. registers. controll ers. datapath components. etc. Rather, the di s-
tinguishing enabl es the student to init iall y develop a solid understanding of basic design.
before considering the morc advanced topi c of optimi zati on, akin to how a physics book
introd uces Newton's Ja ws of Illotion initi all y ass uming fri cti onless surfaces and no wind
rcsislJ.nce. Furthermore. optimi zation IOday involves more than j ust size minimi zati on.
ins tead requiring a broader understanding of tradeoffs among size, performance. and
power. and even of tradeoffs among custom digi tal ci rcuits and microprocessor soft ware.
Again, coverage is kept concrete and appropri ate to an int roductory digit al design course.
Nevertheless. the book di st ingui shes basic design from optimization in a way that
cleanly provides an ins tructor max imum Hexibility to introduce optimi zati on at the times
and to the extent desi red by the instructor. In pani cular. the optimizati on chapter's sub-
sections (Chapter 6) each correspond direct ly to one earli er chapter. such that Secti on 6.2
can direct ly follow Chapter 2. Secti on 6.3 can fo ll ow Chapter 3. 6.4 can foll ow 4. and 6.5
can foll ow 5.
Several additional features of the book include:
Extensil'e lise of applied examples alldfigures. After describing a new concept and
providing basic examples. Ihe book provides exampl es that appl y the concept 10
appl icati ons recogni zable to a student. like a seal belt unfastened warning systcm.
a computerized checkerboard game. a color printer, or a di git al video camera. Fur-
ulermore. the end of mOst chapters include a product profi le. intended to give
students an even broader view of the applicability of the concepts. and to intro-
duce clever appl icati on-speci fi c concepts the students may find very int eresting-
like the idea of beamfoml ing in an ultrasound machine or of fi ltering in a cellul ar
phone. The book extensively uses figures 10 illustrate concepts. contai ning over
600 figures.
Learning through discovery. The book emphasizes understanding the need for
new concepLS. which not onl y helps students learn and remember the but
develops reasoning skill s that can apply the concepts to other domains. For
example. rat her than just defi ning a carry- lookahead adder, the book shows intui -
tive but inefficient approaches to buil ding a fas ter adder. eventua ll y solving the
inefficiencies and leading to ("discovering") the carry-lookahead design.
Introduction to FPGAs. The book incl udes a full y boltom-up int roduction to
FPGAs. showing students concretely how a ci rcuit can be convcrtcd int o :1 bit -
stream Ihat programs the indi vidual lookup tabl es. switch I11 tl tri cc!!. and olher
grammable components in an FPGA. This concrete int roducti on cli mi nntcS the
mystery of the increasingly-common FPGA devices.
Preface
HDL cOl'erage flexibility. The book's organization cleanly allows instructors to
cover HDLs (hardware descri ption languages) intennixed with the introduction of
design concepts. to cover HDLs later. or 10 not cover HDLs at all. The HDL
chapter's subsecti ons (Chapter 9) each correspond to an earlier chapter. sucb that
Section 9.2 can direct ly follow Chapter 2. 9.3 can follow 3. 9.4 can follow J , and
9.5 can follow 5. Funhennore. rather than the book choosing jUst one of the
popular languages - VHDL. Veri log. and the relatively new SystemC - the
book provides equal coverage of all three of those HDLs. And we use our exten-
sive experi ence in synthesis with commercial tools to create HDL descnptions
well -suited for synthesis. in addi ti on to being suitable for imulation.
Accompanying HDL-introdlictiOIl books. InstruclOrs wishing to co\er HDLs to an
even greater extent can utili ze one of our HDL-introduclion books specifically
designed to accompany this textbook. wriuen by the same author as this textbook.
Our HDL-introducrion books foll ow the same chapter tructure as. and use exam-
ples from. this textbook. eliminating the common situation of students struggling
to correlate their distinct. and sometimes contradicting. HDL book ilIld digital
design book subjects. Our HDL-intmduction books discuss language. simulanon.
and testing concepts in more depth. providing numerous HDL e."tamples. and are
al 0 designed to be usable by themselves for HDL learrung or ,..,fereoc<:. The
books emphasize use of the language for real design. clearl) distio_uishing HDL
use for symhesis from HDL use for testing. and include e."tlensive examples and
fi gures throughout to ill ustrate conceplS. Our HDL-introductioD come "",ith
complete PowerPoi nt slides that use graphic and animations lO sene as an ea:Sy-
to- use tutori al on the HDL.
Allthor-created graphical animated Pou'uPo;nt slides. A rich set of Po,,-erPoint
lides are available to in tructors. The slides were reated by the textbook'
author. res ulting in consistency of perspective and emphasis be(\\, een the tides
and this book. The slides are designed to be a truly effective teaching tool for the
instructor. Most slides are graphics based (avoiding sLides con isting of j\bl bul-
leted lists of tex.t). The lides make us of animation \\ here appropmue to
gradually uO\'eil concept or build-up circuits. e\en nnimated can b!
printout out and undersuxxi. 1early e\er) figure. oncepL and e"tampie from tlt.b
book i included io the set of almo t 500 lides. from \\ hich instructors =
choose.
Complete solmiOlls mallilal. Instru tors rna) obtain 3 complete - luuons m3DuJl
(about 200 pages) containing !!olutions to c\ ef) end-of-chapter execci..-..e In thho
book. The manual e.\tensively utilizes figures to illu. tr..ne .;:oluoo05.
11r,1,,'PLU lIebsi". Dicit.1 Design;' supported b) \\ 'jle)PLL' - 3 po\\ rful
PLUS nnd 'highly intcgrnted sulte of t a;hing and learning re,oun.-es dosign<d to bndge
the gnp between "h::lt happens in the c11l!JSroom and \\bat .It tkml
WileyPLU include, u complete nline ,enlion of the te\t algonthm, ' '''I) """.
::lted problems and guided onhne c \ eft'l ... dJith.>nal 3.: h II '1doo
... olutions of selt"Cted e\ample .... anim:nion, f pen1l1ent 1,.'\m(,."C'pt. (b..."'llh
b) Prole>""r Ed DD<nng of R ,.-Hultn"" In,tllUl l. X'OIplet ,,,Iuu,,,,, manual
vi Preface
and aut hor-created anim:ttt.::d Po\\crPoint >;. pili :' and homcwork manage-
mcnt lOob. in one wcb:, it c.
To learn how to aCCC:":" thcsc go 10 the Book Compani on Site at
\\ w\\.wil ey.comlcoll cgelvahid. or w \ \ \\.dd\ ahid.com.
HOW TO USE THI S BOOK
Thi ... book tlcsigned to allow nc\ ibil ity to choose among the most C0l111110n
appro:lthc:. to material covcragl!. We describc :.cvcral ilpproachcs below.
RTl focused approach
An RTL-focu:-cd approach would :.irnpl y covcr the fir!)t 6 chapters in order:
I. Introduction (Chapter I )
2. Combinational logic de"-ign (Chaptl:f 2)
J. Scquenti al logic design (Chapter 3)
..,t COlllbinmional and component design (Chapter-t )
5. RTL (Chapter 5)
6. Optimizations and Tradeoffs (Chapter 6). to the extent desi red
7. Phyo;;ic;\I implcmcnl:ltion (Chapter 7) and/or Processor design (Chapter 8). to the
c'Xlcnt desired.
We thin!'" thi ... io;; a great way to order the re:-. ulting in students doi ng interesting
RTL dc:..ign:- in abollt 7 \\ech. HDL" cnn be int roduced at the cnd if time permit s. or left
for a !-Iccond course on digital design (as donc at UCR). or covcred immcdi ately arter
cach ch"ptcr - ,, 11 threc appear to be quil c C0111111 0 n .
Tra ditiona l approach with some reordering
Thi", book can be readily in a trad iti onal approach that int roduces optimization along
ith dcsign. wit h a sli ght diffcrcnce from thc traditional approach bcing the wap-
ping of coverage of combinat ion' ll component.:.. and sequent ial logic.
I . Introduction (Chaptcr 1)
2. Combinational logic dC!o.ign (Chapter 2) fo llo\\cd by combi nati onal logic
n ll ion (Section 6.2)
3. Scquential logic design (Chapter 3) fo ll owed by ...cqucnti al Im! ic optimi zati on
(Section 6.3) -
-t o Combinati onal and sequcntial componcnt de\ign (Chaptcr 4) followed by compo-
nellt tradcoffs (Sect ion 6.4)
5. RTL dc ... ign (Chapter 5) to the extent dc, ired. foll owed by RTL
tradeoff, (Swion 6.5)
6. Phyo.,ical (Chapter 7) and/or Proccv . or dc,ign (Chnpt cr 8). to the
extcnt
Till ... i".. :I very rcao.,onabh.: and approach. completing all or Oll l.! tupie
.. I-SM dL''''lgn \\cll :to., he fore moving on to the nl.!x t topic. The n:nr-
tienng lrom .1 tr:ldllional ,Ipproach Introduce", ,cqucnli al dC'Ign (FSM' tlild
Preface vii
controllers) before combinational components (e.g . adders. comparators. etc.,. Such reor-
dering may lead into RTL design more naturall y than a tmditional approach. foll""lI1.
instead an of increasing abstraction rather than the traditional approach that 1Oe;
arat es combmatlonal and sequentiaJ de ign. HDLs can again be introduced at the end. left
for another course, or integrated aft er each chapter. This approach could aJw be used as
an intermediary lep when migrating from a fu lly-traditional approach to an RTL
approach. Migraling might involve gradually postponing the Chapter 6 - for
example. covering Chapters 2 and 3. and then Sections 6.2 and 6.3. before mo,in. on to
-
Completely traditional approach
This book could also be used in a completely traditional approach. '" follo\\ :
I. Introduction (Chapter I)
2. Combinati onal logic design (Chapter 2) follo\\ed by combtnational logic opumi
zation (Section 6.2)
3. Combi national component design (Section 4.1. 4.3. 4.4. 4.5. 4.7. 4 .. 4.91 fol
lowed by combinati onal component uadeoffs (Section 6A - Adders
4. Sequenti al logic des ign (Chapter 3) followed b) sequential logic opumizanon
(Section 6.3)
5. Sequential component design (Chapter 4. ecuons 4.2. 4.6. 4.101 follo\\OO
sequential component tradeoffs (Section 6A -
6. RTL design (Chapter 5) to the extent desired. follo\\OO RTL opumizationl
tradeolTs (Section 6.5)
7. Physical implementation (Chapter 7) andlor Proce",or design (Chapter l. to the
extent desired.
This is thc most widespread approach during. the past (\\ 0 \\ ith the addition of
RTL towards the end. Al though the emphasized distinction be{\\ een combmationJ.l and
scquential design may no longer be rele\'ant in the era of RTL de ... ign both type: ...
of design are imemli xed). some people belic\-e that such distinction for C3. ... ier
learning path. which may be true. HOLs an be in luded at the end. left for a tller C\."')lII"'e.
or int egmted throughout.
ACKNOWLEDGEMENTS
Man) peopl e and contributed to thb ediuon I.)f the tx,,-"'l..
tafT memben. at John \\ 'i le\ and Son ... Pubh ... he", hJ\e ",upP'-"'noo
book' s de\' elopment. includi-ng Cutherine hultZ, ... l.)tO, Dana J....l!lk;g. and
Kelly B ylc. Bill Zobrist ,upported m) e3J'lier Emblded ,tern o,,'tgn- N .
motivated me to \\ rite the btlOk. Jnu pro\ IUNI JJ\ II.: thl\.\Ughl'lllt
development.
R):111 contributed man) II1dUJ,lOg the Jrrcnu11.:
I: xamplc"- and eml :.ub ... t: 'lion.... the ':I.)mrl<.'le !.'\ClX'l,e ,,,-)Iuth ..)n ...
manual. fnct.chcc!...tn\!, e\ ten'l\e IJ\:'mend\.,u ... J.."t'I .lnl.: Junn. P(\"-
duclI(ln. help \\ Ith th;' ,lid6. plent) llf I ...ka., dunnt! ..IN.'u''1 ...''n' . \OJ l11u,,--h m\m!
viii Preface
Roman Lysccky developed numerouS exampl es and exercises. contri buted most of
the content of the HDL chapler. and co-authored our accompanying HDL-intro-
duction books. Roman and Susan Lysecky providcd much proofreading
assistance.
Numerous reviewers provided outstanding feedback on various versions of the
book. Special thanks go to earl y adopters, such as Niki l Out!, Shannon Tauro. J.
David Gillanders, Shcldon Tan. Travis Doom. Roman Lysecky, and others. who
provided excellent feedback from themselves and from their students.
The importance of the support provided to my research and lcaching career by the
Nation'll Science Foundation cannot be overstated. Additional support from the
Semi conductor Research Corporation catal yzed industry collaborations lhat in
tum inOuenced mallY of the perspectives in thi s book.
ABOUT THE COVER
The cover' s image of shrinking squares graphically depicts the amazing rcal-life phe-
nomena of di gital ci rcuits ('computer chips' ) shrinking in size by one half roughl y every
18 mont hs. for several decades now. a phenomena often referred to as Moore's Law. Such
shrinki ng has enabled incredibly powerful computing circuits to fit in tiny devices. like
modem cell phones, medical devices, and portable video games. See pages 34 and 35 for
a discussion of Moore's Law.
ABOUT THE AUTHOR
Frank Vahid is a Professor of Computer Science & Engineering at the Uni versity
of California. Ri verside. He holds Electrical Engineering and Computer Sci ence
degrees: has worked/consulted for Hewlett Packard. AMCC. NEe. Motorola.
and medical equipment makers: holds 3 U.S. patents: has received several
teaching awards; hclped setup UCR' s Computer Engineering program; has
authored two prcvious textbooks: and has published over 120 papers on digital
design topics (automation. architeclure, and low- power).
Reviewers and Evaluators
Rehab Abdel-Kader
Otmane Ail Mohamed
Hussain AI -Asaad
Rocio Alba-Aores
Bassem Alhalabi
Zekeriya Ali yazicioglu
Vishal Anand
Bevan Baas
Noni Bohonak
Don Bouldin
David Bourner
Elaheh Bozorgzadeh
Frank Candocia
Ralph Carestia
Rajan M. Chandra
Ghulam Chaudhry
Michael Chelian
Russell Clark
James Conrad
Kevan Croteau
Sanjoy Das
James Davis
Edward Doering
Travi s Doom
Jim Duckworth
Nikil Dutt
Denni s Fairclough
Paul D. Franzon
Subra Ganesan
Zane Gastineau
J. David Gill ander,
Clay Gl oster
Ardian Grcca
Eric Hansen
Bruce A. Harvey
John P. Hayes
Mi chael Helm
William HolT
Erh-Wen Hu
Baback ILadi
Georgia Southern University
Concordia University
University of California. Davis
University of Mjnnesota. Duluth
Florida Atlantic University
Cal ifornia Pol ytechnic State UniversIty. Pomona
SUNY Brockpon
University of California. Davis
Uni versity of South Carolina. Lancaster
University of Tennessee
University of Maryland Baltimore Coun!)
Uni versity of California. mi ne
Rorida International University
Oregon Institute of Technology
California Polytechnic State Universi!). Pomona
University of Mis ouri. Kansas
Californi a State University. Long Bea h
Saginaw Val ley State Univcrsit}
University of Nonh Carolina. Charlotte
Franci s Mari on University
Kansas Slare Unh'ersity
Uni versi ty of South Carolina
Rose-Hulman Institute of Technolog)
\Vrighl Slate Uni \ crsiry
\Vorcester Pol) technic institute
University of California. Iryine
Utah Valle) late College
'orth Carolina uue Unher;il'\
Oakland Uni, ersit) .
Harding ni,ersi!)
Arkansas tate Unherslt)
Howard nh ersil)
Georgia S uthem l'nhersit)
Dartmouth College
FAMUFSll College of Englne-ering
of
Texas Tech
C lorad chool of Mine.
\ i1liam Ptllcf'lon Unt\en.lt) of l\'e\\
UNY ' e\\ P:tlu
viii
ABOI
ABO
Reviewers and Evaluators
Jerf
Anura
Bruce Johnson
Ri chard
RJji v Kapadia
Bahadir Knruv
Robert Klenke
Clint Kohl
Ht:rrnann Kromphol z
Timothy KUI7;Wt:g
JUl1lokc
Jeffrey Lilli e
David Livingston
Hong Man
Gihan Milndour
Di ana M:lrculesctl
Mi guel [l,llarin
MLl ryHIll Moussavi
Olb
Nava
John Nestor
Rogelio Pal oll1cra
Ji.IIllC:-. Peckal
\Vitale! Pedrycz
Andrew Pcrry
Denis Popel
Tariq Qilyyum
Gang Qu
Mih:lclu RLldu
Suresh Rai
William Rcid
Mu!.okc Scndnuln
SCOlt Smith
Gary Spivey
Lnrry St ephens
Jamc!. Stine
Philip Swain
Shannon T<Illro
Cmlos: T<rvora
Marc Timmcrman
Hariharan VijaYilraghavnn
Bin \\lang
M. Chri!oo Wcrnicki
Shull chich Yang
Hcnry Ych
Naccll1 Zaman
University or Alabama
Colorado State Uni versity
Universit y or Nevada, Reno
Lawrencc Technologic<l1 Uni versit y
Minnesota State Uni versity. Mankato
Fairleigh Dickinson Uni versity
Virginia Commonwcalth Uni versity
Cedarville Universit y
Texas Tech University
Drexel Uni versity
Morgan State Universi ty
Rochester Institute orTcchnology
Virginia Military Institute
Stevens Institute of Technology
Chri stopher Newpon University
C<lrnegie j\'lclion Uni versi ty
McGill Uni versity
Calirornia State University. Long Beach
University or J\llemphi s
University or Texas. EI Paso
Lafaycllc College
Garcia Uni versity of Pueno Ri co. Mayaguez
University or WaShington
Uni versity or Albena
Springfield College
Bakcr University
Cali romia Polytechni c State Universit y. Pomona
University of Maryland
Rosc- Hulman Insti tute or Tcchnology
Louisiana Statc UniverSity. Bnt on Rouge
Clemson Uni versity
Temple Univcrsity
Boise Statc University
Gcorge Fox University
Univcrsity or South Carolina
Ill inois InstitUic or Technology
Purduc University
University or Californi a. Irvinc
Gonzaga Universi ty
Oregon Institute of Techn logy
Univcrsity or Kansi:\'\
Wright State
Ncw York In,titutc of Tcchnology
Roc;hcM:r Institut e of Technology
Califom" Stote Univcr\ it y. Long Be.lch
San Jaoqui n Delt a oll ege
Preface iii
Content s xi
CHAPTER 1
Introduction 1
Contents
1. 1 Digital Systems in the World Around Us
1.2 The World or Digital Systems 4
1.3 Implemcnting Di gital Systems: Programming
Mi croprocessors versus Designing Digit al
Circuits 17
1.4 About thi s Book 23
1.5 Exercises 24
CHAPTER 2
Combinat ional Logic Design 30
2. 1 Introducti on 30
2.2 Switches 30
2.3 The CMOS Transistor 35
2.4 Boolean Logic Gates-Building Bl ocks ror
Di gil il l Circuits 38
2.5 Boolean Algebra 47
2.6 Representati ons of Boolean Functions 55
2.7 Combination:.11 Logic Design Proce s 67
2.8 More Gates 73
2.9 Decoders :.1nd Muxc 77
2. I 0 Additioll:.11 Considerations 83
2. 11 Combi nmional Logic Optilll iz:.1t ions
and Tradeoffs (See Secli on 6.1) 86
2. 12 Combinati onal Logic Descripti on using
Hardware Description L:lI1guagt!s
(Sec Section 9.2) 86
2. 13 Chapter Summary 86
2. 14 Exercises 87
CHAPTER 3
Sequential Logic Design- Controllers 95
3. 1 Introducti on 95
3.2 Storing One Bit- Flip.Flop, 96
3.3 Finite-State Machines (FSMs) and
Controllers I II
3A Controller Design 120
3.5 More on Flip-Flops and Controllm 130
3.6 Sequential Logic Oplimizations and
(See Section 6.3) 137
3.7 Sequenlial Logic Descrip[ion using
Hardware Description Language..,
(Sec Section 9.3) 137
3.8 Product Profile-Pacemaker 137
3.9 Chapter Summilr)
3.10 Exercises I
CHAPTER 4
Datapath Components 150
Introduction 150
4.2 Registers 151
4.3 Adders 165
4.4 Shifters 173
- Comparators 177
COunters 18\
4.7 t)le 189
4.8 Subtracto.." 190
-t9 L'nib-ALL".., 101
-I-. I 0 2O..J
4.11 Component Tradeoff,
( co eeuon 6.41 109
..t I Component De,C'ription u.qng
Hardware [Xscnption l:mguagc!"
( ('c eClllln 9A) 109
4.13 Chapter Summar) _16
-1-.14 E\r: 11
CHAPTER 5
Register-Transfer level (RTl) DeSign 1_S
5.1
.5 . .!

Introdu("lton .!!.S
RTL IX'lell \ l<thoJ 126
RTL Dt"lgn E \Jlllpk, .U1J I"ue,
IXtenninlllg Ch .. "Io..:k .!.51
vi ii
AS
AE
xii
5.5
5.6
5.7
Contents
Behavioral-Level Design: C to Gates
(Oplional) 254
Memory Components 258
Queues (FIFOs) 271
8A A Si x- Instruction Programmable
Processor 434
8.5 Example Assembly and Machine
Progrnms 438
5.8
5.9
Hierarchy-A Key Design Concept 275
RTL Design Optimi zati ons and TradeofTs (See
8.6 Funher Extensions 10 the Programmable
Processor 439
5.10
5. 11
5. 12
5. 13
Section 6.5) 278
RTL Design using Hardware Dcscripli on
Languages (Sec Secti on 9.5) 279
Produci Profi le: Cell Phone 279
Chaptcr Summary 285
Exercises 285
CHAPTER 6
Optimizalions and Tradeoffs 294
6.1 Imroduct ion 294
6.2 Combinational Logic Optimizalions and
Trodeoffs 296
6.3 Scquelllial Logic Optimizalions and
Tradeoffs 317
6A Dalnpalh Componelll Tradeoffs 333
6.5 RTL Design Optimizations and
Tradeoffs 345
6.6 More on Oplimizations and Tradeoffs 354
6.7 Product Profile: Digital Video Playerl
Recorder 361
6.8 Chapler Summary 370
6.9 Exercises 370
CHAPTER 7
Physical Implementation 379
7. 1 In lroduClion 379
7.2 ManufaclUrcd IC Technologies 379
7.3 Programmable IC Technology-FPGA 388
7.4 Other Technologies 401
7.5 IC Technology Comparisons 409
7.6 Produel Profile: Giani Video Display 412
7.7 Chapler Summary 416
7.8 Exercises 417
CHAPTER 8
Programmable Processors 421
8.1 In!roduclion 421
8.2 Basic Architecture 422
8.3 A Three- Instruction Programmable
Proce!)sor 428
8.7 Chapler Summary 441
8.8 Exercises 442
CHAPTER 9
Hardware Description Languages 445
9.1 Introduction 445
9.2 Combinational Logic Descripti on Using
Hardware Description Languages 447
9.3 Sequential Logic Description Using
Hardware Description Languages 459
9.4 Dmapmh Companelll Deseriplion Usi ng
Hardware Descripti on Languages 467
9.5 RTL Design Using Hardware Description
Languages 475
9.6 Chapler Summary 492
9.7 Exercises 492
APPENDIX A
Boolean Algebras 496
A. I BOOlean Algebra 496
A.2 SWilching Algebra 497
A.3 Impanam Theorems in Boolean Algebra 498
AA Olher Examples of Boolean Algebras 504
A.5 Funher Readings 504
APPENDIX B
Additiona l Topics in Binary Number Systems 505
B.I Inlroduclion 505
B.2 Real umber Represcnlati on 505
B.3 Fixed Poilll Arilhmelic 508
8.4 Floming Poim Represelll," ion 509
B.5 Exercises 514
APPENDIX C
Extended RTL Design Example 515
C.I Inlroduclion 515
C.2 DeSigning Ihe Soda Di 'pen,cr
Con !roller 516
C.3 Undemanding Ihe Behavior of Ihe . odn
Di spcn;cr COlllrOlicr nnd Dn",,,nlh 519
Index 526
1
Introduction
1.1 DIGITAL SYSTEMS IN THE WORLD AROUND US
Meet Arianna. Arianna is a five-year-old girl who lives in CaJjfomia. She's a cheerful. out-
going kid who loves to read, play soccer. dance. and lell jokes thai she makes up be""lf.
One day. Ananna's family was driving home from n soccer
game. She was in the middle of excitedly talking about the game
when suddenly the van in which she was riding was clipped 3
car thai had crossed O\'er to the wrong side of the higb",
Although lhe aceidenl wasn, panicularly bad. the impa I caused a
loose item from the rear of the van 10 project forward inside Lhe:
van. slriking Ananna in the back of the head. he "cnt
unconsciou .
Annnna wns rushed to a hospital. Doctors immediatel) noticed that tk!r
wns vcry weak-a common situ:llion after a se\ ere blo" to the head-...o put her
onto n ventilator. which is A medical dl!vice lh::u with breathing. he;' hJd ...
brain tmumA dunng the blow (0 the hend. nnd she for \ern1
weeks. All her vi tal signs were !)t3ble, ex ept ,he ("onllnued to re-qulre breaming a. ........ I .. -
Innce from the ventilmor. Patients in such tl Idtu3tion some tames 1'l'\."'O\er. .:md 'nnl<ome,
they don'l. \ hen they do recO\'cr. that reco\ "'I) take ... mooLtb
viii
I Introduction
AI
A
g
Thanks to the advenl of modern port able venti lators,
Arianna's parent s were gi ven the opti on of taking her
home whil e they hoped for her recovery, an option they
chose. In addition to the remote monitoring of vi tal signs
and the daily at-home visits by a nurse and respiratory
therapi st. Arianna was surrounded by her parents, brother,
sister. cousins. other family, and friends. For the majority
of the day. someone was hold ing her hand, singing to her,
whi spering in her ear. and encouraging her LO recover. Her
sister slept nearby, Some studi es show that such human
int eracti on can indeed increase the chances of recovery.
And recover she did. One day, several months later,
with Arianna's mom sitting at her side, Arianna opened
her eyes, Later that day. she was transported back to the
hospital. After some time. she was weaned from the venti-
lator. Then, after a lengthy time of recovery and
rehabilitation. Arianna finall y went home. Today, six-year-
old Arianna shows few signs of the accident that nearly
took her life.
What does this story have to do with digi tal design?
Arianna's recovery was aided by a portable ventilator
device, whi ch in turn is possible thanks to di gital circuits.
Over the past three decades, the amount of digit al circuitry
that can be stored on a single computer chip has increased
dramati call y_by nearl y 100.000 times. bel ieve it or not.
Thus. ventilators, along with almost everything else that
runs on electrici ty, can take advant age of incredibl y pow-
erful and fast yet inexpensive di gital circuits. The
vent ilator in Arianna's case was the Pulmonetics LTV
1000 ventilator. Whereas a ventil ator of the early 19905
mi ght have been the size of a large copy machine and cost
perhaps $100,000. the LTV 1000 is not much biooer or
hea' h 00
I'ler t an thi s textbook and costs only a few thousand
enough, and inexpensive enough, to be
c,arned In medical rescue helicopters and ambulances for
life-saving Situat ions, and even to be sent home with a
patient. The di gital circuits inside conti nuall y monitor the
pat ient 's breathing, and provide just the right amount of
air preSSure and volume to Ihe palient. Evel), breath thai
the deVice deli vers requires 1II;/I;OIlS of compulati ons for
proper deli very, computat ions carri ed out by the digital
CirCUit S inside.
Portable velllilator
a il e il/dicalor oj
Ihe rafe I"ar lI e\\'
;III'(:' m;OIl .\' are
deve/oped is Ihe
number of 11 (:,11'
ptllelltS gran/ct/-
170.000 per yellr
i" the U.S. (lloll e!
Phoro courtesy of
1.1 Digital Systems in the World Around Us
3
Portable ventilators help not only trauma vic-
tims. but even more commonly help patientS with
debi litating di seases, like multiple sclerosis. to gain
mObility. Such people can today move about in a
wheelchair, and hence do things like attend school.
visi l museum . and take part in a family picnic.
experiencing a far better quality of life than was fea-
sible JUSt a decade ago when those people would
have been confined to a bed connected to a large.
heavy, expensive ventil ator. For example. the young
gi rl pi ctured on the left will li kely require a venti-
lat or for the rest of her life-but he will be able to
move about in her wheelchair quite freely. rather
than being mostl y confined to her home.
The LTV 1000 ventilator described above was
conceived and de igned by a mall group of people.
pictured on the lefL who sought to build a ponable
and reliable ventilator in order to help people like
Arianna and thousands of others like her (as well as
to make ome good money doing o!). Those
designers probably started off like you. reading text-
PholO cOllrles), oj PIIIII/Ollel;c,,' books and taking courses on digital de ign.
programming. electronics. and/or other subj ectS.
The ventilalor is just one of lit erally thol/sands of useful device that have Come
about and continue to be created thanks to the era of digital circuits. If you top and think
about how many devices in the 1V0rid around you rely on or are made po sible becau e of
di git al cirCuits, you may be quite surpri sed. A few such devices include:
Antilock brakes. ai rbags. aUlofocus cameras. automatic teller rn3 hines. aircraft conrroUers
and navigators, camcorders. CilSh regi ster. cell phones. computer net\\orks. credit card
readers, cruise controllers. dcfibrillmors. digital cameras. DVD players. electri card reader'S.
electronic games. electronic pianos. fax machine!), fingerprint identjfiers. hearing aids. home
securi ty systems. modems. pacemakers. pagers. personal compute". personal digita1 assis-
lants. photocopiers, port able music players. robotic aml . I.,canner-, IDc!nn Stat
cOlllroli crs. TV set-top boxes. ventil ators. vid\!o game consoles-the Ii.:,( goe\ on.
Those devices were created by tens of thousands of designers. including omputer
scienti sts. computer engineers. electrical engineers. mechanical engineers. and others.
working together wi th scienti sts. doctors. busine s people. teachers. etc. One thing that
seems clear is thai new devices wil l continue to be inyented for the fore<eeable furure-
devices that in another decade will be hundred of times smaller. cheaper. and m re po\\_
erful than today's devi ces. enabling new applications that \\e don't e\en dream of.
Already. we are seeing amazing new applications that seem futurisric e\en though
exisr today. like tiny di gital -circuit-controlled medicine tii"pem,ers under the
skin. voice-conrroll ed cell phones and applian es. roboric self-guiding hou, h,'lli \ J uurn
cleaners. laser-guided automobile cruise control. and m reo What' , not c1e.lf b \\h:u n \\
and exciting applicat ions will be devel ped in the future. or \\ ho those dey i' S \\ ilIl:>en-
elit. Future designers. li ke YOllrselr perhaps. \\ ill h Ip dl'tennine th;}t.
4 1 Introduction
1.2 THE WORLD OF DIGITAL SYSTEMS
Digital versus Analog .
. h one of a finit e sel of possIble values,
A digilal signal is a signal Ihal al any lime can ave log signal can have one of an
d
I k .' I [n contraSI , an alia .
an I S a so ' nown as a di screte Si gna. continuouS sional. A signal I S
. . . d ' Iso known as a 0
mfil1lle number of possIbl e values, an IS a. I I every inslant of time. An
. . I h a unIque va ue a
JUSI some physIcal phenomena 11al as ts'lde because phys ical tem-
. . h temperature ou " .
everyday exampl e of an analog sIgnal IS t e b 92 356666 degrees An
ture may e .. . .
perature is a continuous value-the lempera ft you hold up because the
. . humber 0 lIl gers ,
everyday example of a di gital sIgnal IS l e n fi ' 1 set of values [n fact the
. 7 8 9 or IO--a 111 e .,
va lue must be enher 0, I, 2. 3, 4, 5, 6. , , '.. (d"1 s) 111eaning finoer
.. " .. . d f "dl on" Igl U , 0 .
lerm dlgnal comes from Ihe Lalln wor or 0 . I those th'lt can have one of
dO oital siona s are (
In compuling syslems, Ihe mOSI common 10 0 d 1 or 0) . Such a two-valued
I
. I' k ff (oft en represenle as
on y two possIble values. I' e on or 0 . d' '1 I slem is a system that takes
. . k b' resenlauon. A Igl a sy
representauon IS ' nown as a lIIary rep .. .. . nnection of digital com-
.. . .. I A dlgllal clfclIIl IS a co
dlgnal mputs and generates dlgna out pUIS. . b k the term dioital wi ll refer
ponenlS Ihat logelher comprise a digilal system. [n thIS text 1
00
k' own as a binary digi t or
. . . A' I binary slona IS n, ,
10 systems wnh bmary-valued sIgnal s. slllg e 0 I ular in the mid-1900s
bil for short (binary digit). Di gi lal electrolllcs became extreme YbPOPI med 011 or off usin
o
.. . It ' swi tch thaI can e u 0
after the mve nllon of the transIstor, an e ec nc h
. . f rther in the next c apler.
another electri c signal. We' ll descnbe IranslSlors u .
Digital circuits are the basis for computers d . b bl
. . I' ' I ' n the world aroun us IS pro a y to
The most well -known use of dl gna CIrCUI s I I'k
build the microprocessors that serve as the brain of general-purpose computers, I e
h
t you mi oht have at home. General-
the personal computer or laptop computer ta o .
vhi ch operate behllld the scenes to
purpose computers are also used as servers, \ . .
A geneml-pllrpose compllfer
implement banking, ai rline reservation, web search, payroll , and SImilar such sys-
tems. General-purpose computers take di gital input data, such as lellers and
numbers received from files or keyboards, and output new dI gItal data, such as
lellers and numbers stored in fil es or di spl ayed on a monitor. about
design is therefore useful in understanding how computers work u.nder the hood,
and hence has been required leaming for most computing and
majors for decades. Based on material in upcoming chapt ers, we II deSIgn a SImple
computer in Chapter 8.
Embedded systems
Digital circuits are the basis for much more "
Increasingly, di gital circuits are being used for much more than Implem: ntmg
general-purpose computers. More and more new applicall onsconve.rt analog SIgnals
to digital ones, and run tho e digital signal s through customIzed dlgll al CirCUIts, .to
achieve numerous benefits. Such appli cations include cell phones, automobIle
engine controll ers, TV set-lop boxes, music instruments, digital cameras and cam-
corders, video game consoles, and so on. Digital circuit s found inside applications
other than general-purpose computers are often call ed embcllded sysl ems, because
those digital systems are embedded inside another electronic device.
1.2 The World of Digital Systems 5
The world is most ly analog, and therefore many applications were
o
Sound waves
_____ ..::::L________ .
previously implemented with analog circuits. But many implementa-
li ons have changed or are changing over to digital implementations.
To understand why, we might first notice Lhat although the world is
most ly analog, humans often obtain advantages when converting
analog signals 10 digital signals before "processing" that infonnation.
For example. a car horn is actually an analog signal-the volume cao
take on infinite possible va lues, and the volume varies over time due
to variations in the battery sLrength, temperature, etc. But. we
humans neglect those variati ons, and we "digitize" the sound we hear
int o one of two values: the car hom is "off." or the car hom is "on"
(gel out of Ihe way!).
Converting analog phenomena to di gi tal. for use with digital cir-
CUi IS, can also yield many advantages. Let's illustrate this point by
considering one exampl e, audio recording, in some detail. Audio is
clearl y an analog signal. with infinite possible frequencie and volumes. Consider
recording an audio signal. li ke music, through a microphone. 0 that the music can laLer be
played over speakers in an electronic stereo y tem. One type of microphone. a dynamic
microphone, works based on a principle of electromagnetism-moving a magnet near a
wire causes changing current (and hence voltage) in a nearby wire. The more the magnet
moves. the higher the VOltage on the wire. So a microphone has a small membrane
attached to a magnel near a wire-when sound hits the membrane, the magnet moves.
causing current in the wire. Likewise. a peaker works on the same principle i; reverse--a
changing current in a wire wi ll cause a nearby magnet to move, which if allached to a
membrane wi ll create sound. (If you get a chance. open up an old speaker- you'lI find a
strong magnet inside.) If the microphone is allached directly to the speaker (through ao
amplifier that strengthens the microphone' output current), then no digitization is
required. But what if we want to save the sound on ome sort of media, so \\e cao record
a song now and play the song back later? We can record sound using analog methods or
digital methods. but di git al methods have many advantages.
: I : move the
'+ '
I I 6 i
i + U
t
. 'd j the magnet.
, ,
l ___ ===_
which creates
current in the nearby wi re
One advantage of digital met hods is lack of deterioration in qualiry over time, When
I was growing up, the audi o casselle tape, an analog method. was the mo t common
method for recording song. Audio tape contain huge numbers of magnetic particles that
can be moved to particular orientations using a magnet and that hold that orientation even
after the magnet is removed. Thus, using magnetism, we could hange the tape' surface.
ome pans up. ome hi gher. some down. etc. This is similar to how you can -pike your
hair, some up, some sideways. some down. using hair ge\. The po ible orientations of the
tape's particles. and your hair, are infinite, so the tape is definitely anal g. To record onlO
a tape, we pas the tape under a "head" that generates a magnetic field based on the elec-
tric current over the wire coming from a microphone. The tape' panicle "ould thus be
moved to particular orientations. To playa recorded song back. \\c \\ uld the t:lpe
under the head again, but this time the head operate in reverse, genernting current Q\ r a
wire based on the changing magnetic field of the tape. and that current then gets amplified
and sent to the speakers.
.....
6 I Introduction
,
0001101011111101 101 000
wi re
", ___ digitized signal
,,--- 00011 01011 111101101 000
, "
, "
, "
read from tape. CD. etc.
Ii :
" I ,
analog signal
on wire
,
,
,
,
,
,
,
analog signal
reproduced from
signal
: : \
If 1"," I
0/ I I I I , , , 1 : : :.
t.tOO: 01 :10 : 10 : 11 : 11 : 11 :01 , 10 , 10 , 00 , time
speaker
. d' . I . al (top) and vice versa (bollom). Notice
Figure 1.1 Converti ng an analog Signal 10 a Igna sign .
some quality loss in the reproduced signal.
A bl
. h d' tape is that the orientati ons of the panicles on the tape's
pro em Wit au 10 ...
surface change over time- just like a spiked hatrdo In the morning eventually flattens
throughout the day. Thus, audi o tape quality deteriorates over time. Such detenoratlOn IS
a problem with many analog systems. . .. .
Di giti zing the audi o can reduce such deterioration. Di giti zed audIO :"orks as shown
in Figure 1. 1. The figure shows an analog signal on a wire dunng a peri od of ttme. We
sample that sional at panicul ar time intervals, shown by the dashed hnes. Assummg the
analog signal range from 0 Volts to 3 Volts, and that we pl an to store each sample
usino two bits. then we must round each sample to the nearest Volt (0, 1, 2, or 3), shown
as in the figure. We can SlOre 0 Volt s as the two bits 00, I Volt as the two bits 01 ,
2 Volts as the two bi ts 10, and 3 Volts as the two bits 11. Thus, we would conven the
shown analog signal int o the foll owing digi tal signal : 00011010111111 0 11 0 1 000.
To record thi s di gital signal, we just need to store Os and Is on the recording
media. We could use regular audio tape, using a short beep to represent a 1 and no beep
to represent a 0, for exampl e. While the audio signal on the tape wi ll deteriorate over
time, we can still certai nl y tell the difference between a beep and no beep, just like we
can tell the difference between a car hom bei ng on or off. A sli gllll y qui eter beep is still
a beep. You've likely heard di gi ti zed data communi cated using a manner simi lar to such
beeps when you've picked up a phone being used by a computer modem or a fax
machine. Even betler than audi o tape, we can record the di gital signal using a media
specificall y designed 10 store Os and Is. For example, the surface of a D (compact
di sk) can be configured to ei ther refl ect a laser beam to a sensor strongly or weak.ly,
z
1.2 The World of Digital Systems 7
Ihus sloring Is Os easil y. Likewise. compUler hard di sks in compuler use magnetic
panicle onematl on 10 Slore Os and Is, making such disks si mil ar 10 audio tape. but
enabling fas ler access to random pans of the di sk since the head can move sideways
across the top of the spinning disk.
To play bac k thi s di gitized audi o signal, we can simpl y conven the digital value of
each sampling peri od to an analog signal, as shown at the bOllom of Figure 1. 1. Notice
Ihal Ihe reproduced signal is not an exact repli ca of the ori gi nal analog signal. However.
the faster we ample Ihe analog signal and the more bits we use for each sampl e. the
closer Ihe reproduced analog signal derived from the di gil ized signal will be to Ihe ori g-
inal analog signal- a! some poinl , humans can' l not ice Ihe difference between a pure
audio signal and one thm has been di gitized and then convened back to analog.
Another advanlage of digit ized audio is compres ion. Suppose Ihat we'lI be lOring
each sampl e with ten bits, instead of IWO bits like above, 10 achi eve much beller quality
due 10 less rounding. But thal 's a lot more bils for the same audi o-the signal in Figure
1. 1 has eleven amples, and a[ len bils per sample. that yields one hundred ten bits 10
store the audio. If we sampl e hundreds or Ihousands of time a second. we end up with
huge numbers of bil s. Suppose, though, that a panicular audio recording has many
samples that have Ihe value 0000000000 and Ihe value 111111111l. We could com-
press the digital file by using the following trick: if the firsl bit of a ample i O. the
nexl bit being 0 means the sampl e is actuall y supposed 10 be expanded 10
0000000000: the next bi t being I means the sampl e i 111111111l. So 00 i shon-
hand for 0000000000. and 01 is shonhand for 1111111111. If the first bi l of a
sample is l. then the next len bits represent the actuaJ sample. So the digitized signal
"0000000000 0000000000 0000001111 1111111111" would be compre cd to
"00 00 10000001111 01." The receiver. which must know the com pres ion
scheme, would decompress that signal into the original digitized signal. There are many
other tricks that can be used 10 compress di gitized audio. Perhaps the mo tly widely
known audio compression scheme is known as MP3. whi ch is popular for com pres ing
di giti zed songs. A typical song mi ghl require many lens of megabyle uncompre ed.
bUI compressed usuall y onl y requires about 3 or 4 megabyte . An audio CD can lore
aboul 20 songs uncompressed. but aboul 200 ongs com pres cd. Thanks 10 compre ion
(combined wilh hi gher-capacil y disks), loday' ponable music pl ayers can tore thou-
sands of songs-a capability undreamt of by mo I people in Ihe 1990 .
Di giti zed audio is widely used not onl y in mu ic recording, but also in voi e commu-
ni cali ons. For exampl e. di gilal cellul ar telephones di giti ze your voice and then compres
the digilal signal before transmilling Ihe ignal. enabling far more cell phones to operate
in a panicul ar region than possible using analog cell phones.
Satellites DVD Video MusJCal
Portable
players recorders instruments
music players Cell phones Cameras TVs ???

1995 1997 1999 2001 2003 2005 2007
Figure 1.2 More and more analog produ ts are bt.-coming primarily digit!!.!.
8 Introductio n
manner sl11111ar 10 that descnbed for aud.o.
Pi ctures and video can be dlgll
ized
In a , h ghl y-compressed dl gllal form. and
pi ctures In I
Dl gll al Cdmems. for exampl e, slOre d sks In compressed form too
dlgllal Video recorders SlOre Video onlO tapes or I few of the hundreds of new and
d Video arc Just a .
Dlglli zed audiO, pictures. an of ana 100 phenomena As shown 111
I h b fi t from dl",lI zalion '" d
lll ure ,'ppl ,callons t at ene '" lar products prevIOusly base on
d de numerouS popu '
Figure I 2. over the past eca. ani 10 dlgllal technology Ponable muSiC
"n310g technology. have convert ed pnm, Y to CDs In the middle 1990s, and
I d from cassette tapes
pl,.ye rs. for exampl e. SWIIC 1e I \I phones used analog commUOl ca-
lecent ly to MP3s and other dlgll al format s Ear y ce mdar In Idea to that shown 111
1990 d tal commulll callon, S.
tl on. but In the late s Igl , I 2000s, analog VHS v.deo players gave way
Figu re 1. 1. became dominant In the ear y b t dl olli ze v.deo before stonng the
V
'd corders have egun 0 '"
to dlgllal DVD players. I eo re d fi l nt lfely and Instead slOre photos
I
have eliminate me.
vlcleo ont o tape. wh. e cameras Iy d. gllal -based wllh electrontC
1 t l1cnt s are Increasmg .
uSing dlglla l cards Muslca inS rUI ' rit and electnc gUllars wllh d.g.tal pro-
drums and keyboards IIlcreasIng III popula Y'. . Y 10 di oital TV. Hundreds of
. . I A aloo TV IS also giVing wa '"
cess lIl g appeanng recent y. n '" . . I ' st decades such as clocks and
other devices have converted from analog 10 d.glta InhPa ometers (' whi ch now work in
h
t human temperature term
watches, household t ermosta s, ) gine controllers oasoline
the car rather than under the tongue or other places t car en . e
pumps. hearing aids. and so on. . d bein
o
introduced in di gital form from the
Many devices were never analog. IOstea o. .' .
. d oes have been di gital sillce thelf IIlceplion. .
ve ry start. For example. VI eo ",am .' 1 d Os. Computations uSlllg
Di giti zat ion requires that we encode tillngs Into sl an d Os We introduce these
di gi tal circuits require that we represent numbers usmg s an .
aspects of digital circuits now.
.. THE TELEPHONE.
The tel ephone. pmented by Alexander Graham Sell in
the late 1800s (though invented by Antonio Meucci).
operates using the electromagnetic principle described
earlier-your speech creates sound waves that move a
membrane. which moves a magnet, which creates
current on a nearby wire. Run that wire to somewhere far
away. put :l magnet connected 10 a membrane near that
wire. and the membrane will move, producing sound
waves tha. sound like you talking. Much of .he telephone
system today di gitizes the audio '0 improve quality and
quantity of audio transmissions over long distances. A
couple of illleresting facts about the .elephone;
Believe it or no . Western Union actuall y tumed
down Sell's init ial proposal to develop the
telephone. perhaps .hinking that the then-popular
telegraph was all people needed.
Sell and his assistant
Watson di sagreed on
how to answer the
phone; Watson wanted
"Hello:' which won.
but Sell wanted "Hoy
hoy" instead, (Fans of
the TV show "The
Simpsons" may have
noticed lhat Homer's
boss, Mr. Sums,
answers the phone with
a "hoy hoy." )
All early-slyl e ,elepltoll e.
(Source of some of the above material: www.pbs.org.
trunscript of '1'he Telephone") .
1.2 The Worl d of Digital Systems 9
Digital Encodings and Binary Numbers
Figure 1.3 A typi cal
di gital system.
o 0 1 0 000 1
"' t's 33 degrees"
The previous secti on showed an exampl e of a di gital system, which involved digitizing an
audio signal into bit s, which we could then process using a digital circuit to achieve
several benefit s. Those bits el/coded the data of interest. Encodi ng data into bits is a
central task in di gital systems. Some of the data we want to proces may already be in
di git al form, whi le other data may be in analog form (e,g. audio, video. temperature) and
thus require convers ion to digital data first , as illustrated at the top of Figure 1.3. A di gital
system takes di gi tal data as input, and produces di gital data as output.
Encoding a na log phenomena
Any ana log phenomena can be digitized, and hence countl e s appli cations have evolved
and cont inue to evolve that digiti ze ana log phenomena. Automobiles digitize informa-
tion about the engine temperature, car speed. fuel level. etc., 0 that an on-chip
computer can moni tor and control the vehicl e. The ventilator we introduced earlier dig-
iti zes the measure of the air fl ow into the patient, so that a computer can make
calculations on how much additional Row to provide. And so on. Di gitizing analog phe-
nomena requi res;
A sel/sor that mea ures the analog physical phenomena and converts me mea-
sured value to an analog electri cal signal. One exampl e is the microphone (which
measures sound) in Fi gure 1. 1. Other common examples include video capture
devices (whi ch measure li ght), thermometers (which measures temperature). and
speedometer (whi ch measure peed).
An alia log-la-digital call verIer that convens the electri cal ignal into binary
encodi ngs. The convert er must sample (measure) the electrical signal at a panic-
ul ar rat e and conve n each sampl e to some value of bits. Such a converter was
featured in Figure 1.1, and hown as the A2D component in Figure 1.3.
Likewise. a digilal-Io-allalog COli verier (shown as D2A in Figure \.3) convens bits
back 10 an electri cal s ignal , and an achlOlor convens mat electrical signal back to phys-
ical phenomena. Sensors and actuators together represent type of devices known as
Irallsdllcers--devices that convert one form of energy to anomer.
In many examples throughout thi s book, we will utili ze idealized sensors mat them-
selves direct ly output di giti zed data. For exampl e. we might utilize a temperature sen or
that reads me present temperature and sets it s 8-bit output to an encoding mat represents
the temperature as a binary number (see next sections for binary number encoding ).
Encoding digital phenomena
Other phenomena are inherentl y di git al. Such phenomena can only as ume one value
from a finite set of values.
Some digit al phenomena can a ume only one of two pos ible value. and mus can
be straightforwardly encoded as a single bit. For example. the following types of sensors
may output an electri cal signal that a umes one of twO valu :
Motion sensor; output s a positive voltage (say +.' ) when motion is en . 0
volts when no mot ion is sensed.
10
a
1 Introduction
a t a
" . 0 when li oht is sensed, 0 V when dark.
Light sensor: outputs a pOSlllve volta"C" '. d 0 V when
. . I ge when the button IS presse .
Button (sensor): outputs a posll ive vo ta
nOl pressed.
r's output to a bit. with 1 representi ng
We can strai ght forwardly encode each senso I tl oughout thi s book we will
the positi ve voltage and 0 representing 0 V. In examp es
b
. lr I '
.. . . . Itt the encoded 11 va ue.
utili ze Ideali zed sensors that dIrect y OU pu 'bl I s For example a keypad
e several POSSI e va ue . ,
Other digital phenomena can assum ' bl k A desioner mi oht create a
h d d blu ' oreen and ac. " "
may ave four bUllons. colore re. .' . h the value 001; blue might
circuit such that when red is pressed. a three-bll output as d the output mioht be 000.
output 010. green 011. and black 100. If no bunon IS presse , "
FIgure I 4 Illustrates such a keypad na IS the Enoll sh alphabet. Each
An even more general di gItal phenome " keyboard
f
" fi t set of characters so typtng on a
charactcr comes rom a nt e ' th d tal data to bIts by
I
I data We can convert e Igl
result s tn dlgl ta . not ana og, . I odll1
o
of Engltsh
assl o nll1g a bl l encodtng to each character A popu ar d" d Code for
cha;;'cters IS known as ASCII (standtng for Amencan tan ar F
I h
des each character tnto seven blls. or
Inlormalton Interchange). w ll C enco 'A" "1000001" and
S
CII d for Ihe uppercase lener IS ,
example. Ihe A enco III g . ,,, 00001 " d ' b' IS "1100010."
for 'B' IS "1000010 " A lowercase a IS 11 , an 1000010
Th ' h e "ABBA" would be encoded as "1000001
us. I e nam d ' II 26 I tters (upper
10000101000001 " ASCII defi nes 7-blt en co tngs .or a . e -
. I b I 0 tllrouoh 9 punctuatIon marks, and
and lowercase), Ihe numenca sym 0 s '" " . e
even a number of encodings fo r nonprinlable "control operalt ons. There ar
128 encodings 10lal in ASCIl . A subset of ASCli encodll1gs .IS shown .in
Figure 1.5. Another encoding, Uni code, is increasi ng tn popul ant y due to Its
support of international languages. Unicode uses 16 bils per character, II1stead
of jusl the 7 bils used in ASCII , and represenl s characters from a dIversity of
languages in the world.
Encoding Symbol Encoding Symbol
Figure 1.4 Keypad encodings.
R 1010010 1110010
1110011
1110100
1101100
1101110
1100101
0111 001
0100001
S 1010011
T 1010100
1001100
N 1001110
E 1000101
0 0110000
0101110
<tab> 0001001 <space> 0100000
Figure 1.5 Sample ASCn encodings.
Encoding numbers . .
Perhaps the most important use of digital circuits is to perform arithmetI C computallOns.
In fact , a key dri ver of earl y digi tal computer design was the arithmeti c comput ati ons of
ballistic trajectories in World War L1 . To perform ari thmeti c computat ions, we need a way
to represent numbers using binary digi ts-we need binary numbers.
1.2 The World of Digital Systems II
WHY BASE TEN?
Humans have len fingers. so they chose a numbering
system where cHeh digi t can represent len possible
values. There's nOlhing magical aboul base ten. If
humans had nine fingers, we'd probably usc a base
nine numbering system. It !Urns Out thut base twelve
was used somewhat in the past 100, because by using
our lhumb. we can easil y point 10 twel ve different
spots on the remaining four fingers on that Ihumbs's
hand-the four tops of those fingers. the four middJe
pans of Ihose fingers , and the four bottoms of those
fingers. Thm's likely why the number twelve is
common in human counting today. Uke the use of the
term "dozen," and lhe lwelve hours of a clock.
(Source: " Idem. and Information: ' Arno Pen'lias. W. W.
Nonon and Company).
523
Figure 1.6 Base len
number system.
The Wt,b s('(lf'C:h
engine Google's
Illlme cOllieS from
,IIe lI'om "googol"
-(J } f ollowed by
J CKJ :.elves,
aplJllrelllJy
implying the
ellgill e ellll search
a/% j
'-,l/orl1lal;oll .
o
Figure 1.7 Base Iwo
number system.
I sail' the/ollowillg
011 a T-shirt, ami
Jound il rather
filllllY:
"TIlere nrc I 0 types
or people in the
world: those who
get binnry. and
those who don't."
To under tand binary numbers, we might firsl ensure Illat we understand decimal
numbers. Decimal numbers use a base len numbering system. The basic definition of base
len is a numbering syslem where the ri ghtmost digit represent the number of ones
we have. the nexi digit represent s the number of groups of tens (10
1
) we have. the next
digit represent s the number or groups of len tens ( 10
2
) we have. and so on, as illu trated
in Figure 1. 6. So the digi ls "523" in ba e 10 represent 5*10
2
+ 2*10
1
+ 3*100.
Because humans have ten finger. they developed and used a ba e ten numbering
system. They came up with symbols to represent quanlitie ranging from no fingers (0) to
all the fingers but one (9)-lhese are call ed "ones" rather than "fingers" though. because
we aren' t always counting fingers. To represem a larger quantily than nine one , humans
introduced another di gil to represent the number of groups of all the fingers. called "ten."
NOle thai we don't need a uni que symbol for the quantity ten itself. ince that quantity
can be represented as one group of ten and no ones. To represent more than nine tens.
humans introduced yet anot her di git, 10 represent the number of groups of len tens. which
are call ed "hundreds." To represent ten hundreds, they introduced another digit. called
"thousands. " Engli sh (as spoken in America) doesn't have a name for a group repre-
senting ten thousands. nOr for the group representing ten ten thousand . which is referred
to as hundred thousands. The next group is called millions, and further group that are
multipl es of one thousand have names too (billions. trilli ons, quadrillions. etc.).
Now that we better understand base ten numbers. we can introduce base two num-
bers. known as bi/lary /lllll/bers . Since digital circuits work with values that are either
"on" or "off," such circuit s need only two symbols. rather than ten ymbols. Let tho e two
symbols be 0 and I. I f we need to represent a quantity more than I. we'll use another
digil, whi ch wi ll represent the number of groups of 21 which we'll call two. So "10" in
base two represenl S I IWO and 0 ones. Be careful nOI to call "10" ten-in tead. you might
say "one-two." If we need a bigger quantity. we'll use another digil. which "ill represent
the number of groups of 2
2
, which we'll call four. The weights of each digit in base two
are shown in Fi gure 1.7.
For example. the number 101 in base IWO equal 1*2
2
+ 0*2
1
+ 1*_0. or 5. in base
ten. In other words. 101 can be poken as "one-four zero-two one-one." I t people
comfortable with binary might instead ju t say "one zero one." To be "ety lear, you
mi ght say "one zero one, base two:' But you should definitely /lOT say "one-hundred ne,
base two." 101 is one-hundred one in base ten. but Ihe leftmost 1 does not repre,.;em ne-
hundred in base IWO.
..
,
12 1 Introduct ion
COUNTING "CORRECTL Y" IN BASE TEN.
The fJe l Lhil l there are for of the groups in
base ten. but 110( olhcn" prevents many people from
g3i ning an intuitive underslunding of base ten. Further
liddi ng to the confusion arc the abbreviated names for
I think makes more sense). Thus. the number 523
Id be spoken as "fi ve- hundred two-ten lhree"
"five-hundred twenty- three:' I believe Lhat kids
h harder time learni ng math because of the
ave a in a
gr ups of lens-the numbers 10. 20. 30 ..... 90 should
confusing number naming-for example. carry g
one from the ones column to the tcns"column
more sense if the ones column adds ( 0 ten seven
rather than to "seventcen"-the adds
one 10 Ihe tens column. Learning btnary tS slightl y
harder for some studenls due 10 a lack of a solid
understanding of base 10. caused largely by the
naming confusion. Perhaps. when a store clerk tell s
you "that will be ninety-nine cents." can
him by saying "you mean nme Lf
enough of us do Ihi s. perhaps 11 wt ll calch on.
be call ed One len. two ten. three len ..... ninc ten. but
instead use abbreviated names: one tcn as just "ten:'
I wo tell as " twent y:' three len us " thirt y," " .. and nine
len as ' ninety." YOLI can sec how "ninety" is a
!-honcning of "nine ten:' Funhcnnore. short names arc
also used for the numbers between 10 and 20. II
o; hould be "one l en one: ' but is instead "cleven," while
19 should be " one ten nine" but i s instead "nineleen,"
Table 1.1 indictll es how 10 count "correclly" in b3se ten
(where I boldly define "correcll y" us counling the way
TABLE 1.1 Counting "correctlv .. in base ten.
o to 9
10 10 99
100 10 900
1000 and up
o
16 8 2
Figure 1.8 Basc two
number
EXAMPLE 1.1
As usu:.l l: "zero;' one." "two; ' etc.
.. .. I" "one ten nine"
10. 11. 12. , .. 19: "one tcn," "one ten onc. one len wo, .:" .. ' ne"
20. 2 1. 22 .. ... 29: "two ten:' "two ten one," "two len two, ... two len nl
30. 40 . ... 90: "three len," "four ten," ... "nine len"
As usual: "one hundred." "two hundred," ... "nine hundred." Even bener would be 10 repl ace
the \ ord "hundred" by "len to the power of 2."
As usual. Even bener: replace "thousand" by "ten 10 Ihe power of 3". "len thousand" by "len
to the power of 4:' e IC. eliminati ng all the names.
When we are writ ing numbers of different bases and the base of the number is not
obvious. we indicate the base with a subscript , as follows: 101
2
= 5
10
, We mt ght say thiS
as "one zero one in base two equals five in ba e ten." ,
Note that since bi nary isn' t as popular as decimal. people haven I created short
names for its groups of 21. 2
2
, and so on. like they have for groups in base ten (hundreds,
Lhousands. millions, etc.). Instead. people just use the equivalent base len name for the
group--a source of some confusion to people just learning binary. Nevertheless, tt may
sLil1 be eas ier to think of each group in base two uSlllg base 10 names, rather than
increasing powers of two, as shown in Figure 1.8.
Binary to decimal
Convertlhe foll owing binary numbers to deci mal numbers: 1. 11 O. 10000. 10000 Ill. and 001 10.
12 is jusl 1*2
0
. or I/ o. . .
1 10, is 1*2
2
+ 1' 2 + 0*20. or 6
10
, We mighl lhink of Ihi s using the group wetghls shown In
Figure 1. 8: 1' 4 + 1*2 + 0*1. or 6.
10000, is 1' 16 + 0*8 + 0' 4 + 0' 2 + 0' 1. or 16
10
,
looooi 1 h is 1' 128 + 1' 4 + 1*2 + 1' 1 = 135
10
, Not ice Ihi s lil11e Ihat we didn' l bother to
write the groups with a 0 bit.
001 102 is Ihe sal11e as 11 02 above - the leading O's don' l change Ihe value.
Knowing powers of two
helps in learning binary:
lJ128 2 256
4 512
8 1024
16 2048
32 ...
64
EXAMPLE 1.2
1.2 The World of Digital Systems 13
When converting from binary to decimal , people often fi nd it useful to be comfort-
able knowing the powers of two. since each Success ive place to the left in a binary
number is two times the previous pl ace. In binary. the firs!. righlmost place is 1. the
second place is 2, then 4, then 8. 16. 32. 64. 128, 256. 512. 1024. 2048. and 0 on. You
mi ght top at this poinl to practice counting up by powers of Iwo: 1,2.4.8. 16.32,64.
128, 256. 512. 1024. 2048, etc .. a rew times. Now. when you see the number 10000 Ill.
YOll mi ght move along the number from right to lefl and count up by powers of two for
each bit to delermine Ihe weight of the leftmost bit: 1.2,4.8. 16.32.64. 128. The nexl
hi ghest 1 ha a wei ght of (counting up again) 1,2. 4; adding 4 to 128 gives 132. The next
I has a weighl of 2; addi ng Lhat to 132 gives 134. The rightmost 1 has a weight of I;
adding Ihat to 134 gives 135. Thus. 10000 III equal 135 in base ten.
Counting in binary
Count ing from 0 10 7 in binary looks as follows: 000. 001. 010. 011 , 100. 101. 110. III.
An int eresting fact about binary numbers-you can quickl y determi ne whether a
binary number is odd j ust by checking if the rightmost di git has a I. If the righLmost digil
is a O. Lhe number mll st be even, since the number is the sum of even number .
Converting between decimal and binary numbers using the subtraction method
As we saw earli er, converting a binary number to decimal is easy-we j u t add the
weights of each di git having a 1. Converting a decimal number to binary take slightly
more effort . One mel hod for convert ing a decimal number to a binary number that is easy
for humans to carry out by hand, whi ch we' ll call the sllb/ractioll me/hod. i hown in
Table 1.2. The met.hod starts wiLh a binary number thal is all Os.
TABLE 1.2 Subtraction method for converting a decimal number to a binary number.
0.
" c:;;
N
0.
" c:;;
Slep
PUllin
highest place
Updale
decimal
number
Descripti on
PUI a 1 in the highesl binary place who e weigh I is less than or equal 10 the
decimal number.
Updale the decimal number by Subtntcling the highesl binary place's \\ eight from
the decimal number. The new decimal number is lhe remaining quanti)' to be
converted 10 binary. If Ihe updaled deci mal number is nOI zero. return 10 step I.
For example. we can convert the decimal number 12 as shown in Figure 1.9.
1. Put 1 in highest place
Try place 16. too big (16)12)
Next place. 8. is highest (8<12)
2. Update decimal number
Decimal not zero. return to Step 1
1. Put 1 highest place
Next place. 4. is highest (4=4)
2. Update decimal number
Decimal number is zero. done.
Decimal
12
-8
4
-4
-0-
Binary
)( 1000
168421
1 1 0 0
168421
(current value
is 8)
(cumm' value
IS 12)
Figure 1.9 Converting Ihe decimal number 12 10 binary usi ng the ubtntclIon "lethO<l
14

Int roduction
EXAMPLE 1.3
W d ' nal' 1*8+ 1*4 +0*2 +0*2 = 12.
e cun c heck Our work by convenin" 1100 back to eCIl , . .
As a no the r example. Figure 1. 10 illustrates the subtracti on method fo r convert 109
decimal number 23 to binary. We can chec k our work by convertlll g the lesult, 101 1 ,
back to decimal: 1* 16+0*8+ 1*4+ 1*2 + 1* 1 =23.
1. Put 1 in highest place
Place 32 too big, but 16 works.
2. Update decimal number
Decimal not zero, return to Step 1
1. Put 1 in highest place
Next place is 8. too big (8)7)
4 works (4<7)
2. Update deci mal number
Decimal number not zero, return
to Step 1
1. Put 1 in highest place
Next place is 2. works (2<3)
2. Updale decimal number
Decimal nol zero, return to Step 1
1. Put 1 in highest place
Nexl place is 1, works (1=1)
2. Updale decimal number
Decimal number is zero, done

23 10
0 00
168421
-16
- 7
-4
-3-
-2
1
- I
o
1 0 1 0 0
168421
1 0 1 1 0
168421
1 0 1 1
168421
(current value
is 16)
(current value
is 20)
(current value
is 22)
(current value
i523)
Figure 1.10 Conveni ng the decimal number 23 to binary using the sublIacti on method.
Decimal to binary
Convert the fo llowing deci mal numbers to binary using the subtracti on method: 8, 14, 99.
To convert 8 to binary. we start by putting a I in Ihe 8's pl ace, yielding 1000. Since 8-8=0, we
are done-the answer is 1000.
To convert 14 to bi nary, we stan by pUiting a I in the 8's pl ace (16 is too much). yielding 1000.
14 -8 =6. sn we PUI a I in the 4' place. yielding 11 00.6 - 4 =2, so we put a I in the 2's place,
yieldi ng I I 10. 2 - 2 = 0, so we are done-the answer is 111 0. We can quickly check our work by
converting back 10 decimal: 8 + 4 + 2 = 14.
To convert 99 to bi nary, we stan by putting a I in the 64's place (the next hi gher place, 128. is
too bi g-noti ce that being abl e to count by powers of two is handy in thi s problem), yielding
1000000.99-64 is 35, so we PUI a I in the 32's pl ace, yieldi ng 1100000.35-32 is 3. so we put a
I in the 2's pl ace. yieldi ng 11 000 10.3 -2 is I. so we put a I in the I 's place, yielding the fina l answer
of I 1000 I I. We can check our work by conven.ing back to decima l: 64 + 32 + 2 + I = 99.
Converting between decimal and binary numbers using the divide-by-2 method
An alte rnative approach for converting a decimal number to binary, perhaps less intuitive
tha n the s ubtracti on method but easier to automate using a compute r program, involves
re peated ly dividing the decimal number by 2-we' ll call thi s the divide-by-2 method. The
remainder at each s te p (ei ther 0 or I) becomes a bit in the binary numbe r, s tarting from
the least sig nifi cant (ri ghtmost) digit. For exampl e, the process of converting the dec imal
number 12 to binary using the divide-by-2 method is shown in Fi gure 1. 11.
EXAMPLE 1.4
1. Divide deci mal number by 2
into the binary number
Conllnue since quotient (6) is greater than 0
2. Divide quotient by 2
Insert remainder into the binary number
Continue since quotient (3) is greater than 0
3. Divide quotient by 2
Insert remainder Into the binary number
Continue since quot ient (1) is greater than 0
4. Divide quotient by 2
Insert remainder into the binary number
Quotient Is 0, done
1.2 The World of Digha l Systems

2.[6 0 0
-6 21
o (current value: 0)
1
213
-2
l'
o
2V1
-0
l'
1 0 0
4 2 1
(current value: 4)
1 0 0
8421
(current value: 12)
Figure 1.11 Converti ng the decimal number 12 10 binary using the divide-by-2 method.
Decimal to binary using the divide-by-2 method
Convert the foll owi ng numbers to bi nary using the divide-by-2 method: 8. 14.99.
15
To convert 8 to binary, we start by di vidi ng 8 by 2: 812=4, remainder O. Then we divide the
quoti ent , 4, by 2: 412=2. remainder O. Then we divide 2 by 2: 212=1 . remainder O. Finally. we divide
I by 2: 1/2=0. remainder I. We stop di vidi ng because the quotient is now O. Combining all the
remainders. least sig nificant. digi t. fi rst, yields the binary number 1000. We can check this answer by
mulliplYlllg each binary digit by liS weight and adding the terms: 1*2
3
+ 0*2
2
+ 0'2
'
+ 0'20 = 8.
To conven 14 tn binary, we follow a similar process: 1412=7. remainder 0.712=3. remainder I.
3/2= I, remainder I. 112=0, remainder I. Combining the remainders gi ves us the binary number 1110.
Checki ng the answer verifies that 1110 is correct: 1' 2
3
+ 1*2
2
+ 1'2' + 0' 2
0
= 8 + 4 + 2 + 0 = 1.1.
To conven 99 to binary. the process is the arne but natumll y takes more step: 9912=49
remainder I. 49/2=24, remainder I. 24/2= 12, remai nder O. 1212=6. remainder O. 612=3, remainder
O. 312= I. remai nder I. 112=0. remai nder I. Combining the remainders tngether gives us the binary
number I 1000 1 I. We know from Example 1.3 that this i the correct answer.
Converting fr om any base to any other base using the di vi de-by-n method
We have been di viding by 2 in order to conven to base 2, but we can u e the arne basi
method to conve rt a base 10 number to a number of any base. To conven a number from
base 10 to base 11. we simpl y repeatedly divide the number by /I and add the remainder to
the new base /I number, sta n ing from the lea t significant digit.
EXAMPLE 1.5 Decimal to arbitrary bases using the divide-by-n method
Conven the number 3439 to base 10 and to base 7.
We know the number 3439 is 3439 in base 10. but let's use the divide-by", (where n i- 10l
method to illustrate that the method works for any base. We tart by di\;ding 3439 b) 10: 3439/
10=343, remainder 9. We then divide the quotient by 10: 343110=34. remainder 3. We do the same
with the new quoti ent : 34/3=3. remai nder 4. Finally, we divide 3 by 10: 3/10=0. remainder . Com-
bining the remainders. least signifi ant digit first. gives us the base 10 number 3439 .
,
hex
0
4
5
6
7
16 IntrOduction
. " 1 excepl we now divide by 7. We begin by
To conven 3439 to base 7. the approach IS Simi . .' our calculati ons we get: 49 117==70,
. d ? Continuing < < ,
dividing 3439 by 7: 3439n=-191. rematn er -'. 3 In=o remai nder I. Thus. 3439 in base
remai nder I. 70n=10. remainder O. Ion = I, remalOder . I ' sull ' I*r' + 3*73 + 0*72 + 1*7
1
7 is 13012. Checking the answer veri fies Ihat we have the corree re .
+ 2*7
0
= 2401 + 1029 + 7 + 2 = 3439.
d from one base to another by first convening
Generall y, a number can be convene n number to the desired base using the
that number to ba e ten. then convenm the base te
clivide-bY-/l method.
8 A F Hexadecimal and octal numbers. . .
. b known as " exadecl/lwl /ltlmbers or Just "ex, are
16
4
163 162 161 160 Base SIxteen num ers. b . d' 't is
also 0 ular in di git al design. mainl y because one. ase sIxteen Igl.
binary hex
0000 8
0001 9
0010 A
0011 8
0100 C
0101 0
A F . PI P r r base twO di oit s making hexadeCImal numbers a Illce
t t
equl va ent to 'ou " ' . I fi d" t
d
. for binary numbers. In base SIxteen, t le rst Igl
shonhan represent.ll on 0
10001010 1111 fif nes-the sixteen symbols commonl y used are ,
represent s up 10 teen 0 _ .
bi nary I. 2 ..... 9. A, B. C. D. E, F (so A=ten, B=eleven, C=twelve, D-thlneen,
---+-_.:...- E=fourteen and F=fifteen). The next digll represent s, the number of
1000 group of 1'61 Ihe next di gil the number of groups of 16-, ebc., as shown
1001 . F' I 12 S SAF equals S*16
2
+ 10*16
1
+ 15*16, or 2223 10,
1010 111 Igure . . a 16 d d" . b t a
1011 Since one digit in base 16 represent s 16 values, an our Iglts III ase w
1100 represent s 16 values, each di git in base 16 represent s four dIgIts III base
1101 two, as shown at the bOllom of Figure 1.1 2. Thus, to convert SAFI6 to
0110 E
0111 F
1110 binary. we convert 816 10 1000
2
, AI 6to 10102, and FI6 to 111 .12' resulllllg
11 11 in 8AF
I6
= 1000101 0 1111
2
, You can see why hexadeclll1al IS a popul ar
Figure 1.12 Base sixleen number system. shonhand for binary: SAF is a lot easIer on the eye than 100010101111.
To convert a binary number to hexadecimal , we Just substItute every
fou r bits with the corresponding hexadecimal di git. Thus, to convert 10 II 0 II 0 12 to hex,
we group the bi ts into oroups of four staning from the right , yielding I 0 II 0 110 I. We
then replace each group" of four bits wi th a single hex di git. 110 I is D, 0 II 0 IS 6, and I IS
I, resulting in the hex number 16D
16
.
EXAMPLE 1.6 Hexadecimal to/from binary
Conven the following hexadecimal numbers 10 binary: FF. 1011 , AOooo. You may find il useful to
refer 10 Fi gure 1.12 10 expand each hexadecimal di gillo four bils.
FFI 6 is 1111 (forthe left F) and 1111 (for the ri ghl F), or 111111112,
101 1
16
is 0001. 0000. 0001. 000 I. or 000 1000000010001
2
, Don'l be confused by lhe facI that
1011 didn'l have any symbols bUI I and 0 (which makes Ihe number look like a bll1ary
number). We said il was base 16, so it was. If we said il was base 10. then 1011 would
equal one Ihousand and eleven.
AOOOO
l6
is 1010, 0000,0000, 0000.0000. or
Convert the foll owing binary numbers 10 hexadecimal : 0010. 01111110, 1111 00.
0010
2
is 2
16
,
o I I I I I 10
2
is 0 I I I and I I 10. meaning 7 and E, or 7E
16
.
1111 00, is II and 1100. which is 0011 and 1100, meaning 3 and C. or 3C
16
. NOlice that we
start-grouping bits into groups of fou r from Ihe ri ghl . nOI tlte left.
1.3 Implementing Digit al Systems: Programming Microprocessors versus Designing Digital Circuits 17
. The subtraction or di vide-by- 16 method can al 0 be used to conven decimal to hexa-
decul1al, however, convening directl y from decimal to hexadecimal can be a bi t unwieldy
for humans SUl ce we are not used to working with powers of sixteen. Instead. it is often
qUI cker to conven from decimal to binary u ing the ubtracti on or divide-by-2 method
and then conventng from btnary to hexadecimal by grouping set of 4 bi ts.
EXAMPLE 1.7 Decimal to hexadecimal
COIll'en 29 base 10 10 base 16.
To perform thi s conversion, we can first convcn 29 to bi nary and lhen conven the binary result
to hexadecimal .
Convening 29 to binary i strai ghlforward usi ng Ihe divide-by-2 method: 29/2=14. remainder
I. 14/2=7. remai nder O. 712=3. remainder I. 312=1, remainder I. 112=0. remainder I. Thus. 29 is
111 01 in base 2.
Convert ing 111 01210 hexadecimal can be done by grouping selS of four bilS. so 11101, is I,
and 1101
2
. meaning 1
16
and D
16
, or ID
I6
. - -
Of course. we can use Ihe divide-by-16 method 10 conven directl y from decimal 10 bexadec-
imal. Slarti ng wi lh 29. we di vide by 16: 29116=1 , remainder 13 (D
I6
). 11 16=0. remainder I.
Combi ning the remainders togelher gives us I D
16
- Though lhi s particular conversion was simple.
convening larger numbers directly from decimal to hexadecimal can be lime-<:on uming. and lhe
two-step conversion may be preferable.
Base eight numbers. known as oClallllllllbers, are sometimes used as a binary hon-
hand too. since one base eight digit equal s three binary digit s. 503
8
equals 5*82 + O*SI
+3*So = 323 10, We can convert 503
8
directly to binary impl y by expanding each digit
Into three bits, resulting in 503
8
= 101 0000 II , or 1010000 I !,. Likewise. we can conven
binary 10 octal by grouping the binary number into groups o(three bits starring from the
ri ght , and then replacing e.ch group with the corresponding octal di git. Thus. 1011101
2
yields I 011 101 , or 135
8
,
Appendix A di scu ses number represemations further.
1.3 IMPLEMENTING DIGITAL SYSTEMS: PROGRAMMING
MICROPROCESSORS VERSUS DESIGNING DIGITAL CIRCUITS
Designers can implement a digital system for an applicat ion using one of tWO common
digital system implement ati on methods-programming a microprocessor or creating a
custom digital circuit (known as di gi tal desi gn) .
As a concrete exampl e, consider a simple application that !Urn on a lamp whenever
there is moti on in a dark room. Assume a moti on detector has an output wire alled a that
outputs a 1 bit when moti on is detected, and a 0 bit otherwise. Assume a light sensor bas
an output wire b that outputs a 1 bit when li ght i sensed. and a 0 bit othen\ise. And
a sume a wire F turns on the lamp when F is 1, and rum off the lamp when O. dra\\ing
of the system is shown in Fi gure 1. 13(a).
The design problem i to detennine what goes in the block named Dm'Clor. The
Detector block takes wires a and b as inputs. and generates a \'lliue on F. -uch that the
li ght turns on when moti on is detect ed when dark. The Detector :lpplicati n is readil)
implemented a a di git al system. as the application' inpull and utpUtf obviousl) are
-
..
-
18 Introduction
. " ' . h A desioner can implement the Detector
(li gna!. haVing only two pOSS Ible values eac. 3(b"')) ' n
o
'J custom di oital cirCUIt
block by programming a microprocessor (FIgure I, I or USI ", ' '"
(Figure 1. 13(c)).
~ I I
Detector
> ~
Detector
Detector
a
Digital F
System
PO
Micro-
~

(a)
(b) (c)
Figure 1,13 MOlion-in-lhe-dark-deleclor syslem: (a) sySlem block di agram, (b) implement ati on
using a microprocessor. (c) implementation using a custom digit al circuI t.
Software on Microprocessors: The Digital Workhorse
A "processor"
processes. or
tralls/orms, dow. A
"microprocessor"
is (l programmllble
proct'ssor
implemellted 011 (J
sillsle compllter
chip-rile "micro"
JUS! meall S sl1Ial/
here. The
microproce.uor
lerm became
popular il/ Ihe
19805 whell
processors shrank
dOlvlIfrom
mulliple cflips to
jusl OIlC. The first
single-chip
microprocessor
was the Imel 4004
chip ill 1971.
Desioners that need to work with digit al phenomena oft en buy an off-the-shelf mi cropro-
cess;r and write software for that microprocessor, rather than design a custom dt gttal
circui t. Microprocessors are reall y the "workhorse" of di git al systems, handltng most
digit al processing lasks.
10 PO
11
;;:
P1
12
o
P2
i3
13 '0 P3
14
i3
P4
"
15 ~ P5
t6
Q
P6
t7 P7 (b)
(a)
Figure 1.14 Basic microproce sor's input and output pins.
A microprocessor is a programmable digital device that execut es a user-specified
sequence of instructi ons, known as a program or as software, Some of those instructions
read the microprocessor's inputs, others write to the mi croprocessor' s outputs, and other
instructions perform computati ons on the input data, Let's assume we have a bas ic micro-
processor wi th eight input pins named 10, 11, .. . , !7, and eight output pins named PO,
PI, .. . , P7. as shown in Figure 1.l4(a), A photograph of a real mi croprocessor package
with such pi ns is shown in Figure L 14(b) (the ninth pin on thi s side is for power, on the
other side for ground).
A microprocessor-based soluti on to Ihe moti on-in-the-dark detector applicati on is
ill ustrated in Figure 1.1 3(b), and a photograph of an actual physical implementation
1 ,3 Impl ementing Digital Systems: Programming Mi croprocessors vers us DeSigning Digital Circuits
19
shown in Figure LI S. The designer connects the a wire to the microprocessor input pin
10, the b W, re to Input pin 11 , and output pin PO to the F wire, The designer could then
speCIfy the II1structi ons for the mi croprocessor by wri ting the foll owing C code:
void rnain()
{
Ivhile (1) {
PO ~ 10 && ! 11 : 1/ F a and ! b ,
C is one of several popular lan- moti on sensor
guages for describing the desired
instructi ons to execute on the micro-
proce sor. The above C code works
as fo ll ows. The mi croprocessor. after
being powered lip and reset, executes
the instructi ons within rna in's cllrl y
brackets ( ). The fi rst instructi on is
"wh i 1 e (1) " which simpl y means
to repeat the i nsrructions in the
whil e's curl y brackets forever. Inside
the whil e's curl y brackets is onl y one
instructi on "PO = 10 && ! 11,"
whi ch assigns the microprocessor's
out put pin PO with a 1 if the input pin
lO is 1 all d (written as &&) the input
pin I 1 is not 1 (meani ng 11 is 0).
Thus, the output pin PO, whi ch tums
the lamp on or off, forever gets
assigned the appropriate value based
on the input pin values, which come
from the moti on and li ght sensors.
Fi gure 1. 16 show an example
of signals a, b, and F over time,
with time proceeding to the ri ght.
As time proceeds, each signal
may be either 0 or 1, illustrated
by each signal' s associated line
being either low or high. We made
F

lamp
Figure 1,15 Physical motion-in-the-dark
detector implementation using a microprocessor.
a
O ~
1
b
0-------'
1
F
O ~
I I
6:00 7:057:06
I I
9:00 9:01 time
Figure 1.16 Timing diagram of motion-in-the-<lark
detector system.
a equal to 0 until 7:05, when we made a become 1. We made a stay 1 until :06.
when we made a return back to 0, We made a stay 0 until 9:00. when we made a
become 1 again, and then we made a become 0 at 9:01. On the other hand, we
made b stan as 0, and then become 1 sometime between 7:06 and 9:00, The
di agram shows what the value of F would be given the C program executing on [he
mi croprocessor-when a is 1 and b i 0 (from 7:0- to 7:06). F will be 1. A diagram
I
20 Introduction
wi th lime proceeding 10 Ihe riohl. and Ihe va lues or digital signals shown by high or
I I
' .. <> d' We draw Ihe inpul lines (a and b) 10 be
ow Ines. IS known as a IWll llg lagram. .' f
whalever values we walll . bUI Ihen the oUlpul line (F) musl describe Ihe behavIOr 0
Ihe digilal syslem.
EXAMPLE 1.8 Outdoor motion notifier lIsing a microprocessor
10 PO
11

P1
12 n P2
13 .g
P3
14 g
P4
'"
15 P5
16
Q
P6
17 P7
molion sensor
LeI's use the basic microprocessor of Fi gure 1.1 4 to implement a
thai sounds a buzzer when moti on i s detected at any of Lhree
buzzer mOli on sensors outside a house. We connect the motion sensors
microprocessor inpul pins 10. 11. and 12. and conneCI OUlpUI pill
PO 10 a buzzer (Figure 1.17). (We assume Ihe mOli on sensors and
buzzers have appropri ate elcclronic interface to the mi croprocessor
pins.) We can then wri lc the foll owing C program:
void main()
(
(1) (
PO = 10 II I I II 12:
Figure 1.17 Motion sensors connected to
mi croprocessor.
The progrnm executes the statement inside the while loop repeatedly, That Slatcmcnt will set
PO 10 I if lO is I 01' (wrillen as II in Ihe C language) I I is I or 12 is 1. olherwise Ihe slalemenl
sels PO 10 O.
EXAMPLE 1.9 Counting the number of active motion sensors
In thi s example. wc'lI usc the basic microprocessor of Figure I 14 to impl ement a simple di gital sys-
tem that outputs in binary the number of Illati on sensors that presently detect motion, We' ll assume
two motion sensors, meaning we'll need to output a two-bit binary number, whi ch can represent the
possible counlS 0 (00). I (0 I). and 2 (10). We' ll connecl Ihe mOli on sensors to microprocessor
inpul pins 10 and I I and OUlpullhe bi nary number onto outpul pi ns P I and PO. We can Ihen wrile
the follOwing C progrzHll:
void main()
(
while (J)
if ( ! 10 && ! I I) (
P1 = 0; PO = 0 : II output 00 . meaning zero
else if( ( 10 && ! ] 1 )
II
( !] O && ] 1 ) ) (
PI = 0 ; PO = 1 : II output 01. meaning one
else if (]O && ]ll (
P1 = 1 : PO = 0 : 1/ output 10 . meaning two
n
IlI lel named 'heir
t!VO/l,illg 1980S/
90s desktop
processors using
/llimbers: 80286,
80386. 80486.
As pes bc(:ame
popular. Intel
switched 10
c(lu:hier ,wmes:
the 80586 lVas
called 0 PemiuIII
("pellfa" mealls
5),JollolI'ed by the
Pentium Pro. Ihe
Penlium II, alld
others,
the "ames
tiomill(IIeti over
the nllmbers,
1. 3 Implementing Digital Syste . p '.
ms. rogrammlllg Microprocessors versus DeSigning Digital Circuits
21
Designers like 10 use mi croproce _
sors In their digi tal systems because
microprocessors are readil y avai lable,
Inexpensive. easy 10 program. and ea y
to reprogram. II may surpri se you 10
learn Ihat you can buy cenai n mi cro-
processor chips for under $ 1. Such
mi croprocessors are found in places
lIke lelephone answering machines.
mi crowave ovens. cars, IOYs. certain
medi cal devices, and even in shoes with
blinking li ghl s. Examples include Ihe
805 1 (ori ginall y designed by Inlel). the
68HC II (made by Motorola). and Ihe
PIC (made by Mi croChip). Other
mi croprocessors may cos I lens of dol-
(a)
(b)
Figure 1.18 Microproeessorchip packages: (a) PIC
and 805 1 microprocessors. costing aboUI S I each.
(b) a Pentiu m proces or with pan of ilS package
cover removed. showing the si licon chip inside,
lars: found in pl aces like cell phones, ponable digital assistams. office automation
equlpmenl, and medical equipmenl. Such processors include the ARM (made by the
ARM corporal Ion), Ihe MIPS (made by the MIPS corporation). and others. Other
mi croprocessors, like Ihe well -known Pentium processors from Intel. may cost several
hundred doll ars and may be found in desklop computers. Some microprocessors may
cost thousand doll ars and are fou nd in a mainframe compuler running perhaps
an alrlme reservallon system. There are literal ly hundreds, possibly even thousands, of
differem microprocessor Iypes avai lable, di ffering in performance. cost. power. and
olher melrics. And many of Ihe small low-power processors cost under $1.
Some readers of Ihi book may be fami li ar with software programming. others may
no\. Knowledge of programmi ng is not essemial 10 learning Ihe material in this book.
We wi ll on occasion compare custom digilal circuits wi th their corre ponding software
implememali ons-the ullimale conclusions of Ihose comparisons can be understood
withoul knowl edge of programming it elf.
Digital Design- When Microprocessors Aren't Good Enough
With microprocessors readi ly avai labl e, why would anyone ever need to design new digital
circuits, olher Ihan those relatively rew people de igning microprocessors themselve ? The
reason is that sofI ware nlnning on a mi croproce sor often isn' l good enough for a partic-
ul ar applicalion. In many cases. software may be too slow. Microproce sors only execUle
one instruclion (or aI most a rew instructions) at a time. But a custom digital circuit can
execute dozens, or hundreds. or even thousands of compUlations in parallel. Many applica-
lions, like picture or video compression. fingerprim recognition. voice command detection.
or graphics di splay. require huge numbers of computation to be done in a hon period of
time in order to be praclical-afler all , who wants a voice-controlled phone thaI requires -
minutes 10 decode your voice command. or a digi lal camera that require 1- minutes t
take each picture? In other ca e , mi croprocessors are too big. or nsume mu h
power. or would be too costl y, making ustom digit al cireuils preferable.
22 1 Int roduction
For the mOlion-in-the-dark-detector appli cation. an ahernati ve to the I.llicroprocesso
r
-
based dc.!\ign lISC!) a custom digital circui t inside the. A IS an
t r I . W Sl desion lcII'CUllthat.loreach dlffelent combInauon
1011 0 C eClnc components. C I11U e' lIch circuit is shown in Fi ure
or Input, a and b. gcnerate, the proper val ue on F. One S . g
I 13( )
' V '11 d ' b I ts 'In Lil at circuit later. But you've now seen one SImple
. C. \' C c Ile componen . , .
example of designing a digital circuit to solve a design problem. The mIcroprocessor also
has a circuit inside. but becallse that ci rcuit is designed to execut e programs rather Lhan Just
cietect 1110tion at ni ght. the microprocessor' s ci rcuitml1Y conwin about ten thousand compo-
nents. compared to j ust two components in Ollr custom di gital Cl rClI lt. Thus" our custom
di gi tal circuit may bl! smaller. cheaper. faster. and consume less power than an llTIpl ementa-
l ion on a . .. .
Many applicati on use bot h mi croprocessors and custom dI gIt al deSIgns w attam a
thaL j u!'.t the right balance of performance. cost, power, Size, deSign time,
flexibil it y. etc.
EXAMPLE 1.10 DeCIding among a microprocessor and custom digital circuit
EXAMPLE 1.11
\VC I11U"" a digital conl rol a figiller jet's aircraft wing. In order to properl y control the
aircrafl. the diuital ,ystCIll must execute. 100 li mes per second. .a comput ation lask that adjust the
wing'S posit io; on the aircraft'lj prescnI and desired speeds. pi tch. yaw, and other night factors.
we thai software on n microprocessor would require 50 ms (milli seconds) for each
execli tion of the computati on uhk. whereas a custom digital circuit would require 5 ms per execution.
Execli ting the computati on task 100 times on the mi croprocessor would requi re 100 * 50 ms =
5000 ill S. or 5 ... econcis. But we require those 100 executions to be done in I second. so the micro-
i:.. not fast enough. ExecUl ing the task 100 times with the custom digi tal circuit would
require 100 5 111' = 500 111,. or 0.5 seconds. As 0.5 seconds is less than I second. the custom
digi tal circuit can !'tali:..!'y the system's performance constrai nt. We thus choose to implement the
digital systcm as J digital circuit.
Partitioning tasks in a digital camera
A digital Cilmera capturcs pictures digi tally usi ng several steps. \Vhen the shuller button is pressed, a
grid of a few million li ght -sensitive electronic clements capt ure the image. each elemenL storing a
binary number (perhaps 16 representing the intensit y of li ght hilling the element. The camera
Ihen performs several tasks: the cnmera reads the bits of each of these clement s. compresses the tens
of millions orbits into perhaps il few mill ion bits. andslOl-es lhe compressed bilS as a file in the cam-
crn's nash memory. among other tasks. Table 1.3 provides sample task executi on times on an inex-
pcnsive low-power microprocessor versus a custom digi tal circui!.
TABLE 1.3 Sample digital camera task execution times lin seconds) on a
microprocessor versus a digital circuit,
Task Mi croprocessor
Custom digital circuit
Read 5
0. 1
Compress 8
0.5
Store
0.8
. We need '0 decide which tasks to
on the microprocessor and which
to as a CUstom digital ci rcuit.
to the constraint Ihal we should
strive 10 minimi ze the amount of Custom
di gital circuitry in order 10 reduce chi
Such decisions are known as parll..
t tOlllIIg. Three paniti oning opt ions are
hown in Figure 1.19. I f we implement all
three tasks On the mi croprocessor the
camera wi ll require 5 + 8 + I = 14
to take a pi cture-too much lime for the
camera 10 be popular wilh consumers. \Ve
could implemelll all the tasks as Custom
di gi.al ci rcuits. resulting in 0. 1 + 0.5 + O. =
1.4 seconds. We could ins.ead implement
lhe read and compress tasks wilh CUStom
digital ci rcuit s. while leaving the store uisk
to the mi croprocessor, resulting in 0. 1 + 0.5
+ I. or 1.6 seconds. We mi ght decide on thi s
lasl implementation Option. to save COSt
without much noti ceable time overhead.
1.4 ABOUT THIS BOOK
(a)
(b)
(c)
1.4 About this Book
Micro-
processor
(Read.
Compress.
andSlore)
Figure 1.19 Digi tal camera implemenred
with: (aJ a microprocessor. (b) CUStom
ci rcuits. and (c) il combination of Custom
circuits and a mi croproces or.
23
Secti on 1. 1 di scussed how di gital systems now appear everywhere around us and iooifi-
cant ly the way we li ve. Section 1.2 highli ghted how learning digital
accompli shes two goals: showing us how mi croprocessors work "under the hood." and
enabling us to implement ystems usi ng custom di gi tal circuit rather than or alon2 ide
microprocessors to achi eve beller implementati ons. This latter goal i becomin2
ingly significant since so many analog phenomena. like music and video. are becomin2
di git al. That section also introduced a key method of di gitizi ng analog igoal. namely
binary numbers. and described how to convert among decimal and binary numbers.
Section 1. 3 described how designers tend to prefer to implement digital ystcms by
writing software that executes on a mi croprocessor. yet designers often use u tom digital
circuits to meet an applicati on's performance requirement s or other requirement .
[n the remainder of thi s book you will learn about the exciting and challenging field
of digi tal design. wherein we convert desired system funcLionality into a custom digital
circuit. Chapter 2 will introduce the most basic foml of digital circuit. combinational cir-
cuits. whose outputs are simply a functi on of the present values on the circuit"s inputs.
That chapter will show how to u e a foml of math called Boolean algebra to de - ribe our
desired circuit functionality. and will provide clear sleps for conve-rting Boolean equa-
ti ons to circui ts. Chapter 3 will introduce a more advanced type of ircuit. equential
circuits, whose outputs are a function not only of the present input value. but aI 0 of pre-
viou input val ues-i n other words. sequenti al circuits have memory. uch circuits are
commonly referred to as controllers. ThaL chapter will show us how t u' another
26
,
Introduction
. '. . etll od:
1. 12 Convert lhe foll owing decimal IHllllbcr::. to binary Ilumbers uSing the dl vlde-by-2 nl
(") 9
(b) 15
(e) 32
(d) 140 . lhOd:
1. 1J C{lIlvcrt thl! foll owing decimal numbers 10 binary numbers uSing Ihe cli vide-by-2 1l1
e
(") 19
(b) 30
(c) 64
(d) 128 .' . . el hOd :
1. 1-' Convert the foll owing deci mal numbers to binary numbers lI smg the c11 vlde-by-2 111
(") 3
(b) 65
(e) 90
(d) 100 .' . . ? me
l
]1od:
1.15 COllvert the foll owing decimal numbers 10 blllary numbers usmg the dlvldc-by--
(") 23
(Il) 87
(e) 123
(d) 101
l.16 Conve rt the followi ng binary numbers to hexadecimal:
(") 11110000
(b) 111 11111
(e) 010110 10
(d) 1001101 101101
1.17 COnVl:ft the foll owing bi nary numbers 10 hexadecimal:
(a) 11001101
(Il) 10100101
(c) 11 110001
(d) 1101101111100
1.18 COllvert the f<? ll owillg binary numbers 10 hexadecimal:
(a) 111 00 111
(b) 11 001000
(C) 10100100
(d) (JIll 11'11
1.19 Convert the foll owing hexadecimal numbers to binary:
(a) FF
(b) FOA2
(c) OF IOO
(d) 100
1.20 Convert the foll owing hexadecirnallllllllber!> to binary:
(a) 4F5E
(b) 3FAD
(e) 3E2A
(d) DEED
J.2J Convert the following hexadecimal numbers to binary:
(a) BOC4
(b) I EF03
(e) F002
(d) BEEF
1.22 Convcn Ihe following hexadeci mal numbers 10 decimal:
(a) FF
(b) FOA2
(e) OFIOO
(d) 100
1.23 Convert the roll owing hexadecimal numbers to decimal :
(a) 10
(b) 4E3
(c) FFO
(d) 200
1.201 Convert (he decimal number 128 to the foll owing number systems:
(a) binary
(b) hex adeci mal
(c) base Ihrce
(d) base fi ve
(c) base fineen
1.5 Exercises
].7
1.25 Compare the number of digits necessary 10 represent the followi ng decimaJ numbers in bioary.
octal, decil11111. and hexadecimal representauons. You need nOI determine the actual represen-
tations-j ust the number of required dig it s. For example, representing the decimal number 12
requires four di gits in binary ( 1100 i s the aClUal representali on), two digital in oct:JJ ( 14) . twO
digils in decimal ( 12). and one digi l in hexadeci mal (C).
(a) 8
(b) 60
(c) 300
(d) 1000
(e) 999,999
1.26 Delenni ne the decimal number ranges thal can be represented in bina,). octal. decimal. and
hexadecimal using the following numbers of digits. For example. 2 digits can represent decimal
number range 0 Ihrough 3 in binary (00 through II ). 0 through 63 in octal (00 through 77), 0
Ihrough 99 in decimal (00 through 99), and 0 through 255 in he,xadecimal (00 through FF).
(a) I
(b) 3
(e) 6
(d) 8
SECTI ON 1.3: IMPLEMENTING DIGITAL SYSTEMS: PROGRAi\(,\IlNG
MICROPROCESSORS VERSUS DESIGNING DIGITAL CIRCUITS
1.27 Use a microprocessor like thai in Figure 1.14 to implement a system that sounds :In aJ3.lTll
whenever there is motion detected al the same lime in three different roon Each
mot ion sensor output comes to us on tl wire as a bit 1 meaning motion. 0 meaning no mou(\o.
\Ve sound the alann by selling an Output wire "alann" to 1. hm\ the l'Onnt."Ctions tl'l..Uld tn.")nl
the mi croprocessor. and the C code to execute on the micropf"()C"e or.
28 Introduction
. I ent a system that counts the number of
. h . F gure I 14 10 nnp em 1 . f a car is
1.28 Use a mi croprocessor like I at III I , has a sensor that out put s a I
cars in a parki ng 101 wit h seven spaces. Each space h Id be written in binary over three
O h . e The output 5 all d an
and thaI outputs a Qt erWls . d the C code. Hint : use a loop an
wires. Show the connecti ons with the mi croprocessor an a 1 if-else statement or a switch
b f cars present. then usc < I
i nt eger variable to caunl the nutll er 0 fO nate 3-bit out put.
statement to convert the integer 10 the app P . thn! displays the number
. ... I 1410 Implemenl a syslem . d
1.29 Use a mi croprocessor Itke thai In Figure . . II LEOs 'Irranged III a rowan
LED displ ay There are elg 1 , 1 th
of people in a wai ting room onlO an ' . . that wi ll output a when e
I . ped with a sensor <
eight chairs in the waiting room, eac 1 equip d I number of seal S being occupied,
I LED 1"1 viII correspon 10 11e fi
SCal is in li se. The number 0 S I \ fl ' h two seats those are), the r5t two
d (regardless 0 1I' 1iC . h
For insHlll cc. if two se:lt s are OCCUplC " ed the first three LEDs in the row will hg t up.
LEDs will light if seats are OCCUpl the li ght s will li ght up incrementall y. Show
Regardless of which parti cular seals arc occup c ro ri ate C code.
the connecti ons with the microprocessor and th l PP P . d I ' d c pIing
I orts encrypted Video. an t lal e ry
1.]0 Suppose a panicul ar TV SCI-lOp box al a hole supp C Th eculi on limes of each lask on
each video frame consists of three sub tasks A. B. and . e ex I 1S A 10 I11S versus 2
. I ' . are 100 ms versus n
a microprocessor versus a custom dl glla CirCUl i the microprocessor and
I f C Panilion Ihe tasks among
ms fo r B, and 15 ms versus ms or . . f Stom di gital circuitry, while
custom digital circuit ry, sll ch that you minimize the amount 0 cu
meetino the constraint of decrypting at least 30 frames per second.
e .. a er tickets for oaining entrance to base-
1.31 The owner of a ba ebnll stadium wanIs to p p w auending the game to
ball names. She would like 10 sell li ckels el eclrolli call y and all o .. II . Ihe fingerprinl
e . . TI has two opll ons for Instu II1g
ent er by theIr fingerpnnt. le owner . h fi erprint recoonition
T fi .. yst"m thm Impl ements L e 1I1g 0
recogniti on system, rst IS a s t: The'second opti on is a custom di gital circuit
using soft ware executing on a m,lcroproces.s?r, Th ftware system requires 5. 5 seconds to
designed specificall y for fi ngerpnm recogniti on. e so d' . I ' ' 1
d 550 pe. r unit whereas the Iglta ClrCUI recoe nize an individual'S fingerprmt an costs ' . th I
e S 00 ' ' t The owner wants to ensure a requi res 1.3 seconds for recognition and costs I per Ull! . d Ih s
. h d before Ihe game starts, an u everyone att ending the game will be able 10 enter t esta IU111 ' C
needs 10 be able 10 suppon 100,000 people enlering Ihe sladium wilhin 15 mmules. ompare
the two alternati ve systems in terms of how many people per minute each can
d 100000 peopl e 111 15 mmules, how many un its of each system would be neede to support '.
and what Ihe overall cost of installat ion would be for the two competing systems.
1.32 How many possible partiti onings are there 0 a set 0 f f lasks where each (ask can be imple-
men ted on il microprocessor or ilS a custom di gi tal circuit ?
1.33 *Wrilc a program that automati call y partiti ons a set of 10 tasks among <l mi croprocessor and
custom digital circuit ry. Assumc that each task has two associated executi on times. onc for the
microprocessor and the other for custom di gital circuitry. Assumc ulso that task has an
associated size number, representing the amount of di git al circuit ry required 10 impl ement thaI
task. Your program should read in the executi on times and siles rrolll a fil e,
should seek to minimi ze the amount of digital circuiLry whil e meeting a con ...
on the sum of the task execution times. Your program shoul d output the
tn k' s name and whether the task is mapped to the microprocessor or to
cui lry), lhe 10lal execul ion li me of Ihe lasks for Ihat partilioni ng, and Ihe lowl dlgll al CIrCUlI
size. Hinl: you probably can' l Iry all possible panil ionings of Ihe 10 ""k" >0 l"e a pnrll-
lioning approach Ihal makes Some educmed gues,"s. Your program lif..c ly won." I. "': able 10
guaranlee lhal il find, Ihe "':Sl panilioni ng, bUI il , houl d (I I le"' 1 li nd a good parlill orlillg.
DESIGNER PROFILE
Kelly firsl
became ---.:::;:;;:::;::;::::"!!
int erested in engineering
whil e allending a lalk
about engineering at a
career rair in hi gh chool.
" I was dazzled by Ihe
interesting ideas and the
cool graphs." While in
coll ege. Ihough. she
learned Ihat "Ihere was
much more to engineering - ....... _ ......... __
Ihan ideas and graphs. Engineers apply Iheir ideas and
ski ll s 10 build Ihings lhat reall y make a difference in
people's li ves, for generat ions to comc."
In her first few years as an engineer. Kell y has worked
on a vari cty of project "(hat may help numerous
individuals," One project was a ventilat or system li ke the
one menti oned earlier in thi s chapter. "We designed a new
conlrol system that may enable peopl e on ventil ators to
breathe with mOre comfort whil e still getting the proper
amount of oxygen," In additi on, she examined alternative
implementations of Ihat control system. incl Uding on a
microprocessor, as a Custom di gital circuit, and as a
combination of Ihe Iwo. ' Today"s lechnologies. like
FPGAs, provide so many differenl opli ons. We examined
several opti ons to see what the tradcoffs were among
Ihem. Underslanding the Iradeoffs among Ihe opli ons is
quite important if we wanl to build the best system
possible:'
She al so worked on a projecl Ihal developed 'small
self-expl analory el eclroni c bl ocks Ihal people could
connect together to build useful electronic systems
1. 5 Exercises 29
involving almoSI any kind of sen or. like motion or lighl
sensors. Those blocks could "': used by Jcids 10 learn basic
concepls of logic and com pUlers, concepts which are quite
important 10 leam these days. Our hope is that these
bl ocks will "': used as leaching lools in schools. The
blocks can also "': used 10 help adults sel up useful
systems in their homes, perhap to mOrU lOr an aging
parenl , or a child al home sick. The polential for these
blocks is greal-il will "': interesting 10 see whal impacl
Ihey have:
"My favorite thing about engineering i the variety of
skill s and creativi ly involved. We are faced with problems
Ihat need 10 "': solved. and we solve them by applying
known techniques in crealive ways. Engineers must
continually learn new [echnologies. hear new ideas. and
lrack current producls, in order 10 be good designers. It's
all very exci ting and challenging. Each day a( work is
different. Each day is exciting and is a learning
expenence.
'Studying 10"': an engineer can "': a great deal of work
bUI it"s wonh il. The key is 10 lay focused, 10 keep your
mind open. and to make good use of available resources.
Staying focused means to keep your priorities in order-
for example. as a Student . studying Come firsL recreation
second. Keeping your mi nd open mean [0 alway be
willing [ 0 listen to different ideas and [Q learn about ne\llo'
technologi es. Maki ng good use of resources means to
aog:ressivel y seek information. from the lnterneL from
from books. and 0 on. You ne\ er knO\\ where
you goi ng 10 get )our ne'U importrult bi, of
information. and you \\ On'l get that infonnarioo un}
you seek il:
30
2
Combinational Logic
Design
2.1 INTRODUCTION
. d d solely on the present combinatioll of the circlIit
A dio it al circuit whose out puts epen .... b . b t
. " . I 's call ed a combillatiollal circllit. Combll1ali onal CirCUIt s are a aSlc u
II/PillS va li es / , b . ponant ly
. I . f di oital circuit s able to impl ement some systems. ut more 1m
Important c ass 0 ,," f' . TI 's chapter introduce the
scrvino as the basis for more complex classes 0 ClrCLlIlS. 11 . (
des ion"of bas ic combinational circuit s. Later chapters will deal with more com-
bi na7ion'1 1 circu it s and with sequent ial circuits. whose outputs depend on t e seqhuedn?re
" . h . .,. t Fioure 2 I Illustrat es te l -
(hi story) of va lues that have appeared at t e CirCUIt s Inpu s." .
Ference between combinati onal and sequentI al Clrcutts.
a
b
F
If we know the present input bi t values,
then we can determine the output value.
If ab=OO. then F is a
tl ab=Ol , then F is 0
If ab=l 0, then F is 1
If ab=ll, then F is 0
F
We cannot determine the output value
just lrom tooking at the present input
values. We must atso know the history
01 input vatues.
e.g., il ab was 00 and then 10, F is 0
but il ab was 11 and then 10. F IS 1
Figure 2.1 Combinati onal versus sequential digital circui ts.
The chapt er will introduce the basic bu ilding blocks of combinati onal circuit s,
known as logic gates. and will also int roduce a form of mathematiCs, known as Boolean
a lgebra, that is usefu l for designing combinationa l circuits.
2.2 SWITCHES
Electroni c switches form the basis of all di gital circuits, so they make a good staning
point for the di scussion of di gi ta l circuit s. You usc a type of switch, a li ght ,witch, whel.l-
ever you turn li ghts on or ofr. To understand a swit ch, it help, to understand some ba IC
electron ics.
2.2 Switches 31
Electronics 101
Although
wu/erstalldillg 'he
electronics
underlyil/g tligiftll
logic gtlles is
Optiollal,II/("' )I
peoplejilld II basic
IIlu/ersflIlldillg
satisjies IIIlIch
clIriOSilY alld al,..o
helps ill
I/Iulersullldil/g
SOllie of the 1/01/
ideal digital gate
behavior later 01/.
2 ohms
9V
4.5A
Figure 2.2 9V battery
connected to li ght bulb.
You '. re probably familiar with the idea of electrons, or let's just say charged panicles.
fl owll1g through wires and causing li ghts to illuminate or stereos to blast mu ic. An anaJ-
ogous situation is Water flowing through pipes and causing sprinkl ers 10 pop up Or
turbi nes to turn. We now describe three bas ic electrical terms:
f"
'" l>
Voltage is the difference in electri c potential between two points. Voltage is mea-
sured in volts (V). Conventi on says that the emh. or ground. is 0 V. [nformall y,
voltage tell s us how "eager" the charged panicles on one si de of a wire are to get
to ground (or any lower voltage) on the wire's ot her side. Voltage is analogous to
rhe press ure of water trying to flow through a pipe-water under higher pressure
is more eager to fl ow. even if the water can't actually flow perhap becau e of a
closed faucet.
Current is a measure of the fl ow of the charged panicles. Informall y, current teli
us the rate that pani cles are actuall y flowing. Currem i analogou to water
flowing th rough a pipe. Current is measured in amperes (A). or amps for hon.
Resistance is the tendency of a wire (or anything. really) to re i t the flow of cur-
rent. Res istance is analogous to a pipe's diameter-a narrO\ pipe re isIS water
fl ow. while a wide pipe lets water flow more freely. Electrical resistance i mea-
sured in ohms (Q).
Cons ider a battery. The panicles at the positive terminal Want to flo" to the
negat ive terminal. How "eager" are they to flow? That depends on the \oltage dif-
ference between the terminals-a 9 V battery'S panicles are more eager to flow
than a 1.5 V batt ery's panicles. because the 9 V battery'S panicles ba\e more
potential energy. Now suppose you connect the positive tenninai through a light
bul b back to the negati ve terminal as shown in Figure The 9 \ ' batteI) will
result in more current fl owing. and thus a brighter lit light. than the 1.- V baneI).
Precisely how much current will flow is detemlined using the equation:
V = IR (known a Ohm's Law)
where V is voltage, I is current. and R is resistance (in this case. of the light bulb).
So if the res istance were 2 ohms. a 9 V battery would re ult in A lsint'e 9 =
1*2) of current. while a I.) V battery would re ult in 0.75 A.
Rewriting the equation as I = VIR might make more inruitive ense--the
hi gher the volt age. the more current: the higher the resistance_ the k -- current.
Ohm's Law is perhap the most fundamental equation in electroni s.
The Amazing Shrinking Switch
Now back to swi tches. Figure 2.3(b) show_ that a s"' it h has three pans-let's call them
the source input. the output , and the ontrol input. The source input has hlgher
than the output. so cun'ent wanlS to flo\\ from the source input through the ,,,it -h It> the
OUl pUt. The whole )JlIIlJose of a switch is to block t1U1 current" h 'n th' 'onrrol th
swit ch "ofr." and to all ow that cmrent to Ilo\\ \\ hen control, 'ts th <\\I(.:h "(,n." F...'r
exampl e. when you flip a light switch up to tum th' ,,, it-'h on. the ," Itch t
32
Combinational Logic Design
wire so current flows. When you flip the
Source input wi re to physicall y touch the output . ' II ates the source input from
. . If h ' tch physlca y separ
switch down to turn the SWItch a . t e SWI . I'k r cet valve that determi nes
the output. In our water anal ogy. the control input IS I ' e a au ,
whether water fl ows through a pipe.
conlrol
input
/
"off "

I \
source output
discrete
input
control
transistor
input
"on"
relay vacuum tube
Ie
I
source output
quarter
input
(to see the relative size)
(b) (a)
Figure 2.3 (a) The evol uti on of swi tches: relays (1930s), vacuum tubes ( I 940s). discre.te transistors
( 1950s). and integrated ci rcuits (Ies) containing transistors ( 1960s-present). lC's ongmally held
about len lransislors: now they can hold more than a bi lli on. (b) Simple view of n SWJlch.
Swit ches are what cause di git al ci rcuits to uti lize binary numbers made from
the on or off nature of a switch corresponds to the Is and Os in binary. We now dtscuss
the evolution of swi tches over the 1900s, leading up to the CMOS transistor switches
commonly used today in digital circuits.
1930s-Rel ays
Enaineers in the 1930s tried to devise ways 10 compute using electronically controlled
whose control input was another voltage. One such swi tch, an electro
magnetic relay like that in Figure 2.3(a), was already being used by telephone industry for
switching telephone call s. A relay has a control input that is a type of magnet, whtch
becomes magnetized when the control has a positive volt age. In one type of relay, that
magnet pulls a piece of metal down, resulting in a connection from the source input to the
output-akin to pulling down a drawbridge to connect one road to another. When the
control input return to 0 V, the piece of metal returns up again (perhap pushed by a small
spring), disconnecting the source input from the output. In tel ephone systems, relays
enabled call s to be routed from one phone to another, without the need for those nice
human operators that previously would manually connect one phone's line to another.
1940s-Vacuum Thbes
Relays relied on met al pans moving up and down, and thus were rather slow. In the 1940s
and I 950s. vacuum tubes, shown in Figure 2.3(a) and ori ginally used to amplify weak
electri c signal s like those in a telegraph, began to replace relay. in computers. Vacuum
tubes had no moving pans, so the tubes were much faster than relays .
2.2 Switches
33
MDfBUGGING"
In 1945, a moth got st uck in one of the relays of the Mark 11 computer
at Harvard. To get the computer working properl y again. technicians
found and removed the bug. Though the tern, "bug" had been used for
decades before by engineers to indicmc a defect in mechanical Or
electrical equipment. the removal of that moth in 1945 is considered
to be the origi n of the term "debuggi ng" in computer programming.
taped that moth to their written log (shown in the picture
to the s.de), and that moth is now on display at the National Museum
of American History in Washington, D.C.
Jock Kilby 01
Texas IlIsfmmellls
mill Roben No}'ce
01 Fojr"hild .
SemicOIu/lictors
lire often credited
1I';,h ellch/IO"'lIg
illriept'lIdt'lIIly
im't'lIled rhe I e.
The machine said to be the world's first general.purpose computer. the ENIAC
Numencal Integrator And Computer), was completed in the U.S. in 1946. ENIAOO
contatned about 18.000 vacuum tubes and 1500 relays. weighed over 30 ton . was I
feel long and 8 feet high (so it would likely not fit in any room of your house. unles you
have an absurdly big house). and consumed 174,000 wans of power. Imagine the heat
generated by a room full of 1740 IOOwan light bulbs. That' hot. For all thaI. E'llAC
could comput e aboul 5000 operations per second-compare that to the billions of opera-
tions per second of today's personal computers, and even the tens of millions of
comput ations per second by a handheld cell phone.
Although vacuum tubes were faster than relays. they consumed a lot of power. geo-
erated a lot of heat, and failed frequeJ1lly.
Vacuum tubes were commonplace in many electronic appliances in the 19605 and
1970s. I remember taking trips to the store with my dad in the early 19705 to buy replace-
ment tubes for our tel evision set. Vacuum rubes sti ll live today in a few electronic de\;c<7'
One place you might still find tubes is in electric guitar amplifiers. where the rube
unique-sounding audio amplification is st ill demanded by rock guitar enthusiasts who
want their version of classic rock songs to ound just like the originals.
1950s-Discrete Transistors
The invention of the transistor in 1947. credited to William Shockley. John Bardeen. and
Walter Brattain of Bell Laboratories (the research am, of AT&n. resulted io mailer and
lower-power computers. A soli dstate (discrete) transistor. hown in Figure 1.:(a). uses a
small pi ece of sili con. "doped" with some extra materials. to create a wit h. inee these
switches used "solid" materials rather than a vacuum or even moving pans io a rein}. the}
were commonly referred to as solidstate transistors. Solid tate transi tors were maller.
cheaper. fas ter. and more reliable than rubes. and became the dominant mputer swit h
in the 1950s and I 960s.
1960s-lntegrated Circuits
The invention of the illtegrated circuit (IC) in 195 reall) Ie\ luti nized computing.
An Ie. n.k.a. a chip. packs numerou tiny tran$i'tor.; on a fingernailsized pi f :ili o.
So instead of 10 transistors requiring 10 discrete ele troni mponc.>nt> n} our lx>ani.
10 transistor.; can be implemented on one component. the hip. Figure _.:\3) .. \\ . a
picture of an IC thut ha$ a few million transistors. Though earl} I ,fe3tured < nl_ t us f
I
34 Combinational Logic Design
lransistors. improvemenls in IC technology have resulied in nearly ONE BfLLlON tran
sistors on a chip loday. IC lechnology has shrunk transislors down 10 a totally dIfferent
scale. A vacuum lUbe (aboul 100 mm long) is 10 a modem IC transislor (aboul 100 nm) as
a skyscraper (aboul 0.5 km) is 10 Ihe Ihi ckness of a credi l card (aboul 0.5 mm).
I' ve been worki ng in Ihi s field for IWO decades. and Ihe amounl of transIstors on a
chip slill amazes me. The number I bill ion is bigger than mosl of us have an intuilive feel
for. Think of penni es, and consider Ihe volume Ihal I billion pennies would occupy.
Would Ihey fil in your bedroom? The answer is probably no (unl ess you have a really
huge bedroom), since a Iypi cal bedroom is aboul 40 cubic meiers, while I billion pennies
wou ld occupy aboul 400 cubi c melers. So you would need aboul 10 bedrooms, roughly
Ihe size of an el1lire house, packed from wall to wall , floor 10 ceiling, wi th pennies, 10
slore all Ihal money. And if we Slacked the pennies, Ihey would reach nearly 1000 miles
imo Ihe sky-for compari son. a jel fli es at an allilude of about 5 mi les. That 's a lot of
penni es. BUI we manage to fi l I billion lransislors onto si licon chi ps of jusl a few square
cemimelers. Truly amazing.
The wi res thai connecl all those transistors on a chip, if straightened into one straight
wire. would be several mil es long.
IC Iransistors are much small er, more reli able, fasler. and less power-hungry than
discrele lransislors. Thus, IC lransi stors are now by far the mo t commonly used switch in
computing.
ICs of the earl y 1960s coul d hold tens of transistors, and are known today as small.
scale il1legrati on (SS/). As transistor sizes shrank. in the late I 960s and early I 970s, ICs
cou ld hold hundreds of transistors, known as medi um-scale integrati on (MS/). The 1970s
saw the developmem of large-scale integrati on (LS/) ICs wi th thousands of transistOrl;,
whil e very- large scale integrat ion (VLS/) chips evolved in the I 980s. Since then, ICs
have cominued to increase in their capacity, to around I billion transistors. To calibrate
your underst:lI1ding of thi s number. consider thai the first Pentium microprocessor of the
early 1990s required only aboul 3 million transistors, and some popular but relatively
small microprocessors require onl y about 100,000 transistors. Many of today' high-end
chips Iherefore comai n dozens of microprocessors, and can conceivably comain hundreds
of the relatively small microprocessors (or just one or two big microprocessor ).
IC density has been doubling roughl y every 18 months since the I 960s. The doubling
of IC densi ty every 18 months is widely known as Moore's Law, named after Gordon
Moore, a co-founder of [ntel Corporat ion, who made predict ions back in 1965 that the
num,ber of componenl s per IC woul d double every year or so. At some point, chip makcrl;
won t be able 10 hnnk transIstors any funher. After all . the transistor has to at least be
A SIGNIFICANT INVENTION
We now know lhal lhe inventi on of the transistor was the
sian of the amazing computation and communication
revolutions thaI occurred in the laller half of Ihe 20th
century. enabl ing us 10 loday do Ihings like see the world
On TV. surf Ihe web. and lalk on cell phones. Bul Ihe
Implications of the transistor were not known by mOSI
people at Ihe time of its invenlion. Newspapers did nOl
headline the news. and mosl stories Ihat did appear
predlcled "mply Ihal transislOrs would improve things
like rad,os and heari ng aids. One may wonder whal
recent ly invented bUI unnoti cd lechnology mighl
SIgnificantly Change Ihc world once again.
D
2.3 The CMOS Transistor
HOW 00 THEY MAKE TRANSISTORS SO SMALL? USING PHOTOGRAPHIC METHODS
If you look a pencil and made Ihe smallest dOl Ihat you
could on a sheel of paper. Ihat dOl' S area would hold
many thousands of transi stors on a modem sil icon chip.
How can chip makers create such liny transistors? The
key li es in photographi c mel hods. Chi p makers lay a
special chemi cal OnlO the chip, special because Ihe
chemi cal changes When exposed 10 li ght. Chip makers
Ihen shine Iighl through a lens Ihal focuses the lighl
down to extremely small regions on the chi p-si mil ar
to how a microscope' lens ICls us Sec li ny things by
focusing li ght. but in reverse. The chemical in Ihe small
illumi nated region changes. and lhen a solvent washes
away the chemical-but some regions stay because of
the lighl that changed thaI region. Those remaining
regions form pans of transislors. Repeating this proces
over and over again. with different chemicals at
different steps, results not only in transistors. but also
wires connect ing [he transistors. and insuJators
preventing crossing wires from touching.
Photograph of a Pentium
processor's silicon chip
haviflg millions of
lrafl sislors. Acltlal si:e is
about I em each side.
35
wide enough to lei electrons pass through. People have been predicting the end of
Moore's Law for over a decade now. but transistors keep shrinking.
Not only do smaller transistors and wire provide for more functionality in a Chip.
but they also provide for Faster circuits. in pan because electrons need not travel as far to
get from one transistor to the next. This increased speed is the main reason why personal
computer clock speeds have impro ed so drastically over the past few decade. from kilo-
hem frequencies in the 1970 to gigahenz freq uencies in the early 2000 .
2.3 THE CMOS TRANSISTOR
The most popular type of IC transistor is the CMOS transi tor. Although a detailed
nation of how a CMOS tran istor works is beyond the cope of this book. nevertheless.
I' ve found that a simplified explanation seems to satisfy much curiosity.
A chip is made primarily from the element silicon. A hip. also known as an inte-
grated circuit , or IC, is typicall y about the size of a fingernail. Even if you open up a
computer or ot her chip-based device. you would not actually see the ilicon chip, inee
chips are actually inside a larger. usual ly black. protccti"e package. But )OU ""'nainl)
should be able to see those black package. mounted on a printed ireuit board_ in ide a
vari ety of household electronic devices.
Figure 2.4 illustrates a cross section of a tiny pan of silicon hip. howing the ' ide
view of one type of CMOS transistor-an nMOS trnnsistor. The trnnsistor has the thre..>
part s of a switch: ( I) the SOl/ree input: (2) the output. which is ailed the drain. I suppo-
becau e electric panicles flow to the drain like water Hows to 3 drain: and (3) the :onO'OI
input. which is call ed the gate. I suppose because the gate blocks the current Ho\\ like a
gate bl ocks a dog from e caping the ba kyard. A hip maker o-eates the soun-e and drain
by injecting cenai n elements into the -iii on. Figul'e _..! al'o ' ho\\ _ the el 'O'Onic s)mool
of an nMO transistor.
Suppose the drain was onne 'ted to a slllall po-ithe ,oltagc (Illodem t 'ho'iogi:
use about I or 2 ) as the "power suppl):' and the source \\:l> X'nn ted thn.'\U.gh
36 2 Combinational Logic Desi gn
A positive
vol tage here ..
(aJ
... aHracts elect rons here,
turning the channel
between Source and drain
into a conductor.
{
gate--jl
conducts
(bJ
does not
conduct
J
,"' -4, '4
l
'-4\
conducts
Figure 2.4 CMOS transistors: (aJ transistor on silicon. (b)
nMOS transistor symbol with indicati on of conducting when
gate; I. (c) pMOS transistor symbol conducts when gate; O.
does not
conduct
(c)
a resistor to ground. Current would thus want to Row from drain to source, and on to
ground. (Note: unfortunatel y, convention is that current How is defined using positive
charge, even though actuall y negati vely charged electrons are fl owing-so you may
noti ce that we say current fl ows from drain to source, even though elecLrons flow from
source to drain .) However, the sili con channel between source and drain is not normally a
conductor, or in ot her words, the channel is normall y an insul ator. We can think of an
insul ator as an extremel y large resistance. Since I ; V fR, then I will essentiall y be O. The
switch is off.
The really int erest ing thing about silicon is that we can change the channel
from an ins ul ator to a conductor just by applying a small positive volt age to the
ga te. That gate volt age doesn' t result in current fl ow from the gate to channel ,
because o f the s mall insul ator (oxide) between the ga te and the channel. But, that
gate volt age does create a posi ti ve electric fie ld that attracts e lectrons, whi ch have a
negat ive c harge, from the larger sili con region into the channe l region-akin to how
you can move paper clips on a tabletop by moving a magnet under the table. When
e nough e lectrons gather int o the channel, the channel all of a sudde n becomes a
cond uctor. A conductor has extremely low resistance, so current flows a lmost freely
bet ween drain and source. The swit ch is now on. As you can see, s ili con is not quite
a conductor but not quite an insulator ei ther, mlher representing something in
between- hence the term semicOllducl or.
An analogy to the current trying to cross the channel is a person trying to cross
a ri ver. Norma ll y, the ri ver mi ght not have enough stepping stones for the person to
be able to wa lk across. But if we could altract stones from other part o f Ihe river
into one pathway (the channel ), the person could eas il y wa lk ((cross the river
(Figure 2.5).
2.3 The CMOS Transistor
37
Figure 2.5 CMOS transi stor operation analogy-A person may not be able to cross a river until JUSt
enough stepping stones are attracted into one pathway. Likewise, electrons can't cross the channel
between Source and drain until just enough electrons are attracted into the channel.
We mentioned that nMOS was one type of CMOS transistor. The other type is
pMOS. A pMOS is similar except that the channel has the opposite functionality-the
channel is a conductor normall y, and then doesll'r conduct when the gate has a positive
voltage. Figure 2.4 shows the electronic symbol for a pMOS transistors. The use of these
two "complementary" types of transistors is where the C comes from in CMOS. The
MOS stands for Metal Oxide Semiconductor, but the reasons for that name go beyond the
scope of thi s di scussion.
SILICON VALLEY, ANO THE SHAPE OF SILICON
Silicon Valley is not a city, but refers to an area in
Northern Califomia. about an hour south of San
Francisco, that includes several cities like San Jose,
Mountain View. Sunnyvale, Milpitas, Palo Alto. and
others. The area is heavil y popul ated by computer and
other high-technology companies, and to a large extent
is the result of Stanford University's (located in Palo
Alto) effons to attract and create such companies. What
shape is silicon? Once. as my plane arri ved in Silicon
Valley, the person next to me (who happened to be a
college seni or studyi ng Computer Science) asked
"What shape is a silicon anyways?" I eventually
reali zed he thought silicon was a type of polygon. like
a pentagon or an octagon. Well . the words do sound
similar. Silicon is not a shape. but an element. like
carbon or aluminum or sil ver. Silicon has un atomic
number of 14, has a chemi cal symbol of"Si:' and i the
second most abundant element (next to oxygen) in the
earth's crust, found in items like sand and clay. Silicon
is lI sed to make mirrors and glass. in ndditi on to chips.
In fact. to the naked eye. a silicon chip actw!lJy looks
like a small mirror.
A d rip packagt w;lh its chip coveT C'lUt
see rhl! mirror-like SiliCOII chip ill lite ctnur.
38 2 Combinational Log ic Design
2.4 BOOLEAN LOGIC GATES- BUILDING BLOCKS FOR DIGITAL CIRCUITS
You 'vc seen that CMOS transistors can be used to implement switches on an incredibly tiny
scale. However. trying to usc switches as Our bui lding blocks to bui ld complex di gital ctrcults
is aki n to urying to use small rocks to build a bridge. as illustrated in Figure 2. 6. Sure, you
could probably bui ld something from rudimentary building blocks, but the buil ding process
would be a real pain. Switches (and small rocks) are just too low-level as buildi ng blocks.
00);)00
These bl ocks ... . .. are hard to work with.
Transistors are
hard to work with
------ ---- ----- - - - - - - - - -- - - ------ --- - ----. ----- . ---- -. ----- - ---- ------- ------- -
DOD
The right building blocks ...
. .. enable greater designs.
The logic gates that we' ll
soon introduce enable
greater designs
Figure 2.6 Having Ihe ri ght building bl ocks can make all the difference when building thi ngs.
Boolean Algebra and Its Relation to Digital Circuits
Fonunately. Boolean logic gates help us in the design task by representing di gital circuit
building bl ocks that are much easier to work wi th than switches. Boolean logic was
developed in the mid- 1800s by the mathematician George Boole. not to bui ld digital cir-
cuits (whi ch weren' t even a glimmer in anyone's eye back then), but rather as a scheme
for u ing algebraic methods to formali ze human logic and thought.
An algebra is a branch of mathematics that uses letters or symbols to represent
numbers or values, where those letters/symbols can be combined according to a set of
known rul es. Booleall algebra uses variables whose val ues can only be 1 or 0 (repre-
senting true or false, respectively) and whose operators, li ke AND, OR, and NOT, operate
on such variables and return 1 Or O. So we might declare vari ables x, y . and z, and then
say that Z = x OR y, meaning z is 1 onl y if at least one of x or y is 1. Likewi se. we
might say z = x. AND NOT(y). meaning z is 1 only if x is 1 ;lI1d y is O. Contrast
Boolean algebra with Ill e regular algebra you' re familiar wi th from perhaps hi gh school,
in whi ch variabl e va lues could be integers (for example), and operators could be addi ti on,
subtracti on, and multipli cati on.
The basic Boolean operators are AND, OR. and NOT:
A D return. 1 if both it s operands are 1. So the result of a A D b is 1 if both
a 1 and b= 1, otherwise the result is O.
~ --.._ - -
"ob==OJ" ij'
s/rorrlullld for
"0=0. b=I."
EXAMPLE 2.1
2.4 Boolean Logic Gates- Building Blocks for Dighal Circuits
39
OR returns 1 if either or both of its operands are 1. So the result of a OR b is 1
111 any.or the following cases: ab=OI , ab= 10, ab= 11. Thus, the only time a OR
b IS 0 IS when a b-OO.
NOT returns 1 if its operand is O. So NOT(a) return 1 if a is O. and returns 0 if a
IS 1.
We use Boolean logic operators all the time in everyday thought such as in the state-
ment "I'll go to lunch if Mary goes OR John goes, AND Sally does ~ o t go." To represent
thIS uSll1g Boolean concepts, let F represent my ooing to lunch (F-l means I'll go to
lunch, F=O means I won' t go). Let Boolean variables m, j, and 5 represent Mary, John.
and Sally each going to lunch. Then we can translate the above Engli sh sentence into the
Boolean equation:
F - (m OR j) AND NOT (s)
So F wi ll equal 1 if either m or j is 1. and s is O. ow that we've translated the
Engli sh sentence into a Boolean equation. we can perform several mathematical activities
with that equati on.
One thing we can do is determine the value of F for different values of m. j . and 5:
m=I , j=O, 5-1 ~ F = (l OR 0) AND NOT(l) = 1 AND 0 = 0
m=I , j=I , s=O ~ F = (lOR 1) AND NOT(O) = 1 AND 1 = 1
In the first case, I don' t go to lunch; in the second, I do.
A second thing we could do is apply some algebraic rules (which we'll discuss later)
to modify the original equation to the equivalent equati on:
F - (m and NDT(s) OR (j and NOT(s)
In ot her words, I'll go to lunch if Mary goes AND Sall y doe not go. OR if John goes
AND Sally does not go. That statement, as different as it may look from the earlier ooe.
is equi valent to the earlier one.
A third thing we could do is formally prove propertie about the equation. For
exampl e, we could prove that if Sally goes to lunch (5=1). then I don't go to lun b (F=O)
no matter who else goes, using the equation:
F - (m OR j) AND NDT(I) - (m OR j) AND 0 0
No matt er what the values of m and j . F will equal O.
Noting all the mathematical activities we can do using Boolean equati ns. you can
stan to see what Boole was trying to accomplish in formalizing human reasoning.
Converting a problem statement to a Boolean equation
Convert Ihe foll owing problem st3lemenlS 10 Boolean equation. u ing roo R. and :O\ OT ('3-
10rs. F should equal I onl y if:
I. a is I and b is 1. AllslI'er: F = a A D b
2. ci lher of a or b is 1. AllslI'er: F = a OR b
40 Combinational Logic Design
J. both a and b are not O. AIl.\,wer:
(al Oplion I: F; NOT(a) AND NOT(b)
tbl Oplion 2: F ; a OR b
4. a is 1 and b is O. AII.mer. F; a AND NOT(b) .
. J atemcnts (0 Boolean equati ons:
Convert the following English prob em 51 . d h ystcm is set to enabled.
" r ifhiah heat IS sensed an I C S " d F _
I A fire sprinkler system should spray wale . o. d " e represent "enabled, an rep
. Answer: LeI Boolean variable h represent "'ugh heat IS sense .
. ,. TI quation is' F; hAND e.
resent "spraylng water. l en all e. . haken or the door is
. bl d and eilher Ihe car IS s
, A car alarm should sound if the alann I S ella e . I "car 'IS shaken " d represent
_.. ., . bled" S rcprcsen ,
opened. AIIslI'er: Lei a represent alarm IS ena" ' i n is' F = a AND (s OR d).
.. . " and F represent "alarm sounds. Then an equat 0 .
door IS opcne . ' . or d ro resenlS "door is closed" inslead of open
(al Alternali vely. assunllng Ihal our door sens p blain the following equation: F;
d=l when the door is closed, 0 when open), we 0
a AND (s OR NOT( d)).
EXAMPLE 2.2
Evaluat ing Boolea n equations f . bles a b
SIUlIIlIOfl, by,he
1\'0)', ;.roI50
co".ridered the
fa/hero!
illfimllorioll
theory dill' /0
"if later l1'ork
0 11 diXllal
commlmic(J/WII .
Evaluale Ihe Boolean equali on F ; (a AND ) 0 c b R ( AND d) for Ihe given values 0 vana "
C. and d:
a=I, b=I , c-1. d- O. AIISII'er. F =
a=O. b=1. c=O, d=I. AIISII'er. F
a-I. b- 1. C= l.d=l. AIISh'er. F
(1 AND 1) OR (1 AND 0)
(0 AND 1) OR (0 AND 1)
( 1 AND 1) OR (1 AND 1)
1 OR 0
o OR 0
1 OR 1
1.
O.
1.
One mi ght now be wondering what
Boolean algebra has to do with bui lding cir-
cuits using switches. In 1938, an MIT
oraduate student named Claude Shannon
a paper (based on hi s Masters thesis)
describino how Boolean algebra could be
applied swi tch-based circuit s, by showing
that "on" switches could be treated as a I (or
true). and "ofr ' switches as a 0 (or fal se), by
connecting Ihose switches in a certain way
(Fi oure 2.7). Hi s thesis is widely consi dered
as seed that developed into modem
dioital design. Since Boolean algebra comes
a rich set of ax ioms, theorems, postu-
lates, and rules, we can use all those things
to manipul ate digital circu its USing algcbra.
In other words:
We can build circuits by doing math.
Boolean
algebra
(mid-1800s)
Boole's intent: formalize
human thought
j
Switches
(1930s)
!
Shannon (1938)

Digital design
For telephone
switching and other
electronic uses
Showed application
of Boolean algebra
to design of switch-
based circuits
Figure 2.7 Shan non applied Boolean ..
algebra 10 swi lch-based circuils.
a formal basis 10 di gital circuil deSign.
That 's an extremely powerful concept. We' ll be building circuit s by doing math
throughout thi s chaptcr.
.. -... - -
2.4 Boolean Logic Gates- Building Blocks for Digital Circuits 41
AND, OR, & NOT Gates
Earlier we said a
"gate" was the
.flili tch c:ofllrol
iI/put of a CMOS
trallsistor, but fl OW
we're tlllkiflg
about "logic:
gates." III all
ulI/orluI/ate
Iwming similarity,
the sallie \ ....oro
(gate) refers to
two different
things. D Oli '/
worry /hollgh;
(I/ter Ihe "ext
sec/iol/, we '/I just
be /I sing the word
gate 10 refer /0 a
"'ogic gOle."
To bui ld di git al circuits that can be manipulated using Boolean algebra, we tirst imple-
ment the Boolean operators AND, OR, and NOT usi ng small circuits of switches, and we
call those circuits Boolean logic gates. Then, we forger obour swirches_ and instead use
Boolean logic gates as Our building blocks. Suddenly, we have the power of Boolean
algebra at Our fingertips when deSigning more complex ci rcuits! This is akin to first
assembling rocks into three shapes of bri cks, and then building structures like a bridge
from those bri cks, as illustrated in Figure 2.6. Trying to build a bridge from small rocks
is much harder than bUilding a bridge from the three basic brick shapes. Likewise, trying
to build a moti on-in-the-dark circui t (or any digital circuit) from switches is harder than
building a circuit from Boolean logic gates.
Let' s first implement Boolean logic gates using CMOS transistors. and then later
we' ll show you how Boolean algebra helps bui ld better circuits. You really don' t hove to
understand the underlyi ng transistor implementations of logic gates to learn the digital
design methods in the rest of thi s book, and in fact many textbooks omit the tranSistor
di scussion entirely. But an understanding of the underl ying transistor implementation can
be quite sati sfyi ng to a student , leaving no "mysteries." Such an understanding can also
help in understanding the non ideal behavior of logic gates that one may later have to
learn to deal with in di gital design.
We' ll start by using "I " to represent the power suppl y's voltage level, which today is
usuall y around I to 2 V for CMOS technology (e.g., 0.7 V, or 1.3 V). Let"s use "0" to rep-
resent ground. Note that we could have chosen any two symbols or words. rather than "I --
and "0," to represent power and ground voltage levels. For example, we could bave used
" t rue" and " f a 1 5 e," or " H" and "L." Remember that the "1 '- does nO! nece sarily corre-
spond to I V, and the "0" does not neceSSari ly correspond to 0 V. In fact each usually
represents a voltage range, like "1" representing any VOltage between 1.2 V to 1.4 V_
NOT
OR AND
Symbol
xV-
F
;D-F ;D-F

x y F x y F
Truth table o 1 0 0 0
1 0 0 1 0 1 0
0 0 0
Figure 2.8 Basic logic gales
symbols, trulh lables, and transislor
circuit s: (a) NOT (i nverter) gate. (b)
2-i npul OR gate, (c) 2-input AND
gate. Warning: real AND and OR
g",es arell " aCluall y buill thi s way,
but rather in n more complex
manner-sec Section 2.8.
Transistor
circuit
(a)
F
(b)
42 2 Combinational Logic Design
F ~
o
-----...
time
Figure 2.10 Inverter
liming diagram.
NOT Gate I s be the opposite, or inverse,
A NOT gale has an input X and an output F. F should n way We can build a NOT
f C I ' called an /IIver l er .
a X-lor thI s reason. a NOT gate IS common) . F' ? 8(a) The tri anole at
. s shown In IOllfC _ . ,. co
gate using one pMOS and one nMOS transIstOr, a of' l ower suppl y which
.. e voll aoe a lIe p ,
the top of Ihe transistor circuit represents the POSII1 V '.0. ent s ground which
b
of Ihe ClfCUlt repres ,
we represent as 1. The seri es of lines at the attorn " 11 onduct but the
. . 0 h pMOS transIstor WI c ,
we represe11l as O. When Ihe 1I1 put x IS , I e h' k f the circuit as a
O . . . ? 9( ) I Ih t case we can t In a
nM S will not. as shown 111 F, gure _. a . n a, . 1 the nMOS will
. 1 0 I ther hand when X IS ,
wife fl'Om 1 to F, so when x = 0, F = . n t Ie a " e can think of
. . F 29(b) In that case, w
conduCI, but Ihe pMOS wil l not, as shown In Igure.. . F' 28 called a
". h 1 F- O The table 111 Igure . ,
the ClfCUlt as a Wife from 0 10 F, so w en X= , - . ' t t for every
Irlllll labl e, summarizes the NOT gate' s behavior by li sting the gate s au pu
possible input.
Figure 2.9 Invert er conducti on paths when:
(a) the input is O. and (b) the input is 1.
Figure 2. 10 shows a liming di agram for an inverter-when the input is 0, the output
is 1; when the input is 1, the output is O.
Electricall y, combining pMOS and nMOS in thi s way has the benefit of low-power.
otice in Fi gure 2.8(a) that for any value of x, either the pMOS or nMOS tranststor
wi ll be nonconducting. Thus (conceptually), current can never now fl'Om the power
source to ground- thi s feature will also be true for the AND and OR gates we' ll define
next. Thi s feature makes CMOS circuit s consume far less power than other transistor
technologies, and part ly explains why CMOS is the most popular logic gate tranststor
technology today.
OR Gate
A basic OR gale has two inputs x and y and an OUIPUI F. F should be 1 only if at least
one of X or y is l. We can bui ld an OR gate using two pMOS transistors and two nMOS
trans istors, as shown in Figure 2.8(b) (although we will see in Section 2.8 that OR gates
are actually built in a more complex manner). If al least one of X or y is 1. then we get a
connecti on from 1 to F, but no connecti on from 0 to F, so F is 1, as shown in Figure
2.II(a). If both X and y are 0, then we get a connecti on from 0 10 F, but no connection
from 1 to F, so F is 0, as shown in Figure 2.11 (b). The truth table for the OR gate appears
in Figure 2.8(b).
time
Figure 2.12 OR gate
liming diagram.
D-
x ~ J L J
~ ~
F
0----1'
time
figure 2.14 AND gate
timing di :'lgram.
2.4 Boolean Logic Gates- Building Blocks for Digital Circuits
Figure 2.11 OR gate conducli on
paths when: (a) one inpul is l. and
(b) bOlh inputs are O.
F
o ~
43
. Figure 2.1 2shows a timing diagram for an OR gate. (See Section 1.3 for an introduc-
tIon to lIm1l1g diagrams.) We set inputs x and y to each possibl e combination of values.
and show that F wlil be 1 if ei ther Or both inputs is a 1.
Larger OR gates, having more than two inputs, are also pos ible. If at least one of the
OR gate's inputs are 1, the output is l. For a three-input OR gate. the tran iSlOr clrcuit
Fi gure 2.8(b) would have three pMOS transistors on top and three nMOS transi [Ors on
the bottom, instead of two transistors of each kind.
AND Gate
A basic AND gale has Iwo inputs x and y and an outpul F. F shoul d be 1 only if both x
and y are l. We can build an AND gate usi ng two pMOS transi stors and two nMOS tran-
sistors, as shown in Figure 2.8(c) (again, we will see in Section 2.8 that AND gates are
actuall y built in a more complex manner). If both x and y are 1. then we get a connection
from power to F, but no connection from ground to F, so F is l. as hown in Figure
2. 13(a). If at least one of x or y is 0, then we get a connection from ground [0 F. but no
connection from power to F, so F is 0, as shown in Figure 2.13(b). The truth table for the
AND gate appears in Figure 2.8(c).
Figure 2.13 AND gate conducti on
paths when: (a) all inpuls are l. and
(b) and input is O.
Fi gure 2. 14 shows a timing diagram for an AND gate. We set input$ \ and) to a h
possible combinati on of valucs. and show that F \\'ill be 1 onl) if both inputs :II\' a .
---------- -._-- ,-
44
Combinational Logic Design , 1 onl y if
'ble The output IS
Laroer AND oates, having more than twO inPhu ts, are Fi gure 2. 8(b) would
" " . . ut AND oate. t e transl . tead of
'111 the inputs are 1. For a three-tnp d h " nMOS transistors on the bOllom, tnS
;, ave three pMOS transistors on top an tree
twO transistors of each ktnd.
Building Simple Circuits Using Gates how how to build
.. bl k f om transistors, we now s F'
Having bu il t logic gate butldtng r Recall the digi tal system exampl e of Igure
Detector
a -,.----1
EXAMPLE 2.3
EXAMPLE 2.4
useful circui ts from those bUlldtng oc s. t'on and b=O meant dark, so we
. d k d t ctor a=l meant 010 I , OT ( b ) and
I 13 the moti on-tn-the- ar e e . h . erter to get N ,
. . d F - a AND NOT(b). We can connect b throug an tnv . F The resulting circuit
wante - . AND oate whose output IS .
ect the result along with a tnto an '" . We now provide more
conn . F' I 13(c) shown again to the left for convent ence.
appears 111 19ure., ....
examples.
Converting a Boolean equation to a circuit with logic gates
Convert the foll owing equation to a circuit:
F : a AND NOT( b OR NOHe)
We start by drawing F on the ri ghi , and then
worki ng backwards toward ',he inputs. (We
instead start by drawing the Inputs on (h.c left and
working toward the output. ) The equation for F
ANDs IWO ilems: a. and the OUtpUI a NOT. We
thus begi n by drawing the CIrCUit of Fi gure 2 .. 15(a).
The NOT's inpul comes from an OR of Iwa Items:
b. and NOT(C). We thus complete the drawtng tn
Figure 2. 15(b) by includi ng an OR gate and NOT
gate as shown.
-i>D-F
(3)

Figure 2.t5 Building the circuit for
F: (a) parti al, (b) complete.
M e examples converting Boolean equations to gates .
or . . s to circui ts bUilt from
Fi oure 2. 16 provides IwO more examples of convert mg Boolean e fi ure shows the
lo:ic gates. We agai n start from the output and work backwards to tnputs. each gate
" andenee between equati on operators and gates. and the order In which p corresp
in the circui t.
F = (a AND NOT(b)) OR (b AND NOTlc))
2 1 3
F
(3)
(b)
Figure 2.16 Examples of convening Boolean equations to circui ts.
2.4 Boolean Logic Gates-Building Blocks for Circuits 45
EXAMPLE 2.5 Using AND and OR gates with more than two inputs
Figure 2. 17(a) shows an implementali on of the equation F = a AND bAND c. using two-input AND
gates. However. deSigners would typically instead implement such an equation using a single three-
tnput AND.gate, shown in Figure 2. 17(b). The function is Ihe same. but the three-input AND g3le uses
fewer tran IStOrs, 6 rather than 4 + 4 = 8 (as well as having Ie s delay-more on delay later). Likewise,
F = a AND b AND c AND d would typically be implemented u ing a four-input AND gate.
:%, ;0-,
(3) (b)
Figure 2.17 Using multipl e-input AND gates: (a) usi ng 2-input AND gates. (b) usi ng a 3-input
AND gate.
The same approach applies to OR gates. For example. F = a OR b OR e "auld typically be
implemented using a single three-input OR gate.
We now provide examples tarring from Engli sh problem de criptions. which we
convert to Boolean equations, and then fi nally implement as a ci rcuit.
EXAMPLE 2.6 Seatbelt warning light
Suppose you Want to design a system for an automobile Lhal
illuminates a warning light whenever the driver' s seat belt is
not fastened and the key is in the igniti on. Assume the follow-
iog sensors:
a sensor with output S indicates whether the driver's
belt is fastened (5 = 1 means the belt is fastened). and
a sensor wi th output k indicates whether the key is in
the igni tion (k = 1 means the key is in).
Assume the warni ng li ght has a single input w thaI i1luminales
the li ght when w is I. So the input s to our digital system are 5
and k, and the Outpul is w. w should equal 1 when both of the
fOllowi ng occur: 5 is 0 and k is I.
Let's first write a simple C program executing on a
microproce sor to solve thi s design problem. If we connect S
to 10, k to n, and I. to PO, then Our C code inside the C pro-
gram's main () functi on would be:
wh i 1 e (1) I
PD - ! 10 && I I :
The code repeatedly checks the sen ors and updates the warning lighl.
Now leI's write a Boolean equlllion describing a ircui[ implementing the design:
w - NOT( 5) AND
46 2 Combinational Logic Design . lete the
. ccd earlier. we can easil y camp
Usi no the AND and NOT l ogic gales that ld connect ing the resulting NOT(s) and
' n ' f first system. by connecting s to a N? [
inputs of a 2-input AND gate,. as I a timing diagralll , we can set
Figure 2. 19 provides n IImmg dwgram or draw the output line to match the Clfeu
inputs to whatever values we want. but the On a then 10. then 11. The onl y time that the
. I h figure we set 5 and k to . t ,
function. n t c .' . 0 d k ' 1 as shown in the fi gure.
output \'1 wi ll be 1 IS when S IS an IS .
Inpuls

o
BeltWarn
1 J 5
0
___ _
OulpUIS

o ..
ti me
Figure 2.18 Seat belt
warni ng ci reui !.
Seatbelt
Figure 2.19 Timing
diagram for seat bell
warning circuit.
We stated earli er that logic gates are more
appropriate than transistors as building bl ocks
for desionino di gital Clrcutts. Note, however,
that the 100i; oates are ultimately implemented
" " . F' ??O For
using transistors, as shown I.n Igure .-:- ' _
C programmers, an analogy IS Ihal wrltmg soft-
ware in C is easier than wri ting 10 assembly.
even though the C ultimalely gets implemenled
using assembl y. Not ice how much less mlUltl ve
and less descripti ve is Ihe
circuit in Figure 2.20 than the equi valent logiC
gate-based circuil in Fi gure 2.18.
EXAMPLE 2.7 Seat belt wa rning light with driver sensor
Let 's extend the previous exampl e by adding a sen-
sor. wi th output p. that detects whether a IS
aClUall y sitting in the dri ver's seat, and by
the system' s behavior LO only illuminate the warning
when a person is detected in the seat (p=l) . So the
new ci rcuit equation is:
w - P AND NDT(s ) AND k
In thi s case, we need a 3-input AND gate. The cireuil
is shown in Fi gure 2.2 1.
Be aware thaI the order of the AND gale' s
inputs does not matter,
Figure 2.20 Seat belt warning
circuit using transistors,
BellWarn
Figure 2.21 Seal bell warning
circuit with person sensor,
w
2.5 Boolean Algebra 47
EXAMPLE 2.8
BeltWam
Seat belt warning light with initial illumination
L.ets ex.tend the previous example. Automo_
biles tYPi call y II ghl up all thei r warning lights when
you first lurn the key. so you can check that all the
IJ g,hts are working, Assume that Our system
receives an Input t that is 1 for the first 5 seconds after
a key is inserted into the ignition, and 0 afterward
(don' t worry aboui who Or whal sets t in thai way). So
we wan! '1=1 when p=l and s=D and k=l , OR when
t =1. NOle that When t =l . we illuminale the light,
regardless of the values of p, s, and k. The new circuil
equation is:
w = (p AND NOT (s) AND k) DR t
Figure 2.22 Extended seat belt
warning ci rcuit.
The circuit is shown in Figure 2.22.
Some circuit drawing rules and conventions
There are some rul es and convent ions thaI designers
commonly fOllow When drawing circuits of logic gates:
Logi c gates have one or more inputs and one
outpul, bUI we typically don' l label each gate' s
inpUlS or output. Remember that the order of the
inputs inl o a gate doesn'l impact the logical
behavior of the gale.
Each wire has an implicit direction. going from one
gate' s Outpul to another gate's inpuI, but we typi-
Cally don' l draw arrows showing each direction.
A single wire can be branched OUI inlo two (or
more) wires going 10 multipl e gate inputs-the
branches have Ihe same value as the si ngle wire.
But two wires can NOT be merged into one
wire-whal would be the value of that one wire
if the incomi ng IWO wire had different values?
2.5 BOOLEAN ALGEBRA
no yes
=D- D-
Logic gales are lI seful for implementing circuits. bUI equations are bener for manipulating
circui ts. The algebraic tool s of Boolean algebm enable us to manipulate Boolean equa-
tions so we can do things li ke simplify the equations. check if two equati as are
equivalent , find the inverse of an equation. prove properties about the equati n . et '.
Since a Boolean equation consisting of AND. OR. and ' OT opemti n an be straight-
forwardly transformed into a circuil of AND. OR. and ' OT gate' . manipulating Boolean
equations can be con idered as manipul ating digital circuils.
Well informall y introduce some of the most u ' eful algebraic I' f Bool an
algebra. API endi x A provides a fOrlnal definition of B lean algebr.l.
48 2 Combinational Logic Design
Notation and Terminology d ' b' n
o
Boolean equati ons. We' ll
. d linolooy for escn I "
We now define some notation an tem I" book
use these definitions extensively throughout t 1e .
Operators tors in equations is cumbersome. Thus, Boolean
Writing out the AND, OR and NOT opera .
. f those operators.
algebra uses si mpl er notauon or _, I 'which one peaks of as
. . . ' or a. We I use a ,
"NOT(a)" is typIcall y wntt en as a I t of a or the illl'erse of a .
. k 1 as the comp emell .
"a prime." a ' IS also ' noWI ' fi ' 11 y 'Intended to look similar to
. "a + b " specl C,I f
"a OR b" is typicall y wntt en as :, b'" ven referred to as the slim 0
. I loebra. a + IS e
lhe addition operator III regu ar a 0 II b "
" b" is usuall y spoken of as a or . .
a and b. a +. . "* b" or "a. b." specifi call y Intended to look
"a AND b" is typIcall y wntten as a. I Igebra and even referred to as
. I' . operator \0 regu ar a J ..." h
similar to the multlp Icatlon . I b a we can even wnte a b ,or t e
d b J t as In regular a ge r , . I
the product of a an . us d b are separate vari abl es IS c ear.
f and b as lono as the fact that a an ".
product 0 a . Of" d b" or even just as . a b.
"a *b" is usuall y spoken 0 as a an
. . h otati ons for Boolean operators, but the above nota-
Mathemati CIans often use ot er n . s likely due to the Intenuonal
ti ons seem to be the most popular amon
g
b
englnee;o;'
simil ari ty of those operators wi th of:
Usi ng the simpl er notallon, our ear ler se
w = (p AND NOT(s) AND k) OR t
could be rewritt en more conci sely as:
w = ps ' k + t
which would be spoken of as "w equals p s prime k, or I."
EXAMPLE 2.9 Speaking Boolean equations
Speak the foll owi ng equations:
I. F = a' b' + e. Allswer: "F equals a prime b prime or c."
2. F ::: a + b * c ' . Answer: "F equals a or band c prime:'
Convert the foll owing spoken equations into wti ucn equati ons:
I. "F equals a b prime c prime." Answer: F = a b ' e' .
2. "F equals abc or d e prime." Answer: F = a be + de'.
Th les of Boolean algebra require that we evaluate expressions using Ihe precedence
rul: * has precedence over +, that complementing a v:triablc has precedence over *
d
+ d that we of course compute what's in parentheses first. We can make the earlter
an an .. . , r II " w _ (p * (5') *
equation'S order of evaluation explI CIt uSlOg parentheses as 0 OWS.
k) + t.
Table 2. 1 summarizes Boolean algebra precedence rule, .
EXAMPLE 2.10
2.5 Boolean Algebra 49
TABLE 2.1
Boolean algebra precedence, highest precedence first
Symbol
Name
Description
(J
Parentheses
Evalullle expressions nested in parentheses fi rst
NOT
Evaluate from left to righl
AND
Evaluate from left to right
+ OR
Evaluate from lefl to right
Conventions
Although we borrowed the multiplication and addition operations from regular aJoebra
and even use the terms sum and product, we dOli '! say "times" for AND or "plus" OR.
Dt gl tal. deSIgn textbooks typicall y name each vari able u ing a single character.
uSlOg a slOgle character makes for concise equations like the equations above.
We II be WfltlOg many equations, so conci eness wi ll aid understanding by preventing
equati ons that across multipl e lines or pages. Thus. we'l l usuall y follow the conven-
tIOn of uSlOg slOg Ie characters. However, when you de cribe digital systems using a
hardware descripti on language or a programming language like C. you hould probably
use much more deSCriptive names so that your code is readable. So in tead of u ing " s "
to represeot the output of a seat-belt-fastened ensor, you might instead use
" SeatBel tFastened."
Evaluating Boolean equations using precedence rules
Evaluale the following Boolean equations. assuming a=l, b=l. e - D, d=l.
J. F = a * b + C. Answer: * has precedence over +. 0 we evaluate the equation as F ::: (1 '"
1 ) + 0 = (1) + 0 = 1 + 0 = 1.
2. F = a b + e . Allswer: the problem is identical to the previous problem. using the hortband
notation for *.
3. F a b ' . Answer: we first evaluate b' because OT has precedence O\'er AND. resulting in F
= 1 * (1 ' ) = 1 * (0) = 1 * 0 = D.
4. F = (ae) ' . Allswer: we first evaluate what is inside the parentheses. then \\e :"OT the result.
yielding (l *0) ' = (0)' = 0' = 1.
5. F = (a + b ' ) * c + d ' . Alls,,.r: The parentheses h"e highest preceden e. Inside the
parentheses. NOT has hi ghest precedence. So we evaluate the parentheses pan !IS ( 1 - l ' 1
(] + (0 = (] + 0) = 1. ext. * has precedence O\er +, yielding (] 0 - l '
( 0) + 1 ' . The NOT has preceden e over the OR. gi"ing (0) + ( I' ) = ( 0) _
= 0 + 0 - O.
Variables, Literals, Terms, and Sum of Products
Let's define a few more concepts, u ing the e 'ample equation: F ( a . b. e)
abc' + ab + c.
a' c
Variable: A variable represents a quantil) (0 or D. The abo\e equJtit>n h ,three
variables: a . b. and c . We typically USe \:uiables in Boolean <'quation, to repn'-
sent the input s of our system ometimes \\e e'\ plicitl) li,t" fun,'u n', \arlabl ,
as above ("F (a. b. c) = ..... ). Other times we omit th Ii,t ("F _ ..... \.
50
2 Combinati onal Logic Design
. bl e in either true or compl emented
Literal: A li teral is the appearance of a abc' a, b, and e .
form. The above equati on has 9 lit erals: a .' e, T' h ' b 'equat ion has four
. d ct of hterals. e a ove
Prodllctterm: A product term IS a pro u
terms: a' be, a be ' , a b, and e. . f d t terms is known as
. . as an OR1I1" 0 pro uc
SlI m-oJ-Prodll cts: An equal10n wntten i quat ion for F is in sum-of-
. f d f n The above examp e e ,
bei ng 111 sum-o -pro ucts . on . . II in sum-of-products form:
producls form. The follow1l1g equati ons are a
abc + abc '
a b + a ' e + a be h ve ' ust one literal).
a + b ' + a e (note Ihat a product term can a J
. I NOT ' n of-products form:
The foll owing equal10ns are al 111 SUI -
(a + b)e
(a b + be) (b + c)
(a') ' + b
a(b + e(d + e))
(ab + be) '
Some Properties of Boolean Algebra
I b
Assume a, b, and e are Boolean
We now li st some of the key rules of Boolean age ra.
variables, which each hold either the value of 0 or 1.
Basic Properties
The foll owing properties, known as postulates, are assumed to be true:
COlllmutative
a + b = b + a
a * b = b * a
This propert y should be obvious. Just try it for different values of a and b.
Distriblltive
a * (b + c) = a * b + a * c
a + (b * c) = (a + b) * (a + c) (litis ail e i s Irick)'!)
Careful , the second one may not be obvi ous. It 's different than .regular algebra.
But you can verify that both of the di stributi ve properti es hold Imply by eval u-
ating both sides for all possible values of a, b, and e.
Associative
(a + b) + c = a + (b + c)
(a * b) * c = a * (b * c)
Again, try it for different values of a and b 10 see Ihat thi s holds.
Identity
o + a = a + 0 - a
1 * a a * 1 - a
EXAMPLE 2.11
EXAMPLE 2.12
2.5 Boolean Algebra 51
Makes intuit ive sense right ? OR' . 0 . .
b h . . , . 1I1g a with (a+o) Just means that the result will
e *w atever a IS. After all , 1 +0 is 1, while 0+0 is O. Likewise, ANDing a with 1
(a 1) result s 111 a. 1 *1 is 1, while 0*1 is O.
Complement
a + a ' = 1
a * a' = 0
This also makes int uitive sense. Regardless of the value of a, a' is the opposite.
so you get a 0 and a 1, or you get a 1 and a O. One of (a, a ') will always be a 1.
so ORlllg them (a+a ' ) must yield a l. Likewise, one of (a, a ' ) will always be a
0, so AND1I1g them (a*a') must yield a O.
Let's now appl y these basic properties to some di gital desi!m exampl es to see how these
propertjes can help us. 0 ,
Applying the basic properties of Boolea n algebra
Use Ihe properties of Boolean algebra for Ihe foll owing problems:
Show Ihal abc ' is equivalenllO c ' ba .
The commutati ve property all ows us 10 swap the operands bein. ANDed, so a*b*c'
a*e ' *b = c ' *a*b = c ' *b*a = c ' ba . -
Show thai abc + abc ' = ab .
The firsl di slributive property all ows us 10 factor out the a b tenn: abc + abc'
a b (e+c ' l. Then, the com pie men I property all ow us to replace the c+c' by 1:
a b (c+c ' ) = a b ( 1 ). Finally. the identity property allows us to remove the 1 from the
AND lerm:ab(1) = ab*1 = abo
Show that Lhe equati on X + x ' Z is equivalent to X + z.
The second distributi ve property (the tri cky one) al lows us to replace x+x' Z by
(x+x' )*(x+zl. The complement property allows us to replace (x+x' ) by 1. and the
identity property allows us to replace 1*( x+z) by X+Z.
Simplification of an automatic sliding door system
Suppose you wi sh to design a system to conlTOl an
aUlomati c sliding door. like one might be found at
a grocery slore's entrance. An input p lO our system
indicates whether a sensor detects a person in front of
the door (p= I means a person is detected). An input h
indicates whether the door should be manually held
open (h = 1) regardless or whelher a person is detected.
An inpul C indicate whether the door should be rorced
to stay closed (like when the store is closed for busi -
ness)-c = 1 means the door should SlOy closed. The
latter two would nomlall y be set by a manager with the Figure 2.23 Ini tial door opener ircui!.
proper keys. An OUlput f opens the door when f is l.
We want to open the door if the door is set to be manunlly held open. OR ir the door is nOi set to
manually held open but a person is detected. However. in either ase, we open the door if the
door is not set lO stay closed. \Ve can tmnslate these requirements into 3 Boolean equurion as:
f - he ' + h' pc '
52 Combinational Logic Design
.' h' "" ( on as in Fi gure 2.23.
\Ve could bui ld a ci rcuit to lI11plemcnt ( IS cqua.l .' . d ' b'd earl 'ler Looki ng al the equa-
. . the properti es escn l: . . .
'ow let's manipulate the cquttllon lISlI1g . h b"" ble to simplify the remalnmg
tion we believe we call factor Oul the c ' . We mi ght 1 ca e "
"h+'h ' p" part toO. Let's try some transformati ons. fi rst factonng out
(by the commutative property)
he ' + h'pe'
e ' h + e 'h'p
e ' (h + h ' p)
e ' ( ( h+h ' ) * (h+p) )
e ' ((l)*(h + p
e ' (h+p)
(by the first distributi ve property). I
(by the 2nd di stributive property-t ncky one,)
(by the complement propert y)
(by the identi ty property)
Note that the simpl er equati on still makes intuiti ve
sense-we open the door only if the door is not set to stay
closed (e ' ), AND either the door is set to be manuall y
held open (h) OR a person is detected (p). A circuit imple
menting thi s equation is shown in Figure 2.24. Thus. by
appl yi ng the algebraic properties. we obtained a simpl er
ci rcuit. In other words, we used math to si mplify the
circuit.
Simplification of logic circuits will be the focus of
Section 2. I I .
EXAMPLE 2.13 Equivalence of two automatic sli di ng door systems
DoorOpener
figure 2.24 Simplified door
opener circuit.
Suppose you found a reall y cheap device for automatic sliding door systems. The device had inputs
e . h. and p and output f , as in Example 2. 12, but the device' s documentati on satd that.
f = e ' hp + e ' hp' + e ' h ' p
Does that device do the same as that in Example 2. J 2? One way to check is (0 see if we can manipu-
late the above equation into the equation in Example 2.12:
f = e 'hp + e ' hp' + e ' h'p
f = e ' h(p + p') + C'h'p
e ' h(l) + e ' h ' p
e 'h + e ' h'p
he' + h'pe '
(by the di stributive propert y)
(by the complement property)
(by the identity property)
(by the commutat ive property)
That's the same as the origi nal equation of Example 2. 12. so the device should work for us.
Additional Properti es
Let ' s consider some additi onal propenies. which happen to be known as theorems
becau e they can be proven using the above postulales:
Null elements
a + I - 1
a * 0 - 0
These should be fairly obvious. I OR anything i, going to be 1. while 0 AND
anything is going to be 0,
Idempotent Law
a + a - a
a * a - a
2.5 Boolean Algebra 53
Again, this should be fairly obvious. If a is 1 1+1-1 d 1*1-1 hil'f a' 0
and 0*0-0. ,an , weI IS.
Involution Law
(a ' ) ' - a
Again, fairly obvi@s, If a is 1, the first negation gives 0, while the second gives
1 again. Ltkewl se, If a tS O. the first negati on gives I, while the second gives 0
agai n.
DeMorgall 's Law
(a + b) ' - a'b '
(ab) ' = a' + b'
are. not as obvious. Their proofs are in Appendix A. Let 's consider both equa-
intUIti vely here, Consider ( a + b)' = a' b ' . The left side will only be 1
tf (a + b) evaluates to 0, which only occurs When both a AND b are 0 meanina
a ' b' - the right side, Likewise, consider ( a b)' - a' + b'. The left ide will
only be 1 if (a b) evaluates to 0, meaning at least one of a OR b muSt be O.
meaning a' + b ' - the right side. DeMorgan' s Law can be stated in Englisb as
follows: The complement of a sum equals the product of the complements: the
complement of a product equals the sum of the complements. DeMorgan's Law i
WIdely used, so take the ttme now to understand it and to remember it.
Let's appl y some of these additional properties in more example.
EXAMPLE 2,14 Applying the additional properties
Convert the equation F = a b (e+d ) into sumoF-products fonn.
The distributive property all ows us to "multiply out" the equation to F = a be - a bd.
Convert the equation F = wx (x ' y + zy ' + xy) into sum-of.productS form. and make
any obvious simplifications.
The di stributive property allows us to "multiply out" the equation: wx (x ' y+zy' T y) =
wxx ' y + wxzy ' + wxxy. That equation is in sum-ofproducts form. The complemen!
property all ows us to replace wxx ' y by w*O*y. and the null element property means thO!
w*O*y = 0, The idempotent property al lows us to replace wx xy by wxy (because xx = Xl.
The resulting equation is 0 + wxzy' + wxy - wxzy' + w y.
Prove that x ( x ' + y ( X ' +y . ) ) can never evaluate to I.
Repeated application of the first di tributive property yields: xx' +xy ( '+y') - x
+ xy x ' + xy y , . The complement property tells us that X ' -0 and yy '-0.
+ O*y + x*O. The null element property leads to 0 + 0 + O. \\hich equals O. Thus.
the equation always evaluates to O. regardless of the a tunl \'a1ues of x and y.
Determine the opposite function of F = ( a b' + e l.
The desired function is G = F' = (ab' +e ) ' . DeMortlan's Ul\\ ,i elds G - a ' ,
* e', Applying DeMorgan's Law again to the first term -) ields G ( a'.-( b' 1 '
e ' . The involution property yields (a' + b ) c ' . Finall). the dlstributh.
yields G - a' c ' + be ',
54 2 Combinational Logic Design
, . . raft lavatory sign
EXAMPLE 2.15 Applying DeMorgan s Law In an alrc .
'II . 'Iled sioo indlcat-
Commercial ai rcraft Iypicall y have ,an I .:= se an air-
in! whether a lavatory (bathroom) IS avail able. Suppo t (linD
has three lavalorics. Each lav:llory has a sensor '0
1 if the lavalory door is locked. 0 otherwise. OUf circuli WI
. 0 from those sensors. as
have three inputs. a. b. and c. camino. 'cd (whether
shown in Fiourc 2.25. If (lilY lav:ll ory door IS unlock. .
e lk d) "should I1lul11lnate
one. two. or all three doors un oc e ,\\1,; 1
the "Available" si!!.Jl by sett ing the ci rcuit's output S to .
this understanding. we recogni ze that the OR func-
tion suits the problem. as OR outputS 1 if any of ,its are
. " 1 We beglll wnuno an
1. regardless 01 how many II1P.UtS '. 0
equation fo r S. 5 should be 1 If a IS 0 OR b IS 0 OR c .
Saying a is 0 is the same as sayi ng a I. Thus. the equatIOn for
5 is:
S-a '+ b ' +c '
\Ve translate Ihe equation to Ihe ci rcuit in Figure 2.26.
\Ve can apply DeMorgan's Law (in reverse) 10 the equa-
tion by noting thai (a be) I a I +b I +c I I so we can
replace the equation by:
= (abc) '
The circuit for that equation appears in Figure 2.27.
Figure 2.25 Aircraft
lavatory sign block.
Circuit
a

Figure 2.26 Aircraft lavatory
sign circuit.
:$5Cf
Figure 2.27 Circui t aft er applying
DeMorgan's Law.
EXAMPLE 2.16 Proving a property of the automatic sliding door system
Your boss wants you 10 prOl'e lhal the automati c sliding door circuit of Example 2. 12 ensures th:t the
door will stay closed when the door is supposed to be forced to stay closed. namely,when c- 1. [f
the function f = c ' (h+p) describes the sliding door, you can prove the door wil l stay closed
(f=O) using propenies of Boolean algebra:
f = C' (h+p)
Let C = 1 (door forced closed)
f 1'(h+p)
f O(h+p)
f Oh + Op (by the distributive property)
f 0 + 0 (by the nu ll elements propenyl
f = 0
Therefore, no matter what the va lues of hand p, if c= 1, f wi ll equal O-thc door will stay
closed.
EXAMPLE 2.17 Automatic sltding door with opposite polarity
In Example 2. 12, we computed the functi on to open an automatic sliding door as:
f-c'(h+p)
YOUR PROBLEM IS MY PROBLEM
The use of Boolean algebra for digital design is an
exampl e of powerful general concept of mapping
one problem to another. By mapping a new problem
(digital design) to an old problem (logic representati on).
the SOlutions (Boolean algebra) to the old problem can
2.6 Representations of Boolean Functions 55
be applied to the new problem. Immediately. the new
problem can benefit from perhaps decades of work of
solving the old problem. Mapping one problem to
another is extremely common in engineering. especialJy
in computing. Afler all, why reinvent the wheel?
Suppose your autommic door Control has an input with the opposite polarity as what we expect: 0
means open the door, while 1 means close. \ Ve can compUle the function 9 lhat opens the door. and
simplify that functi on, as follows:
9 f'
9
9
9
9
(c ' ( h+p '
(c ' ) ' + (h+p) '
C + ( h+p) '
+ h'p'
(by sub tituting the equation for f)
(by DeMorgan's Law)
(by the Involution Law)
(by DeMorgan's Law)
2.6 REPRESENTATIONS OF BOOLEAN FUNCTIONS
A Booleall jUllctioll is a mapping of each possible combination of values for the func-
tion's variables (t he inputs) to either a 0 or 1 (the output). An example of a Boolean
funcrion described in regular Engli sh is a function F of variables a and b. such that the
functi on out puts I when a is 0 and b is 0, or when a is 0 and b is 1. There are e\'eraJ
bener representati ons than Engli sh for de cribing a Boolean function. including equa-
ti ons, circuit s, and truth tables, as shown in Figure 2.28. Each repre emarion has its own
advant ages and di sadvantages, and each is useful at different times during design. Yet all
the representations, as different as they look from one another. represent the very arne
funct ion. It 's like how there are different ways to represent a particular recipe for choco-
late chip cookies: wri tt en words, pi ct ures, or even a video. But no matter how the recipe
is represented, it 's the same recipe.
Figure 2.28 Seven
representations of the very
same function F(a.b):
(a) nvo English descriptions,
(b) two equations, (c) two
circuils, (d) a truth table.
(b)
English 1: "F outputs 1 when a is 0 and b is 0, or when a is 0 and b IS 1:'
(a)
English 2: " F outputs t when a is O. regardless of b's value."
F
I
o 1 I
1 0 0
1 I 0
Truth table
(d)

56 2 Combinational Logic Design
Equations
Circuits
Truth Table
One way to represent a Boolean function is by using an equati on. An eqllatioll is a Il)
.' ' b ' all)
ematical statement equatll1g one expressIon with another. F ( a , b) = a + a ' b . '
an example of an equation. The right-hand side of the equati on is often referred to as IS
expressioll . which evaluates to either 0 or l. at)
We've seen Il,at differe11l equati ons can represent the same functi on. The eqUa .
F (a . b) = a' b' + a ' b represent s the same function as does the equati on F ( a , b )tll)ll.
a ' . BOlh equati ons pcrfonn exactl y the same mapping of the input values to output values '"
pick any input va lues (e.g .. a=O and b=O). and both equations map those II1 put values tl) -------
same output value (e.g .. a =0 and b=O would be mapped to F = 1 by either equation).
One advantage of an equation as a Boolean functi on representation compared
other representations is that we can easil y manipulate an equati on using propertie tl)
Boolean algebra, enabling us to simplify an equati on. prove that twO equations repre:e I)f
the same func!l on. prove propertIes about a functi on, and more. I)t
A second way to represent a Boolean function is using a ci rcuit of logic gates. A c;r
is an interconnection of components. Because each logic gate component has a Cll;,
. f ' I .' Pre
defi ned mappmg a Input . va ues to output values, and because wIres Just transmit tl)e'-
va lues unchanged. a cIrcuIt descnbes a function. It-
We' ve seen that differe11l circuit s can represent the same function. The two circuits'
Fi gure 2.28 both represent the same function F. The bOll om circuit uses fewer gates b II)
the functi on is exactl y the same as the top circuil. ' Ut
One advantage of a circuit as a Boolean function repre entation compared to Otl)er
representa!lons IS that a CIrcuit may represent an actual physical implementation I)f
Boolean functi on. and ultimatel y our goal is to implement digital ci rcuits physical! a
Another advantage IS that a CIrcuit drawn graphicall y can enable qui ck and easy corn y.
hension of a function by humans. pre_
A third way to represent a Boolean function is usin
o
a trllth
table. A truth table's left side li sts the input variabl es, and
shows all possible valli e cOlllbillatiolls oj those illPlIts, wit h
one row per combination, as shown in Figure 2.29. A truth
table's ri ght side would then li st the function 's output va lue (l
or 0) for the row's particul ar combination of input values, as
was shown in Figure 2.28(d). Any function of two variabl es
wi ll have those fou r input combinati ons on the left side.
People usuall y li st the input combinations in order of
increasing binary val ue (00=0,01= 1,10=2,11 =3), as we've
done above, though stri ctl y speaking. we could li st the combi -
Inputs
a b
o 0
o 1
o
Output
F
figure 2.29 Trulh lable
Mfucture for a two-
input funclion Fen. b),
nations in any order as long as we li sted all possible combi nations. For any combination
of input (e.g., a-O, b-O). we merely need to look at Ihe corre, ponding vaJue in
2.6 Representations of Boolean Functions _
the oUlput column (i n the case of a=O b= :>7
determ1l1e the function 's output. ,0, the OUtPUt shown in Figure 2.28(d) is 1
FIgure 2.30 shows the truth tabl ) to
fu t' e structure
nc lon, and a four-input function s or a tWO-input functi on a th .
a b F
o 0
o 1
o
(aJ
a
0
0
0
0
. ' ree-IOpUt
b c F
0
a b c d F
0
0 0 0 0
0
0 0 0 1
0
0 0
1 1
0
0 0 1
0 0
0 0 0
0 1
0 0 1
0
0 1 0
0 1 1 1
(b) 0 0 0
0 0 1
0 0
Figure 2.30 T h
. rut table structure for: (a)
r
two-Input functIOn F(. ,b). (b) th ree- input
ull Ctl on F(. b c) d ( )
F( b ' . .' an c. four-input function
. a l,c,d
fi
) Defining. specific function would
In.vo ve lIing in the rightmost column for F
wuh 0 or a I for each row.
. . Truth tables are not only found in
deSIgn. If you've studied basic
bIOlogy, you've likely seen a type of truth
table describing the Outcome of various
gene pairs. For example, the table on the
fight shows outcomes for different eye
color genes. Each person has two genes
for eye color, one (labeled M) from the
0 1 1
0 0
0 1
0
1
(e)
Gene pair
M
blue
blue
brown
brown
o
blue
brown
blue
brown
Outcome
F
blue
brown
brown
brown
mom, one (labeled 0) from the dad. Assumino ani .
bl ue and brown the table lists all bl" .y two poSSIble values for each "ene
, POSSI e combmati f " '
person may have. For each combinau' th b . ons a eye color gene pairs that a
h
an. e ta Ie h ts the t 0
as two blue eye genes will they h bl au come.. nly when a person
. ave ue eyes' hav' 0
results In brown eyes (due to th b . . m" one or two brown eye 2cn
. e rown eye gene bemo d . - -
Unhke equations and circuits a Boolea "ommant over the blue eye gene.)
representation. ,n functIOn has ani 0111' truth table
One advantage of a truth table as a Boolean funetio .
other representations is the fact that a f . h n representauon compared to
unctIon as only one truth bl '
we can conven any other Boolean f' . ta e representanon. so
d
.,.. uncnon representatIOn t th bl
Illerent representations represent th f' a tru ta e to determine if
. . e same un lion-if th
tlon. theIr truth tables will be identical T th bl C) rep,re :nt the same fun.:-
readers, as a truth table clearly h tho ru ta e are also quite mtuiti\'e to human
ow e output for el'e ' bl '
that we 1I ed truth tables in Fi gure 8 t d ' be . . f) .p."' I e mpul. Thu'. n'ti,
basic logic gates. - . a esen 111 an IIltulme manner the beha\ior l f
58
Combinational Logic Design
I
ber of inpuls the number of truth
A drawback of Irul h tables is Ihal for a arge num ' h b f input
tabl e rows can be extreme))1 laroe. Given a functi on with It inputs, t e 0
" . Id h 2
10
- 1024 pOSSIble IIlpUI com-
combinat ions is 2". A funcl ion wilh 10 mputs wou ave -. f .
. . I ble havlll g 1024 rows A unction
binali ons-you can' l easily sec much of anythmg m a a .
with 16 inpuls would have 65.536 rowS in ils trulh labl e.
EXAMPLE 2.18 Capturing a function as a truth ta ble
TABLE 2.2 Truth tabl e for Create a truth table describing a funct ion that detects a
three-bit input ' S' v<l lue. representing a binary number. IS 5 or
greater. Table 2.2 shows a trulh tabl e for the funcli on. We first
list all possible combinations of Ihe three Inpul bitS, whIch
we' ve l abeled a. b. and C. We then enter a 1 in the row
if the input s represent 5. 6. or 7 in binary. We enter as III all
5-or-great er function.
a b C F
0 0 a a
a a 1 a
a 1 a a
remaining rows.
a 1 1 0
1 a a a
1 0 1 1
1 1 a 1
1 1 1 1
Converting among Boolean Function Representations
Given Ihe above represent at ions, we can view combina-
tional logic design as defining Ihe appropriate Boolean
funcli on 10 solve a parlicular problem, and then cre-
aLing a ci rcui l representalion of Ihal function. Defining
the appropriale Boolean funClion requires nOl onl y thaI
we Ihink aboul what Lhal function should be, bUI also
Ihat we capl ure Ihal functi on in some form-Iypically
either as an equal ion or a trulh table. Then, we musl
conven Ihe caplured funcli on representation inlo a cir-
cuit. Thus, combinalional logic design requi res Ihal we
know how 10 conven from one Boolean funcl ion repre-
semation 10 another. For Ihe three representations we

Equa1ions '--2/
( ----4 6)
3) ( 5
'- Trulh lables -""
Figure 2.31 Possible conversions
from one Boolean fu nction
representation to another.
have di scussed so far (equalions, circuils, and truth lables), there are six possible conver-
sion from one represenlation 10 another, which we now describe (Figure 2.3 I).
I. Equations to circuits
Converting an equation 10 a circuil can be done slraighlforwardl y by using an AND gale
for every A D operator. an OR gale for every OR operalor, and a NOT gale for every
NOT operalOr. We already gave several exampl es of such conversion. in Secti on 2.4.
2. Circuits to eq uations
Conven ing a circuil inlo an equati on can be done by slaning from Ihe circuils inpuls. and
then wriling the OUIPUI of each gate as an expression invol ving Ihe gale' s inpuls, The
ex pres ion of Ihe lasl gale before the OUlpUt represent s the expression for Ihe ci rcuil's
.. . -_. _---
2.6 Representations of Boolean Functions
funcli on Fo
h . ' . r exampl e, suppose we are given
[ e CIfCUIl In Fi gure 2.32. To convert to an
equall
on
,. we Slart wilh Ihe inverter, whose
OUIPUI wI ll represenl C ' . We continue wilh Ihe
OR gale-nOle Ihat we can' l delermine Ihe
OUIPUI for Ihe AND gale yel umil we creale
expr,essions for all Ihat gale's inpulS. The OR
gale s OUIPUI represents h+p. Finally. we wrile
Ihe ompul of Ihe AND as C ' ( h+p) . Thus, Ihe
equall on F ( C . h , p) Q C' ( h+p ) repre ents
Ihe same funclion a Ihe circuit.
Figure 2.32 Converting a ci rcuit
1O an equation.
3. Equations to truth lables
Convertino an equat' h
. " ton 10 a Irut lable can be done by fi rsl cre-
al lng a Irulh labl e struclure appropriale for the number of
funcll on InpUI . bl .
. vana es. and then evaluallng the riahl-hand
SIde of Ihe equali on for each combination of inpul values. For
exampl e, 10 conven Ihe equation F ( a , b) = a' b' + a ' b 10
a truth lable, we would firsl creale Ihe truth lable structure for a
IWO-InPUI funclion, as shown in Figure 2.30(a). We would then
evaluate the ri ghI-hand si de of the equation for each row's
comblnallOn of inpul va lues, as follows:
.
a=O and b=O, F
0' *0 ' + 0' *0 1*1 1*0 +
a=O and b=1, F 0 ' *1 ' + 0 ' *1 1*0 + 1*1
a =1 and b=O, F
1' *0 ' + 1 ' *0 0*1 + 0*0
a =1 and b=1. F 1 ' * 1 ' + 1 ' *1 0*0 +
Inputs Output
F
o 0
a 1 1
o 0
o
Figure 2.33 Truth table
for F(a.b)=a'b'+ab.
+ 0
0 + 1 1
0 0 0
0 + 0 0
We would Iherefore fi ll in Ihe lable' righl column as shown in Figure 2.33. NOle thaI we
applied propenies of Boolean algebra (mostly the identity and null elemems
propenYl 10 evaluale Ihe equations.
Noti ce Ihal convening the equation F ( a . b ) =a ' 10 a truth lable re ults in exa t1 v the
same truth labl e as shown in Figure 2.33, [n particular. evaluating the ri ght-hand ide of
the equatIOn for each row's combinalion of inpul values yields:
a =O and b=O, F 0 '
a=O and b=1. F :
0 ' 1
a=1 and b=O. F = 1 ' 0
Inputs Output
a b a' b' a' b F
0 0 1 0 1
a = 1 and b= 1. F 1 ' = 0
0 1 0 1 1
1 0 0
I
0 0
1 1 0 0 0 Some people find il useful 10 creale inter-
mediate columns in the lruth lable 10 compule
Ihe equat ion's inlermediate value. Ihus filling Figure 2.34 Truth ubi. for Fla.b)=ab - ,
h
with intemledinte
eac column of Ihe lable from lefl 10 righl.
moving 10 the neXI column only after filling all ro\\S of the ' umm lliumn. An e:l. :lmpl
for Ihe equallon F ( a . b ) - a' b' + a ' b is h \\ n in Figure
59
60
Combinational Logic Design
4. Truth tables t o equations
To convert a truth table to nil equati on. we cre:.l (C
a product lenn for cach 1 in the output col umn.
and we Ihen OR all the product terms. For the
tabl e on the ri ght (Fi gure 2.35). we get the terrns
shown in Ihe ri glllll10st column of lhat table.
ORing those terms yields F = a ' b ' + a ' b.
5. Circuits to truth tables
Inputs Outputs Term
b F F - sum of
0 0 a' b'
0 1 1 a' b
0 0
0
Figure 2.35 Converting a truth table
to 3n equation.
We can convert a combi nnLional circuit to a truth ,
tabl e by firsl converting Ihe circuit to an equation (described earli er). and Lhen converttng
the equation 10 a lruth table (descri bed earl ier).
6. Truth tables to circuits .
We can convert a truth table 10 a circuit by first converting Ihe trut h table to an equatton
(desc ribed earli er). and then convening the equation 10 a circuit (described earlier).
EXAMPLE 2.19 Parity generator circuit design starting from a truth table
For tllIJ exllmple,
Harling/rom a
IfUl/J wble IS II
more natural
{'/J oice t/Jall (III
t!qUUfIfHl,
Nothing is perfect, and digital ci rcuits are no excepti on. Someti mes a bit on a wire when it's
not supposed to. So a 1 becomes a O. or a 0 becomes a 1. For example. a 0 may be travehng along a
wire. when suddenly some electrical noise comes out of nowhere and chang.es the a 10 n. 1. While we
can reduce the like lihood of such crrors. perhaps by usi ng we ll -insulated wires. we can"t completely
prevent such errors. nor can we delect nod correct all of them- but we can delcct some of them.
Designers typi call y look for situat ions where errors are likely to occur, such as data bei ng Lransmit-
ted between two chips over long wires-like from u compu ter over a printer cable to a printer, or
from a computer over a telephone line to anoLher computer. For those silUJtions. designers add ci r-
cuits that at least tfY to detect that an error has occurred. in which case the recei vi ng circuit can ask
the sending circui t to resend the data.
One common method of detecting an error is call ed parity. Say we have 7 data bi ts to transmit.
We add an extra bit. called Ihe parity bi!. to make 8 bits talaI. TIle sender sets the parit y bit to a 1 if
Lhat would make Ihe lotal number of 1 s even-thai's called evell parity. For example. if the 7 data
bil s were 0000001. then the parity bil would be 1. making the 10101 number of Is equal to 2 (an
even number). The compl ete 8 bi ls would be 00000011. If Ihe 7 dat a bits were 1011111, then
the parity bit would be O. making Lhe total number of Is equal 10 6 (an even number). The complete
8 bi lS would be 10111110.
The receiver now can detect if a bit has changed during transmi ssion by checki ng lhat there's
an even number of 1 s in the 8 bits received. If even, the transmi ssion is assumed correct. If not
even, an error occurred during transmi ssion. For example. if the receiver receives 00000011. the
transmi ssion is assumed 10 be correct, and the parity bit can be di scarded. leavi ng 0000001.
Suppose instead Ihat an error occurred and the receiver receives 10000011. Seei ng the odd
number of 1 s, the receiver knows that an error occurred- note that the receiver docs tlot know
which bil i, erroneous. Likewise, 000000 I 0 would represenl an error 100. NOli ce in thi s case that
the error occurred in the parity bit. but the receiver doesn' t know where the error occurred.
Let's describe a functi on Ihat genemles an even paril y bit P for 3 dal" bit> a, b. and e . Staning
from an equati on is hard-what's the equati on? For Ihi s example. sianing with a truLh lable is the
natural choice. a, , hown in Table 2.3. For cach confi guration of dOIa bil' (i.e .. for each row in the
Lruth lable). we 'et the parily bilto make Ihe lOlaI number of 1, eVCl1. From Ihe (ruth tublc. we then
obtain the followi ng equati n for the pari ly bi!:
2.6 Aepresentations of Boolean Functions
P a 'b' c + a ' be ' + ab ' e ' + abe
We could Ihen d ' h - .
OR gate. cSlgn 1 C Circuit using four A 0 gales and an
Note that even p ' d .
correct ( h amy OC5n t Illean for Sure that {he data is
note 1 at we were c'lref J I .
was "assumed" (0 b . <, U 0 say earlier that the transmi ssion
two errOrs OCc e c?rrecI If parity was correct), In particular. if
For e . I ur On dlfferenl blls. Ihen Lhe parity will sLill be even.
xamp e, the sender may se d Oli 0 .
1111 1111 . ' . n . but Ihe receIver may receive
d
. has even pari ty and thus looks correct. More powerful
error etecl lon methods 'bl
b . nrc POSSI c to detect multiple errors like thi s
one, ut al the pri ce of adding extra bi ts.
k
Odd parity is also a common ki nd of parily-the parity bit value
rna es the lotal numb I' 1 od
b . er a s d. There's no quality difference
TABLE 2.3 Even parity
for 3-bit data.
b e p
o o o o
o o
o o
o o
o o
o o
o o
61
etween even parrly and odd parity- the key is simply Lhat the sender
and receiver must both lise lhe same kind of parity. even or odd.
A popular rcpresenlalion of lellers and numbers is known as ASCII which encodes each char-
acter Into 7 blls. ASCII adds 1 bi t for parity. for a tOlal of 8 bils pcr
EXAMPLE 2.20 Converting a combinational circuit to a truth table
Conven the circuil depicted in Figure 2.36(a) inlo a truth table.
. We beglll by convening the circui t 10 an equation. Starting from the gales closest (0 the
left most AND gale and the invener in thi case-we label each aate's output as an
express Ion of Ihe gate's inputs. We label Ihe lefl most AND gate' output. for ex:mple. as a b. like-
WI se,. we label .the leftmost invener's Output as C . Continuing through the circuit's oates. we label
the nghlmosl Inverter' ( b) ' _ . e
( ) s OUlput as a . Frnall y. we label the nghtmo t D .ate's OUtpUl as
a b ' c ' . which corresponds 10 the Boolean equation for F. The full y labeled ci';;uit is hown in
FI gure 2.36(b).
. From the Boolean equati on. we can now construct the truth table for the combinational circWL
our circuit has three in puts-a, b. and C-there are 23= possible combinations of inputs
(I.e. 001. 010. 011. 100. 101. 110. 111). so our truth table has the ei.ht rows
in Figure 2.37. For each input. we compute the value of F and fill in the correspondim! com
In the lruth lable. For example. when a and e=O. F is (00) - (0)' 1 : I i
=, 1 We the circuit's output for the remaining combinations of input using a truth table
with IIllcrmedmlc values. shown in Figure 2.37.
F
F
Figure 2.36 (a) Combinalional ireuil. and (b) cireuit \I it h gates' output c' prc:"on, lJt-ckd.
62
2 Combinational Logic DeSign
Inputs Outputs
a b e ab (ab)' c' F
0 0 0 0 1 1 1
0 0 1 0 1 0 0
0 1 0 0 1 1 1
0 1 1 0 1 0 0
1 0 0 0 1 1 1
1 0 1 0 1 0 0
1 1 0 1 0 1 0
1 1 1 1 0 0 0
Figure 2 37 TrUlh table ror the circuit 's equation
Standard Representation and Canonical Form
Truth tables as a Boolean function standard representation
We stated earli er that . although there are many possible equation representations and
circuit represem3tions of the same Boolean function. there is only one pos ible truth
table representation of a Boolean function. Truth tabl es therefore represent a standard
representation of a functi on-for any functi on, there may be many possible equations,
and many possible circuit s, but there is only one truth tabl e. The truth tabl e representa-
tion is unique.
One use of a standard representation of a Boolean fu ncti on is for comparing two
functions to see if they are equivalent. Suppose you wanted to check if two Boolean equa-
tions were equi va lent. One way would be to try to manipulate one equati on to be the
same as the ot her equation. like we did in our automatic sliding door exampl e in Example
2. 13. But suppose we were not successful in gelling them to be the same- is that because
they reall y arc not the same, or because we just didn't manipul ate the equation enough?
How do we reall y know the two equations are not the same?
A conclusive way to check if two
fu nctions are the same is to create a truth
table for each. and then check whether the
truth tables are identical. So to determine
whether F = a b + a ' is equivalent to F
= a ' b ' + a ' b + a b. we could gen-
erate truth tables for each, using the
method described earlier of evaluating the
functi on for each output row, as shown to
the right.
We see that the two functi ons are
indeed equivalent , because the outputs are
identical for each input combinati on. Now
let's check if F = ab + a ' is equi valent
to F = (a+b) ' by comparing truth tables.
As seen to the right , those two func-
ti ons are clearly not equivalent. Comparing
truth tables leaves no doubt.
F=ab+a'
o
o
b
o
b
o
1
o
F
1
1
o
F
1
o
F = a' b' +
a' b + ab
a
o
o
b
o
1
o
F = (a+b) ,
a
o
o
b
o
1
o
F
1
1
o
F
o
o
2.6 Representations of Boolean Functions
63
While compari ng truth tables works fine when a function has only? inputs wh t ' f
functton has 5 ' 10 32? - , a J a
. tnputs, or . ,or . . Creating truth tables becomes increasingl y cumber-
; ? ~ and tn many cases Just pl atn unrealisti c, since a truth table' number of rows equals
, here n tS the number of tnpulS. 2" grows very quickly. 2
32
is approximately 4 billion
for example. We can't reali sti call y expect to compare 2 tables of 4 billion rows each. .
However, 111 many cases, the number of output Is in a truth table may be very small
cOmpared to the number of output Os. For example. consider a function G of 5 variables a
b, c, d, and e: G = a bcd + a ' bcde. A truth table forth is function would have 32 rows'
but .only three Is in the output column-one 1 from a ' bcde, and two 1 from abed
(which covers rows corresponding to a bcde and a bcde '). This lead to the question:
Is there a more compact but still standard representation of a Boolean function?
Canonical Form-Sum-of-Minterms Equation
The answer to the above. questi on is "yes". The key is to create a tandard representation
that only deSCribes the situations where the function outputs 1. with the other situations
assumed to output O.An equation, s uch as G = abcd + a ' bcde. is indeed a repre en-
tatIOn that onl y deSCribes the Slluatl ons where G is 1, but that representation is not unique.
that IS, the represent ation is not tandard. We therefore want to define a standard form of
a Boolean equation, known a a cal/ol/ical Jorm.
You've seen canoni cal forms in regular algebra. For exampl e, the canonical form of a
polynomlal ofdegree twoi s:a x
2
+ bx + c. Tocheck if the equation 9x2 + 3x T 2
+ 1 is equivalent to the equation 3 * (3x
2
+ 1 + x), we conven each to canonic,a1
form, resulting in 9x
2
+ 3x + 3 for both equation.
One canoni cal form for a Boolean function i known as a um-of-minterrns. A
mil/term of a function is a product term whose literals include every variable of the func-
lion eraclly oll ce. in either true or complemented form. The function F (a . b . e l = a' bc
+ abc ' + ab + c has four terms. The firs t two terms, a' be and abc ' . are minterrns.
The third term, a b. is not a minterm since c does not appear. Likewise. the fourth term. C.
is not a min term, since neither a nor b appears in that term. An equation i in sum-oJ-min-
terms Jorm if the equation is in sum-of-product form. and every product term i a mimerm.
Convening any equation to sum-of-minterms canonical form can be done follo\\i n!!
just a few steps: -
l. First, we manipulate the equati on until il i III um-of-product form. uppo ewe
are given the equation F( a . b . e) =( a+b)(a '+aclb. We manipulate it as
follows:
F =
F =
F =
F =
F =
F =
F =
(a +b )( a '+a c) b
( a+b )(a 'b+ac bl
a ( a ' b+a c b ) + b( a' b+acb)
aa ' b + aa cb + ba' b + bacb
O*b + a cb + a ' b + acb
acb + a'b + acb
ac b + a' b
(b. the di triburiYe propenYl
(distributive property)
(distributi"e propel'!) )
(complement. commUl3ti \e.
idempotent)
(null elements)
(idempotent)
64 Combinat ional Logic Design
2. Second, we expand each tenn until every term is a minterm:
F aeb + a'b
F
F
aeb + a ' b* l
aeb + a'b* (e+e ' )
ae b + a ' be + a 'be '
(identity)
(complement)
(distributive)
3. (Optional step) For neatness, we can arrange the literals within each to a con-
sistent order (say alphabetical), and we can also arrange the terms In the order
they would appear in a truth tabl e:
F a ' be ' + a ' be + abc
The equation is now in sum-of-minterms form. The equation is in sum-of-products form,
and every product term includes every variable exactly once.
An alternati ve canonical form is known as product-of-maxterms. A max/erm is a
sum term in whi ch every variable appears exactly once in either true or complemented
form. such as (a + b + e ') for a function of three variables a, b, and e. An equation
is in produc/-of-maxterms form if the equati on is the product of sum terms, and every
sum term is a maxteml. An example of a function (different from that above) in product-
of-maxterms form i J ( a . b, c) (a + b + e') ( a ' + b ' + e ' ). To avoid
confusing the reader. we will not discuss the product-of-maxterms form further here, as
sum-of-minterms form is more common in practice, and suffi cient for our purposes.
EXAMPLE 2.21 Comparing two functions using canonical form
Suppose we wanll o delenmine whelher Ihe functions G( a , b, e , d . e) abed + a' bede and
H(a,b.e . d . e) = abede + abede' + a ' bede + a ' bede(a ' + e) are equi valent. We
first com'cn G to sum-of-minterms form:
G abed + a ' bede
G abed( e+e') + a ' bede
G abede + abede ' + a ' bede
G - a ' bede + abede ' + abede
We then conven H to sum-of-mimerms form:
H abede + abede ' +
a'bede +
a'bede (a ' + e)
H abede + abede ' + a ' bede + a'bedea ' + a 'bedee
H abede + abede '
+ a 'bede +
a ' bede + a ' bede
H abede + abede '
+ a' bede
H a ' bede + abede '
+ abede
Clearly, G and H are equi valent.
NOle thai checking Ihe equivalence usi ng truth tabl es would have resulted in 2 rather large
trulh labl es having 32 rows each. Using sum of mintenms was probabl y more approprial e here.
Compact sum-of-minterms representation
A more of sum-of-minterms form involves li sting each minterrn
as a number, each mtnterm's number determined from the binary representation
of Its vanables values. For exampl e, a' bede corresponds to 01111. or 15; abede '
2.6 Representations of Boolean Functions
65
corresponds to 1111 0, or 30; and a be d e corresponds to lIllI, or 3 I. Thu . we can say
that the function H represented by the equation:
H - a ' bede + abede ' + abede
is the sum of the minterms 15,30, and 31 , whi ch can be compact ly written as:
H 1:m(l5 , 30 , 31)
The summation symbol means the sum, and then the numbers insi de the parentheses rep-
resent the minterms being summed on the right side of the equati on.
Multiple-Output Combinational Circuits
Many combinational circuit s not only invol ve more than one input. but also involve more
than one output. The simplest approach to handling a multiple-output circuit is to treat
each output separately. leading to a separate circuit for each output. Actually, the circuits
need not be completely separate-they could share common gates. We'lI show how to
handle multiple-output circuits through exampl es.
EXAMPLE 2.22 Two-output combinational circuit
Design a circuit to implerncnllhe roll owing two equations of three inputs a. b. and c:
F = ab + e '
G ab + be
We can design the circuit by simpl y creating Iwo separate circui ts. as in Figure 2.38(a).
a
b
F F
G
(b)
(a)
Figure 2.38 Multiple-output circuit: (a) trealed as two separale circui ts. :lIld (b) \\ ith gale sharing.
We can instead notice thai the lenn a b is common to both equations. ThUs. the \\ 0 circuits an
share Ihe gate thai compul es a b. as shown in Fi gure _.3S(b).
EXAMPLE 2.23 Binary number to seven-segment display converter
1nny electronic appliances di splay 3 number for us 10 read. E.ample applian< - indud: a d<xc
t
-. 3
oven. and a telephone answering ma hine. A \ cry and simple dC\I:-e tor
ing a single di git number is a se"en-segment display. illustraled III FIgure 2.39.
66
Combinational Logic Design
For rhis txwnple.
starting f rom a
Inlll! table is a
more natural
choice {han all
eqllarion.
a
.-,
f -----,
,",
,
b
9
,:
e -----,
' .. '
, e - ---_
d------,
abedefg = 1111110 0110000
1101101
(0) (b)
(c)
Figure 2.39 Seven-segment display: (3) connect ions of inputs 10 segments. (b) input values for
numbers O. I. ::md 2. (c) a pni r of real seven- segment displ ay component s.
The di spl ay consists of seven light segments. each of whi ch can be illuminated independently
of the others. We can display the desired di git by sell ing the signals a , b . c , d . e: f , and
9 appropriately. So to display the di git 8, we set all seven signals to 1. To display the di git 1, we set
b and C 10 1.
A useful combi nati onal circuit is one thai converts a binary number to the seven-segment
di splay signal s a- g thm di splay the number as a deci mal digil. We need four say w, x, y,
z. to represent the binary values of the ten possible di gits 0 to 9. Table 2.4 deSCri bes the conversion
of cach binary nu mber to the seven-segment displ ay' s signals. We decided to activate no segments
for the numbers 10 through 15.
TABLE 2-4 4-bit biDary number to seven-segment displ av truth table
w x y z a b c d e f
a a a a 1 1 1 1 1 1
a 0 a 1 0 1 1 0 a a
a 0 1 0 1 1 a 1 1 0
a 0 1 1 1 1 1 1 0 a
a 1 a 0 a 1 1 a a 1
a I a 1 1 0 1 I a 1
a 1 1 a 1 0 1 1 I 1
a 1 1 1 1 1 I a 0 0
1 0 a 0 1 1 1 1 I I
1 0 a 1 1 1 I 1 0 1
1 0 1 0 a 0 a a a a
1 0 1 1 a 0 a a a a
1 I a a a 0 a a a a
1 1 a 1 a a a 0 a a
1 1 1 a a a 0 0 a a
1 1 1 1 a 0 0 a a a
9
a
a
1
1
1
1
1
0
1
1
a
a
a
a
a
a
.-.
'-'
-.
a:
'-1
. -
.:.
.-.
.: .


-.
:.
.-
: .
-.
J
. -.
:.
2.7 Combinational Logic Design Proc ess 67
We can create a Custom logic ci rcuit to implement the converter. Note that the above table is in
the fonn of a truth table having multipl e outputs (a th rough g). We can treat each output separatel y.
so deSign a circuit for a . then for b, elc. Looki ng al the Is in the a column. we obtain the fol -
lOWi ng equati on for a:
a - w' x ' Y' Z' + w' x' yz ' + w' x' yz + w' xy ' z + w' xyz ' +
w' xyz + wx ' y ' z ' + wx' y ' z
Looking at the 1s in the b column, we obtain the following equation for b:
b - w' x' y ' z ' + w' x ' y ' z + w' x' yz ' + w' x ' yz + W' xy ' z '
+ w' xyz + wX ' y ' z ' + wx ' y ' z
We could then proceed to create equati ons for lhe remaining outputs C through g. Finally. we
would create a circuit for a having 8 4-inpul AND gates and an 8-input OR gale, another circuit for
b having 8 4-input AND gates and an 8-i nput OR gate. and so on for C through g. We coulci of
Course, have minimi zed the logic for each equ::uion before cremi ng each of the circuits.
You may notice th at the equat ions for a and b have several terms in common. For example.
the term w' x ' y ' Z ' appears in both equations. So it would make sense for both outputs to share
one AND gate generating that term. Looking al the trul h table. we see that the tenn w' X ' Y , z '
is in fact needed for outputs a, b, C, e, f, and g, and thus the one AND gate generating thaI
term could be shared by all six of tho e outputs. Likewi se. each of the olher required tenTIS is
shared by several output s. meaning each gate generating each term could be shared among
several outputs,
2.7 COMBINATIONAL LOGI C DESIGN PROCESS
Based on the previous secti ons, we can define a traighrforward method for designing
combi national logic, summari zed in Tabl e 2.5
TABLE 2.5 Combinat ional logic design process.
Step
0. Capture the
f unclion
N COll vert (0
fr equmiol/s
c;:j
Description
Create a truth table or equations. whichever is most natural for the
probl em. to describe the desired behavior of the combinational logic.
Thi s step is onl y necessary if you captured the function using 3 truth
rabl e instead of equati ons. Create an equation for each output by ORing
all the mintenns for lhat output. Simpl ify the equations if desired.
Implemefll as a For each out put. crente a circuit corresponding ro the outpu( equation.
J5 gate-based cireuit (Sharing gales among multiple Outputs is OK optionally.)
Gate-based circuits designed such thaI Ihe inpul reed into a column of , glHl!S
that feeds into a single OR gale are known as two-iel'eilog;c impiemenlariollS.
EXAMPLE 2.24 Three 1 s pattern detector
We want 10 impl emelll a circuit that cnn detect whether :1 pattern of at least thret' h IJxur
anywhere in an 8-bit input. and that output s a 1 illthut case. The inputs are a . b. c . d. e. f. g. JnJ ".
68
Combinational Logic Design
For this example.
stoning f rom an
eqllMiml is a lII ore
naruml choice
fholl a lrutf, toble.
_ 000 III a I y should be !. si nce there are three
and the output is y. So for an input of a bcde 10 I a I a II: the output should be a,. since
Is 111 a row (on IIl puts d. e . and f J. For an IIlp III I 0000 should result in y = !. Slllce havlllg
are not three Is in a row anywhere. An Input of h . uil is an extremely simple example of
h Id ' 11 tput a I Sue a wc I
more than three I s in a row S Oll 511 Oll . p . detectors arc widely used, for exarnp e,
. . k lIem delcclOrs altern d
a general class of CirCUits ' nown as pOl . k ' a digi ti zed video image. or to elect
in image processing to detect things. like humans or tan ' 5, In
specific spoken words in a digitized audio stream. . .
. re the functi on as a rather large truth table, li sting
Step I: Capillre Ihe JII"cllO". We could captu . 1 'or y in eac h row where at least
.. f . ts and entenng a ., ,
out all 256 combmnll ons 0 I,"PU . od for ea turing thi s particular function is to
Step 2:
Step 3:
three I s occur. However. a melh currence:of three Is in a row. One possibility
create an equation that lists thepoSSlble oc =111 Likewise, if cde=11!. def=lll,
is that of a bc= 11 J. Anot her IS that of bcd I F h possibi lit y the values of the
e f g= 1J1 orfgh=1 11 we should output a . or eac , fd
. ' , . S ' f a bc= III we output a I. regardless of the values 0 ,
other Inputs don I mailer. 0 1 .. ' .' .
e . f, g, and h. Thus. an equati on descnbmg y IS simply.
y = a bc + bcd + cde + def + efg + fgh
Convert to equations. We can skip thi s step since we already have an equation.
.. . I'fi t" of the equation is possibl e. The Implemellt as a gate-based CIrcUli. No simp I ea Jon
resulting circuit is shown in Figure 2.40.
g--_1-1
Figure 2.4ll Three Is pattern detector.
EXAMPLE 2.25
For fhis e;wmple.
starring from a
I fll l h wb/t! is a
more natural
choice Ihon an
equation.
We want to design a circuit that counts the number of Is present on 3 inputs a. b. c. and outputs that
number in bi nary using 2 outputs, y and l. An input of 110 has two I s. so our circuit should output
10. The number of I s on 3 inputs can range from a to 3. so a 2-bit output is suffi cient. since 2 bit
can represent 0 to 3. A number-of- I s COunter circuit is useful in a variety of situations. such as
detecting the density of elect ronic particles hitting a collecti on of sensors by counting how many
sensors are activated. As another example, there are airpon parking lots that have sensors above each
parking spot , coupled with signs that inform drivers of the number of avai lable parking spots on a
pani cular level of a multilevel parking Struct ure (by counting the number of zeros, but thut' s the
same as counting the number of ls with all inputs first complemented).
2.7 Combinational Logic DeSign Process 69
Step I: Capt ure the funclion. Capturing the function for this example is most naturally
aChieved usi ng a truth table. We list al l the possible input combinations, and the desired
output nu mber, as in Table 2.6.
Step 2:
TABLE 2.6 Truth table for number-of-ls counter.
Inputs
(# of l s) Outputs
b c y z
0 0 0 ( 0)
0 0
0 0 ( I)
0
0
0 (])
0
0 1 1 ( 2)
0
0 0 (I )
0
0 1 (2)
0
1 0 ( 2 )
0
(3 )
1
Convert to equations. We create equations for each OUrpUI as follows:
y - a'bc + ab'c + abc ' + abc
l - a'b ' c + a'bc ' + ab'c' + abc
We can simpli fy the first equation algebraically:
Step 3:
y = a'bc + ab ' c + ab (c ' + c) = a'bc + ab' c + ab
Implement as a gate-based circuit. We then create the final circuits for the two outputs.
as shown in Figure 2.41.
a
b
a
b
c
g
c
a
b
c
Figure 2.41 Number-of-ls counter gate-based circuit.
Simplifying circuit nolations .
We u ed a couple of new simpli fying nOlations in our circuits in pre.n
One simplifying nOla lion is to lisl the inputs multiple times. to a,o.d hanog tn OUT
drawi ng crossing one anolher-an inpul lisled multiple times is - umed to ha,e been
branched from the same input.
70
Combinationa l Logic Design
Another s implifying nolttli on is Ihe use of
a n inversion bubble at the inpul of a g al e.
rather than the use of a NOT gate. An tnpUI
th:ll is inverted in to many gates is assumed 10
feed through a s ing le inven er that is then
branched ; ut to those gates. An alternative
simplifi cati on is to simpl y .include comple-
me tHed variabl es. like b ' . as tnputs.
EXAMPLE 2.26 12-button keypad to 4-bit code converter
You've probably seen 12-bull on keypads in many
different places. like on n telephone or at an ATM
mnchi nc as shown in Figure 2.42. The first row has
bUllons I. 2. and 3. the second row has 4. 5. and 6.
Ihe third row has 7. 8. and 9. 3nd the lasl row has *,
O. and #. The outputs of such a keypad consist of
seven signals-one for ench of the four rows (r 1.
r 2. r 3. and r 4). and one for each of the three col-
umns (c l. c2, and c3). Pushing a part icul ar button
causes exactly two outpu ts 10 become 1. corre-
spondi ng to the row and column of Ihal button. So
pushing button ' '1'' causes r I : I and C I: I. while
pushing bUllan " #" causes r4:: 1 and c3=1. \Ve
want to design a circuil thnt converts the seven sig-
na ls frolll (he keyp.:l d illl a a 4-bit binary number
'v-I XY Z indi cat ing which bUlton is pressed. We wanl
For t"is example.
starling Jrom
equations is a
mOre natural
c" oice I"{UI a Irwh
table. although lI 'e
used all inJormol
table (1101 a lruth
table) 10 help LIS
detennille the
equations.
_r1
_r2
_<3
c1 c2 c3
Figure 2.42 12- bullon keypad.
buttons "0" to "9" to be coded as 0000 through 100 I (0 through 9 in binary), respecti vely. Let's
encode butt on " . .. as 1010. # as lOll. and let' s let III I mean that no butt on is pressed. Let's
assume for now lhat only "one" bu tton can ever be pressed at a given time.
\Ve could capture the functi ons forw, X. y, and Z using a truth lab Ie. with the seven inputs on the
left side of the table. and the four outputs on the ri ght side. but that table would have 2' = 128 rows,
and most of Lhose rows would correspond merely to multi ple bunons bei ng pressed. Let' s try instead
to capture the functions using equat ions. The infonnal Table 2.7 mi ght help us get tarted.
TABLE 2.7 Informal table for the 12-bunon keypad to 4-bit code converter.
4-bit code outputs
4-bit code outputs
Bulton Signals
Bullon Signals
w y
w Y
I r l c l 0 0 0 I
8 r 3 c 2 0 0 0
2 r l c2 0 0
9 r3 c 3 0 0 I
3 rl c3 0 0
r4 c l 0 I 0
4 r2 c l 0 0 0 r4 c2 0 0 0 a
5 r2 c2 0 I II r4 c3 0
6 r2 c3 0 0 ( none )
7 r3 cl 0
.. ' '--'---
SLOW DOWN! THE QWERTY KEYBOARD
Inside a standard comput er keyboard is a small micro-
processor and a ROM. The mi croprocessor delects
which key is being pressed. looks up the 8-bit code
for that key (much like the 12-button keypad in
Exa mpl e 2.26) from the ROM. and sends that code to
the computer. There's an intercsting story behind the
way the keys are arranged in a standard PC keyboard,
whi ch is known as a QWERTY keyboard because
those are the keys that begin the top left row of letters.
The QWERTY arrangement was made in the era of
typewriters (shown in the pi cture below), whi ch, in
Keys cOllnected to arms
case you haven' t
seen one, had each
key connected to
an arm thai woul d
swi ng up and press
an in k ribbon
agai nSI paper.
2.7 Combi national Logic Design Process
An annoying problem
with rype:writers was
that arms would often
get jammed side-by-
side up near the paper
if you typed too fast-
like too many people
getting jammed side-
Arms stuck! by-side while Irying 10
simultaneou Iy walk
through a doorway. So
typewriter keys were arranged in the QWERTY
arrangement 10 slow down typing by separaring
common lellers. since slower typing reduced me
occurrences of jammed keys. \Vhen Pes were invented.
the QWERTY arrangement was the natural choice for
PC keyboards. as people were accuslomed to that
arrangement. Some say the differently-arranged D\orak
keyboard enables faster r) ping. but that type of
keyboard isn' t very common. as people are JUSt too
accustomed to the QWERTY keyboard.
71
Using thi s fable. wc call derive equrni ons for each of the four OUlpUlS, as follows:
>, r 3c2 + r3c 3 + r4 c l + r4 c3 + rl ' r2 ' r3 ' r4 ' cl ' c2 ' c3 '
X r 2c l + r 2c2 + r2c3 + r3cl + rl ' r2 ' r3 ' r4 ' cl ' c2 ' c3 '
y rlc2 + r l c3 + r 2c3 + r 3cl + r 4cl + r 4 c3 +
rl'r 2 'r3 ' r4 ' c l' c2 ' c3 '
: r lc l + rl c3 + r2c2 + r3cl + r3c3 + r4c3 -
rl' r2 'r3 ' r4' c l' c2 ' c3 '
We could then creale a circuit ror each OUlpUt. Obviously. the h SI teml of each equation L'OUld be
shared by all four out puts. Likewise. other tenns could be shared too (like r2c31.
Note that this ci rcuit would not work well if multiple bUllons can be "00
Our ci rcuit will output ei ther a valid or inval id code in that situation. depending on \\hich bunoos
were pressed. A prererable circuit would trem multiple buttons being pres!'cd as no button being
pressed, \Ve leave the design of that circui t as an exercise.
Circuit s similar 10 what we designed above exist in computer board..!., e',,"eptlhat are
a lot more rows and colu mns,
EXAMPLE 2.27 Sprinkler valve controller
Aut omatic lawn sprinkler systems use a digital to control opc:nmg. and d o'-mg of w:uc-r
valves. A spri nllcr system !':upports se\ernl ditTcrent the;" bal' kft 'Ide
yard. right si Ie yard. froll! yard, elc. zone' s \ '!the can 3t.1 In \.wdt"r tQ
tain enough water in the !<oprinklcrs in that l one. Up(X.l!'t" J 'UPPl"'\ft.!'o up tt." ..
zoncs. 1)lpical ly. n stem i, controlkd a '\111311, inC\pelhl\(, I1lh .' ropn: "uang l
progralll that ol>t:ns e:lch \'ahc ani) nt tim!?!' of the for 'J.lt.."Cltk Jut'"Jtwo, :urp...r
72 Combinational Logic Design
For this example,
stoning from
equations is a
more nQlltral
choice {han a IrIIlh
lable.
4 ( I pi ns avaihble to control the valves. not 8 outputs as required
the mi croprocessor onl y. has all pu th Illi c<roprocessor to use 1 pin (0 indicate whether a valve
h 8 les We C'1O Instead program e . b' Th we
or I e ZOI. < . the 3 other ins to output the active zone (0. I, ... , 7) In mary. us,
should be opened. and u,eiollal circuit 4 inputs. e (the enabler) and a. b. c (Lhe bi nary value
need to deSign . combln.L . 8 d7 d6 dO (Lhe val ve controls). as shown In Figure
. ) and haVing outputs . . ... 1
of the nCll ve zone . (. . h Id decode the 3-bit binary input by setting exacLl y one output to .
2.43. When e: 1. the clrCUiL s au
h
f ron Valve 0 should be acti ve when abc:OOO and So Lh e equa-
Step I: Capture t e une I .
Li on for dO is:
dO : a ' b ' c ' e
.. I I IdbeacLive when abc:OOl and e:l,so LheequaLionfor dl iS:
LlkevJl sc. vaJ ve S lOU
Micro-
processor
d I : a' b ' ce
dOf--------r'I


d31--------"
d41--------...
d51------.,
decoder d6f-------.,
d7f------."
Figure 2.43 Sprinkl er valve controller bl ock di agram.
The equati ons ror the remaining outputs can be
determined similarl y:
d2 a ' bc'e
d3 a ' bce
d4 ab ' c ' e
d5 ab ' ce
d6 abc ' e
d7 abce
Step 2: Convert to equations. No conversion is
needed since we already have equati ons.
Step 3: Implement as a gate-based circuit. The
circui t implementing the equations is
shown in Fi gure 2.44. The ci rcuit we've
designed is aCLu ally a commonly used
component known as a decoder lVilll
ellable. We' ll introduce decoders as a
building block in an upcoming section.
dO
dl
d2
d3
d4
dS
d6
d7
Figure 2_44 Sprinkl er valve conLroli er
Circuit (actually n 3x8 decoder wi th enahle).
2.8 More Gates 73
2.8 MORE GATES
NAND & NOR
NAND
G
NOR
=I>-
We earlier introduced three basic logic gates: AND, OR, and NOT. Designers commonly
use several other lypes of gates too: NAND, NOR, XOR, and XNOR.
A NAND gale (short for "not AND") has the opposite output as an AND gate, OUtputting a
o when all inputs are 1, and outputting a 1 if any input is a O. A NAND gate has the same
behavior as an AND gate followed by a NOT gate. Fi gure 2.45(a) illustrates a AND gate.
A NOR gate (short for "nol OR") has the opposite output as an OR gaLe, OUtputting
a 0 if at least one input is a 1, and outputting I if all inputs are O. A NOR gate has the
same behavior as an OR gate followed by a NOT gate. Figure 2.45(b) illustrates a OR
gate.
We earlier warned you in Section 2.4 Lhat our CMOS transi stor implementations of
AND and OR gates were not reali stic. Here's Why. It turns out that pMOS transistors
don'l actually conduct Os very weU, but they conduct I s just fine. Likewise. nMOS tran-
sistors don' t conduct Is well , but they conduct Os just fine. The reasons for these
asymmetri es are beyond thi s book's scope. But the impli cations are that the AND and OR
gates we built earlier (see Figure 2.8) are not feasi ble, since they rely on pMOS transis-
tors to conduct Os (but pMOS conducts Os poorl y) and nMOS rransistors to conduct Is
(but nMOS conducts 1 s poorly). On the other hand, if we swap power and ground in the
AND and OR circuits of Fi gure 2.8, we obtai n the gates shown in Figure 2.45 (a) and (b)-
Those gates have the behavior of NAND and NOR gates, which makes sense since output
I s become replaced by Os, and Os by 1 s.
NAND NOR XOR XNOR
;GF;D-F D D-
x y F x y F x y F x y F
0 0 0 0 1 0 0 0 0 0
0 1 0 0 0 1 1 0 1 0
0 1 0 0 0 1 0 0
0 0 0
x-cj
F
F
x---1
(e) (d)
Figure 2.45 Additi onal gates: (a) NA D. (b) OR. (c) XOR. Cd) XNOR.
74
Combinational Logic Design
We can sli ll implemenl an AND gale in
CMOS. bIll we woul d do so by appending a
NOT gale aI Ihe OlliPUI of a NAND gale
(NAND foll owed by NOT gives us AND). as
shown in Figure Likewise. we would
implemenl an- OR gale by appending a NO:
gale at Ihe OUIPUI of a NOR gale. BUI Ihal s
slower Ihan a circuil direclly imple-
menled as NAND and NOR. FOriunalely. we
can apply strai ghtforward methods to convert
any AND/ORINOT cireuil 10 a NA D-only
circuit. or 10 a OR-only circuit. We ' ll
describe Ihose melhods in Seclion 7. 2.
F
EXAMPLE 2.28 Airc raft lavatory sign using a NAND gate
Example 2. 15 a available sign using
the followi ng equation:
Figure 2.46 AND gate in CMOS.
Circuit
XOR & XNOR
s ( a be) ,
Notici ng that the lenn on the ri ght side corresponds
to a NAND. we can implement the circuit using a
single NAND gale. as shown in Figure 2.47.
a-<-+--;


P---t-S
Figure 2.47 Circuit usi ng NAND.
A 2- inpul XOR gale. shorl for "exclusive or" and pronounced as "ex or:' oulPUIS a
1 if exact/" one of Ihe Iwo inpuls is a 1. So if such a gale has inpuls a and b, then
the output' F is 1 if a l and bO, or if b'l and aO. Figure 2.45(c) illustrates an
XOR gate (for si mpli cit y. we omit the transistor-level implement at ion of an XOR
gate). For XOR gates with 3 or more inputs, the output is 1 onl y if the number of
input Is is odd. A 2- input XOR gate is equivalenl to the fun cti on F ab ' +
a ' b.
An XNOR gale. shari for "exclusive nor" and pronounced "ex nor," i simply
the opposite of XOR. A 2- input X OR is equi va lent 10 F - a' b ' + abo Figure
2.45(d) illustrales an XNOR gale. omitting the transistor-level impl ementation for
implicity.
Interesting Uses of These Additional Gates
o
o
o
Detecting all as using NOR
A NOR gale can detect the situati on of a data ilem equal 10 O. ; ince NOR outputs a 1
only when all inputs are O. For exampl e. suppose a byte (S-bil ) input to your system i
counting down from 99 10 0, and when the by Ie reache O. you to ound an alarm.
You can delect Ihe byte being equal to 0 by si mply connecting the 8 bit.'> of Ihe byte into
an 8-input NOR gale.
2.8 More Gates
7S
Detecting equality using XNOR
XNOR gat es can be used to compare two data ilems for equalily. ince a 2-input )(}\:OR
oUlputs a 1 only when Ihe inputs are bOlh a Or are both 1. For example. suppose a byte
II1pUI A (a7a6a 5 ... aO) to your system i counting down from 99. and you want 10 sound
an alarm when A has Ihe same va lue as a econd byte inpul B (b7b6b5 ... bO). You can
detect such equality u ing eight 2-input XNOR gate . by connecting a7 and b7 to the
firsl XNOR gale, a6 and b6 to Ihe second )(j OR gale. elc. Each X OR gate leUs us
whether the bits in Ihal pani cul ar pos iti on are equal. By ANDing all the XNOR OUlpUlS.
we can te ll whether every pos iti on is equal.
Generating and det ecting parity using XOR
An XOR gate can be used to generate a parily bit for a set of dala bilS (see Example
2. 19). XORing Ihe dala bits result in a 1 if there's an odd number of 15 in the data. so
XOR computes the correCI parity bit for even parity. ince Ihe XOR's output 1 would
make the tOl al number of 1 s even. Notice that Ihe truth table we created for generating an
even parity bil in Table 2.3 does in fact represent a 3-bi t XOR.
Likewise, an XNOR gate can be used 10 generale an odd parit) bit.
XOR can also be used to detect proper pari ly. XORing the incoming data bilS along
with the incoming parity bit wi ll yield 1 if the number of I s is odd. Thu . for even parity.
XOR can be used to indicate that an error ha occurred. since the number of I s i up-
posed 10 be even.
XNOR can be used to delect an error when odd parity is used.
Completeness of AND/OR/NOT, AND/NOT, OR/NOT, NAND, NOR
It should be fairl y obviou that if you have AND gate. OR gate. and NOT gates. you can
implement any Boolean functi on. This is because a Boolean function can be represented
as a sum of product . which consists only of Al D. OR. and NOT operations.
What mi ght be sli ghtl y les obviou is that if you had onl) ro and ;\OT gat"". you
could still impl ement any Boolean fu nction. Why' Here' a simple explanatioll--lO
obtain an OR. si mpl y put NOT gates at the input and ourputs of an TJtis \\Qrks
because F ( a ' b' ) ' a" + b" (by DeMo'llan' Law) a - b.
Likewise. if you had only OR and NOT gates. you could implement any Boolean
functi on. To obt ain an AND. you could si mply invert the inpuls and ourpUts of an OR.
sinceF ( a ' +b ' ) ' a"*b " abo
It foll ows thai if you ollh' had NAND gates a,ailable to you. you uJd still imple-
ment any Boolean functi on. Why? Because we can think of a NOT gate - a I-input
NAND gat e. and we an implement an D gate using n 1'1 , D gate follo\\ed by a 1_
input NA D gat e. Since we can implemem any Boolean fun tion l"ing ;\OT and :\_'\'0.
we can therefore impl emenl any Boolean fun lion u>ing just :\ X ;\D gate"
thus known as a 1I11i1'er sa/ gme.
Li kewi e. if )OU had I OR gate. you ould implement any Bool an fun,ti,n.
because we an implement a NOT gUlf a! a I-inpul NOR gate. and an R gat l
NOR foll owing by a I-input ' OR. inc" NOT and OR lI11plement Jny B, ,I an tun ,-
li on. so can OR. OR gate is thus abo I.m",n 3< a uni,e _,t/ gat
76 2 Combinational Log ic Desig n
Number of Possible logic Gates
Having seen several diffcrent types of
basic 2-input logic gales (AI D, OR,
lAND, NOR, XOR. XNOR). one mi ght
wonder how many possibl e 2- inputlogic
gales ex ist. That quest ion is the same as
as king how many Boolean functions
exist for two variables. To answer the
questi on. we first note thai a two-vari -
abl e functi on's truth table will have 22=4
rows. For each row. the funct ion could
output one of two poss ibl e values (0 or
1). Thus. as illustrated in Figure 2.48.
there are 2 * 2 * 2 * 2 = 2' = 16 possible
functions.
b F
0 0 Oar 1 2 choices

0 1 Oor 1 2 choices
'"
0 Oorl 2 choices
'"
Oar I 2 choices
'"
possible functions
Figure 2.48 Counting the number of possible
Boolean functions of two variJbJes.
Figure 2.49 li sts all 16 of those functions. We indicate the 6 familiar functi ons in the
figure. Some of the mher functions are 0, a, b, a', b', and 1. The functions
are not necessaril y common functions, but each could be usefu l for some panlcular apph-
cati on. Thus, we don't necessaril y need to build logic gates to represent those fu nctions,
but we instead would build those fu ncti ons as a circuit of the basic logic gates.
b 10 11 12 13 14 15 16 f7 18 19 110 111 112 f13 f14 f15
0 0 0 0 0 0 0 0 0 0 1 1
0 1 0 0 0 0 1 1
0 0 0 0 1
0 0 1 0 0 1 0 0 t 0 0 1
0 0 0 0 0 0 0 0
0 D
'"
D D D D D
io D 0
a: a: a: a: 0 z
0 0 0 0 z ..:
x
'"
z z ..:
'"
'" '"
x z
'" '"
Figure 2.49 The 16 possibfe BOOlean func ti ons of two variables.
A more general questi on of interest is how many Boolean functi ons exist for a
Boolean function of N variables. We can detemline thi s number by first noting that an
N- vari able functi on will have 2N rows in its truth tabl e. Then, we note lhat each
row can output one of two possibl e va lues. Thus. Ihe number of possible functions will be
2 2 2 *_2
N
times. Therefore, the total number of is:
22N
So there arc: 2
2
' = 2
8
= 256 possibl e Boolean of 3 vuriablc." and
2
2
' = 2
16
= 65,536 possible functions of 4
2.9 Decoders and Muxes
77
2.9 DECODERS AND MUXES
Decoders
Two additional components, a decoder and a multiplexer. are also commonly used as
digital circuit building blocks, though they themselves can be buill from logic gates.
A decoder is a hi gher-level building bl ock commonly used in digital ci rcuits. A decoder
decodes an inputl/-bi t binary number by selling exactly one of the decoder's 2" OUtputs to 1.
For example, a 2-input decoder, illustrated in Figure 2.50, would have 22=4 outputs. d3. d2.
d 1, dO. I f the two inputs iIi 0 are 00, dO would be 1 and the remaining outputs would be
O. If iIi 0=01, dl would be 1. If iIi 0=10, d2 would be l.lf i 1 iO=ll, d3 would be 1.
The internal design of a decoder i straightforward. Consider a 2x4 decoder. Eacb
output dO, dl, d2, and d3 is a di stinct functi on. dO should be 1 only wben i 1=0 and
iO=O, so dO = il'iO'. Likewise, dl=il ' iO, d2=iliO ' , and d3=iliO. Thus. we
build the decoder with one AND gate for each OutpUl. connecting the true or comple-
mented values of i 1 and iO ta each gate, as shown in Figure 2. 50.
dO
dO
dO 0 dO 0 dO 0 dl
0 iO dl 0 iO dl I 0 iO dl 0 iO dl 0
0 il d2 0 0 il d2 0 1 il d2 il d2 0 d2
d3 0 d3 0 d3 0 d3
(a)
d3
il iO
(b)
Figure 2.50 2x4 decoder: (a) output s for possible input combinations. (b) internal design .
etc.
The internal design of a 3x8 decoder is similar: dO=i 2 ' iI ' i 0 '. d -i 2" i ,
A decoder often co Illes with an extra inpm
call ed el/ab/e. When enable is 1. the decoder
acts normally. But when enable is O. the decoder
outputs all Os-no output is a 1. The enable is
useful when sometime you don't want to acti-
vate any of the outputs. Without an enable, one
output of the decoder mllSf be a 1. because the
decoder has an output for every possible value
of lhe decoders II-bit input. We rented and
used a decoder with enabl e in Figure A
block di agram of a decoder with enable appears
in Fi gure _.51.
dO 0
10) : '
iO dl 0 0
il d2 0 11 d2 - 0
e
d3

1 0
(a) (b)
Figure 2.51 Dec."Od<r ",m n bl . I
e-l: !lOnnal lk..' IXiJl\!!. Ibl e- : all
outrut- O.
78
Combinational Logi c De sign
. . h'ck if part (or all ) of the ,ystcm's function-
When designing :1 partIcular system. we C c . d cod'r reduces the amount of
. -. d oder USIng a e "
ahty could be calTled oul by a ec . - ' 11 sec in Example 2.30.
combinat ional logic design thaI we need to perfonll. as YOLI
EXAMPLE 2.29 Basic questIons about decoders _ dO-l d 1-0
I. \Vll a! would be :J 2x.-l decoder's output values when the inputs nre DO? AIIJWel. -.
d2-0. d3-0. dO-O. dl-0.
2. \Vhat would be a decoder's output values when the inputs arc II'! AIlSII'er:
d2-0. d3-1.
I ' ne of Ihe decoder' s OUlpul; 10 be 1 al Ihe
J. \Vhm input values of a 2x4 decoder cause more t lan 0 f d d ' Ul puts can be 1 at a
same time? AflSI,'er: No such input vnllles exisl. Onl y onc 0 a ceo er S 0
given lillle. '., ., :0 dl= l d2-0. d3-0?
t \Vhm woul d the input values of a decoder be If the output \allles.lre dO . .
Answer. The input vi:llues Illllst be ; 1 =0. i 0= l.
be ' f h I 'alues arc dO- 1 d 1 - 1 d2:0. d3-0? 5. \Vha! would the input val ues of a decoder <> I I. e out pu \ " .
AIlS11'e r; This question not valid. A decoder only has Olle output equ:1lto 1 at any time.
6. How Illany outputs would:1 5-input decoder have'? Answer: 25. or 32.
EXAMPLE 2.30 New Year's Eve countdown display
A New Year's Eve counldown display could make use of a decoder. The di>play may have 60 lighl
bulbs goi ng up a tali pole. We want one li ght per second to turn on (with the prevIous one turning
off). slanin-g from bulb 59 al the bollOIll oflhe pole. and ending wilh bulb 0 al the lOp. We could use
a mi croprocessor [0 counl down from 59 to 0, butlhe microprocessor probably hflve 60 OUI -
put pins that we could use to control each light. Our microprocessor could Instead output
the numbers 59. 58 ..... 2. I. 0 in binary on a 6bit OUIPUI pan (Ihus oUlpuIIlllg 1110 11. 11010
.... 000010. 000001. 000000). We could conneCI Ihose six bits 10 a 6-lIlput. 64 (2 )-OUlpUI
decoder. wilh decoder OUlput d59 li ghling bulb 59. d58 li ghling bulb 58. elc. .
We'd probably want an enable on our decoder in Ihis example. since we'd want all Ihe IIghlS
off until we started the COuntdown. The microprocessor would initi ally sct enable to 0 so that no
li.hlS would be illuminated. When Ihe 60 second countdown begin,. Ihe microprocessor would sel
e;able to 1. and Ihen Outpul 59. Ihen 58 (I second laler). Ihen 57. "IC. The final system would look
like that in Figure 2.52.
Happy
iO dO
New Yearl
(;
--il dl
Figure 2.52 Using a 6x64 decoder 10
ill i2 d2
interface a microprocessor and a column
.,
i3 d3
of li gh" for a New Year', Eve di ' pl ay.
a. i4
The microprocec,,')or sets e - 1 when the e
i5
...
u
la.." minute countdown begin..,. and then
d58
counte., down from 59 100 in binary on
d59
Ihe pill' i 5 .. i O. ole Ihal Ihe
d60
microprocessor ... hould never output 6(),
d61
61.62. or63 on i 5 .. i O. and Ihu, Iho,"
6x64 d62 )59
OUlPU" of Ihe decoder go unu,cd
dcd
d63
2.9 Decoders and Muxes
79
Notice that we implemented this system without having to design any gate-level combinatiOnal
logic-we merely used a decoder and connected it to the appropriate inputs and outputs.
Whenever you have outputs such that exactly one of those outputs should be set 10 1
based on the value of inputs representing a binary number. think about u ing a decoder.
Multiplexer (Mux)
A multipl exer ("mux" for short) is another higher-level building block in digital circuits.
An Mx I multiplexer has M data inputs and I output, and allows only one input to pass
through to that output. A set of additional inputs. known as select inputs, determines
whi ch input to pass through. Multiplexers are sometimes call ed selectors because they
select one input to pass through to the ourput.
A mux is like a rai lyard swi tch that connects multiple input tracks to a single outpur
track, as shown in Figure 2.53. The swi tchs control lever causes the connection of Lbe
appropriate input track 10 the output track. Whether a train appear al the output depends
on whether a train exists on the presently selected input track. For a mUll . the switch 's
control is not a lever, but rather select inputs, whi ch represent the desired connection in
binary. Rather than a train appearing or nOI appeari ng at the ourpul a mUll outputs a 1 or
a 0 depending On whether the connected input ha a 1 or a O.
Figure 2.53 A multiplexer is like a rnilyard swi lch. detennining \\ hich inpul track conn IS to !be
single outpul track. according 10 Ihe SWilCh' s contrOl lever.
A 2- inpul mUltiplexer, known as a 2x I multiplexer. has two dala inputs i 1 and .
one elect input 5 O. and one dnta output d. a- shown in Figure If 50-0. . . \llIue
passes through. If 50=1. i l's value pa' ses through.
The internal design of a 2x I multiplexer is hown in Figure When 50- i1. the
top A D gate OUlput s 1* i 0- i O. and the bOIlOIll AND gate outputs 1- . Tb ' . th
OR gate output iO+O- iO. a iO pa es through u, desired. U ke" i -. \\hen S -:.
the bOIlOIll gate passe i 1 whil e the t p gate outputs O. re.ulling in the R
pass ing i 1.
80
2 Combinational l ogic Design
iO
2x1
8 8
iO iO i1
i1 i1
sa sa sa
0 1
w
Figure 2.542 x I multipl exer: (a) block symbol, (b) connecli ons for sO-O, and sO-l, and (c)
internal design.
A 4-i nput muiliplexer. known as a 4xl multiplexer, has four data inputs.i 3, i 2, i 1,
and i O. two seleci inpulS S 1 and sO, and one dala outpul d (a mux a/ways has Just one data
Outpul , no matl er how many inpulS). A 4x I mux bl ock diagram IS shown III Fi gure 2.55.
iO
4x1
iO
i1
i1
i2
d
i2
i3
51 sO
i3
sl sa
(a) (b)
Figure 2.55 4 x I muhipl exer: (a) bl ock symbol and (b) internal design.
The internal design of a 4x I multiplexer is shown in Fi gure 2.55. When S 1 sO-DO,
the top AND gale out puts i 0*1 * 1 = i O. the next AND gate outputs i 1 *0* 1-0. the next
gate out puts i 2*1 and the bOllom gate outputs i 3*0*0=0. The OR gale outpulS
i i O. Thus, i 0 passes through, as des ired. Li kewise, when s 1 sO-O l. the
second AND gate passes i l. whil e the remaining AND gates all output O. When
5150=10, the third AND gate passes i 2, and the other AND gates output O. When
s 1 5 0= 11, the bOll om AND gate passes i 3, and the other AND gates OUtput O. For any
value on s 1 sO, onl y I AND gate will have two 1s for it s select inputs and will thus pas
its data input ; the other AND gates wi ll have at least one 0 for its select inputs and will
thus output O.
An 8x I mult iplexer would have 8 data inputs i 7 ... i 0,3 select inputs 52. s l and sO,
and one data output. More generall y, an Mx I multiplexer has M data input s, log2(M)
select inputs, and one data output. Remember. a mult iplexer always has just one output.
EXAMPLE 2.31 Basic questions about multiplexers
AS5u.me a 4x I muhipJexer's four input.; presently have the followlllg valuc" i 0-1. iI - I. i 2-0,
and 13-0. Whal would be Ihe value on muhiplexcr', OUIPUI d for Ihe folio" illS ,ciCCI inpul volu ?
2.9 Decoders and Mu.es
81
I. S IsO -.01. Allswer : Because sls0-01 passes inpul i Ilhrough 10 d, then d would have the
value of 1 1, whI ch presently is 1.
2. S 1 sO I!. AlISlVer : Thai config uralion of seleci li ne inpul values passes i 3 through. so d
would have Ihe value of 1 3, which presentl y is O.
3. many select mus., be ,present on a J 6x I mulLiplexer? Ansu:er: Four select inputs
ould be needed 10 ulllquely IdentI fy which of Ihe 16 inputs 10 pass through 10 the OUlpUt since
log,( 16)=4.
4. many lines arc there on a 4x2 multiplexer? Answer: This question is not valid--there
IS no such thIng as a 4x2 multi plexer. A multiplexer has exacily one ompul.
S. How inputs arc there on a multi plexer having fi ve select inpUlS? Answer: Five select inputs
can unIquely identify one of 2'=32 inputs 10 pass through 10 the OUlpUt.
EXAMPLE 2.32 Mayor's vote display using a multiplexer
Con.sider a srnalJ IOwn with a very unpopular mayor. Mayor's switches
Dunng every town meeting, the ci ty manager pre-
sents four proposals to the mayor. who then indi -
Cates hi s vote on Ihe proposal (approve or deny).
Very consislently. ri ghl aft er Ihe mayor indicales his
vote, the town's citi zens boo and shout profanities at
the mayor-no matter which way he Votes. Havi ng
had enough of Ihi s abuse, Ihe mayor selS up a simple
digital syst em (Ihe mayor happens 10 have laken a
course in di gital design), shown in Fi gure 2.56. He
provides himself with four switches that can be
positioned up or down, outpUlling 1 or O. respec-
ti vely. When the time comes during the meeting for
him 10 VOle on the fi rst proposal. he pl aces the firs l
swilch either in the up (accept) or down (deny) posi-
tion-bUI nobody else can see the position of the
switch. When the lime comes to Vote on the second
Figure 2.56 Mayor's \ Ole
implemented using a 4x I mu.<.
proposal. he VOles on the second proposal by placi ng the second swilch up or do"n. And 00_
When he has fini shed cast ing all hi s VOles. he leaves Ihe meetine and OUI for off"". \\ith the
mayor gone, the city manager power up a large green/red light. \\'hen the input to the lighl is ,
the li ghl Ii ghl s up red. When the inpul is 1. the lighllighlS up green. The cil} manager controls!VoO
switches that can route any of the mayor's switch outputs to the light. and so the manager sreps
through each confi guralion of Ihe swilches. slarring with configuration 00 (and alling OUI "n,.,
mayor's VOle on Ihi s proposal is .....). then 01. then 10, and finally 11. causing the lighl lolighl
either green or red for each confi guration depending on the IX> itions of the 013)Or' S\\;tcbes. The
system can easily be impl emented lIsi ng a 4x I multiplexer. as shown in Figure 1.56,
N-bit Mxl multiplexer
Muxes are oft en used to sele ' rively po s through n t ju tingle bilS. but 'v-bit data item..<.
For exampl e, one set of input s A may onsist of four bits a3, aZ. d 1, a .:md anocMr: (
of inputs B may also consist of four bi ts b3. b2. b1. bOo \\' \\:int t mullipl \ th<
inputs to a four-bit output C. consi ti ng of c3. c2, cL eO. Figure 2.5 (al >hO\\S h,)\\ to
accompli sh ' uch mult iplexing using ur _\ I lllU\ CS.
82
2 Combinational Logic Desi gn
Simplifying
notation:
4bit
4
....... C
4 2xt
A-.'-IO
C
rs short
D
4
for:
8-.'-11
sa
- c3
sa - c2
- cl
sO
- cO
(a)
(b) (e)
Figure 2.57 -J-bit 2x I 11l1lX: (a) int ern;} 1 design using four 2x I nluxes for selecti ng al11.ong 4-bi l data
items A or B. and (b) block di agr:lJl1 of iI -I.-bit 2x I mux component: (e) The block diagram uses a
C01111110n simpli fying notation. using one thick wire with a slanted line and the number 4 to
represent .... single wires.
Because muxing data is so common. another common building block is lhat of an
N-bit- wide Mx I mul7ipl exer. So in our example. we would use a 4bil 2x I mux. Don't get
confused. lhough- an N-bil Mx I muhiplexer is reall y just the same as N separale Mx I
multi plexers. with alilhose muxes sharing the same select inpul s. Fi gure 2. 57(b) provIdes
the symbol for a 4- bi l 2x I mux.
EXAMPLE 2.33 Multiplexed automobile abovemirror display
Some cars come with a displ ay abovc the rcar-
view mirror, ::I S shown in Figure 2.58. The car's
driver can press a button named mode to select
among di 5. pl aying the outside tcmperatu re, the
average miles-per-gall on of lhe car, lhe instanta-
neous mil es-per-gall on, and the approximate
mi les remaining until the C:1r runs out of gaso-
line. Assume the car's cenlral compUier sends
Ihe dala 10 the di splay as four 8bil binary num.
bers. T (the temperature). A (average mpg). I
(inslanlaneous mpg). and M (mil es rema ini ng).
T consists of 8 bil s: t7. t6. t5. t4. t3. t2.
tL to. Likewise for A. l. and M. Assume Ihe
display system has two additi onal inputs X and
y. which always change according (0 the fol-
Figure 2.58 Above mirror di'play.
lowing sequence-OO. 0 I. 10. ll-whenever the mode bUll on i, prc"cd (\I c'lI ,cc in [I lata chap-
ler how 10 creale such a ,equence). When xy-OO. we wanl lo di'play T. When xy- O 1. we want to
di'play A. When xy - l O. we wanl lo di ' play I. and when xy- I I. we wnnl to di'plny M. A<s ul11e Ihe
OUIPUI , D go 10 a di splay Ihal know, how 10 conven Ihe 8bi l hinary number on 0 10 a human.read.
able di'played number like thaI in Figure 2.58.
We Can dc\ign the di splay u'\ing eight 4:< I mult iplexer,. A rei n of
Ihal ,a me design u,e, an gbi l 4x I multiplexer. a, ' hown in Figure 2.59.
2.10 Additional Considerations 83
B Bbit
0 10 4Xl
g- A
;:; 8 =:.... ___
E iii ?-12
0'= M 8
u: l --..... -..13
8 D
" j..,\ --------

Figure 2.59 Above-mirror display using an 8-bil 4x I mux.
Notice how many wires must be run from the car's central computer. which may be under the
hood, to the above-mirror di splay-B * 4 = 32 wires. That's a lot of wires. We'll see in a later
chapter how to reduce the number of wires.
. Notice in the previous example how simple a design can be when we can utilize
hI gher-l evel building blocks. If we had to use regular 4x I muxes. we would have 8 of
them, and lot s of wires drawn. If we had to use 2ates. we would have -lO of them. Of
course, underlying Our simple design in Fi gure are in facI eight 4x I muxes. and
underlyi ng those are 40 gates. And underl ying those gates are lOIS more rransislOrs. We
see that the hi gher- level building blocks make our design task much more managable.
2.10 ADDITIONAL CONSIDERATIONS
Schematic Capture and Simulation
When we design a circuil , how do we know that we designed the circuit correctly" Perhaps
we created the truth table wrong, puning a 0 in an outpul column where we houId have PUI
a 1. Or perhaps we wrote down lhe wrong mintenn. writing y z when we should bave
wrole xy Z '. For exampl e, consider the number-of-one's counter in Example 2.25. We
created a truth table, then equati ons. and finall y a ci rcuit. Is the circui t correcl?
One method of checking our work is to reverse engineer the function from the
circuit-staning with the circuit. we could conven lhe circui l to equations. and then the
equations to a trulh tabl e. If we gel the same ori ginal truth table. then the circuil 5h uld
be correct. However, sometimes we stan with an equalion ralher lhan a truth l:tble. 3S in
Example 2.24. We can reverse engineer lhe circuit to an equati n. but that equation
be different than our ori ginal equation. espe ially if we algebmicaIl) the
original equalion when des igning the circuit. And checking that two equati ns are equi\'-
alent may requ ire convening to canonical rOm! (sum-of-minten11S1. \\hich may result in
huge equalions if our functi on has a large number f inputs.
In fact . even if we didn ' t make nny mi stakes in nvening ur mental undersr.mding
of the desired funcli on int o a lruth lable or equation. ho\\ d \\ e \"no\\ that our mental
under tanding was correct ?
84
Combinalional Logic Design
.' Ih'lI (I circuil works (IS we expect is called
A commonl y used method for checkmg 'f ' d' g omple inl)UIS 10 the circuit
. . . . .' . h process 0 provl In u
sl mul all on. SlIlIlI latlOlI of a CirCUli IS I e . ' 1' OUIPUI for the given inputs.
th I compules Ihe Circul s
and running a compuler program a I The com pUler program that
We can then check Ihal the OUIPUI malches whal we expec .
performs simulalion is called a s;mllialor.
Figure 2.60 Di splay snapshot of a commercial schematic
capture tool.
To use simul ali on 10 check a circuit, we
mUSI describe the circuit using a method that
enables compul er programs 10 read the. Cl r-
cui!. One melhod of descri bing a cIrCUlI IS to
draw the circuil using a schemati c capture
1001. A schemat;c caplllre 1001 all ows a user
10 pl ace logic gates on a com pUler screen and
10 draw wires connecting those gates. The
1001 all ows users to save their ci rcuit draw-
inos as compuler files. All the circuit
in thi s chapter have represented
examples of schematics-for example, the
circui l drawing in Figure 2.50(b), repre-
senting a 2x4 decoder. was an example ofa
schemati c. Figure 2.60 shows a schematic
Inpuls
iO---.-fL
for Ihe same des ign. drawn usi ng a popular
commercial schemali c capture tool. Sche-
matic capture is used nOI onl y to capture
circuil s for simul ator lools, but also for tools
that map our ci rcuits to physical implementa-
tions, which wi ll be di scu ed in Chapler 7.

Outputs
d3
i1 I'.M+
Outputs n-

d2
d1
dO
(a)
d2-.r-
d1JL

(b)
Figure 2.61 Simulation: (a) begins wilh us defining Ihe inputs
signal over time. (b) automati call y generales the oUlput
waveforms when we ask the simulator to simulate the circui t.
Once we've created a circuit u ing sche-
matic capture, we must provide the simulator
with a sel of inputs for whi ch we want to
check for proper output. One way of pro-
viding the inputs is by drawing waveforms
for the circuit 's input ' . An input's waveform
is a line thaI goe from left to ri ght , repre-
senting the value of the input as time
proceeds 10 the right. AI different times, we
draw the line as high 10 represent 1, and low
to represenl 0, as shown in Figure 2.61 (a). After we are sat isfied wi th our input wave-
forms, we instruct the si mulator 10 simulale our ci rcuit for the given inputs waveforms.
The simulator determines what the circuit outputs would be for each unique combination
of inputs, and generates waveforms for the outpUts. as illustraled in Figure 2.61 (b). We
can then check that the output waveforms malches the outpul val ues Ihat we would expect
for each input. Such checki ng can be done visuaJly. or by providing certain checking
statements (often call ed assertions) to the simulalor.
Simulation still does not guarantee thaI our circuil is correct. but rather increa es our
cOllfidence that our circuit is correct.
. -- - -
Nonideal Gate Behavior-Delay
Ideally, logic gale oUlputs woul d change
Immediately in response to changes in
the gate's inputs. The liming diagrams
earli er in thi s chapter all ass umed such
ideal zero-delay gates, as shown again in
Figure 2.62(a) for an OR gale. Unfortu-
nalely, real gate oUlputs don' l change
immedialely, but ralher after some short
lime delay. After all , even the fastesl
automobi les can' t go from 0 10 60 miles-
per-hour in 0 seconds. The delay in gates
is due in part 10 Ihe fact that transistors
don' t switch from nonconducting to con-
ducting (or vice versa) immedi ately-it
takes some time for electron to accumu-
late in Ihe channel of an nMOS
2.10 Additional Considerations 85
x
1
Ju-
:lr
F ___ -
o ,
..
time
(a)
1Ju-
o !
1 ! r-
o-f-l
F
1
JJ
O ! ..
time
(b)
Figure 2.62 OR gale timing diagram: (a) withoul
gale delay. (b) with gale delay.
transistor, for exampl e. Furthermore, electric current travels at the speed of light. which,
whi le extremely fast. is slill nOI infinitely fast. Additionall y, wires aren'l perfect and can
slow down electric current because of "parasitic" characteristics like capacitance and
inductance. The timing di agram in Fi gure 2.62(b) illustrales how a real gate' output
changes sli ghtl y after change in the inpulS. Gales delays for modem CMOS gates may
take less than I nanosecond to respond 10 changes--extremely fast. but still not zero.
Demultiplexers and Encoders
Two additional components, demultiplexers and encoders. can also be con idered as
combinational building blocks. However, those component" are far les commonly used
than their counterparts of multiplexers and decoders. everlheless. for completeness.
we' ll briefi y introduce Ihose addi lional components here. You may notice throughout
thi s book thaI demultiplexers and encoders don't appear in many examples. if in any
examples at all .
Demultiplexer
A demultiplexer has roughl y the opposite functionality of a mulriple_<er. pecifically."
I xM demultiplexer has one dala inpul. and based on the alue of lecl liD
passes that input through to one of M OUTputs. The other outputs stay O.
Encoder
An encoder ha Ihe opposi te functionalily of a decoder. pecifi an n r log;:(n)
encoder has II inputs and log2(1I) OUlputS. Of the II inputs. e<8ctly one is 3S<umed I be _
al any given time (su h would be the case if the inpul n isted of a liding or rotating
swi tch with II possible po ilion. for example). The en oder outputs a btn:ll) value 0 \ r
the log2(1I) output indi ating which of the II inputs \\ as a L For e\ 301ple. en.: er
would have four inpttls d3. d2. dl. dO. and t\\O UIPUI - el. eO. Rlran IIlput 1,
86
Combinational Logic Design
OUlpUt is 00. 0010 yields 01, 0100 yields 10. and 1000 yields 11. In other words,
d 1 resul ts in an OUl pUt of 0 in binary, d 1 1 result s in an output of 1 blllary, d 2- 1
results in an output of 2 in binary. and 1 result s in an omput of 3 ttl btnary.
A priority e/l eoder has si mil ar behavior, but handles situat ions where more than one
input is 1 at the same ti me. A pri ority encoder gives pri ori ty to the hIghest Input that IS a
1. and outputs the binary value of that input. For exampl e. if a 4x2. pri ority encoder has
inputs d3 and dl both equal LD 1 (so the inputs are 1010). the pnonty encoder gIves pn-
ority to d3 , and hence output s 11.
2.11 COMBINATIONAL LOGIC OPTIMIZATIONS
AND TRAOEOFFS (SEE SECTION 6.2)
The earlier secti ons in thi s chapter described how to create basic combinati onal circuits.
This section. Secti on 2. 11 , physicall y appears in thi s book as Secri on 6.2, and describes
how to make those circuits better (small er, fas ter, etc.)-namely, how to make optimiza-
tions and tradeoffs. One use of thi s book covers combinati onal logic optimi zati ons and
tradeoffs immediately after introducing basic combi nati onal logic design, meaning cov-
ering that section now (as Section 2. 11 ). An alternative use of the book covers that section
later (as Section 6.2) , after al so introducing basic sequenti al design, datapath compo-
nents. and regi ster-transfer level design- namely, after Chapters 3, 4, and 5.
2.12 COMBINATIONAL LOGIC DESCRIPTION USING HARDWARE
DESCRIPTION LANGUAGES (SEE SECTION 9.2)
Hardware description languages (HDLs) allow designers to describe their circuits using a
textual language rather than as circuit drawi ngs. Thi s section. Secti on 2. 12, introduces the
use of HDLs to describe combinati onal logic. The secti on physicall y appears in the book
as Section 9.2. One use of mi s book int roduces HDLs now Secti on 2. 12), immediately
after introducing basic combinati onal logic. An alternative use of the book introduces
HDLs later (as Section 9.2), after mastery of basic combinati onal, equenti al, and reg-
Ister-transfer level design.
2.13 CHAPTER SUMMARY
Section 2. 1 introduced the idea of u. ing a custom di gi tal circuit to implement a system's
desired fu nctionalit y and defi ned combinational logic as a di gi tal circuit whose outputs
are a function of the circuit's present inputs. Secti on 2.2 provided a brief hi story of digital
swit ches, starting from relays in the 19305 to today's CMOS transi. tors, wi th the main
trend being the amazing pace at whi ch switch size and delay have continued to shrink for
the past several decades, leading to ICs capable of containing a billion transistors or
mar:. Sect ion 2.3 described the basic behavior of a CMOS tr. nsiMor. j ust enough infor-
matIOn to remove the mystery of how work. ecti on 2.4 introduced three
fundamenta l bui lding blocks for bui lding di gital D gat es. OR gates. and
NOT (i nverters), which arc far c.1sier to work with thun Section 2.S
showed how Boolean algebra could be u,cd to rcprc,ent circuit; built from D, OR,
2.14 Exercises 87
and NOT .
gates, enabling us to build d .
extremely powerful conee t S' mani pulate circuits by using math-
B I p . ecnon 2 6 Introduced I ' an
00 ean functions namely equat ion '.. severa dIfferent representatiOIl5 of
straightforward th;ee-step process and truth tables. Section 2.7 described a
examples of bui lding real circuits usin th combinatIOnal CIrcuits. and gave several
NAND and NOR gates are actuall get ree-step process. Secti on 2.8 described why
CMOS technology, and showed y mare commonly used than AND and OR eates in
could be built with NAND gates built from AND. OR. and NoT gates
two other commonl y used gates XOR d gates alone. That seCllon also introduced
commonl y used combinati onal bUildin an
bl
XNOR. Section 2.9 introduced two additional
Introduced schematic capture tools ;. h decoders and mUltiplexers. Section 2.10
puter programs can re'ld thos . ,w IC a . ow us to draw our circuits uch that com-
, e CirCUit S and als' d d' .
the output waveforms for user-pro ' d d' . a Intra uce SImulation. which generates
. . VI e Input waveforms t hi '
a CirCUIt correctly. That section also di scu a e p us venfy that we created
between the time that I' n t h ssed how real gates actually have a small delay
pu s canoe and the t' b h ' -
secti on also introduced sam I '" Ime t at t e gate s output changes_ The
. e ess commonly used c b' . aI .. -
tlplexers and encoders. am ill atIOn building blocks. demul -
2.14 EXERCISES

PLUS
Any problem noted with an asterisk (*) represent
SECTION 2.2: SWITCHES
an especially challenging problem.
A microprocessor in 1980 used aboul 10.000 transislors How .
would fil In a modern chip having J billion transistors? . mallY of those mtcroprocessors
2.1
2.2 The fi rsl Pent ium mi croprocessor had about 3 ' 1" .
processors would fi l in a od . . lr.lnSISIOrs. HO\\ many of those micro-
m em chip havmg I bi lli on transistors?
2.3 Define Moore'S Law.
2.4 Assume for a panicular year that a panicular ' h' .
cOOlain I billion t' . sIze C lp uSing tate-of-Lhe-an technolo!!) :lD
ranSlSlOrs. ASSUll11112 Moore's Law h Id h -
same size chip be able 10 contai n in Ie; years? 0 s. 0\\ man) will the
2.5 Assume a cell phone co . 50 "
the phone used vacuum How. big would such 3 cell pb oe be If
I cubi c inch? ' assumIng :1 \ 1} uum rube has a \olume of
2.6 A modem desktop proces h
bi would . sor, suc as [he Pentium -f, has about 300 million tr.Utsistors.. H<m
if we used vacuum rube of the I a
SECTION 2.3: THE CMOS TRANSISTOR
2.7 Describe the behavior of Ihe CMOS lransislor circuit
shown .in Figure 2.63. clearl y indicati ng when the tr:Jn-
Slstor ClrCUi I conducts.
2.8 If we apply a volt age 10 the gate of a CMO lransbtor,
?ocsn', the CUITCnt fl ow from the gale 10 lht: tr..tn-
Sistor S source or drain? Figure 2-63 Ctrcuit, Nlt lrung
t\\ O .. "'f'.
88
Combinational Logic Design
UILDING 8LOCKS FOR
SECTION 1.4: 800LEAN LOGIC GATES-8
DIGITAL CIRCUITS . .
OT . appropriatc for cach of Ihe foli owlOg.
2.9 \ Vhi ch Boolean opcrali on. AND. OR. or . IS d' 'I house motion sensor outputs
(a) Detecti ng mOl ion in any Illolion sensor surrOUIl 109
1 when Illol ion is detected). ",' ssed simultaneously (each bUBo n outputs I when
(b) Detecting that three buttons arc bt: lOg pre
a is being pressed)" r oll! sensor (the li ght sensor out put s 1 when li ght is
(e) Delccllng (he :1 bscncc of li ght from a 10
sensed). .
. .,. lIS 10 Boolca. " equati ons:
2. J 0 COllvcn the roll owing Engli sh probl em ' d "d ' d the system is set 10 enabled.
n h Id a pump If water IS CICCI!.; an
(a) A ood deleclor S ou lum on , . ; 11 if il is nighl and lighl i delecled inside a
(b) A house energy monitor should sound an a an
room but 1110 [ion is nOI detected. . . . water valve if the syste m is enabled and
(c) An irrigati on system should open the spnnkl er s
neither rai n nor freezing tempelJ.tures are detected. .
1.11 Evaluale Ihe Boolean equali on F = (a AND b) OR c OR d for Ihe given values of vanables
a . b. c. and d:
(a) a-I. b=1. c=1. d=O
(b) a=O. b=1. c=1. d=O
(c) a=1. b=1. c=O. d=O
(d) a-I. b-O. c= l. d-l
2. 12 Eva luale Ihe Boolean equali on F = a AND (b OR c ) AND d fo r Ihe given values of variables
a . b. c. and d:
(a) a=l. b-1. c=O. d-l
(b) a=O. b=O. c=O, d-l
(c) a-I. b-O. c=O, d=O
(d) a-I. b=O. c=1. d=1
2.13 Eval uale the 8 00lean equation F = a A D (b OR (C AND d) ) fo r Ihe g iven values of vari
abies a . b. c . and d:
(a) a-I, b-1. c-O. d-l
(b) a-D. b-O. c-O. d-l
(c) a-l. b-O. c=O. d- O
(d) a-l. b-O. c-1. d- l
Show the conducli on paths and OUIPUI value of the OR gale transi lor c irc uil in Fi gure 2.11
when: (a) x = 1 and y = O. (b) x-I and y = I.
2.15 Show the conduclion paths and OUIPUI value of Ihe AND gale lrans i'lor circuil in Figure 2.13
when: (a) x = 1 and y - O. (b) x = 1 and y - 1.
2. 16 Conven each of Ihe foll Owing equali ons directly 10 gate-level circlIi l-"
(a) F a b ' + bc + c '
(b) F - ab + b ' c 'd'
(e) F E a + b' ) * (c ' + d + (c + d + e ' )
2.17 Conven each of Ihe following equali ons direclly 10 gate-level circuits:
(a) F - a ' b' + b' c
F - ab + bc + cd + de
(c) F - (( a b) ' + (c + (d + e f) ,
2. IS Conven each of Ihe fOllowing equation; direell y 10 gale- leve l C"ClliL' .
(a) F - abc + a' bc
2.14 Exercises 89
(b) F - a + bcd' + a e + f'
(c) F = (a + b) + (c ' * (d + e + f9
e.
o
PLUS
2. 19 We Want to design a system that sounds a buzzer inside Our home whenever motion outside is
detecled al ni ght. Ass uming we have a moti on sensor wi th output M thal indicates whether
mol ion is delecled (M-l means motion delecled) and a lighl sensOr wilh Outpul L that indi-
cales if li ghl is delecled (L = 1 means li ghl is delecled). The buzzer inside the home has a
single inpul B Ihat when 1 creales a loud warning sound. Usi ng AND. OR. and NOT gates.
creale a simple digital circuit 10 impl ement the moti on detector at night system.
2.20 A DJ (" di sc j ockey." meaning someone who plays the music al a party) would like a system to
aUl omalically conlrol a strobe li ghl and disco ball in a dance hall depending on whether music
is playing and anyone is dancing. Assume we have a sound sensor with output S mat indicates
whelher music is pl aying (5= 1 means music is playing) and a motion sensor M that indicates
whether peopl e are dancing (11- 1 means people are dancing). The strobe light bas an input L
Ihal lums Ihe lighl on when L is 1. and the di sco ball has an inpul B thai turns the ball on
when B is 1. The DJ wants Ihe di sco ball 10 tum on only when music i playiDg and nobody
is danci ng. and Ihe DJ wan Is the strobe li ghl lo lum on only when music is playing and people
are danCing. Using A D. OR. and NOT gales. creale a si mple digilal circuil to activate: (a) the
di sco ball. and (b) Ihe strobe li ght.
fu-s
2.21 We wanl to concisely descri be the fallowing si ruation usi ng a Boolean equation. \Ve Wanl to fire a
foolball coach (by setting F -1) if he is mean (represented by M= 1). If he is nor mean. but has a
losing season (represented by the Boolean variable L - 1). we wanl 10 fire him anyway. Write an
equation thai translales the siluation directly 10 a 8 00lean equation for F. "ithout any
simplificmi on.
SECT/ON 2.5: 800LEAN ALGE8RA
2.22 Forthe funclion F = a + a' b + acd + c':
(a) Lisl all Ihe vari ables.
(b) Li sl all Ihe li lerals.
(c) Lisl all Ihe produci lerms.
2.23 For Ihe funcli on F - a ' d' + a ' c + b' cd' + cd:
(a) Li sl all the variables.
(b) Li sl all the lilerals.
(c) Li sl all Ihe produci lerms.
2.24 Lei variabl es T represent being tall. H bei ng heavy. and F being fast. Let" consider an)ooe
who is nOI lall as short . not heavy as li ghl. and not fast as slow. Write a Boolean equation to
represenl Ihe following:
(a) You moy ride a panicular amusemenl park ride only if you are either tall !Uld or
short and heavy.
(b) YOll may NOT ri de an amusement park ride if you are either tnll !Uld lighl. or -bon !Uld
heavy. Use algebru 10 si mplify the equation 10 sum-of-produ IS.
(c) You are eli gibl e 10 play on a panicul", baskelball leam if you are tall !Uld fast. or tall :tnd
s low. Simplify Ihi s equalion.
(d) You are OT eligible 10 play on " particular foolball 1<!Ul' if you are shoo !Uld ". or if
you are light. implify 10 sum-of-products fonn.
(c) You are eli gible 10 play on both the baskelball and football leams .00\ . based on the
above criteri a. Hi n!: combine the two into one- equ3tion ANDing them.
2.25 Lei variables 5 represenl n pockagc being -mall. H being he3\). and being <\pensl\e. Let",
consider a package thaI b not small as big. nO( heJ.\') light. and not \ n-
si"c. \ rile n 8 00lean equmion I represent Ihe fOllowing:
90
Combinat ional Log ic Design
k "'S are either small and c,'<pensive. or big and
(a) You can deliver packages only if the pac 'age. .
inexpensive. . r led above. Use nlgebru to simplify the equati on
(b) You can NOT deli ver a package Ihal IS 15 .
10 sum-of-products . k I ' f Ihe pockoges "rc small and lighl, small
(c) You can load the pacbges into your truc on I
. S' rfy Ihe equallon.
and heavy. or bi g and light. IInp I "b d ' bove Simplify to sum-of-products.
OT I d h packaoes descn ea .
(d) You can N oa l e,.o I ' equarion (0 sum-of-product s form:
2.26 Use algebrai c manipulation to convert the fol OWing
(b - c)(d ' ) + ac ' ( b + d ) F
a + h foll owino equation to sum-of-products ronn:
2.27 Use aloebrni c manipulat ion to convert te d)'
a ' b( ;+ d ' ) +a ( b ' + c) +a ( b+ c.. +a 'b
. f the following equation: F ;;;: a b e .
2,28 Usc DeMoroan' s Law 10 find Ihe IIl verse 0 . F' = ( a bc + a ' b) ,
e r Hint' Stan wllh
Reduce 10 sum of-products ,om1.. . ' . F _ ' + a bd' +
. f the follo\\llOo equation. a c
o ' ' 9 Use DeMorgan's Law to find the Inverse 0 e
PLUS _.- acd. Reduce 10 sum-of-produc ls fom1.
SECTION 2.6: REPRESENTATIO S OF BOOLEAN
FU CTIONS .
2.30 Convert Ihe following Boolean equali ons 10 a digi lal circu,,:
(a) F (a , b , c ) a ' bc + a b
(b) F ( a , b, c) a ' b
(c) F( a , b , c ) abc + ab + a + b + c
(d) F ( a . b , c ) c '
Figure 2.64 Combinalional circuit F.
2.31 Creme a Boolean equation representalion of the digital ci rcuit
in Figure 2.64.

G
c: ----
f igure 2.65 Combinalional ci rcuil C.
2.32 Create a Boolean equat ion repre entation for the digital circuit
in Figure 2.65.
2.33 Convert each of Ihe Boolean equalions in Exerci se 2.30 10 a
lrulh table.
2.34 Convert each of lhe foll owing Boolean equalions 10 a !ruth
table:
(a) F ( a , b . c) = a' + bc '
(b) F( a , b , c) = ( ab ) ' + a c ' + bc
(c) F( a , b , c ) ab + a c + a b ' c ' + c '
(d) F ( a , b , c , d ) = a ' bc + d '
2.35 Fill in Table 2.S's columns for Ihe equalion: F- ab + b ' .
TABLE 2.8 Truth table,
TABLE 2.9 Truth table.
2.14 Exercises 91
2.36 Convert the functi on F shown in the truth table in Table 2.9 10 an equation. Don' t minimize
[he equation.
b c
a o a
0 0 1
a a
a
o
o
o
a
F
a
a
T}BlE 2.10 Truth table.
--.a b
(J 0
c> a
d
o
1
o
a
c
a
a
a
a
F
1
a
a
o
TABtE 2.11
b
Truth table.
a
o
o
a
a
c F
o a
o 1
a
o
a
o
a
a
1
a
a
a
o
2.37 Use algebraic manipulafion to minimi ze the equation obtained in Exercise 2.36.
2.38 Convert Ihe funclion F shown in Ihe lrulh lable in Table 2. 10 10 an equalion. Don'I minimize
the equation.
2.39 Use algebraic manipulation to minimize the equaLion obtained in Exercise 2.38.
2.40 Convert the function F shown in the truth table in Table 2. I I to an equation. Don't minimize
the equat ion.
2.41 Use algebraic manipul at ion to minimi ze rile equation obtained in Exercise 2AO,
2.42 Creale a lrulh table for Ihe circuil of Figure 2.64.
2043 Creale a lrulh table for Ihe circuil of Figure 2.65.
Convert Ihe funclion F shown in Ihe lrulh lable in Table 2.9 10 a digital circui t.
2.45 Convert Ihe funclion F shown in Ihe lrulh lable in Table 2. 10 10 a digital ci rcuit.
2.46 Convert the function F shown in the truth table in Table 2, 1 J to a digital circuit.
2.47 Convert the following Boolean equations to canonical sum-of-mimenns fonn:
(a) F ( a , b , c ) a ' bc + a b
(b) F (a , b , c) a ' b
(c) F(a , b, c) a bc + ab + a + b + c
(d) F (a , b , c) c '
2.48 Delennine whelher Ihe Boolean funclions F
( a + b ) ' *a and G - a T b' are
equival en!. using: (a) algebraic manipulation. and (b) !ruth lables.
2.49 Detennine whelhcflhe Boolean funclions F = a b ' and G = ( a ' + a b) ' are equi\'alenl
using: (a) algebraic manipulalion. and (b) lrulh lables.
2.50 Delennine whelher Ihe Boolean funclion G _
a 'b'c + ab' c + a bc ' + abc isequiv-
alent to the function represented by the circuit
in Figure 2.66.
2.51 Determine whether the two circuits in Figure
2.67 are equi valent ci rcuits using: (3) algebraic
manipul alion. and (b) lrulh lables.
Figure 2.67 Combinntional circuils F and C.
b
H
Figure 2.66 Combinational irruil H.
G
2.52 Figure 2.68 shows two in \\ hich Ihe- inputs of the cirt'uil'\ Jre .
(a) Dctenninc whether the 1\\ 0 arc cquh-ak nl. Hint : Tr) ;1JI ,)t lM
inpulS fi r both circuit:"
92
Combinational logic Design
.. '. ,'II au need 10 perform to dctemli nc if IWO circuits with
(b) How many circuli compansoll s ,\ I Y
10 unlabeled inputs arc C(lui valenl ?
Dr>-F
D
Figure 2.68 Combinalional eireuils F and G.
SECTION 2.7: CO/l'IBI NATIONAL LOGIC DESIGN PROCESS
G
1.53 A museum has three rooms. each with a IllOli on sensor (mO. and m2) ,thai outputs 1 when
moti on is detected. At nigh!. the only person in the museum IS one guard who
from room to room Create a circuit thai sounds an alaml (by CUing an output A to Ir
nlOl ion is ever dClcc;ed in more than one room at a lime in two or three rooms). meanmg
there must be an imruder or inlnJders in the museum. Start with a truth table.
2.54 Creale a cireui l for the musem of Exercise 2. 53 thaI delccls whelher the guard is properly
patrolling the museum. detected by exactly one mOlion sensor being 1. (If no mOlion sensor is
1. Ihe guard musl be sining or sleeping.)
2. 55 Consider the museum security aJarrn function of Exerci se 2. 53. but, for <l ,10
rooms. A lrulh table is not a good starting point (too many rows). nor IS an equation de cnbmg
when the alarm should sound (too many tenlls). However, the inverse of the alann ru nclion can
be straightforwardly captured as an equati on. Design the ror 10-room securi ty
by designing the inverse or the function. and then just adding an Invener berore the CircUli s
output.
2.56 A network router connects multi ple computers together and all ows them to send messages 10
each other. Ir two or more computers send messages simultancou Iy, they collide and the mes-
sages musl be resen!. Using the eombinari onal design process of Table 2.5. creale a collision
detection ci rcuit for a router that connects 4 computers. The circuit has 4 inputs labeled MO
through M3 thaI are I when Ihe corresponding compuler is <endi ng a message and a other
wise. The eircuil has one OUIPUI labeled C Ihal is 1 when a coll bion i. deleeled and 0
otherwise.
2.57 Using Ihe eombi nali onal design process of Table 2.5. creale a 4 bil prime number deleclOl.
The eireuil has four inputs. N3. N2. NI. and NO Ihar corre'pond 10 " 4-bil number (N3 is the
most bit) and one output named P Ihal out pUl!oo a 1 when the input is a prime
number or 0 otherwise.
2.58 A car has a fuel-level deleclor thaI OUlputs Ihe currenl fuel-level ", a 3-bil binary number. wirh
000 meaning emply and III meaning full. Create" cireui l Ihal iliumin:! le, a "low fuel" indio
calOr lighl (by , cuing an OUlpO! l 10 1) when Ihe fucl level below level 3.
2.-9 A car has a Ihat outputs the current li re prc ...... ure as 3 5-bil binal)
number. Create a circuit that iliumlllarcs a "low tire prc.."'''i urc'' inthc;.lIor fift ht (by setting an
OUIPUI T 10 I) when the lire pre"ure drops below 16. Il in!: YOIl mighl find II ""'Ier 10 ereale l
ci rCUli that detccl,lhe invcl'M! runction. You can lhcnjU"i1 append an IIlvencr 10 'he outpul
circuit
2.14 Exercises
93
SECTION 2.8: MORE GATES
2.60 the conducti on pmhs and output va lue of the NAND gale transistor ci rcuil in Figure
2.4) when: (a) X = 1 and y = O. (b) x = 1 and y z 1.
2.61 Show Ihe eonduelion parhs and OUlpUI valu'e of Ihe NOR gale lransislor eireuil in Figure 2"+5
when: (a) X = I and y - O. (b) x - a and y = O.
2.62 Show the conducti on paths and output value of the AND gale lransislOr circuit in Figure 2.46
when: (a) X = 1 and y - 1. (b) X = a and y _ 1.
2.63
2.64
Two people, denoted using variables A and B, wanl (Q ride with you on your mOlorcycle. Write
<'1 Boolean cquUlion that indicates thaI exaclly one of the Iwo peopl e can come (A=l means A
can cOllle, A=O means A can ', come). Then use XOR 10 si mpli fy your equation.
Simplify Ihe foll owing equari on by using XOR wherever possible: F = a ' b + ab'
cd ' + c ' d + ae.
2.65 Use XOR 10 creale " cireuil thaI OUIPUIS a 1 when the number of Is on inputs a. b. c. d i
odd.
2.66 Use XOR or.XNOR [Q creme a eireuil Ihal deleclS if al l inputs a. b. c. d are as.
2.67 Use XOR or XNOR 10 creme a eircuil Ihal deleels if an even number of rhe inputs a. b. c. d
are Is.
2.6S Show Ihal a 4-bi l XOR gale is an odd funeli on (meaning Ihe OUIPUI is 1 only if rhe number of
inpUI Is is odd).
SECTION 2.9: DECODERS AND MUXES
2. 69 Design a 3x8 decoder using AND. OR. and NOT gates.
2.70 Design a 4" 16 decoder using AND. OR. and NOT gales.
2.71 Design a 3x8 decoder with enabl e using AND. OR. and NOT gares.
2. 72 Design an 8x I muhi plexer using AND. OR. and NOT gales.
2.73 Design a 16xl muhiplexer using AND. OR, and OT gales.
2.74 Design a 4-bit 4x I rnull iplexer using 4x I multiplexers.
2. 75 Creat e a circuit that rings a bell whenever motion is dClccrcd from one oflwo motion .sensors.
A switch 5 determines which sensor to pay allention to: 5=0 means ring the bell when there's
moti on at moti on sensor 1. 5=1 means motion sensor 2.
2.76 A home enlenainmenr cenl er has four differenr audio ourees thar can be pla)ed over rhe same
sel of speakers. Each audio Source. named A. B. C. and D. is eonnccled using "ires on "hieb
the digiti zed audi o signal is tmnsmiued. The user seleclS wttich audio Duree i.:, 10 be pla)ed
using a rolary swilch wilh four OUlpUIS. 5 O. 51. 52. 53. of which e . a tI) one wil1 be .. al
any given lime. If 5 a = ' I ' . Ihe audio souree A shoul d be pl a)ed. if 5 I = 'I '. rhe audio =
B should be played. and so on. Crear e a digilal cireuil \Virh a single -bit ourpur a thaI "ill
output the user's selected audio source.
SECTION 2.10: ADDITIONAL CONSIDERA TIONS
2.77 Design a 1..4 demulriplexer using A D. OR. and NOT gores.
2.78 Design a Ix8 demuhipl exer using AND. OR. and NOT gale .
2.79 Design a 4x2 encoder using A D. OR. and NOT gales.
2.110 Design an 8x3 encoder usi ng AND. OR. and OT gale . . , ume rhal onl) Oil<' inpul will be
I ;.11 any given time.
2.8 1 Design 3 -'-':2 pri ori t)' encoder usi ng
a is encoded as 00.
D. OR. nnd NOT gales. ""urne th;u e\<f) mput be'l\!!
9-1 2 Combinational Logic Design
DESIGNER PROFILE
<unSell enjoyed physics
and math in college.
focused his advanced
experience. "For the smaller team each
had more responsibility. and overull effiCiency was high.
For the lame team project. each per!;on worked on a
specific pa'; of the proj ect-the chip lVas divided into
clusters. each cluster into units. and each unit had a
leader, We relied heavi ly on design nows and
methodologies."
i<.lUdics on integrated
circuit (Ie) design.
believing the industry to
have a great future.
Years laler now. he
reali zes he was right:
"Looking brick 20 years
in hi gh tcch. we have
experienced four major
revolutions: the PC
rc\ OiUlion. digital rC\Olulion. cOlll l1luni cati on revol ution.
Jnd Internet rc\oJution-all four enabled by the Ie
indul., ll') . The impacl of these revolutions 10 ollr daily life
is profound:'
He has found hi s job to be "vcry challenging.
interestin!! . and exciting. I cOlllinually learn new skills to
keep up. ; nd to do job more efficicmly:'
Samson has seen the industry's peaks and valleys
during the past IwO decades: "Li ke any industry. the Ie
job market has ils ups and downs." He believes the
industry survives the low !>oims in large pan due to
"i nnov3lion:' "Brand name sell products. but without
innovation, markets go elsewhere, So we have to be very
innovati ve, creating new products so lhal we are always
ahead in the e:lobal competition,"
But. doesn't grow on trees," Samson points
out. "There are two kinds of innovati ons. The first is
inventi on. which requires a good understanding of the
physics behind technology. For example. to make an
analog TV into a digi tal TV, we must know how human
eyes perceive video images, whi ch parts can be digiti zed,
how digi tal images can be produced on a silicon chip. elc.
The second kind of innovmion reuses existing technology
for a new application. For example, we can reuse
advanced space technologies in a new non-space product
serving a bigger market. c8ay is another example-it
reused Internet technology for on-line aucti ons.
Innovations lead 10 new products, and thus new jobs for
many years,
One of SJmson's key design projects was for digital
television. namely. high-definition TV (HDTV). involving
companies like Zenith, Philips, and Intel. In particular, he
led the 12-person design teal11 that built Int ers first Liquid
Crystal on Silicon (LCoS) chip for rear-projection HDTY.
"Traditional LCoS chips are analog. They apply different
anal OR voltage .. on each pixel of the di splay chip so it can
an image. But analog LeoS is very sensit ive to
noise and temperature variation. We used digital signal s to
do pube width modulation on each pixel." Samson is
quite proud of his team' , accompli,hments: "Our HDTV
picture quality was much bener."
Samson also \\-orked on the 200-mcmber design team
for Inlers Pentium II processor. Thai was a very differen!
Thus. Samson point out that ''The industry is counting
on new engineer from college to be innovative. so they
can continue to drivc the high tech industry forward.
When you graduate from college. up to ),011 to make
things beuer."
. . -- -

Sequential logic
Co ntro" e rs
3.1/NTRODUCT/ON
Des ign-
The output of a combinational circu it is a function onl y of the circuit's present inputs.
A combinational c ircuit has no memory-we cannot tore bits into a combinational
ci rcuit and later read the bits out that we saved. Combinational circuits by them eh'
are rather limited in their usefulness. Designers ins tead typicall y use combinational cir-
cuits as part of larger circuit s called sequenti al ci rcuit s--circuits that do have memon.
A sequel/tial circllit is a circuit whose outputs depend not only on the circuit's
input s, but also on the circuit 's present state, which is all the bits stored in the circuit.
The circuit 's state in turn depends on the past sequel/ce of value that ha\'e appeared at
the circuit 's inputs.
An everyday exampl e of a combinational circui t i a doorbell-push the button (the
input) now. and the bell (the output) rings. Push the butt on again. and the bell rings again.
Push the button tomorrow. or next week. and the bell ring the arne en h time. A door-
bell has no state, no memory-its output value (whether the bell ring or not ) depends
solely on it s present input value (whether the button i pressed or not ). In ontnst. an
example of a sequenti al circuit is an automati c garage door sy tem-pu h the button (the
input) now. and the door opens. Push the button again. and this time the door loses. Pu b
the button tomorrow. and the door opens again. The system' output (\\ hether the door
opens or closes) depends on the state of the system (whether the door is open or
closed). which in [urn depends on the sequence of pasl input value in e we turned on
the ystem.
Most di git al sys tem with which you are familiar in\'oh e sequential cin:-uits that
store bits. A handheld calculat or must contain a sequential cin:-uit. because [he ,'a/culator
must store the numbers you enter. in order to operate on tho' e A digital amen
stores pictures. A traffic light controll er store. infonnmion indicaring \\ hi h light i. pres-
ently green. A circuit t.hat counts d wn from 59 to 0 Stores the present 'l'unt \ alu', to
know what the next val ue should be.
In th is chapter. we describe sequential ireuit building bl ' I..s. Jnd th- d "tgn 01
a cennin of sequential circui ts kno\\ n as c ntrollers .
96 3 Sequential Logic Design-Controllers
3.2 STORING ONE BIT-FliP- FLOPS
To build a sequential circui!. we need a
building block that enables us to store a
bi!. By store a bi!. we mean that we can
save a bit in the bl ock (say a 1) and latcr
come back 10 see what we saved. As an
example. suppose we want to bui ld the
fli ght att endant call-button system in
Figure 3.1 . An airline passenger can push
the Call bunon to tum on a small blue
light above the passenger's sea!. indi cati ng
Call Blue light
buHon

bunon
Figure 3.1 Flight att endant call-button
system. Pressing Call turns on the light,
which stays on afl er Call is released.
Pressi ng Cancel turns otT the light.
10 a fli ght att endant that the passenger .
needs service. The light stays on even after the call button is released. The hght can be
turned off by pressing the Callcel button. Since the light has to stay on even after the call
button is released. we need a way to " remember" that the call button wa pressed. We can
remember by u ing a bit storage block. and storing a 1 in the block when the call button
is pressed. and storing a 0 when the cancel button is pressed. We connect the output of
this bit storage block to the blue ligh!. The light illuminates when the bl ock's out put is 1.
To introduce the internal design of such a bit storage block, we' ll introduce several
increasi ngly complex circuits able to store a bit-a bas ic SR latch. a level-sensitive SR
latch. a level-sensiti ve 0 latch, and an edge-triggered 0 flip-flop. The 0 flip-flop will then
be used 10 create a bl ock capable of storing multiple bits, known as a register, which will
serve as our primary bit storage block in the rest of the book. Each successive ci rcuit el imi.
nates some problem of the previous one, leading to the robust 0 Rip-fl op and then register.
Be aware that designers rarely use bit storage bl ocks other than 0 flip-nops. We
introduce the other bl ocks primarily to provide the reader with the underlying intuition of
the 0 flip-flop's design.
Feedback-The Basic Storage Method
The basic method used to store a bit in a digital ci rcuit isfeedback . You've surcly experienced
feedback in the form of audio feedback, when omeone talking into a microphone stood in
front of the speaker. causing a loud continuous humming ound to come out of the
(in tum causing everyone to cover their ears and snicker). The talkcr gcnerated a sound that
was picked up by the microphone, came out the peakers (ampli fied), was picked up again by
the microphone, came out the speakers again (amplified even more), etc. That' feedback.
Feedback in audi o systems is annoying, but in digit al sy terns is extremely useful.
Intuitively. we know that we need to somehow feed the output
of a logic gate back int o the gate itself, so that the stored bit
ends up looping around and around, like a dog chasing its own
tail. We might try the circuit in Figurc 3.2.
Suppose initially 0 is 0 and 5 is O. At some poi nt. uppose
we set 5 to 1. That causes 0 to become 1. and that 1 feeds back
into the OR gate, causing 0 to be 1. ctc. So even when S rctums
to O. 0 stays 1. Unfonunmely, 0 1 from then on. and we
Srf2j-
Ftgure 3.2 FiNt (failed)
att empt at u\lng fecdbxk
to '-lore a bi!.
SR Latch
3.2 Storing One Bh-Flip-Aops
have no way of reselli ng 0 to O. But hopefully you understand the basic idea of feedback
now-we did successfully store a 1 using feedback.
. We draw in Figure 3.3 the timing di agram for our attempted feedback circuit from
Ftgure 3.2. NOIe that we assume the OR gate has a small input to output delay, as was
discussed in Section 2. 10. Initially, we assume both OR gate inputs are 0 (Figure 3.3(a)).
Then we set S to 1 (Figure 3.3(b, which causes 0 to become 1 slighlly later (Figure
3.3(c , which in tum causes t to become 1 lightly later (Figure 3.3(d. Finally. When
we change S back to 0 (Figure 3.3(e. 0 will stay 1 because t is I. The firsl curved line
with an arrow indicates that the event of 5 changing from 0 to 1 cau es the eVent of 0
changing from 0 to 1. The second curved line with an arrow indicates that the eVent of 0
changing from 0 to 1 in turn causes Ihe eVent of t changing from 0 10 I. And that 1 then
Continues to loop around, forever, with no way of 5 resetting 0 to O.

(a) : ; (b) ' ' ----fei) -' ! (e)
\ /-;f'/ (
t f \ _________ _
Q 0 \, 0 stays 1 forever
Figure 3.3 Tracing the behavior of our first attempt at bit storage.
Basic SR Lalch
It turns out that the simple ci rcuit in
Figure 3.4. called a basic SR latch.
implements the bil slOrage building block
we desire. The circuit consists of just a
pair of cross-coupled NOR gates.
Making the cireui!"s S input equal to 1
causes Q to become 1. while making R
equal to 1 causes Q 10 become O. Making
both 5 and R equal to 0 causes whatever
value 0 i. 10 keep loopi ng around. In
other words, S "sets" the lat ch to 1. and R
"resets" the latch to O-hence the lellers
5 (for set) and R (for reset ).
s
0-----
o
o
Let's ee why the basic SR lal ch
works as it does. Recall that a OR gate
outputs 1 when all the gate's input '
equal 0; if at least one input equals 1.
the NOR g1tl e outputs O.
Figure 15 R latch \\ hen = 0 and R = I.
98 3 Sequential Logic Design- Conlrollers
S k 5 0 d R-l as in Ihe SR Imch ci rcuil or Fi gure 3.5. and that
uppose. lhm we ma ' c = an - . . e bOllom 'ate or Ihe cireuit has at
we don'l 11lIllall y know the va illes or 0 and t. SlI1ce Ih .. g. . 1
. 0 in Ihe IImlll O dlagrmn. R becoming leasl one IIlpUI equal 10 1 (R) . the gale oulPUIS - " ..
call Scs 0 10 become O. In the ci rcuil. O's 0 reeds back 10 Ihe lap OR gate. whIch WIU have
. . . I I 1 In the limino dIagram. 0 becoming 0 bOlh li S IIlPUIS equal 10 0 and liS OUIPUI equa a . " .
callses t 10 become 1. In Ihe cirell il. thai 1 reeds back 10 Ihe bOllom OR gal e. whI ch has
al leasl one inpul eqll al 10 1 (nclUnlly. bOlh inpuls equal 1). and so Ihe boti om gate will
continll e 10 Oll lplll O. Thlls the OUlp11l 0 equals O. and all values are slable.
Now suppose we make 5=0 and S 1
R=O. as in Figllre 3.6. The bOllom gme 0-----
slill has aI leasl one inpul equal 10 1 (Ihe
input coming rrom the top gale). so the
botiol11 gale cOlliinues 10 OIl IPUI O. The
lOp gale cOlllinll es 10 have bOlh inpuls
equal 10 0 and cOlllinlles 10 OUlpul 1.
The OUlpUi 0 willihus slill equal O. Thus
Ihe earli er R= 1 srored a 0 inl o Ihe SR
lalch. also known as resellillg Ihe Ialch.
and Ihal 0 remains slOred even when we
relUm R 10 O. Figure 3.6 5R laleh
when 5=0 and R= O.
afler R equaled I.

,_ - 1 _----- 0
\ ..... :><.: ,..,..",
- --------- 1 Q
R=O
Figure 3.7 5R Inl ch
when S= I and R=O.
Now lei's make 5= 1 and R=O. as in
Figure 3.7. The lap gale in the circui l
now has one inplII cqual 10 1. so Ihe lap
gate ou tputs a O- the liming diagram
shows Ihe change or 5 rrom 0 10 1
causing t 10 change from 1 10 O. The
lOp gale'. 0 OUIPUI reeds back 10 Ihe
botiom gale. which now has both inpUis
equal 10 0 and OUIPUI S l - Ihe limi ng
di agram shows Ihe change or t rrom 1
10 0 causing 0 10 change rrom 0 10 1.
The botiom gale's (0) 1 OUiPUI reeds
back 10 Ihe lap gale. which has al leasl
one inpul equal 10 1 (aclUally. bOlh
inputs equal 1 now). a Ihe lap gale con-
linues 10 OUI PUI O. The OUIPUI 0
Iherefore equals 1. and all va lues are
slabl e.
ow lei's make 5- 0 and R=O again,
a, in Figure 3.8. The top gale slill has aI
leasl one inpul equal to 1 (the inpul
comi ng from the botiom gale). so the lOp
gale cOnlinue, 10 output O. The botiom
gale cOnli nuc, 10 have bolh inputs equal
10 a and eOnlinue, to oUlpul I. The
ouipul 0 " 51ill equal to I. Thu" Ihe
,_ - t _---- 0 I

, --
Q
R=O
Figure 3.8 SR laleh
when 5=() and R =0.
aflcr 5 equaled I.
o
Q
0
S
0
R
0
0
Q
0
S

R
0
0
Q
0
earli er 5= 1 stored a 1 int o the SR
latCh, also known as sellillg Ihe
lalch, and thai 1 remains slored
even when we relUrn S 10 O.
The basic SR Ialch can be used
10 implemenl the fli ghl al lendant
cal/ -bullon syslem (Figure 3.9). We
conneCI the call bUll on 10 5, Ihe
cancel button 10 R. and Ihe lighl 10
O. Pressing Ihe call bUllon sels 0 10
I, Ihus lurning on Ihe li ghl. 0 stays
I even when the call button is
released. Pressing Ihe cancel bUllOn
reselS 0 10 0, Ihus turning orf the
lighl. 0 Slays a even when Ihe
cancel bUllon is released.
Level-Sensiti ve SR Latch
bunon
Cancel
bunon
3.2 Storing One Bit- Fli p-Rops
Figure 3.9 Flight auendant caIJ-bulton system using
a basic SR laleh.
A problem wilh Ihe bas ic SR Ialch is 5 and R both equaling 1 al Ihe same time causes
undenned behavior-we mighl have stored a I, we mighl have slored a 0, or we might
even cause Ihe latch ourpUi 10 oseill ale from 1 10 0 10 1 10 O. and so on. Lei's ee wby.
If 5 = 1 and R = I, both gales have at leas I one inpul equal 10 1. and thu both gate
OUlput 0, as shown in Figure 3. 1 O(a). A problem occurs when we rerum 5 and R 10 O.
Suppose 5 and R rerum to 0 al exaclly the same time. Then both gates will have all 0 ar
Iheir inpulS, so Iheir ourpul s wi ll change from Os to Is. as shown in Fi gure 3.1 0(b). Those
Is feed back 10 the gate inpuls, causing Ihe gates 10 OUIPUI as. as hown in Figure 3. IO(c).
Those as feed back 10 the gale inputs again. causing the gates to OUtpUI Is. And 0 on.
Going from I 10 a 10 1 10 0 and so on is cal/ed oscillation. Oscillation is not a de irable
fealure of a bil slorage bl ock.
Figure 3.10 The si lUation or S = I and R = I causes problems-Q as il/Oies \\ hen R re!Urn to 00.
In a real circuil . the del ays or Ihe upper and lower gales and wires lI ould b.!
different from one anot her. a after a lime of os illation. one of the gale. gel ahead
of the olher (Oulpull ing a 1 before Ihe other d . then a 0 b.!fore the other on de -,
cle.). II ntil it gets rar enough lI hend to cause the cirt'uil I enler a siluati n of ither
OaO or 0= I-which case will happen. li e don'l knOll . u 'h a in IIh,,-h th tinal
99
100 Sequential Logic Design-Controllers
5 and R should
flt'I'U bOlh equal
I in all SR lotch
value of a memory circuit depends on the del ays of
gates and wires, is known as a race condition.
Figure 3. 11 shows a race condit ion involving oscil -
lation but ending with a stable situation of 0: I.
But we didn' t know which value 0 would eventu-
all y selli e into (it could have settl ed into 0:0), so
the fact that 0: I is not useful to us in our use of
the bit storage bl ock.
Figure 3.11 Q eventuall y seliles to
ei ther 0 or I. due to race condition.
In our fl ighl attendant call -bullon system, if the passenger pushes both butt ons at the
same lime. the result could be thallhe blue li ght slarts oscill ating. and then Ihe li ghl either
ends up on or off.
In summary. Sand R should never both equal 1 in an SR lalch.
In practi ce. we would never aCluall y conneci butt ons directl y to an SR latch's inputs
(we did Ihal just for the purpose of an intuiti ve example). So we can safely assume the S
and R inpuls come from a di gi tal circuit. Thus. we can design that digi lal circuit such thai
5 and R should never both equal 1. BUI even if we Iry 10 design Ihal circuit such thai S
and R should never both be 1. we could still fi nd that S and R inadvertentl y bOlh become
I at the same time. For example. cons ider the simpl e ci rcui l in Figure 3. 12. In Iheory, S
and R can' l both be I-if X:l. then 5: ] bUI R:O. If X:O. R may equal 1 bUI 5:0. So S
and R can' l both be I-in Iheory.
In realil Y, both 5 and R could both be ] for a short lime in Ihis circuit. because of
the delay of real gales. as introduced in Fi gure 2.62. Suppose X has been a and Y has
been ] for a long time, so 5:0 and R: l. Then suppose we change X 10 1. 5 wi ll change
10 I almost immediatel y. but R will stay] for a short whil e as the new value of X pro!>,
agates Ihrough the inverter and Ihe AND gate, after whi ch R changes to O. If each
componenl has a delay of I ns (nanosecond). then 5 and R would aCluall y both be I for
2 ns (Figure 3. 13). Temporary values on ignals caused by gate delays are referred 10 as
glitches.
Figure 3.12 Conceptually. Sand R can' t both be I
in thi' sample circuit . But in reality. they can. due
to the delay of the invener and AND gate.
,
1
X
y
S
!
o
;
1 ' ,!. '::-''"--------
--1J
' .
" .
, '
o I
: : SA = 11

A :\ /
o : ' .. _ I
I
figure 3 13 Grllc delny'
Con cau,c SR = II.
A partial soluti on to thi s problem is to
add an enable input C to the SR latch. as
shown in Figure 3.1 4. When C:l, the S
and R signal s propagate th rough Ihe two
AND gates to the S I and Rl inputs of the
basic SR latch ci rcuit , because S*I:S and
R*I=R. However, when C:O, the two AND
gates cause S I and Rl to be O. regardl ess
of the values of S and R. Thus, when C:O,
the basic latch 's value cannot change. (You
might note that a difference in the lOp and
bottom AND gate delays could result in S I
and R I both being I for a very short time
equal to that difference, but that time is too
short to cause a problem.)
3.2 Storing One Bit-Flip-Flops
Level'sensitive SA latch
S
c
A
Figure 3.14 Level-sensi tive SR latcb-
an SR latch with enable input C.
101
The introduction of the enable input leads 10 the idea of setting the enable to I only
when we are sure that Sand R have stabl e val ues. Figure 3.15 shows the inverter/AND
circuit from Fi gure 3.1 2, thi s time using an SR latch with an enable input. If we change
X, we should wait for at least 2 ns before setling the enable input C to 1 in order to ensu;:'
that the SR inputs to the latch are stabl e and are not equal to II .
Levelsensitive SA latch

1 '
R 0 il'--+-----

1 ; i rI
L-
: :
l---f-, :
Rl 0 ! 'L . ...;; ____ _
>2ns
Figure 3.15 Level-sensiti ve SR latch-an SR latch with enable input C.
An SR latch with an enable is lenown as a level-sensitil'e SR latch. beeau e the lat h
is only sensilive to its S and R inputs when the level of the enable input is 1. uch a Iat b
is also called a transparent latch, because setting the enable input 10 1 makes the internal
SR latch transparent 10 the 5 and R inpulS.
You may have noticed tll al the lOp NOR gate of an
SR lalch output s the opposite val ue as the bottom gale,
which i connecled 10 the oUlput O. Thus, we can include
an output 0' on an SR lalch almost for free, j ust b con-
necting the top gate to Ihat out put. Mosl latche ' do in
faci come with bOlh 0 and 0 ' outpul . The symbol for a
level-sensiri ve SR IMCh wilh such dual outputs is hown
in Fi gure 3. 16.
Figure 3.16 ymbol for
dual-{lU(put 10\ ek nsnh
R lalch.
102
3 Sequential Logi c Design- Controll ers
Clocks and Synchronous Circuits
ble si nal C that we must sct to 1 arter we are
The level-sensitive SR latch uses an ena gd h to set the enable C to I? Most
5 d R
bi B t how do we decI e w en
sure an are sta e. U '0 al that ul ses at a constant rate. For example,
sequential circuits simply use an enableslon, I 0 then low for IOns, then high for
we could make the enable SIgnal go hI gh for '
10 ns, then low for 10 ns. etc .. as in Figure 3. 17.
t
elk
safe 10
change
X, y
X. Y
must not
change

o o
Figure 3.17 An example of a clock signal named elk. Circuil inputs should only change while
elk z 0, such that lalch inputs will be stable when e lk - I .
The time high and time low need not be the same-for example, we could create a
signal that is low for 10 ns, high for I ns, low for 10 ns. hIgh for I ns. etc. . .
Such a pul sing enable signal is called a clock signal. because the Ignal li cks (hIgh,
low, hi gh. low) like a clock. A ci rcuit whose storage elements (Ill thIS case. latc.hes) can
only change when a clock signal is active is known as a synchronous sequenttal CirCUli, or
j ust synchronous circllit (the sequential aspect is impli ed-there's no such thlllg as a
synchronous combinational circuit). A sequent ial circuit that does not use a clock is
caHed an asynchronous circll it. We leave the important but challengi ng topic of asyn-
chronous circui t design for a more advanced di gital design textbook. The majori ty of
sequential circuits designed and used today are synchronous.
Desi gners typicall y use an a ci ll ator to generate a clock ignal. An oscillator is a
circuit that outputs a signal that aitemates between I and 0 at a constant freq uency, like
that in Figure 3. 17. An osci ll ator component typicall y has no inputs (other than power),
and has an output representing the clock signal.
HOW ODES IT WORK?-OUARTZ OSCILLATORS,
Concept ual l y, an oscill ator
can be thought of as an inverter
feeding back to itself, as shown on
the left. If C is initially 1, the value
will feed back through the inverter
and so C will become 0, which feeds back through the
inver1er causing C to become 1 again, and so on. The
oscillation frequency would depend on the delay of the
inverter. Real oscillators mu t regulale the oscillation
frequency more precisely. A common type of osci llator
uses qULJrlZ, a mineral consisting of silicon dioxide in
crystal (arm. Quartz happens to be such that it vibrates
i( we apply an electric current, and thaI vibration i, at
a precise frequency
determined by the
quartz size and
shape. Furt hermore,
when quartz vibrates,
it generates a voltage.
So, by making quanL
a specific ,ize and
shape and then
applying a current,
Oscillator Ie
we get a preci,e electronic o,cillator. We attach the
oillator 10 an IC', clock slg"al input, a' shown
above. Some IC, come with a built-Ill osci liator,
Freq.
100 GHz
10 GHz
1 GHz
100 MHz
10MHz
Period
0.01 ns
0.1 ns
1 ns
10 ns
100 ns
3.2 Storing One Bn-Flip-Flops
103
. A clock signal's period is the time after which the signal repealS ilSelf-or mare
SImply, the tllne between successive Is. The signal in Figure 3.17 has a period of20 ns.
A clock cycle refers to one such segment of time. meaning one segment where the clock
IS 1. and then O. Figure 3. I 7 shows th ree and a half clock cycle. A clock signa)'s fre-
qu ellcy IS the number of cycles per second, and is computed as I/(the clock period). The
III F,gure 3. I 7 has a frequency of 1/20 ns = 50 MHz. The units of frequency are
Hert z, or Hz, where I Hz = I cycle per second. MHz is short for Megahertz_ meaning one
mdl'on Hz.
. A convenient way to ment ally convert common computer clock periods to frequen-
cIes. a nd VIce versa, IS to remember that a I ns period equals a I GHz (Gigahertz,
meanll1g I bIllI on Hz) frequency. Then, if One is slower (or faster) by a factor of 10. the
other is slower (or faster) by a factor of 10 also-so a 10 ns period equals 100 MHz.
whde a O. I ns period equals 10 GHz.
D Flip-Flop
Whil e the SR latch is useful for introducing the notion of storing a bit in a digital circuiL
most circuits actuall y use sli ghtly more advanced devices. namely. D latches and D llip-
naps, to store bi ts.
Level-Sensiti ve 0 Latch-A Basic Bit
Store
The SR latch has the annoying problem of
ent ering all undefined tate if the 5 and R
inputs are both I when the clock is hi gh.
Ensuring that we design circuit s that don 't
set 5 and R to both 1 imposes a burden on
the deSigner. One way to reli eve designers
of thi s burden is to instead u e a new type
of latch. call ed a D latch . shown in Figure
3.1 8.
A D latch stores whatever value is
present at the latch's D input when C = 1.
and holds that val ue when C = O. Int ernally.
the latch's D input connects to 5 directly.
and to R through an inverter. Figure 3. I 9
provides a timing diagram of the D latch
for sample input values on D and C. When
D is I and C is 1. the latch is et to 1.
because 5 is I and R is O. When D is 0 and
C is 1. the latch is reset to O. because R is 1
and 5 is O. By making R the opposite of S.
we are assured that 5 and R won 't both be
I at the same time. as long as we ani
change 5 and R when C is O.
Figure 3.18 D latch internals.
R
0--+---,
o
o
Figure 319 D Iat<h tIming dlJ.\!r.un
Olaleh
104 3 Sequenlial Logic Design- Cont rollers
The symbol for " 0 lalch wilh dual-oUlpUIS
(0 and 0 ') is shown in Figure 3.20.
Edge-Triggered 0 Flip-Flop-A Robust Bit Store
Figure 3.20
D larch symbol.
--fo+

The 0 latch slill has a pOlenti ally nasly problem Ihat can Ca use unpredi clable ci rcuil
behavior- namely_ signals can propagale from a lalch OUlpul 10 anolher lalch's inpul
whil e the clock signal is 1. For example, consider Ihe circui l in Figure 3.2 1. When
e lk = I. Ihe va lue on Y wi ll be loaded inlO Ihe firsl lalch and appear al thaI latch's output.
If ( 1 k slill equals I. Ihen Ihat value will also gel loaded into Ihe second latch. The value
wi ll keep propagating Ihrough the latches umi l (1 k returns 10 O. Through how many
la tches will the value propagale? It 's hard 10 say-we would have 10 know the precise
timing del ay informati on of each lalch.
y
01 01 02 02 03 0 3 04 0 4
Clk ..... ----+-__ =-..... =_--l
Figure 3.21 A problem wilh lalches-through how many Ialches will Y propagale for each pulse
of Clk_A? For Clk_B?
Figure 3.22 ill uslrates Ihi s propagat ion problem in more delail. Suppose 01 is ini-
lially 0 for a long lime, changes 10 1 long enough 10 be stabl e. and Ihen C 1 k becomes I.
01 wi ll thus change from D 10 I after aboul Ihree gate delays, and Ihus 02 will also
change from 0 10 1. as hown in Ihe left timing di agram. If C1 k is slill 1. then thaI new
va lue for 02 wi ll propagale through Ihe AND gales of Ihe second latch. causing S2 10
change from 0 10 1 and R2 from 1 10 D. Ihus changing 02 from a 10 I, as shown in the
left IlmlOg diagram. Also nOle in the left liming di agram that changing 02 whi le C2-1
causes S2 and R2 10 both equal 1 for a short lime, due 10 Ihe extra delay on the palh 10
R2 cau ed by Ihe Inverter. Ihough Ihe lime thaI bOlh are I is probabl y 100 short 10 cause a
problem.
You mighl suggesl maki ng the clock signal such thaI the clock is I onl y for a shan
amount of tl,,;e .. so there's nOI enough li me for Ihe new OUIPUI of a lalch 10 propagate 10
Ihe nexl lalch s mpulS. BUI how short is shan enough? 50 ns? IOns? Ins? 0. 1 ns? And if
we Ihe clock's time m I 100 short, Ihat li me may nOI be long enough for the bit al a
lalch s 0 mpullo m Ihe lalch's feedback circuil . and we mighl Iherefore nOI suc-
cessfull y Slore Ihe bll , as tl luslraled in Figure 3.22 (c).
3.2 Sloring One Bit- Flip-Flops
o lalch
o latch
Clkt==================--.J '---'--'---- - __ -..J
(a)

r, Too short-ol

01 --.l '
01
0 1102
S2===:t,) SR= 11
R2
01/02 ______ _

(b)
S2 ______ _
R2 ____________ _
0 2 ______ _
(e)
Figure 3.22 A problem wilh level-sensitive lalches: (a) whil e C 1. 01 's new val ue may propagale
10 D2. (b) such propagation can cause S2 and R2 10 both be 1 for a shan time while the latch's
enable is 1 (bul SR 11 is never supposed 10 occur). or can cause an unknown number of latches
along a chain 10 gel updaled, (c) Irying 10 shonen Ihe clock's high lime 10 avoid propagalion 10 the
neXl lalch, bUI long enough 10 all ow a lalch 10 reach a slable feedback silualion. is hard. because
making the c1ock's high lime 100 short prevents proper loading of the latch.
A good solution is 10 design a more robuSI bl ock for storing a bi l- a block that stores
Ihe bil al Ihe 0 inpul at Ihe ill slalll lhal the clock ri ses from 0 10 1. Note thaI we didn' t
say thaI the bl ock Slores the bil inslantl y. Rather, the bit thaI wilJ eventually get slOred
into the block is Ihe bil Ihat was slabl e at 0 al Ihe
inslal1l Ihal Ihe clock ri se from a to 1. Such a
block is call ed an edge-Iriggered D flip-flop . The
word "edge" refers 10 the verti cal pan of Ihe line
representing the clock signal, when the signal !Tan-
sirions from a 10 1. Figure 3.23 shows three cycles
of a clock signal. and indicales the Ihree ri sing
clock edges of those cycles.
Figure 3.23 Risi ng clock edges.
Edge-Triggered D Flip-Flop Usillg a Masler-Serllalll Desigll. One \\'a 10 design an
edge-triggered D flip-flop is to use 111'0 D latches. as shown in Figure 3.24.
The first 0 lalch. known as the mOSIer. is enabl ed (can slore new val ue on Om) \I hen
C 1 k is a (due 10 the inverter). whil e the second D latch. known as the sen OIll. is enabled
when C 1 k is 1. Thus, whil e C 1 k is O. Ihe bil on 0 is slOred into the masler lal h. and
hence Om and Os are updaled- bul the servant latch doe nOI lore this new bil beenu
Ihe servanl latch is nOI enabled ince C 1 k is nol 1. When C 1 becomes 1. the mn ter
105
106
3 Sequential l ogic DeSi gn- Controll ers
The common
name ;s actually
"master-slave ...
Some clroou
insll!ad to use the
term "servant "
due 10 some
people finding lire
term "slave "
offenSive. Others
use the turns
"primary-
.ucondary. "
o
o lalch
Om Om
o flipflop
o lalch
0 '
Os Os'
Os 0
servant
, ,
Clk --r---L
O/Om
Cm
Om/Os
Cs
.. '--, i
Os _ _ ---,c-'
Figure 3.24 A D fli p- fl op implemenling an edge-lri ggered bil slomge bolOCk.
latches in a master-servant arrangement. The master D i3lch slores 1,IS m Input W I e:::: , UI
lhe new va lue appeari ng al Om and hence al Os does 1101 gel slored ml o the servant latch. because
the servanl lalch is disabled when elk = O. When elk becomes 1. Ihe servanl D lalch becomes
enabled and Ihus gelS loaded wilh whalever value was in Ihe mas'er lalch JUSI before elk changed
from 0 10 1.
latch becomes disabled (relai ns ils stored value), thus holding whalever bit was at the 0
input j usl before the clock changed from 0 to 1. Also, when elk is 1: the servant lalch
becomes enabled. thus storing the bil that the master IS stonng. wh,ch ' s the bll thaI was
al the D inpul j usl before elk changed from 0 to I-hence implementing an edge-trig-
gered storage block.
The edge-triggered
block using two inl ernal
latches thus prevents the
stored bi t from propagating
y
01 01 02 02 03 03
through more Ihan one elk .... __ ----'
lalch when Ihe clock is 1.
Consider the chai n of flip-
Aops in Fi gure 3.25. whi ch
is simil ar to the chain in
Fi gure 3.21 bUI with 0 Rip-
fl ops in place of 0 lalches.
Figure 3.25 Using D Rip-flops. we now know through how
many Rip-Rops Y will propagale for C 1 k_A and for C 1 k_B-
one Rip-Rop exacll y per pulse. for either clock signal.
We know that Y will propagate Ihrough exactly one Rip-flop on each clock cycle.
The drawback of a maSler-servanl approach is that we now need two 0 lalches 10
store one bit. So Figure 3. 25 shows four Rip-Aops, but Ihere are IWO latches inside each
Aip-flop, for a tOlal of eight lalches.
There are many ahemati ve methods other Ihan the maSler-servant method for
designing an edge-tri ggered Aip-Aop. In fac t, Ihere are hundreds of different designs for
latches and Aip-fl ops beyond the designs we showed above, with those designs differing
in lenns of their size, speed. power, etc. When we use an edge-tri ggered Aip-nop, we
usually don' l worry aboul whether the flip-flop achi eves edge-tri ggering using Ihe master-
servant melhod or using some olher method. We need onl y know that the f1ip-Rop is edge-
triggered, meaning the data value present when the clock edge is ri sing is the value thai
gets loaded into Ihe flip-Aop, and that appears atlhe flip-fl op's outpul some time later.
- -_._---
3.2 Sloring One Bit- Flip-Flops
107
aClually been describing whal's known a . positi ve Or risillg edge-triggered fli p-
Aops. wh, ch are Inggered by Ihe clock signal going from 0 10 I. There are also Aip-Aops
known as lIegati,'e or Jallillg edge-lriggered fl ip-ft 0l s. which are triggered by Ihe Signal
gOll1g from 1 10 O. We can buil d a negalive edge-triggered 0 flip-llop usi ng a maSler-servalll
deSIgn where Ihe second fl ip-fl op's clock inpul is invened. rather than the fi rst fli p- Aop's.
Posi tive edge-Iriggered fli p- fl ops are drawn
using a small triangle al Ihe clock inpul. and nega-
ti ve edge-Iriggered fli p-flops are drawn USing a
small Iriangle along wilh an in version bubble. as
shown in Figure 3.26.
Bear in mind thar all hough Our maSler-scrva l1l
design doesn' l change Ihe output unlil Ihe railing
clock edge. Ihe fl ip-fl op i slill po ili ve edge-
Iriggered. because Ihe fl ip-flop Slores Ihe value Ihal
was al Ihe 0 inpul al Ihe in' Wnl thm Ihe clock edge
j riSing.
In ill
Figure 3.26 Posili ve (shown on lhe
left ) and negalive (righl ) edge-
Iriggered D fl ip. flops. The sideways
rriungle input rcprescnls an edge-
Iriggered clock inpul.
Latches ,'ersus Flip-Flops: Various lexlbooks defi ne the temls latch and fli p-nop differ-
ently. We'lI use what seems to be the mOSI common conventi on among des igners. namely:
A latch is level-sensilive. and
A jlip-jlop is edge-Iriggered.
So saying "edge-Iri ggered flip-Rop" is redundanl , since flip-fl ops are by defin ili on
edge-triggered. Li kewise. saying "Ievel-sensili ve lat ch" is redundant. since latches are by
defi nili on level-sen ili ve.
Fi gure 3.27 uses an example liming
diagram 10 illuslrale the di fference belween
level-sensili ve and edge-Iriggered bil IOrage
bl ocks. The figure provides an example of a
clock signal and a value On a signal D. The
next signal trace is for Ihe 0 OUIPUI of a 0
larch, whi ch as we know is level-sensili ve.
The lalch ignores Ihe firsl pulse on D (labeled
a 3 in the fi gure) because elk is low. How-
,
,
,


Q (0 latch)
ever, when elk becomes high (I), the latch 0 (0 flip.flop)
oUI PUI follows the D inpul , so when 0
f f
:9 10:r---
, i changes from 0 10 1 (4), so does the latch
OUlpul (7). The latch ignores Ihe nexl
changes on 0 when elk is low (5). but then
follows D again when elk is hi gh (6, 8).
Figure 3.21 Lalch versus flip-Rop liming.
Compare this wilh the nexl signal trace. howing the behavior of a rising-edge-trig-
gered 0 Aip-fl op. The Aip-fl op amples D at the fi r t ri ing clock edge (I). fi ndi ng 0 to be
O. The flip-fl op thus slores and oUlpul a 0 (9). The Rip-fl op amples 0 al the next rising
clock edge (2). finding D 10 be 1, and thus stores and outputs a 1 (10). Olice that the Rip-
fl op ignores all changes 10 0 Ihat occur bel ween Ihe ri ing clock edges (3. -1. 5. 6)-even
ignoring changes On 0 when Ihe clock is high (4. 6).
108 3 Sequenti al Logic Design- Controllers
EXAMPLE 3.1 Flight attendant call-button uSing a D fl ip-flop TABLE 3.1 0 truth table for
call -button system.
Lei ... dc\i gll ollr ni el ill ,lItcm.lant cull -bullon system lIsing a D
flip-nop. If Ca 11 e prcs:-.cd. we wanl 10 store a 1. If
Cance 1 prc',cd. \\ C \\Iunl !oo lore n O. If neither is pressed,
we W;J1l1 to siore whatever i, prcscnll y Siored. meani ng O.
\Vc Ihu, ;1 , imp/c l'olnbin:.Hi ollal circuit in front of the 0
inpul. dc,cribe<i by tbe truth table in Table 3. 1. If Ca II =0
and Ca nce 1=0 (thc li"t two rows). 0 equals D'' valuc. If
Call a O and Cancel - l (the next two rows). 0=0. If
Ca 11 B t and Cance I BO (the nc.XI two rows). 0=1. And tI
both Call=) and Cancel a ) (the last two rows), we' lI
give pri ority to the C a II button. so 0-1.
Ca II
0
0
0
0
)
1
Ca ncel
0
0
)
)
0
0
0 0
0 0
) )
0 0
)
0
0 )
1 )
Aner algebraic lii mplific:l.li on. we obtai n (he fol-
lowing equal ion for 0: ) 1 0 1
D B Cancel ' 0 + Call 1 1 )
The final !<tY'lcm iii 11 in Figure 3.28.
The D flip-fl op-based design uses
more gates tha n Ihe SR lalch-based in
Figure 3.9 (which could have just as
eas il y used an SR fl ip-flop) . One
reason ror the exira gate!' is Ihal a D
flip- fl op always slores ils D inpul on
every clock cycle, so we muSI expli c-
il ly feed 0 back inl o D 10 maintain the
same value. In contrast. we coul d just
SCI 10 mainlain Ihe same va lue
wilh an SR flip-fl op. Furthermore. we
must convert Ihe bUli on presses 10 the
appropriate D inpul value, requiring
ext ra logic. rather than just cui ng
ei ther 5 or R 10 1. In Ihe late 1970s and
earl y I 980s. Ihose ex tra gates were a
big deal. because ICs came with just a
Call ,--, _-'
burton
Cancel
button
Call
button
Cancel
button
Flight
altendant
call-button
system
(a)
(b)
Figure 3.28 Flight attendant call -button system:
(a) block diagram. and (b) implemented using a
o fli p- fl op.
)
few gales on Ihem, so extra gales oft en meant extra ICs, meaning extra size, cost, power,
etc. But today, in Ihe era of mill ion-gate ICs, the savings of an SR flip-fl op are trivial. In
modern des ign. nearl y all designs u e D flip-fl ops, not SR flip-fl ops.
As a poinl of informal ion, deSigners commonly refer to fl ip-fl ops simpl y as flops .
We wenl Ihrough several inl ermediale designs before arri ving at our robust D flip-
fl op design for Our desired bil storage block. Fi gure 3.29 summari zes those designs,
including Iheir features and their problems, leading to the robust edge-triggered D
flip-flop. In looking Over the summary, noti ce that the D flip-fl op reli es on an internal
SR lalch to mai nt ain a stored bil be/ ween clock cycles, and relies on the designer to
introduce feedback outs ide Ihe D fl ip- fl op to mai ntain a stored bit frorn aCIVss clock
cycles.
Fealure: 5=1
sels 0 101 , R=I
resels 0 10 O.
Problem:
5R=11 yield
undefined O.
3.2 Storing One Bit- Flip- Flops
109
Level-sensitive 5R lalch
5
o lalch
o flip-flop
Olalch
Fealure: 5 and R only
have eHecl when C=I.
We can design oUlside
circuil so SA:: 11 never
happens when C=I .
Problem: avoiding
5R=11 can be a burden.
o latch
Om Om Os Os, O'
Cs Os 0
Fealure: 5R can'l be 1/ Fealure: Only loads 0 value
if 0 is slable before and presenl al rising clock edge,
while C= I , and will be II so values can'l propagate 10
for only a brief glilch even olher flip-flops during same
if 0 changes while C=l . clock cycle. Tradeotf. uses
Problem: C= I 100 long more gales inlernally Ihan 0
propagates new values lalch. and requires more
Ihrough 100 many lalches; exlernal gales than 5R- but
too Short may nOI enable gate count is less of an issue
a slore. loday.
Figure 3.29 Increasingly better bit storage bl ocks. leadi ng to the 0 flip-llop.
Basic Register-Storing Multiple Bits
EXAMPLE 3.2
A reg i st er is a sequcnlial componenl thai can store multiple bits. We can bui ld a basic
regisler simply by us ing multi ple fli p-flops, as shown in Figure 3.30. That register can
hold 4 bi ts. When the clock ri ses, all 4 fli p-fl ops get loaded wi lh inputs 10, 11. 12, and
13 si mult aneously.
13 12 f1
01 00
Figure 3.30 A basic 4-bit register int ernal design (left) and block symbol (right).
This register, made simply from multi ple fli p- fl ops. is the mo t basic fornl of a reg-
ister-so basic that some compani es refer 10 such a register simpl y as a "4-bit D fl ip-
fl op." We' ll describe more advanced regislers, namely, registers with more feat ures and
operations, in Chapter 4.
Temperature history display using registers
We Want to design a system that records the outside temperature every hour and di splays the last
three recorded rcmperalUrcs. so thai an observer can see the lcmpermure trend. An architecture of
the system is shown in Figure 3.3 J.
A timer generales a pulse on signal C every hour. A cnsor outputs the
lemperature as a 5-bit binary number ranging from 0 to 31 . to those temperatures tn
Celsius. Three display COIll' e l1 Iheir 5-bit bi nary inputs into a numencal dtsplay.
110 3 Sequential logic Design- Controllers
))) TemperalureHistoryStorage
Figure 3.31 Temperature
hi slory display syslem.
timer C avoid connecting the timer output
(In practice. we would actually ( 9 an oscillator output to a clock input.)
C to a clock input, instead only connec In
Figure 3.32 lnl emal
design of (he
TemperalllreHiSlory
Storage component.
. S" component usino three 5-bil registers, a
\'Ve can implement the Temperal/lreHl sfOl)' IOIfI!fR' 1/ Ill > teml>erature on inputs
.. 31 E I I' signal C loads a WI 1 "
shown In Figure _. 1. pll se 0.11 . R . I the 5 input bi ts). At the same time that regi ster
x4 . . xO (by loadlllg the) flip-naps IIlslde a W.1t 1 Rb octs loaded with the value that was in Ra.
Ra 2CIS loaded wi th that present temperature. dOl ';]1 the same time namely on the
Lik;wisc. Rc gels loaded wilh Rb's value. Alllhree Oil s lappen . re Ih; clock cd e et
rising edge of C. The errect is [hat the v:ilucs that wcre In Ra and Rb Just befo g g
shifted illlO Rb and Re. respecti ve ly.
a2 al a0 r-- b4 b3 b2 bl bO ,---- c4 c3 c2 cl cO a4 a3
,----
14 04J- f-:-14 04 14 04
--
03 I-- 13 03 13
--
03
02 02 12
--
02 12
01 It 01 tl 01
--
00 10 00 10 00
--
xO


c
TemperatureHistoryStorage
Fieure 3.33 shows sample values in Ihe regi slers for several clock cycl es, assuming all ihe reg-
isters held Os. and assuming that as time proceeds the inputs x4 .. xO have the values
shown al the (OP of rhe timing diagram.
Figure 3.33 Example
of values in the
TemperolureHislory
Storage registers. One
panicuJar daw item)
J 8, is shown moving
through Ihe regiSlers
on each clock cycle.
Ra
Rb
Rc
3.3 Finite-State MaChines (FSMsl and Controllers 111
Thi s example dcmoll Slrnres one of Ihe great things lIbOUI synchronous circuits built from edge-
tri ggered nip-nops-many Ihings happen at once. yel we need nOI be concemed aboul signals prop-
agating [00 fast through II register to nnOlher register. The rcason we need nOI be concerned is
because registers ollly gel loaded 011 lhe rising clock edge. which effectively is an infinitely small
period of lime. so by fhe lime signli is propagate through a register to a second regi slcr. it's too
laIc-that second register is no longer paying attenti on to its data inputs.
We should menti on that , in practice. designers typically try to avoid connecting any
signal other than an oscillator outpul to the clock input of a Rip-flop or regi ster. So in
practice. we might Iry to avoi d connecti ng the signal C to the regi sters' clock inputs, since
C comes rrom a timer output, not an osci ll ator. We' ll show in Chapter 4, Example 4.3,
how to des ign a simi lar ys tem using an osci llator ror the clock.
3.3 FINITE-STATE MACHINES (FSMS) AND CONTROLLERS
EXAMPLE 3.3
Regi sters store bits in a di git al circuit. Stored bits means the circui t has memory, also
known as slale. resulting in what are known as sequential circuits. While a register
storing bits happens to result in a circuit with state. we can actually use state to design
circuits that have a . pecifi c behavior over time. For example, we can specifically
design a circuit that output s a 1 for exactly three cycles whenever a button is pressed.
Or we cou ld design a circuit that blinks li ghts in a specific pattern. Or we could design
a circuit that detects ir three buttons get pushed in a particular sequence and that then
unlocks a door. In all the e cases. we would be making use of state to create specific
time-ordered behavior for Our circuit. A sequential circuit that controls Boolean
Ou tpul S based On Boolean inputs and a specific time-ordered behavior is often referred
to as a cOlllroller.
Three-cycles-high laser timer-a poorly done first design
Consider the design of a pan of a laser
surgery syslem. such as a syslem for scar
removal or corrective vision. Such systems
work by turning On a laser for a precise
amounl of time (see "How doe it work?-
laser surgery" on page I 12). A general
archil ecture of such a system is hown in
Figure 3.34.
clk
patient
A surgeon activates the laser by
pressing the bUllon. Assume Ihe la er
Figure 3.34 Laser timer system.
should Ihen Slay on for exacl ly 30 ns. . .
Assuming our clock's period is 10 ns. 30 ns means 3 clock cycles. (Assume thai b IS synchroruzed
with the cl ock and Slays high for onl y I clock cycle.) We need 10 design a controller component
Ihal. once delecting Ihal b I. holds X high for exactly 3 clock cycles. thus luming on the laser for
exacl ly 30 ns. . .
Thi s is one example for which a software solut ion may nol work. USlllg JUSI regular program-
ming statements reading inpul pons and wriling OUtpUI pons, we may nOI have a way 10 hold an
OUlput pan high for exaclly 30 ns-for example. when Ihe mi croprocessor clock frequency IS not
fasl enough. or when each slalemenl takes 2 cycles 10 execule.
112 Sequential logic Design-Conlrollers
The prel'iolls
example
illllsrra,ed rhe
need for a way
of describing 'he
desired behol'ior
of a sequential
cirr;ui,.
Let's try to create a sequential circuit implemen-
tation for the system. After thinking about the
problem for a while. we miglll come up with the (nol
so good) implementation in Fi gure 3.35.
Knowing we need 10 hold the output high for
three clock cycles, we used three flip-flops. with the
idea bein!! that we'll shift a I through those three flip-
flops. taking three clock cycles for the bit (0 move
lhrough all lhree flip-naps. We ORed the nip-nap
outputs 10 generate signal x, so Ihal if any flip-flop
comains a 1. the laser will be on. \Ve made b the
clk
Figure 3.35 Firsl (bad) allempl 10
implement Ihe laser surgery syslem.
inpul lO the firsl flip-flop. so when b= 1, the firsl nip-Hop Slores a 1 on Ihe nexi cl ock cycle. One
clock cycle Imer. the second flip-fl op will get loaded with 1, and assuming b has now returned (0 0,
Ihe firsl flip-flop will gel loaded with O. One clock cycle Imer. Ihe third flip-fl op wi ll gel loaded wi!h
1. and Ihe second flip-flop wilh O. One clock cycle Ialer, Ihe Ihird nip-flop wi ll gel loaded wi!h O.
Thus. the circuit held the aUipul X at 1 for three clock cycles after the bulton was pressed.
We did nOI do a very good job implementing thi s syslem. First of all , what happens if
the surgeon presses the button a second time before the three cycles are completed? Such a
situati on could cause the laser 10 Slay on 100 long. Is Ihere a simple way to fix our circuit to
accounl for that behavior? Second, we didn'l use any orderly method for designing the
ci rcuil-we came up with the of nip-flop OUlputs, bUI how did we come up wilb
that? Will thai merhod work for all lime-ordered behavior that we mi ghl wanl 10 design?
We need IWO Ihings 10 do a bener job al designing ci rcuilS having time-ordered
behavior. Firsl, we need a way 10 explicitly represenlthe desired time-ordered behavior-
we' ll introduce the finite-slale machine represenlation for thi s purpose. Second, we need
an orderl y method for implemenling such behavior as a sequenlial circuit-we' ll intro-
duce such a standard method.
HOW DOES IT WORK?-LASER SURGERY.
Laser surgery has become very popular in the pasl
decade, and has been enabled due 10 digilal syslems.
Lasers. invented in Ihe early I 960s, generale an
intense narrow beam of coherenl light with pholOns
having a single wavelength and being in phase (like
being in rhythm) wilh one another. [n contraS!, a
regular light's pholons fly OUI in all directions. wi th a
diversily of wavelengths. Think of a laser as a plaloon
of soldiers marching in synch, whi le a regular lighl is
more like kids running oul of school althe end-of-the-
day belL A laser's lighl can be so inlense as 10 even CUI
steel. The abi lity of a digilal ci rcuilto carefull y control
the location, intensi lY. and duralion of the laser is whal
makes lasers so useful for surgery.
One popular use of laser for surgery is for Scar
removal. The laser is focused on the damaged cells
sljghlly below the surface, causing Ihose cells 10 be
vaporized. The laser can also be used 10 vaporize skin
ceUs !hat fonn bumps on Ihe skin. due 10 scars or moles.
Similarly. lasers can reduce wrinkles by smoothing ille
skin around the wrinkle to make the crevices more
gradual and hence less obvious, or by stimulating lissue
under Ihe skin 10 sl imulale new collagen growth.
Another popular use of lasers for surgery is for
cOlTecling vision. [n one popular laser eye surgery
method, the surgeon CUIS open a fl ap on the surface of
Ihe comea_ and Ihe la er Ihen reshapes the cornea by
thlOnlOg Ihe cornea in a panicular pall em, wi th such
IhlOmng accompli shed Ihrough vapori zi ng cells.
A digilal syslem conlrols the laser's localion, energy,
and dural ion, based on programmed informalion of the
procedure. The availabilily of lasers, combined
wuh low-coSi high-speed digilal circui ts. makes such
precise and useful surgery now possible.
---- - -
J.J Finite-Slate Machines IFSMs) and Controllers
Finite-State Machines (FSMs)
113
In the previous chapler, you. saw Ihal we could design a combinational circuit by first
descnbmg the deSired CirCUli behaVIOr using a malhematical formalism known as a
Boolean equation, and then converting the equation 10 a circuit. For a sequential circuit. a
Boolean equatIOn alone is not sufficient 10 describe behavior-we need a mOre powerful
malhematlcal formali sm Ihal incorporales lime.
Finite-slale machines (FSMs) are jusl such a method. The name is a bil
but Ihe concepl is st raighlforward. An FSM consists of several Ihings, the
mOS I Imponanl of wh ich IS a sel of states representing every possible stale, Or
mode, of a system.

. "
_ . " 1
". . '. i
.
- '
1.like 10 my daughler's hamsler as an intuitive example. After baving a hamster as
a family pel, I ve learned Ihal hamsters basically have four stales: Sleeping, Eating, Run-
IIlIIg 011The Wheel, and TryillgToEscape. They spend mOSI of their day leeping (being
nocturnal), a bit of tllne ealing or running On the wheel, and the rest of their time desper-
alely Irylng 10 escape from Ihei r cage.
As a more electronics-oriemed example, lei 's design a system thai repealedly sets
an OUlpul X 10 0 for one clock cycle and 10 1 for one clock cycle. The syslem clearly
has on ly two states, which we' ll call Off and Oil. In slate Of(, X = 0; in stale 011_ x = 1.
We can show Ihose slales, and the transilions between them, usi ng the state diagram in
Figure 3.36.
- .

Outputs: x
I I I I
I I t I
clk cycle 1 h cycle 2 h cycle 3 h cycle 4 i
i i ! i

Outputs: I I , I
X --r--1---J!
Figure 3.36 A simple slale diagram (len) and Ihe timing diagram de cribing the state diagram's
behavior (ri ght). Above the timing diagram. we see the FSM going from one Sl'ate 10 the other in
each clock cycle. "e 1 k A" represenls Ihe rising edge of the clock signal.
Assume we Slarl in Slale Off. The diagram shows thai x is set 10 0 while the y lem
is in Slale Off. The diagram also shows thai on Ihe neXI rising edge of the clock signal .
C/kA, the syslem Iransilions 10 Slale 011, and the diagram shows thul i el 10 I in Ibal
Slale. On the next rising edge of the clock, [he di agram shows lhal the y"lem tran i-
li ons 10 slale Off again. A l.iming diagram showing the sy lem' i hown in
Figure 3.36.
Recall in Example 3.3 thai we wan led a syslem Ihal held ils OUrpUI high for three
cycles. Toward that end. lel's extend the simple Sime diagram of Figure 3.36 I ha\e on
off Siale and three on slales, as shown in Figure 3.37. The OUIPUI will be 0 for one C) -Ie.'
and Ihen 1 for Ihree cycles. as shown in the liming diagmm of the figure.
Sequential Logic Design- Controllers
Outputs: x IkA 1 elk JUULJLILJLJLJl-

elkA x=1 State@ff lonl ;on2pn310ff lonlpn2iOn310ff l


Off Onl
Outputs: -.J U L
x
. 'He diaornn1 (left). timing diagr:111l (ri ght ).
Fi ure 3.37 Three-cycles-hi gh system. st. 0 .
g ' ti ons 10 funher extend the behaVior.
. ditions on the LranSI . . . ' .
We can introduce Input con . 38 b hanoino the condillOn on the tranSition
d' 11 in F, oure 3 Y c . " I k
We extend the stat e wgn.lI e new conditi on requires not Just a CDC,
from state Off to stale ani. such that.. f OljJback to Off. wi th the condlLlon ofa
VII I
dd a tranSillon rom ' d
but also that b= 1. e a so a .' . . the fi oure shows the state an outpUt
. . . d b=O The liming diagram Ill . "
rlSlIlg cl ock an . '. values on b.
behavior for the given IIlput
elk JLJLJLJllJuLl
, ,
, ,
Inputs: rn 1
b ' i
Srate !Off I Off I Off I Off I Off Off I
I L
-------------" ,
Outputs:
. h . state diaorom (left), liming diagram (right).
Figure 3.38 Threecycles hl g system. e
see that ajirril e-slate macilirr e, or FSM, is a math-
From the above examples. we can . .
emalical fomlali sm consisLing of several things.
A a
mpl e had four states: {Onl. On2, 01/3, Offl
A set of states. ur ex .
. d f output s. Our example had one IIlpllt: {b }, and one
A set of IIlputs, an a set 0
output: {x }.
. -' I tate to stan in when we power up the ystem. An
An Illili al tate, name y, a s. d' d d . h
. . ' b . d'cated graphically by a ;lIlgle ItCCle e ge, Wit no
FSM's IIllli al tate can e In I , ... I
. h ' ' (al tate An FSM can only have one IIltlia state.
source state, Ihat pOint to t e 1111 I . .
Our example's initial stale was Off.
. ' f h t t te to go 10 based on the current stale and the va.l ues of
A deSCripti on 0 I e nex sa . d' .
. a ' Ie sed directed edges with a,wciated Input carr IflOns to
lhe inputs. ur examp u. . . " .
tell us Ihe next state. Those edges wilh condil lon arc known trlll/Slfrorr s. _
d
.. of what OUlput values 10 generate in each Mate. Our example a signs
A eSCrlplion . M ' .. Ii
a value to X in every slate. Assigning an outpUI In an FS "nown as an ac on.
VII sed a graphical represenl ation of an F M. known .1\ a slal e diagram, to ho"
e u I W o ld have repre5ent ed the F M lexluall y lIl'tead. but stale
the FSM for our examp e. e c u "
arc very popular for vi sualiting an FSM , hchavlOr.
3.3 Finite-State Machines (FSMs) and Controllers 115
EXAMPLE 3.4 FSM for the three-cycl es-high laser timer
a"clkA
(J+D
a'
C:r-D
We can create an FSM to describe Ihe earlier introduced laser timer system. The system might have
four states: Off, all / , 0112. and On3. In the Off state, the laser should be off (x-D). The anI state
would be the first clock cycle the laser is On (x - 1), On2 the second cycle, and On] the third cycle.
The state dIagram of Ihe FSM is in fact identical to that shown in Figure 3.38.
. Here's how Ihe FSM should be interpreted. We start in our initial state Off. We stay in state Off
until One of Us two outgoing transitions has a true conditi on. One of those transitions has the condi-
tion of b' AND ri sing clock (b ' *c1 kA)-in Ihal case, we transition right back to state Off. The
other of Ihose transili ons has Ihe condi tion of a b AND a ri sing clock (b*c 1 k A)-in that case, we
transitIOn to Siale 0111. We Slay in Slale a,,} until its one outgoing transition's condition. a rising
clOCk. becomes true-in which case we transition to stale On2. Likewise, we stay in On2 until the
riSing clock. Iransilioning to 0113. We Slay in 0,,3 until the next rising clock. causing a transi-
tIon back 10 slate Off. In stale Off. we have associaled the action of setling x-O, while in states anI,
0112, and On3, we have associated the action of selling X= 1.
Thus. we have preci sely described the desired time-ordered behavior of the laser timer system
using an FSM.
/t 's inleresting to examine the behavior of this FSM if the button is pressed a second time
while the laser is on. No[ice thm the lransilions among the On states are independent of the value of
b. So this system will always lurn Ihe laser on for exactly three cycles and then return to the Off
Slale 10 awail another press of the bUllon.
Simplifying FSM Notation: Making the Rising Clock Implicit
Thus far, we have included the rising clock edge
(c 1 k A) as pan of the condition of every FSM
transi tion. We included that edge because we are
onl y considering the design of sequential circuits
that are synchronous and that use rising edge-
tri ggered Rip-Rops to store bits. Synchronous
sequential circuits with edge-triggered Rip-Rops
make up the Vast majority of sequential circuits in
modem design practice. As such, most textbooks
and designers, to make their state diagrams more
readable, follow the conventi on lhat every transi-
tion in an FSM is implicirly ANDed with a ri sing
clock edge. For example, a transition labeled
Figure 3.39
assuming every tmn irion is ANDed
with a rising clock .
"a '" actuall y means "a' *c1 kA." Hencefonh. we will not include the rising clock edge
when drawing FSM transiti ons, and we will follow the convention that every transition is
"STATE" I UNDERSTAND, BUT WHY THE TERMS uFlNITEu AND "MACHINE? .
FinilC-Slate machines, or FSMs. have a mther
awkward name thm sometimes causes confusion. The
term "finite" is there 10 contrast FSMs with a similar
representation used in mathematics Ih31 can have i.U1
infinite number of stliles; Ihal represenlmion is nOI
very useful in digilal design. F Ms, in contmst. have
n limiled, or finite, number of SIUI.S. n,e lem,
"machine" is used in irs mar.hematical or computer
science sense, being a concepTUal object that can
execute an abSlr3et language--specificaJl . that sense
of machine is not hardware. Finite- tnte m3ClliMs are
also known :IS jinile-S14U aUlolfllJllJ. FSMs used
for many things other than just digilllJ design.
116 Sequential Logic Design-Controllers
. ' . ' do Fioure 3.39 illustrat es the laser timer state
implicitly ANDed wiLh a nSlll g clock c " , . . lock
d sino an Imp Jell C .
diagram from Figure 3..l0. re rawn u . pl y transi ti ons on the next clock cycle,
0----0
A transiti on -with no assoc iated condlll on slIn
because of the impl ici t ri sing clock edge. . ho v 10 describe lime-ordered behavior
Let' s consi der a rew more examples shOWing \
using FSMs.
EXAMPLE 3.5
Secure car key . .
. " I ew aUlOmobiles have ::J thi cker plasti c III the
H<1 ve you nOllced ... that Ihe keys for 11 believe it or 11 0 1, there is a computer chi p IIlslde the
past (see Figure .lAO)? reason IS In n basic version of such a secure car key, when
head of the key, implemcntll1g a secure car (whi ch is under the hood and commu-
the driver tums the key in the ignition, the car s compuler
d
. 'onal aski ng the car key's chip to
h b {'OI) sends out a ra 10 510 ..
nicates using what' s t, e '0 The chi in Ihe key then responds by sending
respond by sending an Identifier via a rad iO sIena!. P s onder ;'transmits" in "response"
the identifi er (ID) usino what's known as a transponder (a tJan P, h fD d'f
' e . esponse or the key s response as an 1 -
to a request), If the bases!ation not rece,l\'e a r h com uter shut s down and the car
ferent than the lD programmed 1111 0 the car s computer. ( e p
won't start.
Figure 3.40 Why are the he3ds of keys gelling thicker? Note th31 .the key on the ri ght is .thicker than
the key on the left. The key on the right has a computer chip inSIde that sends an IdenlJ fier to the
car's computer, thus helping to reduce car thefts.
Let's design the controll er for such
a key having an ID of 1 0 11 (re31lDs are
typically 32 bits long or more. not just 4
biLS). Assume the controller has an input
a that is 1 when the car's computer
requests the key's ID. Thus the controller
initially waiLS for the input a to become
I. The key should then send its ID
(lOll) serially, staning with the right
most bit, on an output r: the key send 1
on the first clock cycle. I on the second
cycle. 0 on the third cycle, and finall y 1
on the fourth cycle. The FSM for the
controller is shown in Figure 3.4 1. Note
that the FSM sends the bits 'lUning from
the bit on the right. which is known as
the leart significant bit (LSB).
Figure 3.42 provides a timing
diagram for the FSM for a particular Silu.
ation. When we set a - 1, the FSM ente"
,tate K J and output.> r - 1. The FSM
..... ---- .
\
Inputs: a; Oulputs: r
Figure 3.41 Secure car key FSM.


a
State I iNail Flail I K t I K2 I K3 1 K4 IWait!W8it i
Figure 3.42 Secure car key tllTlIng diagram.
EXAMPLE 3.6
Ihen proceeds Ihrough K2. KJ, and K4,
OutpUlling r: 1, 0, and 1, respect ively.
even though we returned inpUi a to O.
Timing di agrams represent a par-
ticular situation defi ned by how we sel
the inputs. What would have happened
if we had held a = I for many more
clock cycles? The timing di agram in
Figure 3.43 illustrates that situalion.
Notice how the FSM, aft er returning to
stare Wait. proceeds to Slate K I again
on the next cycle.
The computer chip in the car key
has circuiLry that converts radi o signals
10 bi ts and vice versa,
3.3 FiniteState Machines (FSMs) and Controllers lJ7

Inputs
a ___ --'
State II Wait I Wail I Kl I K2 I K3 I K4 Wait Kj
Outputs

Figure 3.43 Secure car key timing diagram
for a different sequence of values on input 3.
"So my car key may someday need its batt eries repl aced?" you might ask. Actually, no-those
chips in keys draw their power as well as their clock from the magnetic componem of the radio-
frequency fi eld genermed from the computer baseslaLi on. The extremely low power requiremem
makes Custom digi tal circuitry, ramer than software on a microprocessor, the preferred implementa-
tion method.
Computer chi p keys make stealing cars a lot harder-no more "hal- wiring" 10 stan a car, since
the car's compuler won' t work unless it also receives the correct idemifier, And the method above is
acrually an overly simplisti c method-many cars have more sophi sticated communication berween
the computer and the key, involvi ng several communi cmions in both directions, even using
encrypted communicmion- maki ng fooling the car's com pUler even harder. A drawback of secure
car keys is that you can't just run down to the local hardware Store and copy those keys for S5 any
longer-eopying keys requi res special tools that today can run $50-$ I 00. A common problem while
computer chip keys were becoming popular was that low-cost locksmi ths didn't realize the keys had
chips in them. so copies were made and the car owners went home and later couldn't figure ou{ why
their car wouldn' t start. even though the key fi t in the igni tion slot and turned.
Code detector
You' ve probably seen doors in airpons or hospi-
tals that require a person to press a panicul ar
sequence of bUllons (i.e .. a code) to unlock the
door. For exampl e, there might be three bUllons,
colored red, green, and blue. and 3 fourth bUllon
for starting the code. Pressing the stan bUllon.
then the foll owing bUllon sequence-red. bl ue,
green, red- unlocks the door. while any other
sequence does not unlock the door. Such a
system may have the genernl archi tecture shown
in Figure 3.44. An extra output from the bUll ons
component , a, is 1 whenever all)' button is
Start
Red
Green
Blue
Code
detec10r
Door
lock
Figure 3.44 Code detector W"Chitocture.
pressed.
We can de cribe the behavior of the CodeDetector block using an FSM
""cured as the SOte
diagram shown in Figure 3.45. .
For simplici ty, assume that the bUllons each h3ve a pecial ireuit that S) n the butt n
with the clock ignal. nnd cre'lles a pulse exn tly one clock yele "ide for e3<,h unique press of the
118
Sequential logic Design-Controllers
bUllon. This is necessary to en!' urc
Lhal the CUlTcm SIZlI e doesn '( inad-
vcncnr l y change 10 another Slate if a
button press i:.Isls longer thall a single
cl ock cycle. (\Vc'lI design such a
synchroni zation circuit in Example
3.9.)
The behavior of Ihe FSM is
a ~ foll ows:
The FSM begins in the Wait
Siale . As long as the slart bUI-
Ion is nOl pressed (5 ' ), the FSM
FSM slays in m,il: when Ihe Figure 3.45 Code deleclOr - .
sIan bUllon is pressed ( S). Ihe
Inputs: s.r.g, b,a;
Outputs: u
FSM oaes 10 the Sian Slale. d bl
0 , FSM is now ready (0 delcct the sequence re . ue, green,
Being In the 5wrr stal e means the . S I If a bUlIan is pressed AND that bUI-
. d ( ') he FSM stays ," tar.
red. If no bUllon IS presse a . I R /I If a bUllon is pressed A D Ihal bUllon
. d b (a r) the FSM noes 10 slale et . '.
Ion IS the re ullon. e h Wa i t stalc-nole Ihal when III Ihe Wall
. b ( r ' ) the FSM retums 10 I e.
IS nOI Ihe red ullon a . Id be i. nored. unl il the SIan bUllon IS pressed
stare. further presses of the colored butlons wou ~
agai n. b lion is pre sed (a'). If a bullon is pressed and
The FSM stays in Slate Redl as long as no U 81 "f Ihal bUllon is nOI blue (ab ' ) Ihe
thaI bUllon is blue (ab). the FSM goes 10 Slale li e. I ,
FSM relUrns to state 1'0i1. ,
. 1 I g as no bUllon is pressed (a ). and goes 10 Slale
Likewise. the FSM stays 10 state B /Ie as on .. ,
Green on conditi on a g. and state Wai r on condition a 9 . . .
. 'f bIn is pres ed. and goes 10 Slale Red2 on condItIon
Finally. Ihe FSM slays III Creen I no ur a
a r. and to state \ ~ i l on condition a r I
R d2 h eans that Ihe user pressed Ihe bUllons III the correel
If Ihe FSM makes il 10 slale e . I al m kin Ihe door. Ole thaI all olher Slates sell/=O.
sequence-Ihus, Red2 sets 1/= I. thus unloc g
The FSM then relums to slate Wail . Ded with a rising clock edge. ~
Recall thal every transi tion' condi ti on is implicitly A
Checking FSM Behavior
Correctl y defi ning the behavior
of a system is hard. The earli er
we fi nd problems, the easier they
are to fix. So after we create the
FSM, we might take time to ask
que tions about how the ystem
behave under cenain input si tua-
tions and then verify that the
FSM responds as we expect.
Consider the code detector FSM
in Figure 3.45. What happens if
the user presses the stan button
and then presses all three colored
Inputs: s,r,g,b,a;
Outputs: u
3.3 Finite-State Machines (FSMs) and Controllers J 19
but tons simultaneously, four time in a row? Well , the way we defined the FSM. the door
would unl ock! A soluti on to thi s undes ired ituation is to mOdify the conditions on the
arcs that go back to the Wail state. Rather than the condition a r ' , we could use the con-
diti on a (r '+b+g). Thus, when the FSM expects the red bUll on, then not pressing the red
button. Or pressing the blue or green bUllon, causes a transit ion back to the Wail state-
and so does not unl ock the door. Likewise when we are expecting other specific buttons.
An improved FSM is shown in Figure 3.46. Fi xing the FSM was easy: trYing to fix a
circui t deri ved from the FSM would have been much harder.
It turns Out that the FSM in Fi gure 3.46 still has a problem-a fairly seriou one.
We' ll describe that problem in Exampl e 3. 13.
Standard Controller Architecture for Implementing
an FSM as a Sequential Circuit
Now that we've seen how to descri be sequenti al behavior using an FSM. we need a Struc-
tured method to conven the FSM to a sequent ial circuit. The method is actually very
straightforward when we use a standard implementation archi tecture for the circuit. con-
sisting of a state register and combinati onal logic_ LOgether known as a conlroUer. There
are many other ways to implement an FSM, but sti Cking to the tandard architecture
result s in a straightforward design method. The standard archit ecture may not yield the
minimum number of transistors, but as we've menti oned many times. that's not a draw-
back these days.
A standard cont roll er architecture for an
FSM consists of a state register and combina-
ti onal logic. The standard architecture for the
laser timer FSM of Fi gure 3.39 is shown in
Fi gure 3.47. The architecture consists of a state
regi ter and combinati onal logic.
The state register is a 2-bi t register that
holds a binary number representi ng the present
state (i n thi s case, the register is 2 bits wide to
represent each of the 4 possible states).
The combinati onal logic's inputs are the
input of the FSM (in thi s case, b), as well as
the state register's output s (s 1 and sO). The
combinational logic's outputs are the outputs of
the FSM ( x), as well as the next tate bi ts to be
loaded int o the stat e register (n 1 and nO). The
detail s of the combinati onal logic detemline Ule
behavior of the circuit. The prace s for creati ng
those detai l wi ll be covered in the next
secti on.
A more general view of the tandard con-
troller architecture appears in Figure 3.-18. Th31
figure assumes a stat e register that is 11/ bits
wide.
Figure 3.47 Standard conlroller
architecture for the laser timer.
Figure 3.48 tandanl ,,,,"troller
aJ"('hireC'ture--genl"ml \ i '\\ .
120
3 Sequential Logic Design-Controllers
3.4 CONTROLLER DESIGN
Five-Step Controller Design Process
W
. I' five step process summari zed in Table 3.2. We'll ill us-
e can deSign a control er uSing a - ,
trate thi s process with some examples.
TABLE 3.2 Controll er design process.
Step
Descri pt ion
fr
CapillI" Ihe FSM
Create an FSM that describes the desired behavior of the controller,
ii;
N
a.
" ii;
Creale the
arc/ZirecllIre
Create the standard architecture by using a stale register of
appropriate width. and combinati onal logic With. inputs being the state
regiSler bits and the FSM inpuls and outpul S bemg the next state bits
and the FSM outputs.
r.
fr
ii;
Encode (he slates
Assign a unique binary number 10 each Each number
representing a state is known as an ellcodlllg. Ally encodlJ1g will do
as long as each stale has a unique encoding.
"'"
fr
ii;
Creme the stale
table
Create a truth tabl e for the combinati onal logic such that the logic
wi ll generate the correct FSM outputs and next state signals. Ordering
the inputs with state bits fi rst makes this truth tabl e describe the state
behavior. so the table is a state lab Ie,
.,.,
Imp/emelll the
c.
Implement the combinational logic using any method .
"
combil1alioll{l//ogic
ii;
EXAMPLE 3.7 Three-cycles-high laser timer controller Icontinued)
We can implement the laser limer (see Example 3.4) as a sequential circuit using the fi vestep process.
Step I: Capt ure the FSM. The FSM was created earlier (see Fi gure 3.39).
Step 2: Create the architecture. The standard contrOll er architecture for the laser timer FSM
was shown in Figure 3.47. The Slate regi ster has two bi lS to represent each of the four
states. The combinational logic has external input b and inputs 51 and sO coming from the
state register, and has external output x and outputs nl and nO going to the state register.
Step 3: Encode the states, We can encode
the states as foll ows-Off 00. 0,,/ :
aI, 0,,2: la, a,, ]: 11. Remember.
any nonredundant encoding is fine.
The state diagram with encoded states
is shown in Figure 3.49.
Step 4: Create the state table. Gi ven the
impl ementati on architecture and the
bi nary encoding of each state, we can
create the state table for the combina-
ti onal logic, as shown in Table 3.3.
Li sting the inputs from the state reg-
ister first in the input columns all ows
Figure 3.49 Laser timer state diagram with
encoded
3.4 Controller Design
121
us to easily sec which rows correspond to whi ch Slates. We fi ll al l combinati f'
the left , as usual for a truth tabl e. For each row, we look at the state i
O
;,puts
. 9 to determine the appropriate outputs. For the two rows starting with
0Cb: should be O. If b - 0, the controller should stay in state Off, so nInO Sh:uld
'" . I, the controll er should go to state anI. so nInO should be 01.
Likewise, for the two rows slarting with
5150 = a 1 (state 0111). x should be 1 and TABLE 3.3 State table for lasertimer
the next state should be 0112 (regardl ess of controller.
the value of b), so nInO should be 10. We --------, _____ _
Step 5:
compl ete the lasl four rows similarly.
Be careful [Q nOle the difference
between the FSM inputs and outputs of
Figure 3.49. and the combinational logic Off
Inputs and outputs of Figure 3.5O--the latter
mcludes the bits from and [0 the stale register.
Implement the combinational logic. We 0,,1
fi nish the design by usi ng the combina-
ti onal logic design process from Chapter 2. 0112
From the truth tabl e, we obtain the foll owi ng
equa!lOns for the three combinational logic
output s:
On3
51
a
a
a
a
Inputs
sO
a
o
o
a
a
a
o
a
a
a
Outputs
nl nO
a a
a
o
a
o o
o a
x = 51 + sO (note fromthet abIe that x=I if S1=lorsO=I)
n1 51 ' sOb ' + 51 ' sOb + s1s0'b' + sIsO'b
n1 51 ' 50 + 5150 '
nO
nO
SI ' 50 ' b + sIsO ' b ' + S150 ' b
51 ' sO ' b + 5150 '
We then obtain lhe sequential circuit in Figure 3.50. implementing the FSM.
Many textbooks will organize the
state table in different ways than that in
Table 3.3. However, we intentionally
organi ze the table so that it serves both
as a stat e table and a truth table that can
be used to design the combi national
logic of the controll er.
Figure 3.50 Final implementation of the
threccycles-high Inser timer controller,
Combinational logic
o
c:
-5
c:
0;
122
3 Sequential Logic Design- Controllers
EXAMPLE 3.8
Understanding the laser timer controller's behavior .
, FSM leI's trace through the behavior of the
To betler understand how" controller implement s ,In . in state 00 (5 I 50-00). b is 0, and
laser (liner controller. are y based 0 11 the combinational logic, X
the clock IS currentl y low. As shown in Figure 3.) 1 (left . a ' h I 00
. I 'II b 0 'lI1d nO will be . mc.lnlng t e va ue
wtll be a (the desired out put in state 00). n WI e .' d 00 ' 11 bid d ' t th
wi ll be waiti ng <II the state register's inputs. Thus. on the "ext clock e ge. WI e oa e In 0 e
Sial e register. meaning we stay in state aD-which is correct.
x=o _-- - __
- <$:b'
b( ><.=.1 _ x=1 x=1


b( __ x=1 x=1

x=o _----_
-<$))b'
b( __ x=1 x=1
s-'S-<11 0&
elk r'-"*O-F'-,
o
o
elk slale=OO
slale=01

Inputs:
b------____ "' __________ _
Outputs: ,

Figure 3.51 Traci ng the behavior of the three-cycles-high laser timer controll er.
Now suppose b become I. As shown in Fi gure 3.51 (middl e). x will still be O. as desired. n I
will be O. but nO wi ll be I. meaning the value 01 will be waiting at the stale regisler's inputs. Thus,
on the lIex( clock edge. a I will be loaded inlo the state register, as desi red.
As shown in Figure 3.5 1 (right side). soon after 01 is loaded into the state register, X will
become I (after the register i loaded, theres a slight delay as the new values fo r 5 I and sO propa-
gate through the combi national logic gates). That output is correct-we should output X= 1 when in
state 01. Al so. n1 wi ll become 1 and nO will equal 0, meaning the value 10 will be waiting at the
state register inputs. Thus. on the next clock edge. 10 will be loaded into the state register. as desired.
After lO is loaded into the state regiSler, x will Slay I , and n I nO wi ll become II. When another
clock edge comes. 11 will be loaded into the register. x will SlaY I. and nl nO wi ll become 00.
When anOlher clock edge comes. 00 will be loaded into the register. Soon aftcr. x wi ll become
O. and if b is O. nInO wi ll stay 00: if b is I. nInO will become 01. Noti ce we're bnck where we
started.
Understanding how a State register and combinational logic impl ement a state machine can
take a whil e, ince in a particular state (i ndicated by the value presentl y in the state regisler). we
generate the eXlemal output for that state. and we generate the signal, for Ihe lI ext state-bul w.
don't lran;i tion to that next state (i.e .. we don't load the <wle register) untilthc next clock edge,
3.4 Controll er Design
123
EXAMPLE 3.9 Button press synchronizer
clk cycle
1
h cycle2 n cycle3 n cycle4
Inputs: : :
L
bi -..J '
,
Outputs: J :
bo ---r-1----
We want to build :l circuit that synchroni zes a
bUlion press to a clock signal. such Lhat when a user
presses the bUll on. the result is a signal thm is high
for exactl y one clock cycle. Such a synchroni zed
Signal IS (0 prevcm a single bulton press that
multI ple cycles from bei ng interpreted as mU/ 4
(I?'e presses. Figure 3.52 uses a liming
diagram to Illustrate the desired circui t behavior.
The ci rcuit's input wi ll be a signal bi and Figure 3.52 DeSired lIming diagram oflhe
the output a sjonal bo Wh bl' b I' bu([on press synchromzer
, 0 < , en ccomes. rep-
the but ton b,eing pressed, we want to set bo to 1 for exactly one cycle. We [hen wait for
1 return to a agalll , alld then wair for bi to become 1 again. whi ch would represent the next
pressmg of the button,
Step 1: Capture Ihe FSM. Fi gure 3,53(a) shows an FSM describing the circui t' behavior. The
FSM In slate A, outputting bo""O. until bi is 1. The FSM then transitions to stale
B, OUlptlt,llng bo:; I. The FSM will then transition to either slate A or C. which both set
bo=O again, so that bo was 1 for just one cycl e, as desired. The FSM 0DCS from B toA if
b 1 returned to O. If b i is still 1. the FSM goes to Slate C. where the FSM wailS for b i
fa return D. causing a transiti on back to Slal e A.
Figure 3.53: Bulton press
synchroni zer design steps: <a)
inili al FSM. (b) architecture. (c)
FSM with encoded Slatcs. (d)
state table, (c) final circuil with
impl emented cOll1binati onall ogic.
FSM inputs: bi ; FSM outputs: bo


\. b" bi
r b"
A bi B bi C r
bo=O bo= 1 bo=O
(a)
FSM inputs: bi; FSM outputs: bo
0
CD
(0
unused
bo=1 bo=O
(c)
Combinational logic
Inpuls Oulpuls
s1 sO bi n1 nO bo
0 0 0 0 0 0
0 0 1 0 1 0
--cn--o- Tb-T-'
o 1 1 1 0 1
--1-'0-0- -6-'0--0--'
1 o 1 1 o 0
--1--'--0- -6-'0--0--'
1 1 1 0 0 0
(d)
;:.J!l

clk
bi
(b)
n1 = s1 'sObi + s1s0'bi
nO = s1 'sO'bi
bo = s1 'sObi' + s1 'sObi = 51 '50
(e)
124

Sequential Logic
Step 2:
Step 3:
Step
Slep 5:
. FSM has three states. the architecture has a two-bit
the archit ect ure. Smce the
regi ster. as shown in Figure 3.53(b).
. I vardly encode the thrce , tates as 00. 01. and
Encode Ihe sla tes. We can str:lJg 11 Of'
10. as shown in Figure 3.53(c).
" T." cOllvert the FSM wi th encoded states to a state table as
Creale Ihe slale lable. '" I t bo 0 d
For the
unused Slate 11. we have C lOsen to outpu = an
shown in Fi gure 3.53(d).
ret urn 10 00. . . .
.' " We derive the equ<ll lons for c:Jch combll13tlOnai
Impl ement the comblll a tlOnal logiC. .. h
. . . 353( and then creal C the fina l CIrculi as Sown.
logiC output. as shown III Figure. t:,
EXAMPLE 3.10 Sequence generator
Inpuls: none; Outputs: w. x. y, z
\Ve want to design a circuit wi th four output s: w. x,, y.
and Z The circui t should oenerate the followlIl g
of output pallems: 000 I. 00 II. 11 00. and
1000. After 1000. the circuit should repeat the
sequence. slarting m 0001 again. We wanl the circuit
to generate the next pattern only on a ri sing edge.
Sequence generators arc common in a range
of systems. For exampl e. we might want [0 blink a
of four lights in a particular paltcrn. such 35 in a festi ve
lights display. We might instead want to rolate an elec-
tric motor a fixed number of degrees on each cl ock
cycle by powering magnets around the motor in a spe-
cific sequence to attract the magnet ized motor 10 the
next position in the rotati on-such a motor is known as
a stepper motor. si nce the molor rOi ates in steps.
We can design the sequence generator controller
using our five-s tep process.
Step I :
Step 2:
Capt ure Ihe FSM. We capture the
system' s behavior as the FSM shown in
Figure 3.54. The FSM has four states. which
weve labeled A. 8. C. and D (though any
other four unique names would do j ust fine).
Creale Ihe a rchit ecture. The standard
controller architecture for the sequence gen-
erator wi ll have a 2- bit state regi ter to
represen t the four possible states. no inputs
10 the logic. and outputs w. x. y. z from the
logi c. along wi th outputs n I and nO. as
shown in Figure 3.55.
Step 3: Encode Ihe states. We can encode Ihe
states as 00. 8: 01. C: 10. D:
II. Any other encoding with a unique code
for each state would also do fine.
Step 4: Create Ihe stale tabl e. The Slale lable for
Ihe FSM with encoded states is shown in
Table 3.4.

cb--cb
wxyz=OOll wxyz= ltOO
Figure 3.54 Sequence generator FSM.
Wo
x C "TI

z in
clk
Figure 3.55 Sequence generalor
controller architect ure.
TABLE 3.4 State lable for sequence
generator controiler.
Inputs Outputs
sI sO w X y z nl nO
A 0 a 0 0 0 I 0 I
8 0 I 0 0 I I I a
C I 0 1 I 0 0 I I
D I 1 I 0 0 0 a a
Slep 5: Impl ement Ihe combinalional
logic. We derive the equati ons
for each output of the combina-
tional logic from the table. Afler
some algebraic simplification.
the equations arc as fOll ows:
w = sI
X sIsO '
y 5 I sO
z 51
nI 51 xor sO
nO sO '
The final circuit is shown in Figure
3. 56.
EXAMPLE 3.11 Secure car key controller (continued)
3.4 Controller Design 125
W
"TI
X
o
c
r----+- y -g
Figure 3.56 Sequence generator
control ler archileclurc.
z in
nl
Let"s complete the design for the secure car key controiler from Example 3.5. We already carried
oUl the Capture Ihe FSM step of the fi ve-step process. wi th the FSM shown in Figure 3.41. The
remaining steps arc as foll ows.
Step 2: Creale Ihe a rchileclure. Since the FSM has five statcs. wc' lI need a 3-bit state reg-
iSler. A 3bi t stat e regis ter can reprcsent eighl slates. so three Slates will be unu ed. The
input to Ihe logic is signal a. while the OutpulS are signal r and next SlalC oUlpurs n2.
n 1. and nO. The architcclure is shown in Figure 3.57.
Slep 3: Encode Ihe s lates. Let"s encode the states using a straightforward binary encoding of
000 through 100. The FSM with state encodings is shown in Figure 3.58.
Combinalional n2
logic
Figure 3.57 Secure car key
controll er archil cclurc.
o
r C"TI

in
Inputs: a ; Outputs: r
Figure 3.58 ecure car F M \I ith
encoded Sl3le ___ .
Slep 4: Creale the Slule lable. The FSM convened 10 a stote table is ,ho.I n in 3.: . For
Ihe unused :\ttHcs. we h:wc ch sen to SCI r - a 311d the nc\! 'tale 10 000.
126 Sequential Logi c Desig n- Controllers
Slep 5:
Impl ement the
logic. We call design four Cl fCUltS.
one for c;lch output. 10 implement
combinational logic. We Icave thi S
step as an exercise for the reader.
More on Controller Design
Converting a circuit to a n FSM . .
We showed in Secti on 2.6 Ihal a clrculL
Iruth table. and equat ion were al l ways 01
representing the same combinational fu nc-
tion. Similarl y. a circui t. state labl e. and
FSM are all ways of represenling Ihe same
sequenlial funcli on.. .
We have been converllng an FSM 10 a
circuit using a fi ve-step process. We can
also convert a circuit to an FSM by
applying Ihe five-slep process of Table 3.2
i n In general. converting a cirCUit
to an equation or FSM is known a,s re,_'erse
ell gilleerillg Ihe behavior of Ihe CirCUIt.
EXAMPLE 3.12 Converting a sequential circuit to an FSM
Given the sequential circuit in Fi gure 3.59. fi nd III
TABLE 3.5 State table for secure car key
controller.
InpulS
OUIPUI S
52 51 sO a
r n2 nl nO
o
Wait 0
o
KI 0
o
K2 0
o
K3 0
K'/
Unused
o 0
o 0
o
o
o
o
o 0
o 0
o I
o I
I 0
1 0
1 1
I 1
00000
I 0 00 1
00 0
1 0 0
1 0 1 1
1 0 1 1
o 0 0 0
1 0 0 0
o 0 0 0
I 0 0 0
o 0 0 0 0
I 0 0 0 0
o 0 0 0 0
1 0 0 0 0
o 0 0 0 0
1 0 0 0 0
:;
an equivalent FSM. x
We Slart from slep 5 of Ihe 5-Slep process :2 o
c
described in Table 3.2. The combinalional f2
circuit has already been implemented. and we
can proceed to step 4. where we create a stale
lable.
The combinati onal logic in the controller
architecture has 3 inputs: 2 inputs. 50 and s1.
repreo;;;ent the conlents or the Slate register. and I
input , x, is an eXlema.1 input. Thus slate
table wi ll have 8 rows Ince there arc 2 ::; 8 pas
sible combinat ions or inputs.
A ncr we set up the state tabl e and cnu
meratc al l pas ibl e combinati ons or inputs
(e.g .. ..... slsOx=ll l). lI'e
use Ihe lechniques described in Secll on 2.6 10
fill in Ihe values of Ihe OUlpUIS. For example.
con,ider Ihe OUI PUI y. From Ihe combinalional
Figure 3.59 A <C(IUenlial circlIil wilh
unknown behavior.
Z -ij
c
;;;
circuit. we see that y" 5] ' . Knowi ng Ihi \ , .
we add a 1 in the y column of Ihe \laIc lable In every row where S 1 O. and we add a 0 to
remaining ' pace. in Ihe y col umn. Now nO. which wc ,ee h'" Ihe Boolean
nO. S l ' sO ' X. Accordingly. we '01 nO 10 1 when S 1 = 0 and sO = 0 lind X = 1. We fill In tht
column\ ror z and n 1 u\ing a simi lar an::llylii\ and move on 10 the neX! \"cp.
111 step 3, we must encode the Natu-
ra ll y, the Sl ates have already been encoded. bUI we
can still name each Slate. We arbi trarily choose
Ihe labcls A. B. C, and D. secn in Table 3.6.
3.4 Controller Design
TABLE 3.6 State table for sequential
circuit
Inputs
OUlputs
5 I sO nl nO y
127
Slep 2 call s for Ihe creal ion of Ihe slandard
archit ecture. This step requires no work
Since the controll er architecture was already
defined. A
0 0 0 0
0
Finall y, in Slep I. we caplure Ihe FSM. Ini -
tiall y. we can set lip an FSM di agram with the
rOllr slates we've labeled in step 3, shown in
Figure 3.60(a). Nexl, we lisl Ihe va lues of Ihe
FSM outputs y and Z next to each state. For
example. in Siale A (51 sO = 00). Ihe OUlputs y
and z are 1 and O. respectivel y. so we list
"y l = 10" wilh Sial e A in Ihe FSM.
Outpuls: y, Z
0)
0
0 0
yz:10 yz: 10
0
0
0 0
yz:oo
YZ:01
(a)
(b)
0
B
0
0
C
D
0 I 0
0
0 0 0
0
1 I 0 0
0 0 0 0 0
0 I I 0 0
0 0 0 0 0
I 0 0 0 0
Inputs: x: Outputs: y, z
YZ: 10
yz:01
(c)
Figure 3.60 Converting a Slale lable 10 an FSM diagram: (a) inilial FSM. (b) with OUlputs
specified. and (c) FSM wi lh OUIPUI S and transilions specifi ed.
Art er li sting the outputs for Slares B. C. and D. shown in Figure 3.6O(b). \\C tum 10 the ,late
tmnsil ions specified in the slate tabl e by 111 and nO. Consi der the first row oflhe sttlte table. \\hich
says Ihal nlnO-OO when s1s0x=000. In olher words. when in laleA (s1s0=00). the nnl
Siale is Siale II (nlnO = 00) if X is O. We can represenl Ihi s in the FSM diagram b) dr.l\\ing an
arrow rrom slate A back to stal e A and labclin2. the new trnnsition " X ' ," No\\ consider the . nd
row of the stal e tabl e. whi ch indicates that Sl3lC A. we tr.msition to state B \\hen \ =- 1. \\'c add
a transiti on arrow from Slale A 10 B and label it "x." Arter labeling all the tr.lnsitions. \\ e are left
wilh Ihe FSM in Fi gure 3.60(c).
You mny nOli ce thut sl<He D cannOI bl.! reached from any OIht!r SlalC and transi tion, (0 stale -\
on any input. \Ve reasonabl y infer that (he origi nal F had onl) Ihree Slates and 'Iale D i"
:111 cXlrn. unused stat e. For completeness. it is preferable to Icave state 0 in lile tinal diJgram.
however.
Gi ven any synchronous circuit of logic gales and flip-flops. \\e ' :m
redraw the ci rcuit as of a state register and logi -{)ur st:mdard l'('ntroll r
arc hit eclUre-just by grouping all Ihe Oip-O ps logelher. Thus. the appfO.Ich dc>cnbnl
above works for any synchronous circuit. not j ust a circuit dra\\ n in the fonn ,I'
our siandard controll er archit eclure.
128 Sequential Logic Design- Controliers

ab=ll-
next state?
...
o:::X
a'b 0

whati'
ab=OO?
a'b' ...

Common Pi tfalls
Mi stakes are commonl y made when capturing an FSM, relating to regarding
the transiti ons leaving a state. In short, one and ollly one transitIOn condt tlOn should ever
evaluate to true during any ri sing cl ock edge. The propert ies are:
I. Only one condilioll sholiid be Irlle-For a given s13te, for any rising cl ock edge,
no more than one transition condit ion should be trUe. For example, consider an
FSM with inputs a and b, and a state SWle I with tWO outgoing transitions, one
labeled "a", and the other labeled "b." What happens when a = 1 and b 1-
which transiti on should the FSM take? The FSM designer must ensure that the
conditions are exclusive-only one could possibl y ever be true at one ti me. In the
example, the designer might label the transitions "a" and "a ' b" to solve the
problem. Actuall y, a particular type of FSM, known as a nondetermillistic FSM,
does allow more than one condi tion to be true and chooses among them in some
arbitrary way-but when designi ng circuits, we usuall y want detenninistic
behavior, so we don ' t consider nondetenninist ic FSMs further.
2. Olle cOlldilioll sholiid be Irlle-For a given state, for any rising clock edge, aile of
the transitions from that state must be taken. In other words, every input combina-
tion should be accounted for in every state. Designers sometimes forget to ensure
this. For example, consider an FSM with inputs a and b, and a state Slalel with
two outgoing transitions, one labeled "a", and the other labeled "a ' b." What
happens if the FSM is in Slatel . and a = 0 and b O? Neither of the two transi-
tions from Stale l has a true conditi on. The FSM is not full y specified-we need
to add a third transition, indicating what state to go to if a ' b' is true. With that
third condi tion, we have covered all possible values of a and b. A commonly for-
gotten transition is a transiti on pointing from a state back to itself.
We can verify the above two properti e using Boolean algebra. For the first property
of only one condition bei ng trUe, we can check that the AND of evelY pair of cOlldilions
all Iransiliolls of a stale always reslI lls ill O. For exampl e, if a state has two transitions,
one WIth condi tion a and the other with condition a ' b, using transfonnati ons of Boolean
algebra we obtain:
* a ' b
(a*a ' )*b
= 0 * b
o
For th: second si tuation of one condi ti on being true, we can check that the OR of all
Ihe condlllOlIS all l/'QnS/ll OllS of a stale always in 1. Considering the same example
ofa state that has two tranSit IOnS, one with condIti on a and the other with conditi on a ' b
uSlOg transfonnations of Boolean algebra we obtain: '
+ a'b
a * (1 +b ) + a' b
+ ab + a ' b
a + (a+a' ) b
c a + b
3.4 Controlier Design 129
Clearly. Ihe OR of Ihose Iw . . .
were bolh 0, neither condili on nOl l. bUI rather a+b. Thus. if a and b
specIfied 10 the FSM. Abov d be .Irue, ,lIld Iherefore the neXI Sl ate would
Checking yields: e, we fixed Ih, s problem by addi ng another transi ti on,
+ a ' b + a ' b '
a + a' ( b+b . )
a a + a ' *l
a + a'
- 1
. Analyzing Ihe equalions Illad f ..
ell her 1 or a is a 101 of work. TIl e of every stale and provi ng they equal
two slIuati ons and inform the d ,ere ore
t
,. ,I good FSM capture 1001 wi ll delecl the above
e Igner 0 Ihe SIl U3110n.
EXAMPLE 3.13 Verifying transiti .
. on properties for the code detector FSM
As evidence 'hat
lilis "pitfa" " is
ifldeed common,
we ad",il ,har 'he
mLfloke we made
in Figure 3.46wos
ge1ll1;1I. and lIof
just made for
educatiollal
purposes. A
reviewer of Ihe
book caugh, il. We
left the mistake ill
alld added this '
example. to stress
'he pOitllllzat the
misrake is
commo",
FIgure 3.46 shows an FSM
truc" . Or a code detector We \V'mtIO ,' f h '
( ,property for the transilions leavi ng '1: S' ven y I e 'only one condilion should be
a r +b+g). We Ihus have three pairs of S . llIrl. There are Ihree condilions: a r, a'. and
rollows: ' con( II IOns. which we AND and prove each equal 0 as
a r * a'
m( a*a')r
- O*r
D 0
a ' * a ( r ' +b+g)
- (a'*a)*(r ' +b+g)
O* ( r'+b+g)
0
ar * a ( r ' +b+g)
- (a*a)*r*(r ' +b+g)
- a*r*(r'+b+g)
- arr ' +arb+arg
- 0 + arb+arg
arb + arg
- ar(b+g)
It appears our FSM is not fu ll s cifi d
result in 0, which in IUm means cpe d' e , as Ihe AND of Ihe third pair of conditions does nOt
delerministic FSM (if bOlh d" on 1I10ns could be true at Ihe same time-resulti ng in a non
con Ill ons arc fmc Wh' l ' . h -
deleCtor problem descripll'on tllat ' . a IS t e nexl stale?). Recall from the code
we wan( to trans' t" f h
a bUllon is pressed (a - I) and Ih t b . I Ion rom t e Slarr slale 10 Ihe Redl Slate when
Th F
. a ullon IS Ihe red bUllon d h
e SM III Fi gure 3.46 has the c d" . ,an no Ot er colored bUllon is pressed.
should instead be arb' g' . h
on
Ilion a r. Our mIstake was underspecifying Ihi s condi lion' I ' t
- Ill ot er words a b tt h be .
(r) and Ihe blue bUllon has nOI been p d on as en pressed (a) and il is the red button
The transilion from Starr I back to and Ihe green bUllon has not been pressed (g ').
the same as in Figure 3 46 aft I ' all stale could then be wrillen as a (rb ' g' ) , (which is
verify the "only one DeMorgan 's Law). After this change, we can agai n try 10
and a (rb ' 9 , ) ': p operty for all paIrs of the three conditions arb' g'. a' .
arb'g' * a'
D aa '*rb'g '
O*rb'g'
o
a ' *a(rb ' g ' )'
O*(rb 'g' )'
0
arb ' g' * a (rb ' g ' ) '
= a*a*(rb ' g')*(rb'g')'
write rb ' g ' a Y for clarily .. .
D a*a*Y*Y'
= a*a*O
c 0
We would need 10 change Ihe [ 'f d' .
Ihe pairs of condilions for those Ion con It Ions of the olher slates si mil arly, and then check
ransltl ons too.
130
Sequential Logic Design- Controllers
a---o
a=O a=O
b=' b=O
c=0 c='
...
a---o
b=' c='
, SltIrr we OR the three conditions and
To verify the "one condition is mlc" property for stale '
prove they I:
arb ' g ' + a ' + a ( r b ' g', ) ', ) ' (write rb' g' as Y for clarity)
a ' + arb ' g ' + a(r b 9
a ' + aY + aY '
a ' + a(Y+Y' ) = a ' + a(l )
- a ' + a
1
We wou ld need to check the property for all other states toO,
. I' f' FSM Notations' Unassigned Outputs .. b ' . I"
SImp I yong . I . . ' . ' FSM otation of every transluon eong Imp .cnly
We already introduced the slmphfYlng h n commonl y used simplification involves
. . . lock edoe Anot er
ANDed wIth a ri SIng c <> l'stinO the assionment of every output in
. . If FSM has many outputs, I <> "
asslgno ng outputs. an I d ke the relevant behavior of the FSM hard to
every state can become cumbersome, an ma as follows-if an output is not explici tly
di scern. A COllllllon simpll fymg notati on IS . 0
in a state. the output is implicitly assIgned a .
- . . . , '. I li ci t Clock Connections
Simphfyong C,rcu, t Drawongs. v a si no Ie clock signal connected to all sequential
Many if not most sequential CorCUlt s a. e <>. I because of the small triangle input
k ' a component IS sequenua
component s. e no\\. k b I Many circui t drawings therefore use a simplifi.
drawn on the component S bloc. sym o. 'be connected to all sequential components,
cat ion wherein the clock sIgnal IS assumed to .
. I ltd wiring in the draWIng.
This simplificatIOn leads to ess c ut ere
. .' I d Sequenti al Circuit Design
Mathematical Formali sms on Combmatoona .an Bin functions and FSMs for
We have described two mathemati cal formahsms, 00 ea .'
. . .. I d lti al circuits respect ively. Note that we dId not halle to
deslgllln
o
combonatlona an sequel , b 'Id' th
e . d"t Recall that our first attempt at UI 109 a ree
use those formah sms to eSlgn CorCUI s. . .
. .' F 335 J'ust had us connecti ng components together on Lile
cycles-hl2h laser ti mer on Igure . . '.
-. I orkino circuit However, usong those formahsms provIdes for
hopes of creating a correct y w " .... . .
a structured and sound method of designong corcuns. Those fonnaiosms also proVIde Lile
basis for powerfu l automated tools to assist us wi th design, s uch as a tool that would auto-
. II h k C the common pitfalls described earioer on thIS secllon, tools Lilat
matlca y c ec ' lor . .' ' .
. II n Boolean equations or FSMs onto corcun , tools that venfy that tM aulomall ca y conve
circuit s are equivalent, tools that simulate our systems, etc. And, we have touched
on all the benefits of those mathematical formalisms relating to automatong the vanous
aspects of designing circuits. and verifying the circui ts behave properly. The Importanceo[
using sound mathematical formalisms to gUIde deSIgn cannot be overstated.
3.5 MORE ON FLIP-FLOPS AND CONTROLLERS
Other Flip-Fl op Tvpes
Today, designer generally use registers to implement their bit storage needs, and LilOSl
regi ters typically are built from D flip-flops. However, in the past, tran Istors were
more scarce than today. Thus, designer often utilized other types of flIp-flops, haVll\!
3.5 More on Flip-Flops and Controllers 131
more functionalit y than D flip-flops. to reduce the logic gates required out ide of the flip-
flops, and hence to reduce the number of ICs neces ary to implement a circuit. Those flip-
flop types Included SR. JK. and T flip-flops.
SR Fli p-Flop
The SR flip-nap is similar to the SR latch descri bed earlier. with additional logic to make
the CorCUlt tri ggered by the edge of a clock. rather than just the level of the clock.
JK Flip-Flop
The JK flip-fl op is simil ar to an SR flip-fl op. wi th J corresponding to S, and with K cor-
responding to R (I remember thi s by thinking of " K" standing for "Kl ear" or clear). The
JK flIp-fl ap's behavior differ from the SR flip-flop when both input s are I . Recall that an
SR flip-flop 's behavior is undefined when both inputs are I. A JK flip-flop. in contrast.
toggle when both inputs are set to I (at the next clock edge. of course). To toggle means
to change to the opposi te state, meaning if the present stored bit is I. the next stored bit
would be O. Likewise, if the present stored bit is O. the next stored bil would be I.
T Fli p-Flop
A T fli p- fl op acts like a JK flip-flop wi th the JK inputs tied together to form the T input.
In other words, whenever T is 0, the flip-flop maintains its current state. but whenever T
is I, the flip-flop toggles (think of "T" for 'Toggle").
Nonideal Flip-Flop Behavior
Clk---IL-
o-riL-
: :
:----:
setup time

I

, ,
, ,
, ,
t--')
hold time
Generally, when we first learn about di git al design. we assume ideal behavior for logic
gates and flip-flops, JUSt like when we first learn physics of motion. we as ume there' s 00
friction or wind resistance. There is. however. a non ideal behavior of flip-ftops-metasta-
biJity-that is such a common problem in real digital design practice, we feel obliged to
di scuss the issue briefly here. Digital deSigners in practice should study metastability and
possible SOluti ons quit e thoroughl y before doing serious designs.
Metastability comes from failing to meet fl ip-flop set up or hold times, which we now
introduce.
Set up Times and Hold Times
Flip-flops are built from wires and logic gates, and wire and logic gates have delays.
Thus, a real flip-flop imposes ome restri cti ons on when the flip-fl op's inputs can change
relative to the clock edge. in order to ensure correct operation de pite those delays. Two
important restriction are:
Setllp time: The inputs of a flip-flop (e.g" the D input ) must be stable for a
minimum amount of time, known a the setup time. before a clock edge arrives.
This intuiti vely makes sense-the input values mu t have time to propagate
through any internal logic and be waiting at the internal gate ' inputs before the
clock pul e arri ves.
Hold time: The inputs of a flip-flop must remain stable for a minimum amount f
time, known as the hold time, after a clock edge arrives. Thi at 0 makes intuitive
sense-the clock signal mUSt have time to propagate through the internal gate- to
create a stable feedback state.
132 Sequential Logic Design-Controllers
dk-----FL
o--t--L
1 ,
H
setup
violation

metastable
state
. . I k pul se width- the pul se must be wide
A related restrict ion is on the mlnllnum C oc . I I ' d
, tl ough the tnt erna oglc an create a
enough to ensure that the correct values propdgate lr
stable feedback state. , '
. . II 't h a datasheet describi ng setup li mes, hold limes, and
A flip-flop typlca y comes WI , ,
minimum clock pulse widths. . . I" D han cd 10 0 too close
Figure 3.61 ill ustrates an example of a setup lime Via all an. c g
10 the risino clock. The resul t is that R was not 1 long enough 10 create a stable feedback
" . Q b ' 0 I lead Q glitches to 0 bnefl y. That
in Ihe cross-coupled NOR gates With etng. ns., . . .
gli tch feeds back 10 the lOp NOR gale, causing Q' to gill ch to 1 Thai giltch feeds
back 10 the bOllom NOR gate, and so on. The oscillali on woul d ilkely conttnue until a
race condition caused the circuillo senle inlo a stabl e si luation of Q 0 or Q the
circuil coul d enter a melastable state, which we now descri be.
D lalch
C
D
S
R
Q'
Q
Figure 3,61 Setup lime violation: D changed 10 a (I) 100 close 10 the ri si ng clock, u changed 10 1
after the invener delay (2), and then R changed 10 I afler Ihe AND gale delay (3), BUI then the
clock pulse was over, causing R to change back 10 a (4 ) before a stable feedback situalion wi th 0-0
occurred in the cross-coupled NOR gales. R's change 10 I did cause 0 10 change 10 0 after the NOR
gate delay (5), bUI R's change back 10 a caused 0 10 change ri ghl back 10 1 (6). The glitch of a 0 on
Q fed back inlo the lOp NOR gate, causing 0' 10 glitch 10 1 (7). That glitch of a 1 fed back 10 Ihe
bottom OR gale, causing anolher gli lch of a a on 0, That glilch runs around Ihe cross-coupled
OR gale circuil (osciliali on}-a race condilion would eventually cause Q 10 ettle 10 1 or 0, or
possibly enter a metaslabl e stale (10 be discussed),
Metastabili ty
If a designer fails to ensure that a circuit obeys the setup and hold times of a Rip- fl op. the
result could be that the flip-flop enter a met astable state. A Rip-fl op in a metastable stall
is in a state other than a stable 0 or a stable 1. Metastable in general means that a system
is only marginally stable-the system has other states that are far marc table, A fli p-Hop
in a metastable state may have an output with a va lue thllt is not a Q or a L instead out-
putting a voltage somewhere between that of a 0 and that of ai , That voltage may nl 0
o<;cillate somewhat. That's a probl em. Since a flip-flop' output i< connected to other
components like logic gates and other flip-nap" that wangc vol LOge value may cause
other components to output strange value" and soon the V(Iluc, throughout our entire
circuit can be in bad
3.5 More on FlipFlops and Controllers
133
Why would we ever violate setup and hold times? After all, within a circuit we design
we can measure the longest possible path from any Rip-Rap output to any flip-Rap input
long as we make the clock period sufficientl y longer than that longest path, we can ensure
Our CirCUli obeys setup li mes. Li kewise, we can ensure that hold times are satisfied too
The probl em is that our circuit li kely has to interface to external inputs, and we
control when those inputs change, meaning those inputs may violate setup and hold times
When connected to Rip-fl op inputs, For example, an input may be connected from a
button bell1g pressed by a user-the user can' t be told to press the bunon so many nano-
seconds before a clock edge and to be sure to hold the button so many nanoseconds after
the edge so that setup and hold ti mes are sati sfied. So metastability is a problem
pnmanly when a Rip-fl op has inputs that are not synchronized with the circuit's c1ock-
such II1pUts are said to be asynchronous.
Designers usuall y try to synchronize a cir-
cuit 's asynchronous input to the circuit's clock
before propagating that input to components in
the circuit. A common way to synchroni ze an
asynchronous input is to fi rst feed the asynchro-
nOlls iI/pur imo a single D flip-flop, and then use
the output of that Rip-Rap wherever the input is
needed, as shown for the asynchronous input a i
in Fi gure 3.62. Using a si ngle Ri p- Rap as shown
al so eliminates a second probl em of different
values of the same signal appearing at the various
internal Rip-Raps at a clock edge, due to different
path delays.
"Hold on now! " you might say. Doesn' t that
synchroni zing Rip-Rap experi ence the setup and
hold time probl em, and hence the same metasta-
bili ty issue? Yes, that's true. But at least the
asynchronous input directl y affects only one Rip-
fl op, rather than perhaps several or dozens of Rip-
fl ops and other components. And that synchronizer
aj ----0>--"':.----1
,
"
aj
synchronizer
Figure 3,62 Feeding
external inputs into a single flip-Bop
can reduce melllSlJlbilit) problems.
Ri p-Rap is pecifically introduced for synchronization purpo es and has no other
whereas other Rip-Raps are bei ng used to store bits for other PllIpDSCS- We can !herefore
choose a fli p-flop for the synchronizer that minimizes the metasrnbilit) prohlem-we can
choose an extremely fast Rip-flop, andlor one with I'el)' small setup and hold times. and/or
one wi th special circuitry to minimi ze metastability_ That Rip-Rop may be bigger than
nonnal or can ume more power than nonnaL but there's only oe su h Hip-Hop per -yn-
chronou input. so those issues aren't a problem. Bear in mind that 0 matter what we 00_
though, the synchronizer flip-Rap could still become mc:tasrnble. but 3t Ie -t we can nuni-
mize the odds of a meta ' table state happening byeh -iog a good Hip-Hop,
Another thi ng to consider i that a Rip-flop will typicnll not ' (3) metast:lbl for
I ng, Event ually, the flip-flop will "t pple" mer to amble 0 or a tahle _ It e how 3
oi n tos cd onto the ground nm spin for a \ hi Ie (a mctustubl state) but will
topple over to :1 stable head or tail. Whm many designcn; th refore do IS IIltrodu:e til )/'
1110rc flip-flops in series for s nchronitation purposes, as ShOll'11 in Figure 3 63, '0 I n If
134 Sequentia l Logic Design-Controll ers
the first flip- fl op becomes met a-
stable. that fl ip-fl op will likely
reach a stable state before the
next clock cycle. and thus the
second flip-fl op is even less
likely to go metastabl e. Thus the
odds of a metastable signal actu-
al ly making it to our circuit"s
normal flip-flops are very low.
This approach has the obvious
drawback of delaying changes on
the input signal by several
cycles-in Figure 3.63. the rest
of the circuit won't see a change
Probability of flipllop being
metastable is:
al
synchronizers
very
very
low
Incredibly
low
Figure 3.63 Synchronizer flip-fl ops reduce probability of
melaslabil llY in our regul ar flip-flop .
on the input a i for three cycles. . ...
As clock periods become shaner and shaner. the odds of the firs t flip-flop
before the next clock cycle decreases. so metastability i becomIng a more chall engIng Issue
as clock periods shrink. Many advanced methods have been proposed to deal with the.
Nevenheless no malter how hard we try. metastability wlil alway be a posslblilly,
meaning our cir;uit lIIay fail. We can minimi ze the likelihood of fail ure, but we c.an' t
compl etely eliminate failures due to metastabilit y. De igner often rate their deSigns
using a measure called mean time between failures . or MTBF. DeSigner typically 31m
for MTBFs of many years. Many students find this concept-that we can' t design fail-
proof circuits-somewhat di sconcening. Yet . that concept i the real situati on in design.
Designers of serious hi gh-speed di git al ci rcuits shoul d tudy the problem of metasta-
bility, and modem soluti ons to the problem. thoroughly.
Flip-Flop Reset and Set Inputs
Some D flip-flops (as well as other flip-
flop Iypes) come with extra inputs that
can force the flip-flop to 0 or 1, inde-
pendently of the D input. One uch
input is a clear, or reset, input, which
forces the flip-flop to O. Another such
input is a set input, which forces the
flip-flop 10 1. Reset and set inputs are
very useful for initializing flip-flop to
an inilial val ue (e.g., initializing all flip-
flops to Os) when poweri ng up or reset-

yyr-y
(a) (b) (e)
Figure 3.64 0 nip-flop, with: (n) 'ynehronous
resel R. (h) a ynehronou rc et AR. and (e)
asynchronou; rescl and ... 1.
ting a system. These reset and set inputs hould not be confused with the Rand S inputs of
an RS latch or flip-Hap-the reset and set inputs are control to any type of
flip-flop (D. RS. T. JK) that take priority over the nomlal data of 0 nip-flop.
The resel and 5et inpull of a flip-flop may be either synchronol!\ or 0'> nchronou . A
synchronous reset input force the flip-flop to 0, regardlc \\ of the ,aluc on the D inpuL
during a rising clock edge. For the flip-flop In Fi gure I.M(a). ctllng R to 1 rces the
3.5 More on Flip-Flops and Controllers
135
to 0 on the next clock edge. Li kewise, a synchronous set inpul forces the flip-
. op to 1 On a ri sing clock edge. The reset and set inputs Ihus have priority over the 0
Input. If a flip-flop has both a synchronous reset and a synchronous set input. the flip-flop
datasheet must Inform the flip-fl op user whi ch has priority if both inputs are sellO 1.
An asynchronous reset forces the flip-fl op to 0 independently of the clock signal-
the clock does not need to be ri sing, or even be 1. for the asynchronous reset 10 OCcur-
hence the term "asynchronous." Likewise, an asynchronous set. also known as preset.
can be u ed to asynchronously force the flip-flop to 1.
We omit di scussion of how
synchronous/asynchronous reset/set
inputs would be internall y designed
in a flip-flop.
Sample behavior of a flip-fl op's
asynchronous reset input is shown in
Fi gure 3.65. We assume Ihe fl ip-fl op
initially stores 1. Selting AR to 1
forces the flip-fl op to O. independent
of any clock edge. When the next
clock edge appears, AR is still 1. so
the flip-fl op stays 0 even though the
input 0 is 1. When AR returns to O.
the flip-fl op foll ows the 0 inpul on
successive clock edges, as shown.
Initial State of a Controller
cycle 1 cycle 2 cycle 3

D
,
AR '
--LLr----L
Q : : ;
Figure 3.65 Asynchronou reset forces !be fl ip-Hop
to O. independent of c 1 or D.
Particularly observanl readers may have come up with a question when we implemented
FSM as controller in thi secti on-what happened to the indication of the initial tale of
an FSM when we designed the controller implementing the F M' The initial -mle of an
FSM i the state that the FSM starts in when the FSM is first a ti\1lted-or in ntroUer
temlS. when the controll er i firsl powered on. For example. the laser timer ntroller
FS 1 in Figure 3.39 has an initial state of Off. When we omened our graphi-al to
state tabl e in thi s section. we ignored the initial tale infonnation. Thus. all of our n-
troller circuits stan in some random stale based on whate,,:r \'alues happen 10 appear m
lhe state register when we power up the circuit. , ot kno\\;n" the initial -tale of J -ircuil
could pose a problem-for example. we don't want ur laser timer ntroller I ;!:lrt in
state lhat immediately turns on the laser.
One oluti on i to add an additional input. r eset. to e",,) L'OnlI'Olier. tting "ese:
to 1 should cau e a load of the initial state into the stnlC regber. Thi inioal 51 Ie ' W
be forced into the tate register. The re' et and set inputs of a flip-Hop ( OJ 10 \ ')
in thi situalion. We enn imply onnect the controller' - rese input I the ('e. l;md
input of the tate register" Hip-ft ps in a \\ ay that sets the Iltp-Il< s 10 the imtiJI 5t I
when rese i 1. For if the initial state of n sw regi, r . h,'Illd
lhen we could nneet the ontrollcr's re.cI inrut 10 re,et .ll1d set tnpU
flop . . as .ho\\ n in Figure 3.M.
136
3 SeqUential L .
ogle DeSign- Controllers
. Or cou"e. for thi; reset func-
tIonality to as desi red. the
deSigner must lhal the con-
lroll
er
', reset input is I when the
sYStem is fir>! powered up. Ensuring
lhe reset input is I duri ng power up
an be hnndlcd using an appropriat e
e leclronic circui t connected to the
On/off Swit ch. the descripti on of
\Vh,ch is beyond Our scope.
ate that. if the synchronous
re ' et Or set inputs of a flip- nap are
Used. then the earlier-discus ed
etup and hold times. and associ.
ated metastabil ity issues. apply 10
tho e reset and sct inputs.
elk
-
resel
....
b

-
Combinational
logiC

t sO
f;;o-
s t
State register

,---
D O' p..
t> Of- t> 0 f.-
S--
Nonideal Cont II B' .
ro er ehavlOr: Output Glitches
Figure 3.66 Threecycle high laser timer
with a reset input that loads the stale regi ster with
the initial "Iatc 0 1.
Glitching is the presence of temporary values on a wi re. typicall y caused by
delays of different logic paths leading 10 thm wi re. We saw an example of gluchll1g m
Figure 3. 13. Glitchino wi ll also often occur when a controll cr changes states, due to dlf-
ferem path lenOlhs each of lhe cont roller's state regi ter flip- fl op to the controller's
Consider lhe IhreccycJes-hioh laser timer design in Fi gure 3.50. The laser
should be off (output x=O) in Slat: 5150=00 and on (x- I) in . tates 5150- 01.
sIs 0 = 1 O. and 5 I 50= II. However. the delay from 5 I 10 x's OR gatc 111 thc figure could
be longer lhan the delay from 50 to that OR gate. The result could be lhat when the state
regi ster changes Slate from 5150=01 to 5150-10. the OR gate' input could momen-
taril y ee a 00. The OR gate wou ld thu output 0 momentarily (a glitch). In the laser
timer example. that glit ch could momentarily shut off the lascr-an undeSIred ituation.
Even would be glit che that momentaril y tum all a la;cr. .
Real deSigner must detenninc whether such glitching would reall y pose a 111
lheir pani ular tem. and if so. those designer\ should take action to avoid gluchll1g.
One solution in the laser timer example might be to insen a 0 nip-fl p after x s OR gate 10
Figure 3.50. would shift the x output later by I clock cycle (\till resulting 111 three
cycles high. however). but should eliminate glit che\ seen at the x output. as only the table
value appearing at the output would be loaded int o the fl ip-flop on a clock edge.
Active-Low Inputs (Negative Logic)
mil now, we have a \umcd acti ve hi gh input' on
flIp-flop, and other componelll' . An actil'e-iligil
i nput h a comrol Input who\c a"ociated operatIon I<
hy ,cll ll1g the '"put to I For examplc, If an
Input can rc'ct a fl,p-Oop. we '" umed that '"flut
rc'ct ",hen thc Input \ value Wi" I Hnv.c'er, a
-.., . _ ..._-----
D
o
figura J 67 f) Olr-O lp Wllh ad;'e
In\\- )n hrnnuwlo rr'Cl IOp"l
3.8 Product PrOfile-Pacemaker
137
component can instead have an active- low input. An active-low input (also known as a
/l egative logic input) is a control input whose operali on is aClivated by seuing the input to
O. Fi gure 3.67 depi cts a 0 Rip-Rap with an acti ve-low synchronous reset input-the circle
at the R input indicates that the R input is aClive-low. Thus. LO reset the flip-flop LO O. we
would set R to 0, whereas for nonnal 0 Rip-fl op operalion, we would set R LO 1.
Active-low inpulS can OCcur on any component with a control input. not just on flip-
fl ops. For exampl e, the enable control input on a decoder could be active-Iow-seuing that
enable to 0 (meaning the decoder is enabled) would cause nonnal decoder operation, while
selli ng the input to I (meaning the decoder is disabled) would result in all OUtpUls being O.
When di scussing the behavior of a component. designers wiIJ often use the Lenn
assert to mean setting a control input to the val ue that activates the associated operation.
Thus, we mi ght say that one must "assen" the R inpul of the 0 flip-flop in Figure 3.6 in
order to reset the Rip-fl op to O. Using the tenn assen avoids pos ible contu ion mal could
occur when some control inputs are active-high and others are acti\e-Io".
Acti ve-low inputs typi call y exist when the internal design of the component requires
fewer gates when implemented with an active- low input than with an active-high input
3.6 SEQUENTIAL LOGIC OPTI MIZATI ONS AND 'TRADEOFFS
(SEE SECTION 6.3)
The earli er secti ons described how to design basic sequential logic. Thi section. "hicb
phys icall y appears in thi s book a Secti on 6.3. describes how to create bmer sequential
logic (smaller. fas ter, etc.) using optimi zation and tradeoffs. One use of !hi boo '
describes sequenti al logic design optimization and tradeoffs after inrro-
ducing basic sequential logic design. meani ng now. An altemati\'e use describes
sequenti al logic design optimizalions and tradeoff later. after completing the introduc.
lion of basic datapath components and RTL de ign (Chapters -4 and -).
3.7 SEQUENTIAL LOGIC DESCRIPTION USING HARDWARE
DESCRIPTION LANGUAGES (SEE SECTION 9.3)
This secti on. which phy icalJ y appears in thi book as Se ti n 9.3, lI1trodu . the_ use oi
HDLs for describing equenlial logic. One use of this book imrodu uch use ot
immediately after int roducing basi equential logi design. meaning nO\\ . An altemat]\e
use introduces such HDL use later.
3.8 PRODUCT PROFILE-PACEMAKER
A pn emaker is nn electronic devi e that pnl\ ides electrical stimulati n t hem to help
regulate 3 hean ' beating . .. teau ing 3 heart \\ hose natural
not worki nc properh . perhaps due to di.ease. ImplantJble pa III 'e '_ =
, . . ' h . F' an' \\ 'rn b\ el\ r I :: mill" ally placed under the '''' 0 \\ n III 'll"ni _' . . l .
mcricUlt'. The) nrc pl.l\\en:d b) J bJllcl) thm t,st tcn af' r nh:n!. Pa.: nl _
illlpnl\cd the qt;nlit) (1f hfe II.' \\ ell ,h l'llgth ned the li \c', f mJn\ nil II I,,", .'1
138
3 Sequential Logic Design-Controllers .
. I - (left and right) . The ve nlrt cles
. I ) 'md I WO venlrl C cs .
A heart has two atria (left and ng 11 , . ,' , the blood fr0111 the vein. A very
. "I he utna rece"e , .
ush the blood out to the artenes. whl e t I contraction in the heart s rt ght ven-
p detect a nalll ra . ' I ' f th
simple pacemaker has one sensor 10 " I t' 1111ulation to thm nght veOlnc e I e
" d I' er electnca s' . II ' d
. I d one output wire to e IV " d ' ' period- tYPIC" Y JU t un er one
tnc e, an . h" peclfi e li me "
nalUral contracti on doesn' t occur WIt In , s ct',on nO! only in the nght ventncle,
. . ses a cont ra .
second. Such electrical Sll mulallon cau
but also the left ventricle.
. localion under the skin (right). Counesy
Figure 3.68 Pacemaker with leads (Ieil ). and pacemaker
of Medtronie. Inc.
. . fa sim Ie pacemaker's control ler usi ng the FSM in
We can descnbe the behaVIOr 0 h P h pac' maker con i ting of a controller and
69 Th I ft ' de of the figu re sows tee .
Figure 3. . e e Sl . h the timer when t - 1. pon being reset. the
. Th " h n input t wh,c resets .
a umer. e umer as a . 8 d If the timer counts down to O. the lImer
timer begins counting down from id ;befOre rcaching O. in which case the timer
sets its output z to 1. II mer COhU . re t:rt counting down from 0.8 seconds again.
d t z to 1 and Instead t e lImer . . . h
ocs not se. h' h ' 1 when a contraction In the ng t ven.
The controller has an input s. w IC IS h' h the controller sets to 1 when the controller
tricle. The cOOlroller has an output p. w IC
wants to cause a paced contraction.
F,gure 3 69 A ba'lt pacemaker", ."nlloller
psI
t. o

3.8 Product Profile-Pacemaker 139
The ri ght side of the figure shows the controller's behavior as an FSM. Initially. the
COOlroll er reset the timer in state ReselTimer by setting t = 1. ormally. the controller
wa its in state Wail , and stays in that state as long as a contraction is nO! detected (5 ') and
the timer does not reach 0 ( z '). If the cont roll er detects a natural contraction ( 5), then the
controll er again resets the timer and returns to waiting again. On the other hand. if the
cOOlroll er sees that the timer has reached 0 (z = I), then the controller goes to stale Pace.
whi ch paces the heart by setting p= 1, after which the controller returns to waiting again.
Thus, a long as the heart Contracts naturall y. the pacemaker applies 00 stimulation to the
hean. But if the heart doesn' t contract naturall y within 0.8 econds of the last contraction
(natural or paced), the pacemaker forces a contraction.
The atri a receive bl ood from the veins. and contract to push the blood iDlO the "eotri-
c1es. The atri al COOlracti ons OCcur jusl before the ventricular contractions. Therefore.
many pacemakers. known as "atri oventri cul ar" pacemakers. sense and pace nO! just the
ventri cular contractions, but also the alri al contracti ons. Such pacemakers thus bave two
sensors, and two output wires for electri cal stimul ation. and may provide bener cardiac
output, with the desirabl e re ult being higher blood pressure (Figure 3.70).
Inputs: sa, za, SV, zv
Outputs: pa, la. pv. tv
la=1
Figure 3.70 An atriovenlrieular pacemaker'S contrOller FSM (usi ng the comenoon thaI FS)\
OUIPUIS nOI explici lly sel in a Slale arc implieili, sel 10 0).
The pacemaker has two ti mers. one for the right atrium (TimerA) and ne for th ...
ri ght ventricle (TilllerV). The comroller initiall) resets TimerA in tate Re etTunuA. and
then wailS for a nat ural atrial contra tion. or for the timer 10 reach O. If the xmuoller
detects a natural at ri al contracti on (sa). then the ontroller skips pacing of the On
the other hand. if Tilll erA rea he 0 first. Ihen th ... :omroller gO<!' to 'tate Po -eA. hich
causes a contraction in the atrium bv setting pa- l. After no atrial 'ontra -non ( ... trW
natural or paced). the c ntro/ler reset' Timer! ' in ' Iate ResnTimul: and then \\ail> for
nat ural ent ricular contraction. or for the timer to O. If a n"rural , ... ntnculJ.r u'fltr.lC-
tion occurs. the contmllcr skips plICing of the \enmde. n the other hand. If n a\
reaches 0 first. then the controlla gO<!, to ,t.ue Pace I : \\ hich .IU_ , a :"'ntr. ' tbn '"
ve nt ricle b sett ing pv - 1. The ontroller then to th am,ll ,tat" .
lost modcm -an h:1\ e the tim'r pam111erel"' pn,'gr.lflUlk-J 1 I
thrOl lch r:ldio sielln" ,0 that JoctOI"' can u: Jlfli.'rcnt \l1thL'Ut tl' ' u :1-
call) ;c11Io\,e. and the
d
Sequential Logic Design-Controll ers
This example demonstrates Iho usefulness or FSMs in describing a com rOller's
behavior. Real pacemakers have controllers wilh lens or even hundreds or Slal eS 10 deal
wil h ""ri ous details lhal we left Oul of Ihe exampl e ror simpli cil Y
With Ihe adve nl of vcry low-power mi croprocessors. a trend in pacemaker design is
lhm or implemenling Ihe FSM on a mi roprocessor ralher than wllh a custom scquenllal
ci rcui!. Microprocessor impiel11clll ali on yields Ihe advanwge or easy reprogramming or
lhe FSM. expanding the range of treatmenl s Ihat a doclor can expeflmenl wllh.
3.9 CHAPTER SUMMARY
Secli on 3. 1 introduced Ihe concepl or sequenlial circuilS. namely circuil s thai slore bits,
meaning the circuils have memory. known as 5(3IC. Secti on 3.2 developed a series of
increasingly robusl bil storage bl ocks. including Ihe SR lalch. D lalch. D nip-nap. and
finall y a register. which can store muliipl e bil s. The seclion al a introduced the concept of
a clock. whic h synchronizes Ihe loads or registers. Seclion 3.3 introduced fin ite-state
machines (FSMs) for capluring the desired behavior of a equenlial circuit. and a slan-
dard archileclure abl e 10 implemenl FSMs. Wilh an FSM implemenl ed using the
archi lecture known as a controll er. Seclion 3.4 then descri bed a fi ve- tep process for con-
"ening an FSM 10 a cont roll er implementati on. Secli on 3.5 highli ghl ed some types of
flip- fl ops Olher lhan Ihe D flip-fl op. Ihose olher Iypes being popul ar in the past. Thai
seclion also desc ribed several liming issues related 10 Ihe use or flip-fl op . including setup
lime. hold lime. and metastabilil Y. The secli on introduced asynchronous clear and sel
inputs to nip-flops. and described their usc for inili alizing an F M to il initial tate.
Secl ion 3.8 highli ghled a cardiac pacemaker and illu trated the u e of an FSM 10 describe
lhe pacemaker' s behavior.
Designi ng a combinational circuil begi ns by capluring Ihe desi red circuit behavior
using either an equalion or a lrulh table. and lhen foll owing a everal slep process 10
convert Ihe behavior 10 a combi nalional ci rcui!. Designing a equenlial ci rcuil begins by
caplUring the des ired circuil behavior as an FSM. and then foll owing a cveral step
process to convert the behavior lO a circuil consi ling or a register and a combi nali onal
circui!. known as a controll er. ConceplUall y. then. wi th the knowledge in Chapler 2 and
3. we can build any digital circui!. However. many digital ci rcuil deal wilh inpul data
many bits wide. ; uch as five 32-bit inpul5. Imagi ne how complex ur equal ion . lruth
tables. or FSMs would be if they involved 5"32 = 160 inpul' . Fortunalely. components
have been developed specifically 10 deal wi th data inpuls and Ihus the de ign
process--components Ihal will be described in lhe chapler.
3.10 EXERCISES
Any problem nOled wilh an a\icri,k (0) reprc,enL' an e pecl3l1) chnilenglllg problem.
SEc.-no ' 3.2: STORI NG ONE BIT- FUr', f'LOl'S
3. 1 Compule Ihe clocl period for Ihe folil)wlIlg cJocllrequcnClc,
(J) ')0 lHI (Cilfly compule"l
(hi lfJO MHI (Son} Pld)'IJ"'1Il 2 pre""'''''1
Ie) 1 Glil ({nl el Pcnllum 4 prll"t Or)
(d) 10 GHz (PCs of Ihe earl y 2000s)
(e) I THz ( I lerahcnz)
3.2 Compule Ihc clock . r'
(a) 32.768 kHz pe ,od for the foll owing clock frequencies.
(b) 100 MHz
(c) 1.5 GHz
(d) 2.4 GHz
3.3 Compute Ihe clock fr
(a) I s equency for the foll owing clock periods.
(b) I ms
(c) 20 ns
(d) I ns
(e) 1.5 ps
Compule Ihe clock r "
(a) 500 ms requency or the following clock periods.
(b) 400 ns
(c) 4 ns
(d) 20 ps
3.1 0 Exercises 141
3.5 *Assume scienli sts have devel oped a t hO h -
lance, meaning signal s w'lh' h' . Ip 3vmg perfect transi tors and "ires 'With no resis-
I In t IS chip can tra\'el at lh peed f . '"
second. Assuming OUf digital circu't h 'dth e 5 0 hghl. or 3xlv- meters!
th I k ' as a w, oP - mm and a h'!!h f -
e c oc . period and cl ock frequenc 'h th - . e,_ t 0 mm. compute
a single cl ock period is: y. \\ ere e longest dl lance an) signal must r3\"cl
(a) one-eighlh of the wi d,h of the circuil -
(b) one-half the hei gh, of the circuit
(c) lhe widlh of Ihe circuit
(d) diagonally across the circuit
(e) Ihe perime,er of lhe circuit
3.6 Trace Ihe behavi or of an 5R latch for lhe followino . .
for a long time. then 5 chanaes 10 I and Slluaoon. Q. . and Rare 0 and !la,,, be<on
Using a liming diagram. the \'3Jues there for :1 (i,me. then ch:mg ck to O.
Assume logic gates have a tiny but nonzero on e\er) "lre for c\el') change 00 3 \\"Ire..
Qs 3.7 Repeal Exerci se 3.6. but aSSume thai S "'as .'
P
rop h hanged to I just long enough for " 1!OJJ _ 10
ag.atc l rough one logic gate. after \\ hich -
nOI sall sf), Ihe hold ,ime of the lalch. \\as changed back to O--in other \\ords. did
Gs 3.8 ;f"JCC the behavior of a level-sensili'e 5R la'ch (see Figure , th .
"gure 3.7 1. Assume 51. RI. and Q are inilialh 0 Co - .'. or e '"pllt p"ttem m
logic g3l cs have a tiny but nonzero . ' mplete the nmmg
c
__ ______ _
A ____ ____
SI '-----
Al
Q
Figure 3.71
142
. I L 'c Desig n- Controll ers .
Sequentl. ogl . > 3 for the input patt ern 111
.. SR I teh Figure. . .' . .
T ,. h behavior of a level-scnslt l\'C ... a 0 COI npktc the timing diagram. assummg
.\.9 ract: I e Rid Q arc .
Fiourc 3.1'1. S I. . an I
gates have a tiny but nonzero de la).
----,
C
S
n
n
R
Sl
R1
Q . . d' om!1l fo r Exercise 3.9
Fi ure 3.72 SR latch input pattern tIIll1ll g 13,::='
g F' ] for the "'put pattern 111
. '. SR latch (sce Igure.. . .
T h behavior of ;1 levcl-sensIU\ C C Ictc the ti min" dIagram. assuming
,\.10 race t e , 51 RI and Q are ini tiall y O. OI1lP c
Figure 3.73. Assume .' dela '
.1. 11
logic gates have :l liny but nOll lcro ).
C
S
R
Sl
R1
a

n
n
n
. . di agrnm for Exe ise 3. 10
Figure 3.73 5R lalch input pattern II mlllg
ure 3 I ) for the input pattern in Figure 3.74. Assume Q
Trace the beha"ior of a D latch (see Fig . . I gic g.te haH'" ti n), but nonzero delay.
is inilially O. Complete the liming diagram. assuming 0
I I
C L--J '-----
____
S
R
a
Figure 3.74 D latch input pattern timing diagram for E<crcl'e .1 II
Fi J 18) (or the IIlPUI p.llern III Ftgure J.75. ume Q
C' .1.12 Trace the behavior of a D latch (\ee Igure : logIC gate, h",c ,I tin) but nonlero del.).
P L U'S " initiall) O. Complete Ihe IImlllg dtagrnm. as\um"'g
C
D -.fIL-____
S
R
o
Figure 3 75 0 lilkh ",put pJttern IlIning diagram f .. r r:"",'C 1 12
t
3.10 Exercises 143
3.13
Trace the behavior of an edge- tri ggered D Ri p-Rop using a master-servant design (see Figure
3.24) for the input pattern in Figure 3.76. Assume each internal latch initially stores a O. Com-
plete the timing di agram, assuming logic gates have a ti ny but nonzero delay.
L-J ____ _
D/Dm r--J n
Cm
Orn/Ds
Cs
as
__________ ______ _
Figure 3.76 Edge- triggered D Rip-Rap input pattern timing illagram for Exercise 3. 13
3. 14 Trace the behavior of an edge-triggered D Rip-Rap using the master-servant design (see Figure
3.24) for the input pattern in Figure 3.77. Assume each internal latch irtitially stores a O. Com-
plete the liming diagram. assuming logic gates have a tiny but nonzero delay.
3.1 5
C
D/Dm
Cm
Orn/Ds
Cs
as
Figure 3.77 Edge- triggered D Hip- Hop inpul pattern timing diagram for Exercise 3.1 4
Compare the behavior of D lalch and D Rip-Rop devices by completing the timing illagram in
Figure 3.78. Assume each device initiall y stores a O. Provide a brief explanatioo of the
behavior of each device.
C--.J L-1
__
a (D latch)
a (D fli p-flop)
L
Figure 3.78 D I31Ch and D flip-Rap input pattern ti ming illagram for E.lereise 3. 1
3. I 6 Compare the behavior of D latch and D Hi p- Rap de' ice by completing the timing di8gram in
Figure 3.79. Assume each device initiall) stores a O. Provide a brief explanation of the bdla,"1OI"
of each de ice.
C
D ____ ---'
a (D latCh)
a (D flipllop)
Figure 3.79 D latl'll und D tli p-Ih,p '"I ut p,mern ttnllng dl'l!)rnm f ..... E.n:" _1, I
s

3 Sequentia l Logic Design- Controllers .
. . ni ches connected in ( the output of one IS can
C
. , ' , of three Icvcl-senslll vC D I. ' . h long hi gh-li me can cause the value
3. t7 r ealc a Cl feUI h how 3 cl ock Wi l a . I k
t d 10 the input of the next ). ow h orc th'tn one Intch dUring the same c oc
nee e h 'cklc throug III
at the input of the fi rst D late 10 Ln
cycle. . fl . ' lid , how how the input of the first D
. d t ."' red D fli P' op, ." . I k ' I ' h' h
3 18 Repeal Exercist: 17 uSlIlg c gc- . f1 0 maller how long the C OC signa IS Ig.
. latch does nol tri ckle through to the next fllp- op n
3. 19 sin2 D fl ip-flops. creatc tI circuit a3 a2 a 1 aO
wi th input X and an output Y. such
that Y always equals X ddayed by
"I Ti l
t wO clock cycles.
.'.20 Using four registers. design a
lhal stores the previous four
seen at an 8-bil input D. The circuit
should have a single Sbit output that
can be configured using IWO inpu tS 5 I
and sO to output anyone of four
registers. (Hint: use an 8-bu ..h: 1
mux.)
3.21
c
-
13 12 II 10
t>
reg(4)
03 0201 00
b3 b2 b1 bO
13 12 11 10
reg(4)
030201 00
II I I.
c3 c2 cl cO
I I I 1
13 12 11 10
reg(4)
03 02 01 ao
J J J 1
d3 d2 dl dO
Consider three registers con-
nee ted together as shown in
3.80. Assume the initial values In the
registers are unknown. Trace the
behavior of the registers by com-
pl eti ng the Liming diagram of Fi gure
3.81.
Figure 3.8lI Register confi guraLion.
C
b3 .. bO
c3 ..cO
d3 .. dO
Figure 3.81 4-bit reg"ter input pattern timing diagram for Exerci,e 3.21
. d ether a< ,ho" n In Figure 3.83. Assume the initial
3.22 ConSider three 4-bit registers the behaVIOr of the reg"te" by ompleting the
vaJues in the regl.sler) arc un nOwn.
liming diagram of Figure 3.82.
C
b3 bO
c3 ..cO
d3 .dO
Figur.3.82 4-DIl reg"tcr Input pallern IImln8 ding".", rnr F\c""" \ 22

3.10 Exercises 145
SECTION 3.3: FINITE-STA TE MACHINES (FSM) A D CONTROLLERS
3.23 Draw a state diagram for an FSM thai has an input X
and an OUlput Y. Whenever X changes from 0 to I , Y
should become I for two cl ock cycles and then return
to O-even if X is sti ll I. (Assume for thi s problem
and all other FSM problems that an implicit ri si ng
clock is ANDed with every FSM transition condiLion.)
3.24 Draw a state di agram for an FSM with no inputs and
three outputs, x, y. and z. xyz should always follow
the foll owing sequence: 000. 001, 0 10. 100. repeal.
The output shoul d change onl y on a ri sing clock edge.
Make 000 the initial Slate.
3.25 Do Exercise 24, but add an input I that can stop the
sequence when sel to O. When input I returns to I . the
sequence resumes from where it left off.
3.26 Do Exerci se 25, except the equence starts from 000
whenever I returns 10 I.
3.27
A wriSlwatch di splay can show one of four it ems: the
time, the alann. the stopwatch. or the date. controlled
by two signal s s I and sO (00 displays the Lime. 0 I the
alarm. 10 the stopwatch, and II the date-assume
a3 a2 a1 aO
c
d3 d2 dl dO
figure 3.83 Regi ter configurntioo.
s I sO control an -bit-wide mux that passes through the appropriate regi ter). Pressing a
butt on B (which sets B = I ) sequences the di spl ay to the next item (if the presentl) dis-
played item is the date. the next item is the current time). Create a state dia!!J'llID for an
descri bing thi s sequencing behavior. having an input bit B. and 1"0 oUlp;t bilS 1 and sO.
Be sure ( 0 onl y sequence forward by one item each Lime the bUllon is pressed regardl of
how long the bUllon i pressed-in other words. be sure 10 wait for the bunoo to be relea..'>ed
afrer sequencing forward one item. Use shan but descriplh-e names for each ute. :\.faki!
di splaying the time be the initial stale.
3.2S
Extend the state diagram you created in Exercise _7 by adding an input R. R= I
FSM to return to the state Lhat displ ays the Lime.
3.29 Draw a slate diagmlll for an FSM with an input 'em and three outputs. ..t' and :. The t:' ..
outputs generate a sequence called a Gray code in \\ hi b exactly one of the three oulpUlS
changes from 0 to I or from I to O. The Gray code sequence that the FSM should ""tpUt is
000. 0 I O. 0 II. 00 1. 10 1. I II , 11 0. 100. repeal. The output should bange 001) on 3 rb'J11g
clock edge when the input gem = I. Make the initial tate 000.
3.30 Trace through the exccution of the FSM ),ou created in E,<ercise 19 b) mpletil1 the nnun);
diagmm in Figure 3.84. where C is the lock inpul and is the o-bit f'e!!lSttt. AssUlDe'
is initially 000.
genl
c
s
Figure 3.84 F M input pattem tinlln!! fN \ <1\'1 .. . \0
c

Sequential logic Design- Controllers
", " FSM in Fi ,ure 3.85. >ueh that the FSM ;tart; in state Wail.
, H Dr.1\\ a t!ll1m, di agram lor tht: I I bch'lVior of the circuli III Engli sh.
.. S13h: EN, and returns to \\'ail. Dt:scn c I le
a;1
en;O
a;O
en;O
Figure 3.85 FSM for Exerci se 3.31
Inputs: s,r
Oulpuls: a.en
en=1
. . be- f tates indicate the srnall est possible number of bilS
"\ l' For FSi\I s with the follOWing num rs 0 5 .
. .. - for:l st.:lIe register representing those stJtes:
(a) 4
(hI 8
(c) 9
(d) 23
(e) 900
3._'3 How many possibl e states can be represenled by a 16-bi t register? . . .
3 If an FSM has N tates. what is the maximum number of tranSlllons thai
.. in the FSM (assuming there are a large number of inpuls. meaning the number of lranSlllOns IS
nol limited by the number of inputS)?
3.35 .Assuming one inpul and one output. how many po sible four-statc FSM exist?
3.36 . Suppose you are given twO FSMs that execule .. an approach for
merging those two FSM into a ingle FSM with identical funclionalllY as the two epara"
FSM . and provide an example. If the fir.it FSM has , Iates and the sccond has M states. how
many tate will the merged FSM have?
3.37 Sometimes dividing a large FSM into t,,o , mail er
FSMs resul tS in si mpler circuitry. Di vide the F M
shown in Fi gure 3.88 into two FSMs. one contaming
GO-G3. the other containing G4-G7. You may add
addilional Mates, transitions, and inputs or outputs
between the two FSMs. as required. Hint: you will
need to mtraduce signal; between the FSM, for one
FSM to tell the other FSM to go to some state.
SECTION 3.4: CONTROLLER OESIGN
UX U"ng the fi,e-step processor for de"gl1lng 2 con-
troller. con, ell the FSM of Figur. 3.86 10 a
controller. Implemenlmg the controller u<lng a lUte Fi gure 386 F
regISter and logiC gate .
3.39 Using the five-step processor for designi ng a con-
troll er. can veil Ihe FSM of Figure 3.87 to a
impl ementing the controller using a stale
regISter and logic gates.
3.40 Using the five-slep process for designing a con-
troller. can veil the FSM you created for Exercise 24
to a implementing (he controller using a
stale register and logic gates.
3.41 Using the five-slep process for deSigning a con-
troller. convert the FSM you created for Exercise 27
3.10 Exercises 147
y=l
to a controller. implementing the controller using a Figure 3.87 FSM for Exercise 339
Siale register and logic gales.
3.42 Using Ihe five-step process for designing a controller. canvell the FSM you created for Exer-
29 to a COntroll er, implementing [he controller using a stale register and logic gates.
3.43 Usmg the five-Slep process for designing a controller. convell the FSM in Figure 3.88 to a
.. Slopping once you have created the state table. Note: your state table will be quite
arge. havmg 32 rows-you might therefore want to use a computer tool. like a word pr0-
ceSSOr Or spreads heel. to draw !he table.
xyz=110 xyz=OlO xyz=Oll X}'Z=111
Figure 3.88 FSM for Exercises 3.37 and 3,43.
Create an FSM Ihat has an inpul X and an output Y.
Whenever X changes from 0 to I. r should become I
for five clock cycles and then relurn to O--even if X is
slill I. Using the five-step process for designing a
controll er. convell the FSM to a controller. stopping
once you have crcnred the Siale table.
3.45 The FSM in Figure 3.89 has two problems: one state
hn two lr.lnsitions whose condition ('Quid
neausly c\'nlu3Ie 10 lllIc. and another states has
lransistions that aren't gunrnnleed (0 hu\'c at leas( one
of Ihe tmnsition conditions true. By ORing and
ANDing Ihe condi tions for each stnte's tr.lnsitions.
prove that these problems exist. Then. fix these prob-
lems by refining the F M. taking your best gue. < .s
(0 whnl the F creator's imcllt.
xyz=101
lnputs:g.r
Outputs: x.y.z
xyz=001
148 Sequential Logic Design-Controllers
. I circuil shown in Figure 3.90.
3..16 Reverse engineer the behavior of the sequcnlla
Combinational logic
o
COl
Ul

51 sO
(al ci rcuit 10 be reverse engi neered.
Figure 3.90 A sequen I
SECTION 3.5: MORE ON FLIP FLOPS AND CONTROLLERS '.
. d shown in Figure 3.92. Trace lhe behavIor of the flIp-
3.47 Consider lhree T fllp-flopsconnecl.e e as in Fi eure 3.91. Assume all the flip-flops initially
flops by compleLing the umJng dlCloram 0
contain 0 5, ________________________ _
T
C
01
02
03
Figure 3.91 T flip-fl op input panem timing di agram for Exercise 3.47
3.48 Show how to conneCl four T fl ip-fl ops
together to create a circuit that T
o to 15 in binary and back to 0 agaJO- JO
other words, that counLS 0000. 000 I, 00 I 0,
.... 11 11 , and back 10 0000 agai n. Hint: con-
T
001 T
sider usi ng the Q OUlput of a flip-fl op as the C ___ -<l>-___ ....J
clock input of another flip-fl op. Assume all Figure 3.92 Three T flip-fl ops.
lhe flip-flops in itially contain Os.
3.49 Define metastabi lity.
O II ' h 4 b' t state register that gets synchronously initi alized to state 1010 3.5 DeSign a cantro er wll a - I
when an input resel is SCI to 1.
3.51 ' Design a D nip-fl op with asynchronous reset, AR. and a,ynchronous set, AS, inputs using
basic logic gates.
DESIGNER PROFILE
Brian got hi s baChelors
degree in Electri cal
Engineering and then
worked for several
years. Realizing the
future demand for digi tal
design targeting an
increasingly popular
type of digi tal chip
known as FPGAs (see
Chapt er 7), he returned to school to obtain a masters
degree in Electrical Engineering with a thesis topic
targeting digital design for FPGAs. He has been
empl oyed at two different compani es, and is now working
as an independent di gi tal design consultant.
He has worked on a number of projects. including a
system that prevents house fires by tripping a circui t
breaker when current running in the circuit indi cates
arcing is occurring, a microprocessor architeclUre for
speeding up the processing of di giti zed video, and a
mammography machine for precise location detection of
tumors in humans.
One of the proj ects he has found most interesti ng was a
baggage scanner for detecting explosives. "In that system.
there is a lot of data being acqui red as well as motors
running, x-rays being beamed, and other things
happening. all at the same time. To be successful. you
have to pay anent ion to detai l, and you have to
communi cate wi th the other design teams so every one is
on the sa me page." He found that proj ect parti cularl y
interesting because "1 was worki ng on a small part of a
very large. complex machine. We had to stay focused on
our part of the design, while at Lhe same time being
mindfu l of how all the part s were going to fit together in
3. 10 Exercises 149
the end." Thus, bei ng able to work alone as weil as in
large groups was imponant. requiring good
communicati on and team ski lls. And being able to
understand not onl y a part of lhe system, bUl also
important aspects of the other parts was also important..
requiring knowledge of diverse topics.
Brian is now an independent digital design
something that many electrical engineers, computer
engineers. and computer scientists choose to do after
getting experience in lheir field. "I like the flexibility that
bei ng a consultant offers. On the plus side. I get to work
on a wi de variety of projecLS. The drawback is that
sometimes I onl y get to work on a small part of a
rather than seeing a product through from stan to finish.
And of course being an independent consultant means
there's less stability than a regular position at a company,
but I don' t mind that "
Brian has taken advantage of lhe flexibi lity provided by
consulting by taking a part-time job leaching an
undergraduate digital design course and an embedded
systems course at a university. "I really enjoy leaching
and I have learned a 10l through teaching. And I enjoy
introducing students to the field of embedded systems."
Asked what he likes most aboul the field of digital
design, he says. "I like building prodUCLS that make
people's lives easier, or safer, or more fun. That's
sati sfying."
Asked to give advice to students. he says that ODe
imponant lhing is "to ask questions. Don'l be afraid of
looki ng dumb when you ask questions .t a new job.
People don't expect you to know everything, bUl they do
expect you to ask questions when you are unsure.
Besides. asking questions is an importanl part of
learning."
150
4
Datapath
Components
4.1 INTRODUCTION
. . increasinoly complex building blocks Ihat can be used to
Chaplers 2 and 3 II1lroduced . 0
1
d diaDic o'lles mul!iplexors, decoders, basic
build digilal circui ls. Those blocks IOC u e d fa; implementing systems havi ng
. d fi II lLroliers Controllers are goo
reglSlers, an na y cal. .' 1 d eneralino some number of control output sig-
b f antral Inpul Slona S an go.
some num er a co. I . I become 1 (correspond 109 perhaps
F I
'f see a part icul ar conlro II1pU
nal s. or examp e. I we ate a 1 on a control output (corre-
b
. . . d) Ihen we may want 10 gener, ,
to a bUllon ell1g plesse,. I ' ' h ler we inslead focus on creating
. I liohl !lIrnln0 on) In I liS C ap ,
spondll1g penaps 10 a 0 d I havi no dara inputs and outputs. In general,
bui lding blocks Ihat are goo or sys el . '0 II )'
digital ;ystems have IWO Iypes of inpuls (and oUlpUIS as we .
I
. . Iypi call y one bil, representing a part"icular event
Control' A contra InpUI IS ' . .
OUlside Ihe system. li ke a bUllon being pressed,. or representing a panic-
o h' 'de the system like a door being closed or a car bemg
ular state of samet mg OUtSI , " .
at an intersection. Control inputs could sometimes be grouped 11110 mullJple bus-
.. . h' h f 16 bUllon is pressed, or 2 bits representing each
ilke 4 bits represenllng w IC a
of 4 possi ble states of a door (closed. open 113rd, open 2/3rd,. or fu ll y open),
. . II used directly to influence a controll er s present state.
Control II1pUtS are typlca y
Data: A data input is typically multiple bi ts, coll ect ively a single
. F I 32 b'lt input may represent a temperature In binary, A 7-bu
entlly. or examp e. a - . . 00 ft '.
. tthe present floor locati on of an elevator In a I - oar bUlldmg,
Input may represen . . "
d
b s gle bit differino from a slOgle-bll control Input 111 that we
A ala Input may e a In I 0 ,
don' t directly rely on that bit's value to influence the controller s present state.
Not all input can be strictly classifi ed as ei lher comrol or are some inputs
thai fall somewhere on the border in belween the IWO Iypes. BUI most Inputs can be clas-
sified as one or the other. (And. of course, a digi tal ystem also has power Inputs, ground
inputs, and clock inpuls too, in addition to conlrol and data inputs.) .. .
Coni rollers are a good building block for buildi ng systems conslstll1g mall1ly or
comrol inputs and cOlllrol OUlputS. But we also need building block. for systems con
si . ting of data inpuls and OUlpUIS. In particular, we need registers 10 hold the data, and
functional unilS to operale on (e.g. add or di vide) Ihe daw. Such component are known
4.2 Registers 151
as register-transfer level (RTL) components, also known as datapath components. and a
Circuit composed of such componenls is known as a datapalh.
Datapalhs can become quile compl ex, and Iherefore il is crucial to build datapaths
:rom. a SCI of dalapalh componenls Ihal each encapsulale an approprialely hi gh level of
uncllOnalll y. For example, if you were asked whal components make up an aUlomobile.
you wou ld probably li sl components like an engine, tires, a chassis. a body, and so on.
Each of Ihose componems encapsulares a high-level function of the automobile. You
thought of a tire, nOi of Ihe rubber, slee! wires, valve stem, valve, sidewalls, and oiller
parts thai make up the lire. Those delai led pans make up Ihe design of a lire. nOI an aUIO-
mobi le. A tire is an appropriately hi gh level of componem when thinking of a car; a valve
stem IS nol. Likewise, When we design dalapar hs, we mUSI have a set of dalapath compa-
nems aI Ihe appropriately hi gh level- logic gales are 100 low-level.
This chapl er defines such a sel of datapalh componenlS. and also inLroduces simple
dalapal hs. In Chapler 5, we' ll see how 10 create more advanced darapalhs. and how 10
combine datapat hs and Controllers 10 build an even higher-level componem known as a
processor.
4.2 REGISTERS
An N-bit register is a sequemi al componem able 10 store N bils. Typical regi ter width
(the number of bit N) are 8, 16, and 32 bits, though any width is possible. The bilS in a
register often represenl data, such as 8 bils represeming a lemperature as a binary number.
The common name used for storing data imo a register is loading, although tbe words
writing and storing are also used. The opposile aClion of loading a regi ler is known as
reading a register's coments. Reading consisls merely of connecting to the regi ler's
outputs-note thai reading therefore i not synchronized with Ihe clock. and funherrnore.
nOle Ihal reading does nOi remove the bils from the regi ter or change them in any way.
Regislers come in a variely of slyles. We'll introduce some of the mOSI common
slyles in thi s secli on. Registers are perhaps the most fundamemal dalapath campanelli. a
we will provide numerous examples of their design and their use.
Parallel load Register
The mas I basic type of regi ster, shown in Fi gure 3.30 in Chapter 3. cons iSIs of a
set of flip-flops that gel loaded on every clock cycle. Thai ba ic regi ler is useful as the
stale regi ter in a coni roller, since the state register is loaded on every clock cycle. Ho\\-
ever, for most other uses of registers, we walll ome way 10 control whether or nOI a
regisler gets loaded on a particular clock cycle--{)n some cy les we wanl 10 load.
whereas on other cycles we j usl wanl 10 keep Ihe previous value.
WHY THE NAME "REGISTER"?
Hi sloricall y, the term "regisler" referred 10 a sign or
chalkboard 01110 which people could lemporarily wrile
OUI cash lransactions. and later perfonn bookkeeping
using those transactions. The tenn generally refers to n
device for sloring dntn. In Ihis contex!. -inee 3
collection of Hip-flops stores data. the register
seems quile nppropri3le.
152 Oatapath Components
o
11
'0
'" !2
. ( ) . al desian (b) palhs when 1 oad=O and 10ad=l ,
Figure 4.1 4 bil parallel load register: a mtern 0
and (e) regi ster block symbol. .' .
. I I d' g of a reoister by adding a 2x I rnuluplexor In front
We can achieve contra over oa In .0 . 1 d' .
c h 4 b't reoister In F,oure 4.I(a). Whe n the oa sIgnal IS 0
of each flip-flop as shown ,or t e - I 0 " 0 I h . F'
. ' . I fl' fl 0els loaded with its own va ue. as sown m Igure
and the clock signa l rISes, eac l iP- op" ., d
. h fl ' fl . resent contents. the register S conte nts a not change
4 I (b) Because 0 IS t e IP- op s P . . . h fl ' fl
. '1 d ' 0 Wh the load sional is I and tile clock signal nses, eac [P- op gets
when oa IS. en 0 . I d d .
loaded with one of Ihe data inputs 10. I I , 12, or I 3- thus, the regIster gets oa e With
Ihe data inputs whe n loa d is 1. ".
A reoi ster with a load line that control s whether the register IS loaded With mputs,
. II h . b' I aded in parall el is known as a parallel load reglSler. Figure
Wllh a t ose Inputs elflg 0 .
4. 1 (c) provides a block symbol for a 4- ...----------- - - --,
bil parall el load register. A block
symbol of a component shows a compo-
nent' s inputs and outputs. wi thout
showing the component 's internal
detail s.
Because regi ster are such a funda-
mental component in datapaths, we
present a number of examples
involving registers. to ensure the reader
gai ns suffi cient comfort with registers.
EXAMPLE 4.1 BaSIC example uSing registers
Figure 4.2 show, a simple conneclion of
Ihree regi slers RO. R I. and R2. Suppose we
are laid Ihal Ihe inpul values on a3 .. aO
have Ihe values shown in Ihe liming Figure 4.2 Bn,ic regisler example.
EXAMPLE 4.2
4.2 Registers 153
in Figure 4.3(a). We can Ihen delermine the values in regislers RO. RI. and R2. as shown in
-'gurc 4.3(b). Before the fi rst clock edge. we do not know the values in the registers, so we show the
registers' contents as "????" The Contents are actuaJly some combination of four 0 and 1 vaJues but
we don't know what those parti cular values are.
Before the fi rsl clock edge, we are given that a 3 .. a a become 11 11. Thus. on the first clock
edge, RO will be loaded with III I . AI the same momenl, RI and R2 wi ll be loaded with the value
in RO, whi ch is " ???? ," so R I and R2 will still have contents of"? ?? ?."
n n n n
(a) 11 :2
.& a3 .. aO --l-l-l- l --i.I-X 0001 i X 1010 !
------------ --- --- i------- --- ----'--- ----------.t- , _-_- _ _ - _- _- __ -_- __-_-_- __-_-_- __ __-_-_- __-_-_- __-_-_
RO ????
1010
Rl ????
1010
R2 ????
0101
(b)
: 1010

0101 :1101011 0101 1
c.;,:""" "'R O,2"-'-'! Rl R2
Figure 4.3 Basic register example: (a) timing diagram. and (b) the contents of each register.
Before clock edge 2. we are given that a3 .. aO change to 0001. Thus, on the second clock
edge. RO will be loaded wi th 0001. Simult aneously, RI wil l be loaded with the value of RD. which
was 1111, and R2 will be loaded with the value of RO inverted, meaning 0000.
Before the third clock edge, we are gi ven that a3 .. aO change to 1010. On the third clock
edge, RO wi ll be loaded with 1010, while simultaneously RI gets 0001. and R2 gets Ilia.
We are given Ihat a3 .. aO stay at 1010 before the fourth clock edge. On the fourth edge. RO
again will be loaded with 1010. while simult aneously RI gets 1010 and R2 gets a 10 I.
As a 3 .. a a stay at 1010 before the fifth clock edge. then on the fourth edge. RO again will be
loaded with 1010, whil e R I again gets 1010 and R2 agai n gets a 10 I.
The important feature 10 notice in this example is that the RO. RI . and R2 registers all ger
loaded siml/lralleol/sly. Thus, even though RO gets loaded with a new value on a clock edge. RI and
R2 gel the previous value. not the new value. on that same clock edge.
Weight sampler
Consider a scale at a grocery store used to weigh fruit. The scale may have a di splay that shows the
present weight. We want 10 add a second display. and a bunon that the user an press to remember
the present weight (sometimes called "sampli ng"). so that when the fruit is remo,'ed. the remem-
bered weight continues to be displayed on the second di splay. A diagmm of the system is sbown in
Figure 4.4.
154 Oatapath Components
Assume the scale the
present weight as n -I.-bit
!lumber. and the "Present weight
Jnd "Saved weight" di spl nys
matically convert their blll ary
number to the proper di splayed
v;J l ue. We cun design the Weight
Sampfa block using a -lobi! parallel
load rce islcr. \Ve connect the button
signal b to the l oad inplll of the reg-
The OUlplil connects to the
"Saved weight"' di splay. Whenever b
is L the weight va lue gelS loaded
into the register. and thus appears on
the second display. When b retunlS
to O. the register kee.ps its value. so
the second display conti nues 10 show
the same weight. even i f other items
are pJaced on the scule and the first
display changes.
Weight Sampler
Figure 4.4 Weight sampler implement ed using a 4-bit
parall el load register.
EXAMPLE 4.3 T m erature history display using registers (again) .
e p . . ,eraled a pulse on an Input C every hour. We
3 . whi ch a Ilrner gel ,
Recall Example 3.2 of Chapter . 111 t s and those registers were connected such
I k ' UIS of three regis er .
connected that input C to the c oc II1p 'perature the second register would get the
I d d / th the present ten .
that the first register woul d be oa e ".1 Id the te mperature before that one, on the rising
d h thi rd reolSler wou get .
previous temperature. an t eo. d connect any input other than a clock signal
. . e typIcall y a not
edge of C. However. In pracllce. w . W therefore redesign the system LO use a clock
. . clock IIlput e can
(from an oscillator) 10 a register S . . II 11 ad reoister. We could then connect the input C
signal as the regi ster clock inpul. by uSlllg e a 0
..... . shown III Floure 4.5.
to the load inputs of the registers. as 0 I er hour In fact due to the nature of how
. f . n be faster Ihan I pu se p . ,
The oscli lator requency ca ? Q rr Oscill ators o n page 102 in Chapter 3),
oscillators are made (see "How Does It Work .- ua z .
oscillalOr frequencies are usually at least in the k.ilohertz range.
b4 b3 b2 b1
b0r-
c4 c3 c2 c1 cO
a4 a3 a2 a1
ao __
,----
04 14 04
I-
--
r.;;.. 14 04 14
03 03 13 r---
--
03 13
12 Rc 02 -
--
02 12 Rb02
Ra
01 11 01 11 01
-- 00
00 10 00 10 .....
to
IT



C 1
newline Temperature History Slo,age
Figure 4.5 Internal de,ign of the TempemlllreNislorySlorag" componeni. using parallel load reglslers.
EXAMPLE 4.4
4.2 Registers 155
We must ensure that when the timer generates its hourl y pul se on C, the pulse is 1 for onl y one
clock cycle. Otherwise, the regi sters would gel loaded marc than once during a single pulse
(because during that pulse. multipl e rising clock edges would occur. and regi sters get loaded on
each rising clock edge). and so the present temperature would get loaded into two or even aU three
regi sters. We Can accomplish a single-cycle hi gh output by using the Same clock as input to the
timer, and then deSigning the timer's internal state machine to only sel C"'l for one slate-similar
10 how we set an output to 1 For exactl y three stales in Example 3.7 in Chapter 3.
Automobile above-mirror display using parallel-load registers
In Chapter 2, we described an exampl e of a system above a rearview mirror that could di play one of
four 8- bit inputs, T. A. I, and M. In that example, we ass umed the car's central computer was con-
nected 10 the above-mirror system usi ng 32 lines (4*8). Thirty-two wires is a lot of wires to have 10
connect from the computer to above the rearview mirror_ Instead, assume that the computer connects
to the above- mirror syslem usi ng 8 data lines (C), 2 control lines a 1 aD that specify which data item
presentl y appears on C (being T when alaO-OO. A when alaO-Ol. J when alaO-lO. and M
when a 1 a 0-11), and a load control line load, For a total of II line . ratherthan 32 lines. The Com-
puter can send the data items in any order. at any lime. The above-mirror system should simpl y SlOre
dal a items in Ihe appropri ale regisler (according to a laO) when the data items anive. and thus lhe
syslem needs four parall el-load registers in which to store each data item. The control lines a 1 a 0
wi ll therefore serve as the "address" thai tell s us which regis ter 10 load. As in the earlier example.
input s xy determine whi ch value to pass through to the 8-bit display OutpUt 0 (wi th xy sequenced
by the user pressing the mode button).
We can design the system as shown in Figure 4.6. The fi gure uses a popular "shonhand" nota-
tion that replaces a group of wires by a si ngle thicker wire having a slanted line and number
indi cating the number of wires in the group.
iO

8-bit
4x1
o
Figure 4.6 Above-mirror dis pl ay design. a I a O. set by the car's central computer. delennines
whi ch register to load Wilh C. whil e 1 oad-l enables such loading. y . which are !Odependent of
a 1 a 0 and are sel by the user pressing the mode button. dctennine \\ hkh register to output to the
di splay D.
The decoder decodes a 1 a 0 to enable exactly one of the four regbten;. The load line en3bl<$
the decoder- if 1 oa d is O. 110 decoder OUlput is I and so no register get. loaded. The multIpk\ 'r
pUrl of the system is the same as in the earlic.!f example.
156 Datapath Components
Let's !<iCC how thi s system works far a sample sequence of inputs. Suppose init il.l ll y that all reg-
. . , 0 and xy=OO. Thus. the di splay wi ll show O. Ir the user presses the mode button fo.ur
,Sters store s '. h I 0 I 10 II and back to 00. ror each press still dlS-
times the inputs xy wlil sequence t roug 1
la 0 (..:: ince all registers are Os). Now suppose that during some clock cycl e. car s computer
po) e ' 1 d-l d C=000010 10. Then register 1 wi ll be loaded wnh 0000 1010.
sets a 1 aO=01. oa - . an . . .
Since xy=OO. the di splay will still show the contents or regISter O. and thus the dISplay wil l show
O. Now. ir thc user presses the mode button. xy wi ll become OL and the dis play 11' 111 show the
decimal value of ree.i ster I 's 0000 1 010 value. whi ch IS len III deCimal., mod e WI))
clWI1e.c xy to 10. the display will show the cont ents of regi ster 2: whIch O. At any tlme ,ln the
fUlUr;. the car's computer can load the other registers. or reload I . with new val,ues. In any
order. Note that the i03ding of the registers is independent from the display of those reglsters.
EXAMPLE 4.5 Computerized checkerboard
Checkers (known in some countries as "draughts") is ont! of the world's most popular board
A checkerboard consists of 64 squares. formed from 8 columns and 8 rows. Each player starts
12 checkers (pieces) on the board. A computerized checkerboard may replace the checkers by uSing
an LED di ode) in each square. An on LED a checker 111 a square;.an
LED represents no checker. For si mplicity of the example. ignore the Issue of each player havmg hIS
own color of checkers. An example board is shown in Figure 4.7(a).
Figure 4.7 An el ectronic
checkerboard: (a) eight 8
bit regi !>lers (R7 through
RO) can be used to dri ve
[he 64 LEDs. using one
per column. and
(b) detail or how one
regi ster connects to a
column', LEDs and how
the value 10100010
stored in that register
would li ght three LEDs.
O LEO
(a)
e lit LEO
from from
microprocessor decoder
(b)
A computerized chcckerboard typically has a mi croprocessor that keep' track or where each
pi ece is located. moves pieces according 10 user cOlTImnnds or according to a checker-playing
program (whcn playing against the computer), keeps score, etc.
4.2 Registers 157
Notice that the mi croprocessor must set values for 64 bits one bit for each square However
inexpensive type of mi croprocessor used in such a device does not have pins.
IllI Croprocessor needs ex ternal registers to store those bits that drive the LEDs. and will write to
those registers One at a time. The microprocessor writes to the registers so fast.. though. that an
observer would probably see all the LEDs change at the same time. not noti cing that some LEDs
are changIng rmcroseconds earlier than others.
Let 's use one register per column. meaning we' lI need eight registers tOla!. as shown below
the checkerboard in Figure 4.7(a), with those registers named R7 through RO. Each register'S 8 bits of
to a parti cul ar row in the register's column. indicating whether (he respecti ve LED
IS .on or off , as shown in Figure 4.7(b). The eight regi sters are connected to the microprocessor. The
mi croprocessor uses eight pins (D) for data, three pins (i 2, i 1. i 0) for addre sing the appropri ate
register (whi ch is decoded into a load line for each of the 8 regi sters). and one pin (e) for the register
load line (linpl emented using the decoder's enable), ror a total or 12 pins-a number much more fea-
sible than 64 pins. To configure the checkerboard ror the beginning of a game. the mi croprocessor
would create the foll owing sequence of register wri tes shown in Figure 4.8.
clk
Figure 4.8 Timing diagram indicating an input sequence that can be used to initia1ize.
HOW DOES IT WORK? COMPUTERIZED BOARD GAMES,
Many of you have played a computeri zed board game,
like checkers, backgammon, or chess. either using
boards with small di splays to represent pieces, or
perhaps usi ng a graphics program on a personal
computer or website. The main method the computer
uses for choosing among possible next moves is called
lookahead. For the current configuration of pieces on the
board, the computer considers all possible single moves
that it might make. For each such move. it might also
consider all possible single moves by the opponent. For
each new confi gurati on resulting from possibl e moves,
the computer evaluates the configurati on's goodness, or
quality, and pi cks a move that may lead to the best
configuration. Each move that the computer looks ahead
(one computer move. onc opponent move, another
computer move, another opponent move) is cal led the
lookallead amount. Good programs might lookahead
three, four, five moves, or more. Looking ahead is costly
in terms of compute time and memory-ir each player
has 10 possible moves per tum. then looking ahead two
moves results in 10' 10 = 100 configurations to evaluate:
three moves in 10' 10' 10= 1000 configurations. four
moves in 10,000 confi gurations, and so on. Good game-
playing programs will "prune"' configurations that
appear to be very bad and thus unlikely to be chosen by
an opponent, just as humans do. [a reduce the
confi gurati ons to be considered. can examine
mi llions of configurations. whereas humans can onJy
mental ly examine perhaps a few dozen. Chess. being
perhaps the most complex or popular board games, has
attracted extensive attention since the early days of
computing. Alan Turing. considered one of the fathers of
Computer Science, wrote much about using computers
for chess. and is credited as having written the first
computer chess program in 1950. Howe\'er. humans
proved better than computer chess programs until 1997.
when IBM's Deep Blue computer defeated the reigning
world champion in a classic chess match. Deep Blue had
30 lllM RS-6000 SP processors connected to -I Ospecial
purpose chess chips. and could evaluate 200 million
moves per second, and hence many billi ns of m \ 'eS in
[) few minutes. Today. chess toumamenlS nOt only mat h
humans against computer but also progT':l.l1lS
against programs, many hosted b. the lnrern:ltionaJ
Computer Games Association.
(SoufC'e: Chess Hislf'I)', 8dt WaH),
158 Datapath Components
Shift Register
On (he first rising clock edge. RO
with 1010001 D. On the second nSJllg clock
cdoc. R I gets loaded wi th 01000101. And so
on- Arter eight clock cycles. the registers would
the desired values. and the board's LEDs
would be lit. as shown in Fi gure 4.9.
One thing we might want to do wi th a reg-
ister is shift the register's contents to the left
or to the right. Shifting to til e ri ght means to
move each stored bit one nip-nap to the
right. If a 4-bit register originall y stores
11 01. shifting ri ght would result 1tl 0110,
as shown in Fi gure 4. 10(a). We dropped the
rightmost bit (in thi s case a 1), and we
shifted a 0 into the left most bit. To bui ld a
regi ster capable of shi fting to the
conceptually need to connect the regtster s
Aip-Aops in the manner similar to that
shown in Figure 4. 10(b).
O lED
. litlED
Figure 4.9 Checkerboard after loading
regi sters for init ial checker posit ions.
Figure 4.10 Ri ght shift example:
(a) sampl e conteOl S before and
after a nght shift and (b) btl-by-btl
Reglslercontents
o 1 1 0 1 before shllt nght
Register contents
o 1 1 0 after shift fight

(b)
view of the shi ft.
(a)
W o' t able to shi ft to the ri ght as shown in Figure 4. 11. The register
e can create a re"ts cr . h s a ri ght shift on a risi ng
includes two control mput s, S h rand 5 h r _ , n. 5 r cause . .
clock cdoe whi le s causes the register to maintain its present value. 5 h r _, n tS the
bit that :e'want to shift into the leftmost register bit during a shift operati on.
Figure 4.11 Shirt regi' ter: (a) implementation.
(b) path' when S h 1. and (e) block symbol.
I>
03
I
02 01
I
(c)
00
I
EXAMPLE 4.6
4.2 Registers 159
Rotate Register
A rotate register is a Sli ght vari ation of a shift register in whi ch the outgoing bit gelS
shi fted back in as the incoming bit. So on a right rotate, the rightmost bit gets shifted into
the leftmost bit, as seen in Figure 4. 12.

0 1 Register contents
before shih right
1 1 1 0 Register contents
after shift right
(a) (b)
Figure 4.12 Right rotate exampl e: (a) register contents before and after the rotate. and (b) bit-by-bit
view of the rotate opcral ion.
Impl ementing a rotate register is achieved by modifying the design of Figure 4.11.
feeding the rightmost nip-nop output , rather than the 5 h r _ i n input. into the leftmost
mux's i 1 input. A rotate regi ster needs Some way to get va lues into the register--either
via a shift, or via parall el load.
Above-mirror display using shift registers
In Example 4.4. we redesigned the connecti on between a
car's central computer and an above-mirror di splay system
to reduce the number of wires from 32 down lO 8+2+ 1= II .
This bundle
should be
However. even II wi res is a JOI of wi res to have to run f eU' wires.
from the comput er to above the mirror. Let 's reduce the lIot ele\lefl
wires even further by using shi ft registers in [he above- wires.
mirror system. The inputs to the above- mirror system from
the car' s computer wi ll be one data bit C. two address lines
a 1 a D. and a shift line S h i ft. for a total of onl y 4 wires.
When the computer wants [ 0 wri te to Oll e of the abovc-Illjrror system' s registers. the computer will
set a 1 a 0 appropriately and will then set 5 h i f t to 1 for exactly eight clock cycles.
For cHeh of those ei ght clock cycles. (he computer wi ll set c to one bit of the -bit dara to be
loaded. starti ng wi th the least-signifi -
cant bit on the firs t clock cycle. and
ending with the Illost-significant bit on
the eighth clock cycle. We can thus
design the above-mirror system as
shown in Figure 4.13.
Note: this tine is 1 bit, rather than 8 bits like before
x y
t t
51 sO
2x4
8
iO
4",
dl
aO-... iO
il
Figure 4.13 Above-mirror di splay design using shift
regi sters to reduce the number of li nes coming from the
car' s computer. The computer sets a 1 a 0 to the desired
register to load. and then holds S h i 1 for eight
clock cycles, with C equaling the register contents bit-
by-bi t, one bit per clock cycle. resulting in the desired
register being londed with the sent 8-bit value.
al -... il
d2
e d3

8
8
8
d D
8
i2
i3
160 4 Datapath Components
HOW DOES IT WORK? COMPUTER COMMUNICATIONS IN AN
AUTOMOBILE USING SERIAL DATA TRANSFER.
Modem automobiles cont ai n dozens of computers
distributed throughout the car-some under the hood,
some in the dash, some above the mirror. some In.the
door. some in the trunk. etc. Running wires
throughout the car 50 those computers
communi cate is a chall enge. Thus. most aUlOma,bll e
computers communicat e seriall y. meani ng one bit at
a time, like the in 4.6, to
reduce the number of wires. A
serial communicati on scheme I n automobil es IS
known as the "CAN bus." short for Controll er Area
Network. whi ch is now 3n standard
defi ned by ISO (Int ernational Standards
Organizati on) standard number I 1898.
. riate reoister gels a new value shifted in during the next
When Shl ft-l. the approp 0 arallelload from eight separate inputs. but uti'
clock cycles. Thi s method achieves the same as a p
lizes fewe r wi res. nn of communication between di gital circuit s known as serial
Thi s example a .0 . al e data by sending the data one bit at a lime.
communication. in which the CircUi ts communl C
Multifunction Registers . ' "
. nn a variety of operations (al so call edjimcll olls), li ke load, shtft
Many registers can pe 0 ft Th egister user selects the presentl y demed operatIon
h'f I ft t t ri oht rotate Ie etc. e r . . .
S J t e . ro a eo, _ .' , now introduce some multifunctI on regIsters.
by setting the register' s control mputs. We II
Re 'ster with Parallel Load and Shift Right .
gJ .' . a reoister is that of both parall el load and shIft. We can
A popular combmauon of operatIOns on 0 " .
. . f II I load and shift right , the detail s of whIch are shown 10
design a 4-bll regIster capable 0 para . e .
Figure 4.14(a). Figure 4. 14(b) shows a block symbol of the regIster.
to
(a)
Figure 4.14 4-bi l register with parallel load and shift right
operations: (a) internal design. and (b) block symbol.
Notice that we used a 4x I mux, rather than a
2x I mux, in front of each flip-fl op, because each
flip-fl op can now receive its next bit from one of
three locations (the fourth mux input is unu ed).
The register has two control inputs, with the
control behavior shown in Figure 4. 15.
(b)
.1 .0 Operation
0 0 Maintain present value
0 t Parall el load
0 right
(unused - let's load Os)
Figur. 4.15 Operation labl e of a 4-bil
register wi lh parallel load and shift
right operali on\ .
4.2 Registers 161
HOW DOES IT WORK? WIRELESS AND USB COMMUNICATION
BETWEEN DIGITAL DEVICES.
Serial communi cati on between di gital device. such as
between personal computers. laplops. printers.
cameras, elc., is ubiquitous. The popular USB
interface is a serial communicati on scheme (USB is
short for U"i.'ersal Serial Bus) lIsed to connect
personal computer and ot her devices together by wire.
Furthermore. nearl y all wireless cOlllmuni cati on
schemes, such as WiFi and 81ucTaolh. use serial
communi cal ion. sending one bit 31 a lime over a radio
frequency. While data communicati on between devices
may be serial. compulations inside devices are
typicall y done in parallel. Thus. shift registers are
commonly used inside circuils ( 0 convert internal
parallel dal a into seri al data to be senl 10 another
device, Jnd to receive seria l data and convert that data
into parall el data for inlcrnal device use.
Let 's examine the mux and flip-fl op of the ri ghtmost bi t. When 5 I s0:00. the mux
passes the present fl ip-fl op value back to the flip-fl op, causing the flip-fl op to get reloaded
with its present value on the next ri sing cl ock, thus mai ntaining the present val ue. When
51 S 0:0 I , the mux passes the external 10 input to the flip-fl op, causing the flip-flop to get
loaded. When 51 S 0: 10, the mux passes the present value of the flip-fl op output from the
left, Q I, thus causing a ri ght shift. s i s 0: 11 is not a legal input to the register and thus
should never occur; the mux passes Os in thi s case.
Register with Parallel Load, Shift Left, and Shift Right
Adding a shift left operati on to the above 4-bit register is straightforward. and is hown in
Fi gure 4. 16. Instead of connecting Os to the 13 input of each 4x I mux. we instead
connect the output from the flip- fl op to the ri ght. The ri ghtmost mux's 13 input would be
connected to an addi ti onal input 5 h 1_ in.
13 t2 t1 to

(a)
shUn
shr_in
51
sO
(b)
Figure 4.16 4-bi l regisler wilh parall el load. shift lefl. and shin righl operations: (a) internal
design. (b) bl ock symbol.
UNUSED INPUTS,
The example in Figure 4. 14 included 3 mux wi th 4 inputs
of which we onl y used 3 inpuis. Notice that we aClually
sel the unused input to a parti cular value. rather than
simply leaving the input unconneclcd. Remember that
the input is controlling lransistors inside the
component- if we don' t <.15sign n value to the inpul. will
the internal [fUn istors conduct or nOI conduct? \Vc:: don't
really know. and so \\ e C' uld get undesired beh:l\ iar
from the mIL' . Leaving inputs unconnected should not be
done. On Ihe other hand. lea\'ing outputs unconnected is
no problem-an unconnccted output ha\ e a 1 or n
thai simply doesn't control anything clse.
162 4 Datapath Components
The register has the operat ions shown in
Figure 4. 17.
Load/Shirt Register with Separate Control
Inputs for Each Operati on
Registers Iypicall y don' l come wilh conlrol
inpulS Ihal encode Ihe operation inlO the
minimum number of bil s li ke the conlrol
inpulS on Ihe regislers we designed above.
Inslead. each operali on usuall y has ils own
cOlll rol inpul.
So a register wilh Ihe
operati ons of load, shi fl lefl.
and shift righl. mighl have Ihe
inpulS and operati on labl e
shown in Figure 4. 18. The
four poss ible operati ons
(mainlain, shilt left, shifl ri ght
and load) reall y onl y require
two control inputs, but the
figure shows that the register
has three control inputs-l d,
Id
o
o
o
o
shr
o
o
1
o
o
shl
o
o
1
o
1
o
sl sO Operation
0 0 Maintain present value
0 1 Parallel load
1 0 Shift right
1 1 Shift left
Figure 4.17 Operation table of a 4-bit
register with parallel load. shirt left,
and shin ri ght openlli ons.
Operation
Maintain present value
Shift left
Shift right
Shift right - shr has priority Over shl
Parallel load
Parallel load - Id has priority
Parallel load - Id has priority
Parallel load - Id has priority
shr, and shl.
NOli ce that if Ihe user
sets more than one control
inpul 10 1. we muSI decide
Figure 4.18 Operat ion tabl e or <I 4bil register wit h separate
control inpul s ror parallel load. shifl lefl. and shift right.
what operation 10 perform. If
the user sets both s h r and s h 1. we' lI give priority to s hr. If the user asserts 1 d and
either or both of s h rand s h 1. we' ll give priority 10 1 d.
The internal design of such a regi ter is similar to the load/shift register designed above,
except that the three control inputs of 1 d, shl, and shr need to be mapped to the two
control inputs S 1 and sO of the earli er register, using a simpl e combinati onal circuit, as
shown in Fi gure 4. I 9.
I I I I
shr in
' 3 / '2 / '1 /10 /

L shein
13 12 11 10
-
j-- s l
shUn
shl in

combi-
r- sO
- national

ci rcuit
t>
030201 00
-
03 /0 2 /01 /00 /
t>
I I I I
Figure 4.19 A small combi national circui t maps the control inputs 1 d. shr. and shl to the
mux ,elect inputs S 1 and sO.
4.2 Registers 163
Figure 4.20 Truth tabl es
Inputs
Outputs
describing operat ions of a Note
regi ster with lert/right
Id shr shl sl sO Operation Id shr shl Operation
0 0 0 0 shirt and parallel load 0 Maintain value
-
0 0 0 Maintain vaJue
along wit h the mappi ng of
0 0 1 1 1 Shift left
0
-
0 1 Shift left
the register control inputs
0 1 0 1 0 Shift right
0

1 X Shift right
to the inlcmal 4x I mux
0 1 1 1 0 Shift right
f1
1 X X Parallel load
select lines: (a) complete
1 0 0 0 1 Parallel load
operat ion ta ble defi ning
1 0 1 0 1 Parallel load (b)
the mapping or 1 d, s hr .
1 1 0 0 1 Parallel load
and shl to sl and sO.
1 1 1 0 1 Parallel load
and (b) a compact version
of the opcn:lli on tabJe. (a)
We can design that combinati onal circuit starti no from a simpl e truth tabl e shown in
Fi gure 4.20(a). 0
We th us obtain the fOll Owing equati ons for the regi ster's combinational circuit:
sl = ld'*shr ' *shl + ld ' *shr*shl ' + ld'*shr*shl
sO = ld'*shr'*shl + ld
Replacing the combinati onal circuit box in Fi gure 4. 19 by the gates described by the
above equati ons would compl ete the register's design.
. Register dalasheets typi call y show the register operation table in a compact form.
takll1g advantage of the priorilies among Ihe control inputs. as shown in Figure 4.20(b). A
sll1gle X 111 a row means that row is actuall y two rows in the compl ete table. with one row
havll1g 0 111 Ihe positi on of the X, the other row having I. Two Xs in a row means that row
IS actuall y four rows in the complete table. one row havi ng 00 in the positions of those
Xs, anot her row having 01. anO,ther 10. and another 11. And so on for three Xs. repre-
sentll1g 8 rows. Note lhat pUlling hi gher priorit y control inputs to the left in the table
keeps the table' opera li ons ni cely organi zed.
Register Design Process
Tabl e 4. 1 describes a general process for designing a register with any number of functions.
TABLE 4.1 Fourstep process for designing a multifunction register.
I.
2.
3.
4.
Step Descri ption
Determine
mllX size
Create mllx
operaTion fable
COl/fl eet mll.X
inplllS
Map cOllfrol
lili es
Count the number of operations (don't forget the maintain present vaJue
operati on!) and add in rront of each flip-Rop a mux "ith at least that
number of input s.
Crc:uc an operati on table defi ning the desired operalion for each
possibl e value of the 1ll1lX select lines.
For each operation. connect the corresponding I1lUX data. input to Lhe
appropriate external input or flip-fl op OUlput pa..-.sing through
some logic) to achicve the desired operat ion.
Create a lnllh table that Illaps external control lines to the internal mu,
select lines. with appropriate priori Lies. and then design the logi to
achieve lhnl mapping
We' ll illustrate the regi ster design process \ it h another example.
164 4 Datapath Components
EXAMPLE 4.7 Register with load, shift. and synchronous clear and set . .
. . foll owing operations: load. shift lelt. synchronous cl ear, and
We want 10 design a register with the r h on<>ration (1 d. 5 h 1. c 1 r. set). The s)' l/chro-
. h . controllOpUIS l or enc .. -
synchronous SCI. wit unique . I d all Os into the register on the next rising clock
nOll s clear opermi on on :1 means to nil 15 into the register on the next ri sing cl ock
Th I s set opernuon means to
edge. e S) ' IIC tTOIl Oll .' cd because some registers come wilh asy"chronous clear or
edge. The lerm synchronous IS Incl ud h 'gister design method of Table 4. 1. we perform the fol-
asynchronous set operations. Foll owIOg I e rc
lowing sleps:
. . , . . There arc 5 operati ons- load, shift ,left : synchronous clear.
Stcll l. Determlilc mux Size D ' ,rorget the mmntaIn present va lue operat ion as
Il aus set, and maintain preselll I'a/ll e. on ,
that opcnl1i on is impli cit.
Step 2: Create mux operation table. We' ll use
the fi rst 5 inputs of an 8x I mux for the
desired 5 operations. For the
3 mux inputs. wc' lI choose to mmnlam
Ihe present value. though those mux
inputs should never be utili zed. The
,able is shown in Figure 4.21.
Figure 4.21 Operat ion lable for a register
wit h load, shift , and synchronous clear
and set.
Step 3: Connect mux inputs" We connect Ihe
mux input s as shown in Fi gure 4.22.
whi ch for simplicil y shows onl y the
Illh nip-nop and mux of the register.
Figure 4.22 Nth bit-sli ce of a register with
the foll owi ng operations: maint ain present
value. parallel load. shirt lefl. synchronous
clear. and synchronous sel.
s2
o
o
o
s1
o
o
1
o
o
sO
o
1
o
1
o
o
Operation
Maintai n present value
Parall el load
Shih leh
Synchronous clear
Synchronous set
Maintain present value
Maintain present value
Maintain present value
In
....
... On-l
D
o
On
Step Map control lines. We' ll give c 1 r highest pri ority, foll owed by set' .l d. and S h 1,
the register control input s would be mapped to the 8x J mux select hnes as shown In
Fi gure 4. 23.
Inputs Output.
elr set Id shl s2 .1 sO Operation
0 0 0 0 0 0 0 Maintain present value
Figure 4.23 Truth table
0 0 0 1 0 1 0 Shih leh
fo r the control lines of
0 0 X 0 0 1 Parallel load
a register with the Nth
0 X X 1 0 0 Set to all i S
bit-slice shown in
X X X 0 Clear 10 all Os
Fi gure 4.22.
4.3 ADDERS
4.3 Adders 165
Looki ng at each output in Figure 4.23. we deri ve the cqu3Iions describing the circuit that maps
the external comrol input s to the 1l11IX select li nes as foll ows:
52 c1r ' *set
51 c1r " set ' *ld "'sh1 + c1r
sO c1r ' *set ' *ld + c1r
We could then cre3(e a cOlllbin:ll ional circuit implementi ng those equations, to map the external
register control inputs to the mux select li nes. and hence. complcling thc register' s design.
Some registers come with asynchronous clear and/or asynchronous set control
inputs. Those inpulS could be impl ement ed by connecting them to asynchronous clear or
asynchronous set inputs thm exist on the ni p-nops themselves.
Addi ng two bi nary numbers is perhaps the most common operat ion perfonned on data in
a di gital system. An N-hil adder is a d:ltapath component Ihat adds two N-bi t binary
numbers A and B, and generates an N-bit sum S and a I-bit carry C. For instance, a 4-bit
adder adds two 4-bilnumbers. like DIll and 0001 , result ing in a 4-bit sum. li ke 1000.
with a carry of O. 1111 + 0001 would resull in a carry of I and a sum of 0000 (or
10000 if you treat the carry bil and sum bits as one 5-bit result). N is oft en referred to as
the \Vi dlil of the adder. Designing fasl yet size-effi cient adders is a subj ect that has
received considerabl e att ent ion for many decades.
Although it appears that we coul d design an N-bil adder by foll owing the combi -
nati onal logic design process of Table 2.5, it IUrns out thai building an N-bit adder
foll owing that process is not very pracli cal when N is much larger than 4. A 4- bit
adder has IWO 4-bit input s. meaning eighl input s total, and has four sum outputs and a
carry oUlpUt. So we could des ign the adder using Ihe standard combinalional logic
des ign process of Table 2.5. For exampl e, a 2-bit adder, whi ch adds two 2-bit num-
bers, could be desi gned by starting with the truth table depi cted in Figure 4.24. We
could then impl ement Ihe logic using a two-level logic gale based implementation for
each output.
Inputs Outputs Inputs Outputs
.1 aO b1 bO e s1 sO .1 aO b1 bO e s1 sO
0 0 0 0 0 0 0 1 0 0 0 0 1 0
0 0 0 1 0 0 1 0 0 1 0 1 1
0 0 0 0 0 0 1 0 0 0
0 0 1 1 0 1 0 1 1 0
0 0 0 0 0 1 0 0 0 1 1
0 0 1 0 0 0 1 0 0
0 0 0 1 1 0 0
0 0 0 0
Figure 4.24 Trulh table for a 2-bil adder.
166 Datapath Components
. hat for wider adders. the approach resul ts in
The problem with such an approach" t , 6 b' . dd>r has 16 + 16 = 32 inputs
I I' h
' bl ' d too gate,. A I - II a c ,
too "rge 0 Iru t t,1 e' ,In .// . 's A two- level logic gate based
. , bl ' Id h'IVe over jOllr bl rO/l IVII .
mealllng the trulh t.l e wou , . "II ' of oates To ill ustrate this
. I> > , " I' h' ' bl> would likely reqUIre ml Ion 0 .
IInp ement,lIl on a t ,II ta e. . . ' h we used Ihe standard combinational logic
Point we performed an tn whl c . ' h I b' dd ' u
' . . .' wi dth stantng Wit - II a ers on up. ",e
de'ign proce" 10 create adder> 01 Increasing i n 'tool avai labl e. and asked the tool to
used the most advanced commercial logiC des g r d" OR
. . ( I ve l of AND gates lee tng tnlO an gatt
create a design u. ing two levels of logi C one e II ')
. . ber of gates (actua y. trans istors .
for each out put) and using the minimum num
The plot in Figure 4.25 sum-
mari ze:, OUf results. Not ice how
fast the number of transi. tors
grows as the adder width i,
increased. This fast growth is an
effect of exponential growth- for
an adder wi dth of N. the number of
truth table rows i, proporti onal 10
2N (more preci,ely, 10 2"' ,v).
Clearly, Ihi s exponential growth
prohibits uS from w,ing the stan-
dard design proces, for adders
wider than perhaps 8 to 10 bits. We
could nOI compl ete our experi-
ments for adders larger than 8
bi ts-the 1001 simpl y could nOI


1?
* 6000
'in
c
'" ,::

2 3 5 7
N
Figure 4.25 Why large adders aren't built using
!"wndard two-level combinati onal logic-nOlice the
exponential growth. How many transistors would a
32-bil adder require?
complete the design in a reasonable . .
amount of ti me. The tool needed 3 seconds 10 build the 6-bll adder,40. sewnds to bUIld
the 7-bi t adder, and 30 minutes for the S-bit adder. The 9-bl t adder dldn t fi nt sh aft er one
full day. Looking at thi s data. can you predict the number of transistors requlfed a 16-
bi t adder or a 32-bi t adder u ing two-l evels of gates? From the figure, II looks hke the
number of transistors is doubling for each increase in N, with about 1000 transistors for
N=5. 2000 t.ransistors for N=6. 4000 transistorS for N=7. and 8000 transistors for N=8.
Assuming that trend continues for larger adders, then a 16-bit adder woul d have S more
doublings beyond the S-bi t adder. meaning mult iplying the Size of the S-bll adder by
2
8
=256. So a 16-bit adder would require 8000 256 = about two mi lit on transistors. A
32-bit adder would require an additi onal 2
16
=64K doublings, meaning 2 mi ll ion 64K =
over 100 bi/lio/l transistors. That's an outrageous number of transistors. We clearly need
another approach for designing larger adders.
Adder-Carry-Ripple Style
An alternative approach to the standard combinational logic design process for adding
two binary numbers i to instead create a circuit that mimics how we add binary
numbers by hand. which is one column at a time. Consi der the addi tion of a binary
number A-IlIl ( 15 in base 10) and 8-0110 (6 in base 10), column by column, shown
in Fi gure 4.26.

Figur. 426 Adding 1\' 0 bln"ry numbe"
b) h;md. column by column.
+ 0
o t t 0 1
4.3 Adders 167
+ 0 o
o I 0 I
For each column, we odd Ihree bit, togelher, "lid we generate II SlIllI bit ror the
column and a carry bi t fnr the ne" colllllln. The firs l COIUIlIII is all exception in
that we onl y ad I two bi t, t gelher, hUI ,till a MIIIl IIlId tt curry bit. The carry or
the last column become, the lifth bit or the ' lim. The MIIIl i, 101 0 I (2 I in base 10).
We can create a c mhinat ional compollenl to perrOrlll the requ ired addilion 1'01' a
single column. The input' and outpuh of ' "ch colllpOll ent s arc , lt owil in Figure 4.27.
Thus, all we need to do i, de, ign tho,c cOlllpOll ent S thai perrorm Ihe addi lion in each
column. and connect them together u, shown ill Fi gure 4.27 to creat e" 4- bi l adder. Bear
in mind, though. that this llI ethod or creating lin adder ;' illtended to enuble eflicient
design of wider adder,. like those with 8 hil ' and above. We arc ill uslruting Ihe metllOd
u. ing only n 4-bi t adder becm,," that ,ile adder keeps our figures sma ll and readable, but
if al l we rcall y needed wa' a 4-bi l adder, Ihe , tandurd combinalioll al logic design process
for two-Icvel logic wou ld probably work j UM line.
0 -------,
Figure 4.27 sing
combinat ional components
to add Iwo binary numbers
colu mn by column.
A:
+ B: 0
o
SUM
We' ll now design the components in each column of Fi gure 4.27.
Half-Adder
A half-adder is a combinalional component that adds two bits
(a and b), and generales a sum (5) and carry out (c o) bit. ( ote
that we did flot ay that a half-adder adds /lVO 2-bi t /Ill/fi bers-a
half-adder merely adds tlVO bits.) The componenl on the ri ght in
Figure 4.27 that adds the rightmo t column's two bits (a and b)
and generates the sum (5) and carry-out (CO) bit is a half adder.
We can design a haJJ-adder using the straightforward combina-
tional logic design process from Chapter 2, as foll ows:
Inputs Outputs
b co
0 0 0 0
0 1 0
0 0 1
0
Figure 4.28 Trut h table
for a half-adder.
168
Dalapalh ComponenlS
lrulh table 10 caplure lhe funclion. '!be
'tep I: We ' ll use a
nppropnnle lrul h lable" hO" n In Figure 4.2
and rhal S - a' b
Slep 2: Convert 10 We can clearly see lhal
db' Ole Ihallhe equullon S - d' b + ab' i lhe arne as
Slep J: Creole circuil. The cIrcUli
for ,I half-adder, Implemenl'"g Ihe above
equulloo;. I' ,h()wn III Figure 4 29(a),
Fi gure <I ,29(b) ,how, U bloel ,ymbol fa
half-ndder
Full-Adder
Jull-adder " " cmnb,", lI onal compo-
nenl Ih"1 Jdd' Ihree hll\ (d, b, and cO
and generale, J 'um (s ) ,,"d a carry-oul
(co ) bll. ( ole Ihar we did flol Ihal a
full -adder add, 111'0 J-bll fI"mben- 1I
a b
co
(I )
Halfadder
(HA)
co 5
(b)
figure 4211 Half-adder' (.) cireuil. and
(b) block symbol.
llIerely .,dd, ,"r", bllf. ) The three component in Figure 4.27 thaI add the Iwo bilS of a
column (a and b) along Wllh Ihe carry from Ihe column on the righl (ci) and generates
the SUIll (s ) and carry oul (co) bll_ are full -adders, We can de ign a full-adder usi ng !be
' 1r:lI ghlforw" ru comblllallonal logIC de,ign proccs . as follows :
Step I: Capillre t"eJl/llction. We'll usc a ItUth lable 10
caplure Ihe funcll on, , hown In Figure 4.30,
tep Z: Com'.rl/o equations. We oblain the foll owi ng
equali on, for co and S, For ; impli ilY, ler' s wri le C i as
c, We'lI u,c algebmic method, 10 implify rhe equations,
co - a 'bc + ab'c + abc' + abc
co - a ' bc + abc + ab'c + abc + abc' +
abc
co - (a'+a)bc + (b ' +b)ac + (c '+c)ab
- a'b ' c + a ' bc ' + ab ' c' + abc
- a'(b'c + bc ' ) + a(b ' c ' + bc )
- a ' (b xor c) ' + alb xo r c)
, t ,
Inpula Outpula
b 01 co I
a 0 a a a
a a 1 a
0 0 a 1
a 1 1 0
a a a 1
a 1 0
a 0
Figure 430 Trurh lable for a
full-.dder,
During algebmic simplification. for co, we nOled Ihat each of rhe first three terms could
be combi ned wilh Ihe In I term abc. as each of the first three lerms differed from rhe last
lem1 in jusl one lileral. We thus re3led three instances of rhe last' term a bc (which
doesn't change the funclion) and combined rhem with each of rhe first three lerms. DoO' I
worry if you aren'l able 10 come up wirh thaI simplification on your own righl now-
Seclion 6,2 introduces merhocls to make such simpli fication more straightforward. If you
have read rhn! seclion, you mi ghl try usi ng a K-map (introduced in that secLion) to sim-
plify the equali ons,
J: thr drt'u;t.
The CIrcUlI f{lr lull',ld<kr I'
,ho" n 10 -I .\ I( a t, .uld
rhe lulI adder', hi. '\mhol
" hO\, n 10 hgure J 11 (h)
-I- Oil url') -Rippl \ddrr
L"lOg Ihrec lull'Jd(k" ,Uld
one halt -add r, "e an
a 4-bll carT)-npple adder,
"h,ch add, I"" -I bll
numbe" ,lOd gener.He, " J
bll urn. ,ho"n 111 h j!urc
.j L The 4-hll CdrT) npple
Jdder ,10,0 generale' J l'drT)
oul bll
aJb3 112b2
co 53
(a)
4 J Addors
a b co
co
(I )
Figure 4 31 I uti Jddrr (,lllIreu'l. II "d (b) hlock 'YlIlh,,1.
., bl
J
03020 100
4bll ddor
co s3s2s150
- II
. 1 aO
(b)
fIgure 4 J2 4bll adder, ( ) arry "pplc ImplCIllCnlnllOn Wllh 3 fuli -adde" und I 111M-udder, und
(bl blocl tmbol
169
We can Include a carry-In hll WIth Ihe 4-blt lIdder, which cnllble; to connCCl 4-bil
adder\ logether 10 build larger adde". We Include the cllrry-in bil by replacing Ihe half-
adder (whIch WOL\ In Ihe ri ghtm \1 bil po\lllonj by a fu ll -adder. , hown in Figure 4,33,
a3b3 a2b2 81 bl
co 53 s2 51 sO
(a) (b)
figure 4.33 4-bi l adder: fa) carry-ripple implemenlalion with 4 full-addc". wilh a carry-in inpul,
and (bj block symbol.
170 Datapath Compone nts
. .' r Su ose that all inputs have been Os for a long
Let 's ana lyze the behavIOr 01 thi s adde. POP d ' II c i va lues of the full adders will
S
' 11 b 0000 co wtl l be . an d
time. meaning that WI e . 11 d 8 becomes 000] at the sa me time (whose
also be O. oW suppose that A becomes 0] an f A and 8 will propaoate throuoh the
] 000)
Th ,ew values a , " "
sum we know shoul d be . ose I .? S So 2 ns after A and 8 change, the sum
full-adders. Suppose the delay of a full -adder IS - n .. F' re 4 34('1) So 53 will become
' 11 h 0 as shown In Igu . ,.
output s of the full -adders WI c. an"e. +0+0= ] (with c02=0), 5] will become
0+0+0=0 (with c 03=0). 52 wtll become ]1 ]-0 ( ' th coO=] ) But 1111 + OllO
. 1-0) d sO will become + - WI .,
1+0+0=] (with co - . an 01000 What went wrong?>
should not be DOll O-inste:ld. the sum should be .
0111+0001
c030 Os3
Output after 2 ns (1 FA delay)
o 0
Output after 4 ns (2 FA delays)
Output after 6 ns (3 FA delays)
o
o (d)
o Output after a ns (4 FA delays)
Figure 4.34 Exampl e of adding 0111 +0001 using a 4-bil carry-rippl e adder. The output wi ll
exhibit temporaril y incorrect (spuri ous) results until the carry bit from the fi ght most btl has had a
chance to propagate (ri pple) all the way through to the leftmost bit.
OI hing went wrong-the carry- ripple adder simply isn' t done yet after ju t 2 ns.
After 2 ns, coO changed from 0 to 1. Now, we must all ow time for that lIew va lue of coO
to proceed through the next fu ll -adder. Thus, after another 2 ns, 51 wi ll equal 1 +0+ 1 =0,
and co2 wi ll become 1. So after 4 ns (two full -adder delays). the output will be 00 100,
as shown in Figure 4.34(b).
The IeI'm "ripple<
carry" adder is
(IClltol/Ylllore
COII/IIIOIl. I prefer
Ille term "corn-
ripple" for .
cOllsistent I/alll illg
lIIith OIlier adder
types. like carry-
sdeCi (lIId carry-
lookailea{/, which
we describe in
Chaprer 6.
4.3 Adders 17 J
Keep waiting. After a third full-adder delay. the new value of co2 wi ll have propagated
through the next full -adder, resulting in 52 becomi ng 1+0+1- 0. with c o2 becoming 1. So
after three fu ll-adder delays, the output will be 00000. as hown in Figure 4.34(c).
Just a htl le more patience. After a founh full-adder delay. co2 has had time to pro-
pagate through the last full-adder. resulting in 53 becoming 0+0+1-1, wi th c03 staying
O. Thus, after four full-adder del ays. the output will be 01000. as hown in Figure
4.34(d), and 01000 is the correct re ult.
To recap. until the carry bits have had time to rippl e through all the adders. from
ri ght to left. the output was not COrrect. The int ermedi ate output va lues are known as spu-
rious values . The delay of the 4-bit adder, meaning the time we must wait until the Output
IS the stable correct va lue, is equal to til e delay of four full -adder. or 8 ns in thi s case.
which is the time for the can'y bit s to rippl e through all the adders-hence, the term
carry-ripple adder.
Students often inti all y confuse full-adders and N-bi t adders. A full-adder adds 3 bilS.
In contrast. a 3-bit adder adds two 3-bit numbers. A full -adder produces one sum bit and
one carry bit. In contrast, a 3-bit adder produces three sum bilS and one carry bit. A fulJ -
adder is usually used to add onl y olle colf/1I111 of two binary numbers. wherea an N- bit
adder is used to add two N-bit numbers.
An N-bit adder often comes wi th a carry-in bit. so that the adder can be cascaded
with other N-bit adders to form larger adders. Figure 4.35(a) haws an 8-bit adder built
from two 4-bit adders. We would set tWe carry-in bit (ci) on lhe ri ght to 0 when adding
two 8-bit numbers. Figure 4.35(b) shows a block ymbol of that 8-bi t adder.
a7a6a5a4 b7b6b5b4
a3a2al aO b3b2bl bO
ci abit adder CI
co
(a) (b)
Figure 4.35 8-bit adder: (a) carry-rippl e implementati on built from two 4-bit carry-ripple adders.
and (b) bl ock symbol.
EXAMPLE 4.8 DIP-switch-based adding calcul ator
Let 's design a very simple calculill or that can add two 8- bit bi nary numbers and produce an 8-bi l
result. The input binary numbers wi ll come frol11lwO 8-swil ch DIP switches. and the ourput \, i11 be
di splayed usi ng 8 LEDs. as illustrated in Fi gure 4.36. An 8-bit DIP (Dualllllille Package) ,witch is
a simpl e digital component havi ng switches that a user cnn by h:md mo\'e up or dO\\ n. \\ ilh up out-
putting a ] on the corresponding pin. and down outputt ing a O. An LED (Iight-emitling diode) is jU'1
a smalllighl Ihm illumi nates when the LED's input 1. and is dark when the input O.
We con implement this calculator by ut ili zing an 8-bi t c:llT) -ri pple adder for the CALC block.
as shown in Figure 4.36. \Vhcn n moves the switches on 3 DIP s \\ itch. the ne\\
propagate through the adder's gates. generating intcnnittent outputs and henC'<' C3lb1ng
172 4 Datapath Components
B.bit carry-ripple adder
ci 0
Figure 4.36 8-bit DIP-switch- LEOs
based addi ng calculator. The
addition 2+3=5 is shown.
CALC
. .' 0 until the values have finall y propagated through the entire cir-
rapid blinking of some of the LE s'. d th LEOs display the correct new sum.
CUil , al which point the output stabliJZes an LEeD ' 1'1-. 'I.e intenniuent values. we can introduce a
,L bl" ki g of the s w "' "'
If we want to aVOId ",e In n . h ' d' t ' s when the new value should be di splayed. We
.. ] ..) t the system whlc to lea e
button e (for equa s . a fi ured 'both DIP switches to represent the new inputs to be summed.
press e only after haVing can g . . F re 4 37 We connect the e input to the 1 oa d
We can utili ze the e input with a register, as JO on the DW switches, new intennittent
d . When a user moves 5
input of a parallel loa regISter. bl k d at the regi ster's inputs, as the register holds its
values appear at the adder outputs, but are'L OC e 'ous value When the e button is pressed, then on
th LED d' splay ",at previ .
previous value and hence e s J ded d the LEOs will then di splay the new value.
the next clock edge the register wi ll be.IIO
I
abe only if the sum is 255 or less. We could connect
Notice that the displayed value WI carr
co to a ninth LED to display sums between 256 and 51 1.
1

Figure 4.37 8-bit DIP switch-
based adding calcul ator, using
a regi ster to block spurious
LED outputs. The LEOs onl y
get updated after the button is
pressed, which loads the
output register.
B-bit adder
ci 0
CALC
LEOs
EXAMPLE 4.9
4.4 Shifters 173
Delay and Size of an 8-Bit Carry-Ripple Adder
Assuming full-adders are implemented usi ng two levels of gates (ANDs followed by an
OR), and that every gate has a delay of I ns, let 's compute the total delay of a 32-bit
carry-ri pple adder. Let ' s also compute the size of such an adder.
To determine the delay, note first that the carry must ripple from the first full-adder to the
32nd full -adder. The delay of the first full-adder is 2 gates * I nslgate = 2 ns. The new carry
must now ripple through the second full -adder, resulting in another 2 ns. And so on. Thus, the
total delay of the 32-bi t carry-ripple adder is 2 nstfull -adder * 32 full -adders = 64 ns.
To determine the size, note that a full-adder requires approximately fi ve gates (we
say approximately because the 3-input OR gate in a full-adder requires more transistors
than each 2-input AND gates, and the 3-input XOR gate requires even more transistors).
Since the 32-bit adder has 32 full-adders, the total size of the 32-bit carry-ripple adder is
5 gates/full -adder * 32 full -adders = 160 gates.
The 32-bit carry-rippl e adder has a long delay, but a reasonable number of gates. In
Section 6.4, we' ll see how to build faster adders, at the expense of using more gates, but
still using a reasonable number of gates.
Compensating weight scale using an adder
A scale, such as a bathroom scale. uses
a sensor to determine the weight of an
object (e.g .. a person) on the scale. The
sensor's readings for the same object
may change over lime, due to wear and
tear on the sensing system (such as a
spring losing elasticity), resulting
perhaps in reponing a weight that is a
few pounds too low. Thus, the scale
may have a knob that the user can tum
to compensate for the low reponed
weight. The knob indicates the amount
to add to a given weight before dis-
B-bit adder

playing the weight. Suppose that a knob
can be set to change an input compen-
sat ion amount by a value of 0, I, 2,
Weight
clk
7, as shown in Figure 4.38.
We can implement the system using
an 8-bit carry-ripple adder, as shown in
the figure. On every rising clock edge,
the di splay register will be loaded with
the sum of the currently sensed weight
plus the compensation amount.
4.4 SHIFTERS
to display
Figure 4.38 Compensating scale: the dial outputs a
number from 0 to 7 (000 to Ill), which gets added
to the sensed weight and then di splayed.
Shifting is a common operation applied to data. Shifting can be u ed to manipulate b!ts,
li ke when we want to reverse the bits of a number. Shi fti ng is useful for communi aung
data serially, as was done in Example 4.6.
174 Datapath Components
Simple Shifters
.. d' 'd' " by a factor of 2. In base I 0, you are
Shift ing is also useful for multlpl ymg or IVI In" d b s Ilpl y appendin
o
a 0 to a
. I . I ' b 10 can be one y II 0
fami li ar wit h the Idea that mu li p yJJ1g Y . O I '111e as shiftin
o
left one
-' O -0 ApP' ndJJ1 o a IS tIe s, I 0
number. For exampl e. ) times I IS). " 0 d b pendin
o
a 0 meaning by
. . . . . b ? I ' I . 1" by ? can be one y ap 0'
pOSI ti On. LIkeWIse. In ase -. mu tip yll " - h . base 10 multiplying
. . 01 . . ?' . 1010 Furt ermore, 111 .
Shifting left one pOSlll on. So 01 times - IS ... I f ' So ' 11 base 2 multiplying
. O' 11Iftlno e t tWIce. I , ,
by 100 can be done by appcndJJ1 g two s. 01 S Of I . . base 2 1 's 1.Ike multi-
. . f . SI ' f( 10 Ie t t"ee lII11es In ,
by 4 can be done by shI ftIng Ie t tWIce. 11 II " . I ' b ? h' f(
Plyino by 8. And so on. And since shifting left is the same as multl p yJJ1 g y _, S I 109
o ... Od" ddb ? ls 0101.
ri ght is the same as dl vldJJ1 g by 2. So 101 IVI e . y - . fi nd the need to
. . . hft reolster sometimes we
Althou"h slll ftlng can be done uSIng a S I " ' . . d I h'ft b d' f
" .. h fa 11S the ShIft , an t lat can S lY 1- use a separale combinati onal component t at . n
ferent numbers of positi ons and in di fferent directi ons.
that can shift an N- bit input by some An N-bit shifter is a combinat ional component
amount to generat e an N- bit output. ... .. 5 we want a s hifter that
The simpl est shi fter shi fts one pOSIt IOn 111 one directIon. .ay . . 0
shi fts left by I positi on. That simpl e shifter's deSIgn IS strai ghtforward, COnSIStln
o
of Just
wires as shown for the 4-bit left shifter Figure 4.39(a) . Note that the shIfter has an addI-
tional' input thaL is the va lue La shifL int o the ri ghLmost bit.
i3 i2 i1 iO
W
q3 q2 q1 qO
$-
(a)


q3 q2 q1 qO
(b)
i3 i2 i1 iO
inl
q3 q2 q1 qO
(e)
Figure 4.39 Combi national shifLers: (a) len shifter wiL h block symbol shown at bOLl om, (b) len
shin or pass component. (c) left/right shi ft or pass component.
A more advanced shi fte r can eiLher shifL one pos iLi on when an addiLi onal inpuL sh is
1, or can pass the inpuLS Lhrough La the OULpULS unshi fLed when s h is O. We can deSign
LhaL shifLer USi ng 2x I muxes, shown in Fi gure 4.39(b). . .
An even more advanced shifLer can shift left or ri ghL one pos iL ion, shown JJ1 FIgure
4. 39(c). When bOLh shi ft control inpuLs are 0, the inpuLs pa s th rough unchanged. When
s hl=l , the shi fLer shifLS left , and when sh R=l, the shifLer shi fLS ri ght. When bOLh Lhose
control inpuLS are I, the shi fLer could be des igned La OULput Os by connecL ing Os La the 13
inpuLS of the muxe (noL shown). Funher eXLensions of the simpl e shIfter pOSSIble:
such as all owing shi ft s of one po iLi on or two posiLi ons. Such mulu funcLl on shlfLers
inLernal designs require larger muxes, and mapping of the control Ignals to the
select lines, jusL as was necessary in designing multifuncti on regi Lers.
4.4 Shift ers 175
EXAMPLE 4.10 Approximate Celsi us to Fahrenheit convener using a shifter
We arc given a digi tal thermometer that digiti zes a tempermure in
Celsius inlO on 8-bit binary number C. So 30 degrees Celsius woul d
be digili zed as 0001111 D. We wan! to Conven Ihal lemperat ure '0
Fahrenheit . again using 8 bit s. The equmion for convert ing is:
F = C*9/5 + 32
Let's assume that we are nOI concerned about accuracy. so we' ll
repl ace the equati on by a simpl er oll e;
F = C*2 + 32
We can design the converter straightforwardl y using a left shifter
(wilh a shin in value of 0) 10 compule C*2. and Ihen an adder to add Figure 4.40 Celsi us to
32 (00 100000). as in Figure 4.40. Fahrenheit convener.
.. FAHRENHEIT VERSUS CELSIUS_
The U. S. represents temperature usi ng Fahrenheit .
whereas most of the world uses the metri c system's
Celsius. Presidents and other U.S. leaders have desired
lO switch to the melfic system for almost as long as the
U.S. has existed, and several aClS have been passed over
the centuries, the mosL recem being the Melri c
Conversion Act of 1975 (amended several limes since).
The ACL designates the metri c sysLem as the preferred
system of weights and measures for U.S. Irade and '
commerce. Yet switChing (0 metric has been slow. and
few Ameri cans Loday are comfonabl e with metric. The
probl em with such a slow transiri on was poignantly
demonstraLed in 1999 when Ihe Mars ClimaLe Orbiter.
EXAMPLE 4.11 Temperature averager
Recall Example 4.3, in whi ch registers
were used to save a hislOl)' of tempera-
ture values over the last three clock
peri ods. We want to extend thi s system 10
save the last four values instead of three.
We also want the system to compute the
average of the last four values and output
thai average on an output Tavg. The
average of four va lues Ra, Rb. Re, and
Rd is (Ra+Rb+Re+Rd) 14. NOie thaI
dividing by 4 is the same as shining right
by two. Thus. we can design the system
using a right shifter Ihat shifts by two
pl aces (wiLh a shift in value of 0). as
shown in Figure 4.4 1.
costing seveml hundred million dollars. was destroyed
when enteri ng the Mars atmosphere too quickly_ The
reason: "a navigati on error resuhed from some
spacecraft commands being ent in English units
instead of being converted to metric units." (Source:
www.nasll .gov). Perhaps if all readers of this book in the
U.S. use Celsius when they talk. we' ll help speed up the
transiti on? So instead of saying a warm ninety
degrees outside today," say "II 's a warm thirty-two
degrees outside today." Actuall y. we mjghl say '11's a
wann three ten and two degrees outside today"
(remember correct counting in Chapler I?).
Figure 4.41 Temper:Jlure a\'erager using a right-
10 divide b) 4.
b
176 4 Datapath Components
Barrel Shifter
An N.bil barrel shifler is a general purpose Nbit shift er that can shift or rotate any
number of posi tions. For simplicity. le!"s consider only left shIfts for the moment. An ..
bit barrel shifter can shift left by I position. 2 positIons. 3 poslllons. 4 posllJOns, 5 poSI
tions. 6 position,. or 7 positions (and of cour eO positions. meaning no shift is done). An
8.bi t barrel shift er therefore requires 3 control inputs. say x. y, and Z, to speCIfy the dt s
tance of the shift. xy z- OOO may mean no shift. xy I shift by I position, xy
shifL by 2 positi ons. etc.
We could design such a barrel shifter by placing an 8x I mux in front of each of the 8
shifter outputs. connecti ng xyz to each of the eight mux's select input. and then con
necLing the mux inputs wi th the appropriat e shifter inputs for each configuraLion of x, Y,
and z. So 10 (corresponding to xy z- OOO. meaning no shi ft) of each mux would just get
the present bit's shift er input. II (corresponding to meaning left shift by one
position) would get the shift er input one posi tion to the ri ght. 12 (xy z=O I 0, meaning left
shift by two positions) would get the shifter input two positions to the right. And so on.
Such a design. whil e conceptuall y strai ghtforward. has too many wi res being routed
about. And the design does not scale well to larger bit widths. such as a 32bit barrel
shift er-a 32x I multiplexor cannot be built with two levels of gates (AND/OR), because
gates with 32 inputs are too big to be implement ed efficiently. and must instead be imple
ment ed using multiple levels of small er gates.
A more elegum de,ign for an Sbi t barrel shifter
consists of 3 cascaded simple shift ers. as shown in
Figure 4.42. The firs t simple shift er can shift left four
positi ons (or none). the second c<ln shift left by two
positions (or none). and the Lhird by one position (or
none). Notice th<lt the shift s "add" to one another-
shifLing left by two, then by one. results in a total
shift of Lhree positions. Thus. by configuring each
shifter appropriately. we can obtain a total shift of
any amount between zero and seven. ConnecLi ng the
control inputs xy Z to the shifters is easy-just think
of xy z as a binary number representing the amount
of the shift, x represents shifting by four, y shifting
by two. and z shifti ng by one. So we just connect x Figure 4.42 B bit barrel shifter
to the left-by-four shifter, y to the left-by-two shifter, (Iefl shift onl y).
and z to the left-by-one shifter.
The above design considered a barrel shifter LhaL could only shift left. We can easily
extend the barrel shift er to support both left and right shifLS. We would replace the
internal left shifters by shifters Lhat could shift left or right, Lhus each having a control
input indicaLing the direction. The barrel shifter would Lhen have a direction control input
also, connected to each internal shifter's direcLion control input.
Finally, we can easily extend the barrel shifter to support shifLS and rotates. We would
replace the internal shifters by rotators Lhat could ei ther shift or roLate, thus each having a
control input indicaLing whether to shift or roLate. The barrel shifter would then have a shift-
or-rotate control input also, connected to each internal shifter's shift-or-rot ate control input.

4.5 Comparators
177
45 COMPARATORS
We often Want to compare t b'
than the other F wo tnru: numbers to see if Ihey are equal, or if one is greater
suring we 111Ight want to sound an 819rnl if a thermometer mea-
Fahrenheit (394 d Y emperature reports a temperature greater than 103 degrees
binary egrees Celsius). Comparator component s perf0n11 such comparison of
Equality (Identity) Comparator
An Nbil eqllalily COm I ( .
para or sometImes called an idel/lily comparalor) is a datnpath
cfomLhponent .that compare two N bi t input A and B. setting an output control signal to 1
I e two tnputs are equal 1\yo N b' .
B-b3b2blbO . . It mputs, say two 4 bit inputs A- a3a2a l aO and
a3-b3 2 b2' arc equal If each of theIr corresponding bit pairs are equal. So A-B if
.a - ,al-bl. andaO-bO.
turinFOllowing the combinational logic design process of Table 2.5, we can start by cap-
g the functi on of a 4 bit equali ty comparator as an equation:
eq - la3b3+a3 'b3 ' 1 * la2b2+a2 'b2 ') * lalbl+al'bJ'1 *
laObO+aO 'bO' 1
b th term detects if the corresponding bits are equal, namely, if both bits are 1 Or
o li S are O. The expressions inside each of the parentheses represent the behavior of
?n XNOR gate (recall from Chapter 2 Lhat an XNOR gate outputs J if the gate's two
tnput bIts are equal), so we can replace the above equati on by the equivalent equation:
eq - (a3 xno r b31 * (a2 xnor b2) * lal xno r bl) * laO
xnor bO)
We convert the equation to the circui t in Figure 4.43.
a3 b3 a2 b2
4blt equality comparator
eq
Ib)
Figure 4.43 Equality comparalor: la) inlemal design, and (bl block symbol.
Of course, we could have built the comparator starting with a truLh Lable, but that
would be cumbersome for a large comparator, with too many rows in the truth table to
easily work with by hand. A truth Lable approach enumerates all the possible situation
for which all the bits are equal, si nce only those situations would have a I in the column
for the output eq. For two 4-bit numbers, one such situation will be 0000-0000.
J
178 Datapa th Components
Anot her wi ll be 000[=0001. learly, there wi ll be as many situat ions as there are 4-bit
binary there wi ll be 2
J
= 16 where both are equaJ.
F r two 8-bi t numbers, there wi ll be 256 equal ilUall ons. For two 32-blt numbers, there
will be four bi ll ion equal ,i tuat ions. A comparator built wilh such an approach wi ll be
large if we don' t min imile Ihe equation, and Ihat minimi zali on will be hard with such
large numbe" of terms. Our XNOR-based des ign looks 10 be much simpler and scales to
wide inputs wonderfu ll y- wi dening Ihe in put s by one more bil involves merely adding
One morc gil le.
Magnitude Comparator- Carry-Ripple Style
An N-bitmagl/itll de comparator i, a dalapalh componenl Ihal compares two N-bit binary
numbers A and B. and indicmes whel her A>B. A=B, or A<B.
We have already seell several limes Ihal designing cert ain datapat h components by
sl:l rt ing wilh a Irul h lable involves 100 large of a trulh lab Ie. Lei's instead design a magni-
tude comparalor by con,idering how we compare numbers by hand. Consider comparing
IWO 4-bil number> A=a3aZalaO-10 11. B=b3bZb lbO= 100 1. We stan by looking al the
high-order bi IS of A and B, namely. a3 and b3. Since Ihey a.re equal (bot h are 1). we look
at Ihe nc.XI pair of bits. a Z and bZ. Again. since Ihey are equal (both are 0), we look at the
neXI pair of bi". al and b1. ince aJ>bl (l>O), we conclude thai A>B.
Thu" comparing IwO bi nary numbers takes place by comparing from the hi gh bit-
pairs down 10 Ihe low bil -pairs. A; long as bi l-pairs arc equal. we need to compare the
neXI lower bil -pair. As soon as a bil-pair is different. we conclude that A>B if a i = 1 and
bi =0, or Ihal A<B if bi -1 and ai-D. We can thus des ign a magni tude comparator using
Ihe struclure shown in Figure 4.44.
a3 b3
tgl __ in_gl
leq ..... in_eq
lIt __ in_1I
b
oul_gl
oul_eq
OUUI
Stage3
a2 b2 al bl

SIage2
(a)
Stage 1
4bil magnilude comparalor
(b)
aO bO

a
AglB
AeqB
AtlB
SlageO
Figure 4. 44 4-bil magnil ude comparalor: (a) internal design usi ng idenli cal components in each
slage. and (b) bl ock symbol.
Each stage works as follows . If i n_9t=1 (meaning a hi gher stage determined A>B),
this stage need nOl compare bits, and just sets ou t_9 t = 1. Likewi se, if i n_lt = I
(meaning a hi gher stage determined A<B), this stage ju t sets out_ l t=1. If i n_q=1
(meaning hi gher stages were all equal), thi s stage must compare bits, setting ou L 9t =1
4.5 Comparators 179
if acl and b-O, sell inc Ollt I - [ ' f'
I a-O and b- 1. " nd setl ing out q-l if a and b
bOlh equal I or bOlh equal O. -
We coul d C'lplure II r . , '
b
. h ' Ie unCll on 01 a siage', block a Irulh lab Ie wilh 5 ;'lP1115 For
revlly l augh " II ' I .
. 'f " mp y U"" Ihe foll owing equal ions deri ved from the earli er exph -
nati on 0 how c!ach Si a k I ' . '
I
. " ge wor s: I Ie CirCUli for each stage would foll ow di recll y from
t lese equall on>:
ou t_9 i 11_9 + (i 11 q A a ' b ' )
ou _It - in. It + (i n_q * a ' , b)
out_q - in q * (a XNOR b)
cGJ)
a3 b3
1
a3 b3
1
b3
1
a3 b3
t
o
Stage2

a2 b2
0
a2
o
a2 b2

o
bl
(0)
Slage l
o
(b)
Slagel
cG])
al bl
al

Stagel
(d)
Figure 4.45 The "rippl ing" wilhin a magnitude comparalor.
1
bO
StageO
1
aO bO
StageO
StageO

aO bO

Stag eO
AglB
AeqB
AIIB
AglB
AeqB
AIIB
AgtB
AeqB
AIIB
glB
eqB
ItB
180
Data path Components . f and
. arator works for an IOpUt a .
Fi gure 4.45 shows how thIS camp sisting of four stages.
We can view the comparator's behavIOr as can by sellin
o
the external input I I,
5() ve star! 0 . - 1 d
In Stage3 shown in Figure 4.4 a, \ the compari son. Stage3 has 1 n_eq- ,an
to force the comparator to act uall y do ' 11 become I , whil e out _g t and out_I t
since and then ouLeq WI
wi ll become O. that since out_eq of Stage3 connects
I S 2 shown in Figure 4.45(b). we see .'11 be 1 Since and
n tage 2' in eq WI '.
to in eq of Stage2, then Srage s -t d ou t 1 t will be O.
- h' l out 9 an - .
out eq will become I , w J e - that since Stage2' s out _eq IS con-
I S- I shown in Fioure 4.45(c), we see '11 be 1 Since and b1=O,
n tage 0 J's i 0 eq WI .
nected to Stagers i o_e q, eq out_l t will be O.
out gt will become 1, whIle - that the outputs of Slagel cause
- . 445(d) we see
In StageO shown in FIgure .. h'directly causes StageO's out_gt to become
StageO's i o_g t to become I, whlc b a Noti ce that the values of a a and bO
l and causes out_eq and out_l t to e . t to the comparator's external out-
, . 0' outputs connec
are irrelevant. SIOce Slage s 8 d A 1 t8 will be O.
8
' 11 b 1 whIle Aeq an
puts, Agt WI e, h he staoes in a manner similar to a
I . les throug t " .
Because of the way the resu t npp ' 1 h' way is oft en referred to as havIOg a
. d para tor bUl t t IS II " .. b'
carry-ri pple adder, a magOltu e com thou h what 's rippling is not rea y a carry II.
carry-ripple style implementall on, even g t d straiohtforwardl y WIth another 4-bll
. an be connec eo.. .
The 4-bit magOltude comparator c. . d comparator and likeWIse to bUIld any
' Id 8-btl maglll
tu
e , ( tB
magnitude comparator to bUl an. . on outputs of one comparator Ag ,
. I b nnectln
o
the compans I I It)
size comparator, sImp y y co . 0 of the next comparator (I 9 t, eq, .
Aeq8, A 1 t8) wi th the comparIson IOpUtS of looic, and a gate has a I ns delay, then each
If each stage is built from two levels f " -ripple style 4-bit magnitude compar-
staOe wi ll have a 2 ns delay. So the delay 0. a Carryarator built with thi s style wi ll have a
is 4 stages 2 ns!stage = 8 ns. A 32-blt comp
delay of 32 stages * 2 ns!stage = 64 ns.
. . . f 0 numbers using a comparator
EXAMPLE 4.12 Computing the mll11mUm 0 tw ak I va 8-bil inputs A and 8, and OUlputs an
I mponent that t es \
We want to deSign a combmauona co magnitude comparator and an 8.bu 2xl
f A dB Wecanusea
8-btt OUlpUI C thaI IS the minImum 0 an shown In FI gure 4.46
multiplexor to Impl ement thiS component, as
MIN
A1-__
8 8
A B
8-bit magnitude comparator
8-bit
2x1 mux
8
C
(b)
(a) . . f wo numbers: <a) internal
Figure 4.46 A combinalional componenl to compule the ITIlmmUm 0 I
design using a magnitude comparator. and (b) block symbol.
4.6 Count ers 181
If A<B. UlC comparalOr's A 1 tB OUI PUI wi ll be 1. In Ihi s case. we wanl 10 pass A through the
mux. so we conneCI A 1 tB 10 Ihe 8- bil 2x I mu. selecl inpul. and A 10 Ihe mu. s I I input. If Al tB
is O. Ihen eilher Ag tB-I or Ae qB- 1. If Ag tB- I, we wanl 10 pass B. If Ae qB- l. we can pass either
A or B (since Ihey are idemic"I). and so leI 's pass B. We Ihus simply connecl B 10 Ihe 10 inpul of the
8-bil 2x I mu . In Dlher words. if A<B. we' ll pass A, and if A is nOI less than B. we'll pass B.
NOli ce that we sel the comparator's I eq conlrol inpul lO 1. and the I gt and 11 t inputs 10 O.
These values rorce the comparmor to compare its data inputs.
4.6 COUNTERS
Up-Counter
An N-bit COUllIer is an extended N-bi t register component thaI can increment or decre-
ment its own va lue on each clock cycle, when a count enable control input is I.
Illcrement means to add I. while decr emellt means to subtract I. A counter that can
increment is known as an liP-COli liter , a counter that can decrement is known as a down-
COli Iller , and a counter that can increment and decrement is known as an IIp/doWII-
COlllller. A 4-bi t Up-count er would thus count the foll owi ng sequence: 0000 , 0001.
0010 , 0011 . 0100, 0101. 0110, 0111. 1000. 1001, 101 0, 1011,
1100, 11 01. 1110, 1111. 0000 , 0001,etc. Notice that a counter wraps aroulld
(also known as rollillg over) from the hi ghest value (1111) to O. Likewise, a down-
counter would wrap around from 0 to the highest value. A control output on the counter,
often call ed termillal COUllt, or tc, becomes 1 during the clock cycle that the counter has
reached it s last (terminal) count value, aft er whi ch the counter wi ll roll over.
Figure 4.47 shows the bl ock symbol of a
4-bi t up-counter. When co t=I , the counter
increments it s own value on every clock
cycle. When the counter maintains it s
present va lue. On the cycle that the counter
roll s over from 1111 10 0000, the counter
sets tc=l for that cycle, returning tc to a on Figure 4.47 4-biI up-counter block symbol.
the next cycle.
We can design an N-bit up-counter using the
register design process described in Tabl e
4. I- the incremented value of the register
would be fed into a mux input, and the
counter's control lines would be mapped to
the mux select lines. A simpler view of an up-
counter design is shown in Figure 4.48,
assuming an incrementer component exists to
add 1 to the present value. When cnt=O. the
register should maint ain its present val ue.
When c n t = I, the register should be loaded
wi th il s present value plus 1. Note that the 4-
input AND gate causes temlinal count t c to
become 1 when the counter reaches 1111. Figure 4.48 4- bil up-counter imernal design.
182 Datapath Components
Incrementer .' ' rcuit for the incre-
We need to desion a COlllbJl1Htl Onal CI . 0 the
" . . N-bit adder. by setlln"
menter. We could sImply use an . a an N-bit
d I n to a But usm"
8 input to 0001 an tIe CaJrY- I . e looic involved in
adder is overki ll-we don t need all th t '0001 Instead.
an N-bi t adder, because 8 is always JUs . .' mber
dd' 1 to 3 bma1 )' nu
observe in Figure 4.49 that a mg ' three bits per
involves onl y two bns per column. not b rs
. 31 bmary num e .
col umn like when addmg twO gener, S ( on 4 3)
Recall that a half-adder adds two bits (see ec I 1." lf:
. . d b budt usmg 1,1
Thus. a simple mcrementel caul e
adders, as shown in Figure 4.50.
:!.
$
c
" E

u
S
Figure 4.50 4bit incremenrer: (
a) internal design. and (b) block symbol.
We could instead design an
incrementer using the combina
tiona I logic design process from
Chapter 2. We would start with a
truth table, shown in Figure 4.5 1.
We obtain each output row simply
by adding 1 to the corresponding
input row binary number. We would
then deri ve an equation for each
output. For exampl e, we can easily
see that the equati on for cO is
eO=a3aZa1aO. We can also easily
see that sO=aO . We would derive
equations for the remaining outputs,
and then implement the circuit for
each output. The resulting incre-
menter would have a total delay of
only two gate levels, which is less
delay than the incrementer in Fi gure
4.50 built from half-adders.
33
0
0
0
0
0
Inputs
32 31
0 0
0 0
0
0 1
0
0
0 0
0 0
1
1
0
0
00 cO
0 0
1 0
0 0
1 0
0 0
1 0
0 0
0
0 0
1 0
0 0
1 0
0 0
0
0 0
carries: 0 1 1
00 1 1
1
00100
Figure 4.49 Adding I to
a binary number requi res
onl y 2 bits per column.
(b)
Outputs
s3 s2 sl sO
o 0 0 1
o 0
o 0
o
o
o
o 1
o
o
o
1 1
o 0
o 1
o
1 1
o 0
o 1
o
o 1 1
1 1
o 0
o 0
o 1
o
1 1
o 0
Figure 4.51 Truth table for four-bit incrementer.
4.6 Counters 183
We could usc the same combi nat ional logic des ign process to build larger incre-
menters. Recall that we said in Section 4.3 that building adders USing the combinational
logic design process was not very practi cal. Yet here we built an incremenl er using the
combi national logic design process. A key difference to note is that a 4-bit adder has 8
inputs, whereas a 4-bit increment cr has onl y 4 inputs. Thus. we can build wider incre-
ment ers as two- level logic implementat ions usi ng the combinat ional logic design process.
Of course, at some point. even the number of inputs for an incrementer gets too large, in
whi ch case we mi ght chain smaller increment ers toget her to make a wider incrementer.
EXAMPLE 4.13 Up-counter used in the above-mirror di splay
In Example 4.4 and Example 4.6. we assumed
pressing a mode button would cause input." xy to
sequence from 00. 01. 10. 11. and back to 00
again. A simple design to achieve such sequencing.
assuming the mode input is 1 for exactl y one clock
cycle per bUllon press (sec Example 3.9), "tili zes an
up-counter. as shown in Figure 4.52.
EXAMPLE 4.14 1 Hz pulse generator using a 256 Hz oscillator
Suppose we have a 256 Hz oSci ll ntor. but we want a
Down-Counter
I Hz pul se signal. We can cOllven the 256 Hz signal
to a I Hz signal P lI sing an 8-bi t Counter. The 8-bit
COUlHer wraps around every 256 cycles. so we can
si mply connect the osci ll ator signal to the counter's
clock input, set the counter's load input to 1. and
then use the counter's tc output as the pulse signal .
as showll in Figure 4.53. A I Hz signal may be
useful for driving a clock or a wmch, for example.
since I Hz means I pul se per second.
A down-coumer can be designed simil arly to
an up-counter, repl acing the incrementer by a
decrememer, as shown for til e 4-bit down-
count er in Figure 4.54. A decrementer could
be designed in a similar manner as an incre-
menter, staning from a tnlth table like that in
Figure 4.5 1. Note that the terminal count te
becomes 1 when the down-coumer reaches
0000, implemented using a NOR gale-
recall Ihat NOR oUlputs 1 when all it s inputs
are O. The reason the down-counter detects
0000 for te, rather than 1111 like the up-
counl er. is because a down-count er wraps
around after 0000. as in the foll owi ng coum
sequence: 0100, 0011, 0010, 0001. 0000.
x y
Figure 4.52 Sequencer for xy inputs of
above-mirror di splay.
Figure 4.53 Clock divider.
1111, 1110. . Figure 4.54 dl!sign.
184 4 Datapath Components
Up/Down-Counter
An up/down-coullter can COLIIlI
either up or down. It requ ires an
input signal d i r to indi cate the
count directi on, in audition to the
count enable signal cn t. We' lI let
d i r=O mean to count up anu
d i r= I mean to count down.
Figure 4.55 shows the design of
such a .J-bit up/down-counter with
synchronous clear. A 2x I mux
passes either the decrement ed or
increment ed value. with d i r
selectin Q among the two-d i r=O
(count the incremented
dir
elr
ent
va lue and d i r = I (count down) Figure 4.55 4-bi t up/down-counter design.
passes the decrement ed value. The
passed va lue gets loaded into the 4-bit register if cnt=1. di r also selects to pass
the NOR or AND output to the terminal count tc external output-d 1 r-O (count up)
selects the AND. whil e d i r= I (count down) selects the NOR. . .
Alternati vely. we could design an up/down-counter using the regtster destgn process of
Section 4.2. by directly connecting the incrementer and decrementer outputs to muxes In front
of each flip-flop. and mapping the c 1 r. cn t. and d i r control stgnals to the mux select llIles.
Noti ce that we also added a control input c 1 r. whi ch we could have added to the
counter and down-counter too. Ihat when 1 SYllchrollol/sly clears the regt ster, mealllng
reselling the register to ali Os on a risi ng clock edge. We used a 4-bit register wi th clear to
support the clear operati on.
EXAMPLE 4.15 light sequencer
We want to design a sLri p of 8 light bulbs. such thaL the
bulbs illuminate one :1t a lime. ri ght to left , and then
repeal illuminating in LhaL sequence. The sequence
should proceed at the rate of one bulb per second. Such
a li ghting displ ay might be attracti ve outside a restau-
rant or movie the.ner. for example.
For simplicity. assume we have an osci llaror that
generate a I Hz clock signal (meaning one ri sing
clock edge per second). We ll connecl Ihis clock to a 3-
clk
(1 Hz)
lights OOOOOOOO
bi L up-counter. and connect the counter' s three outputs Figure 4.56 Light sequencer.
10 a 3x8 decoder. as shown in Fi gure 4.56.
When the power is on, the system counlS up (we don't know what the initial value of the counter
wa,. but it doesn' t reall y matter). wrapping around from 111 10 000. We don' t need the tc output
in thi s example.
Notice that we used a 3-bil COlllll er will! (I decoder, and 1101 all 8- bil COl llll er , even though
there were 8 OUlpUt S. An 8-bit counter would generate the sequence 00000000. 00000001.
00000010 .... 11111110. 11111111. That sequence is 11 01 the desi red sequence.
Counter with Parallel Load
Count ers often come wi th the
abi lily to initiali ze the count va lue,
achieved by loading the counter' s
register wi th parall el data. Figure
4.57 shows the design of a 4-bit up-
count er with parall el load. When
control input 1 d is 1. the 2x I mux
passes load data input L to the reg-
ister; when 1 d is O. the mux passes
the incremenled value. Furthemlore.
we OR the count er's 1 d and cnt
s ignals to generate the load signal
for the regi ster. When c n t is 1. the
incremented value wi ll be loaded.
When 1 d is 1. the parallel load data
wi ll be loaded. Even if c nt is 0,
1 d = 1 causes the register to be
loaded. A down-counter or up/
down-counter could similarl y be
extended to have a parall el load.
Parall el load is useful when we
want to generate a pulse signal that is
not directl y obtainable from lelting a
counter wrap around and pulse its t c
output naturall y. An N-bit counter
narurally wraps around every 2N
cycles. What if we want a pul se
every X cycles, where X is not a
4.6 Counters
Figure 4.57 Inlemal design of a 4-bit up-counter
with load.
4-bit down-counter
Figure 4.58 A counter selup thaI pul ses
t C every 9 cycles.
I8S
power of2? For example. say we have a 4-bit down-counter. which nonnally pul es the tc
OUlput and wrap around every 16 cycles. and suppose we want to pul se every 9 cycles. We
can achieve Ii pulse every 9 cycles by selling the load data input L to 9-1. or 8 (1000). and
by connecting the tc output to the load control input 1 d. as shown in Figure 4.58. When the
counter reaches its lowest value (0000). tc wi ll become 1. cau ing the 1 d inpul to become
1. Thus. on the next clock cycle, the counter will load 1000. rather than wrapping around to
1111. (Note: the load occurs on the lI exl cycle. not the present cycle. because t c changes
to I after the ri sing cl ock edge. so the new value for 1 d doe n' t gel seen until the next clock
edge.) The counter would thus count in the sequence 8. 7. 6. 5. 4. 3. 2. I. O. pulsing tc and
then return ing 10 8. The reason we load 9-1 . rather than 9. even though we want a pul e
every 9 cycles. is because we must remember that 0 is included in the count sequence-just
as Ihe count from 15 down to 0 takes 16 cycles.
We could instead u e an up-counter for the same purpose. but we must make the load
value equal to the total cycles minus the desired cycles. So for the above example. we
would use a load value of 16 - 9 = 7 (0111). The count er would count the sequen e 7. 8.
9. 10. I I. 12. 13, 14. 15. pulsing tc and then retuming to 7.
186 4 Oata path Components
EXAMPLE 4.16 New Year's Eve countdown display d d ad
I ul the numbers 59 down to O. an a ec er
in Example 2.30. we uti lized n microprocessor 10 ou P h' . 1ple we' ll repl ace the micropro-
" " b d h' l output In t IS cX3n .
(0 Il lumlll ate one 01 60 <lse on I I' ul 59 down to O. Suppose we have an 8-bit
cessor by J down-counter with parall el load to ou P 0 \1 1 cd ( 0 load 59 and then count
. . f ?55 down to . n' e nc
down-coumer avail able. whi ch can COll nt rom - I d 59 ' nl O the coumer and then the
' 11 d reset 10 oa . I .
down. Assume the can press a bUllon ca e . . (d' nl) 10 the 1 positi on (count) to
. d f Ih" a pOSI ti on on I cou
user can move n swit ch count own rOI11 L: F . 459
begin the countdown. The system implcmcn13lion is shown III Igure . .
a
abit
cO
c1
c2
c3
c4
c5
c6
c7
Ie
iO
i1
i2
i3
i4
i5
dO
d1
d2 o---., ...........
d31---........-r'\

d5a
d59
d60
d61
d62
d63
Fi gure 4. 59 Happy New Year counldown system using a down-counler.
Happy
New
Year!
fireworks
Notice thaI the tc signol is our "Happy New Year" indicoti on. We'veconnecled that signal to
an outpul called fi reworks. which we' ll assume aClivates a deVIce Ihat Ignlles fireworks. Happy
New Year!
EXAMPLE 4.17 1 Hz pulse generator using a 60 Hz oscillator
In the U.S .. electricil Y 10 Ihe home operates as an alternating current with a frequency of 60 Hz.
Many appliances convert Ihis signal to a 60 Hz di gital signal. and then convert the 60 Hz dlgll,1
signal to a I Hz signal. 10 drive a clock or olher device needing to keep track of lime at Ihe granu
lari ty of seconds. Unli ke Exampl e 4.2. we can' t
simpl y use a counter of a parti cul ar bilwidth. since no
basic up-counter wraps around aft er 60 cycles-a 5-
bit counter wraps around every 32 cycles, whil e a 6 6-bit up-counter
bit counter wraps every 64 cycl es. Let's start with a csc
6-bit counter, whi ch counts from 0 to 63 and then (60 Hz)
wraps around to O. We' II add some some extra logic,
as shown in Fi gure 4.60. The extra logic should
detect when Ihe counter has counted up 10 59, and
should clear the counter back to 0 on the neXl ri sing
clock edge ralher Ihan lelt ing the counter continue
count ing to 60 and beyond. Fi fty- nine as a 6- bit
bi nary number is 111 a 11. Thus the AND gale in
Figure 4.60 detects 111 a 11. in whi ch case the AND Figure 4.60 Clock divider.
e
Timer
We load 999.
rather ,hall 1000.
bC('(lIIse we mus,
remember ,"m 0
;s parr o/the
CO/U/I. Tr\'
cOlmlillg from 9
dOll'll fO D. raising
(I fi nger ('(lch lime
)'011 say a I lIImber.
No/ice {"m H'hell
roll remc/t D. fell
fingers are lip.
4.6 Counters 187
gale output set the COunt er clear input to I . We assume the counter's clear input clr lakes prece-
dence o:er the Counter's count input ('nt. Since the AND output wi ll pulse every 60 cycles
and the Input clock frequency is 60 Hz. this circuit convcns a 60 Hz input clock into a J Hz output
clock. A circliit thm convcns an input clock into a new clock wi th a lower frequency is known as a
clock diJllder.
A common use of a count er is as the central component wilhin anolher device call ed a
li mer. A limer is a speci al type of count er that mea ures time. Measuring time is a very
common task in a di gital system.
One type of timer is based on " down-counter. We store a value into the counl er. and
wail for the terminal count (0) to be reached. If we know the count er's oscill ator fre-
quency. then we can load " value corresponding to a des ired time int erval. For example.
SUppose we Want 10 know when one second has passed. usi ng a counter havi ng a clock
frequency of I kHz. We woul d thu load 999 (in binary. meaning 1111100 Ill) into the
counter and enable count ing. Aft er I second, the counter woul d reach 0 and assen its ter-
minal count outpu!. notifying us that I second has passed. A timer may repeat this
process aut omati cally, using the terminal count to automat icall y reload the de ired time
va lue (999 in our exampl e) int o the count er. Such a timer mi ght be used in any type of
watch or clock. Our earli er three-cycl es-hi gh laser timer (from chapter 3) coul d have been
bui lt using a timer component. especiall y if in tead of wanting the laser high for three
cycles, we want ed the laser hi gh for a peri od of time like 1.5 seconds.
Another type of timer is based on lin up-counter. We reset that counter to O. and then
enabl e counting when some event occurs that we want to time. When the event ends. we
di sable the counter, aft er whi ch the Counter contains the number of cycles that occurred
during the event. Knowing the time of one clock cycle. we mUltipl y the number of cycles
by the time of one cl ock cycl e to obt ain the total time for the event . For example. if we
time an event as lasting 500 cl ock cycles. and the timer' s oscill ator freq uency i I kHz.
then the time for the event was 500 cycl es * 0.00 1 slcycl e = 0.5 s. We ill ustrate this type
of timer using all exampl e.
EXAMPLE 4.18
Highway s peed meas uring system
Many hi ghways and freeways have ystems that measure the speed of car at various parts of the
hi ghway and upl oad Ihal speed information to a cenl ral computer. Such inforn1a tiol1 is used by law
cnrorcemcnr, traffi c planners. and radio nnd Internet traffic rcpons.
One technique ror measuring the speed of a car use two sensors embedded under the road. 3S
ilJ ustraied in Figure 4.61. \Vhen a car is over a sensor, the sensor ourputs n 1: otherwise. the sensor
outputs a O. A sensor' s output travels on underground wires to a speed-measuring computer box. some
of which are above Ihe ground and others of which are underground. The speed measurer delermines
speed by di viding Ihe distance betwcen the sensors (which is fixed and known) by the time required for
:l vehi clc to Lravel frollllhe first sensor to the second sensor. If the distance between the is 0.01
miles, and a vehicl e takes 0.5 seconds to tr3vel from the first 10 the second sensor. then the ,elucle's
peed is 0.01 miles I (0.5 seconds ( I hOll r 13600 seconds)) = 72 mile per hour. .
To measure the lime between the sensors. we can con truct a imple controls:1 16-bu
timer. as shown in Figure 4.61. St ate SO clears the timer to O. The transition, 10 ' tate J \\ hen
a car passes over the first sensor. 51 starts Ihe timer counting up. The F M stays in J until the 3r
188 4 Datapath Components
,
,
!..-----------a
Figure 4.61 Measuring vehicle
speeds in a highw<.Jy speed
measuring system.
(a)
b'
(b)
'lSSCS over the o;::ccond 'ensor. causing a transi ti on 10 swtc 52.52 SlOpS counti ng .'lIld computes
',i;11C lIsino limcr"'s out put C. Assuming a I kHz clock input to the timer. each cycle
is 0.00 1 theillhe time would be C * 0.001 s. :hal be by D.?"
3600 to the speed. We omit the impiementali on detail s of tht; speed computatIOn, which
would Illost likel y be implcmented as soft warc 011 a microprocessor.
HOW DOES IT WORK? CAR SENSORS IN ROADS.
How does a highway speed sensor or a traffic li ght car sensor know
mat a car is present in a parti cular lane? The main method t.oday uses
what' s call ed an inductive loop. A loop of wire is placed Just under
the pavement-you can usually see the cuts, as in Figure 4.62(a).
loop of wi re has a particular "inductance," which is an electrol1lcS
tenn describing the wire's oppositi on to a change in eleclIic current-:-
higher inductance means the wi re has higher opposition to changes 10
current. It turns out that placing a big hunk of met al (like a car) near
the loop of wire changes the wire's inductance. (Why? the
metal di srupts the magnetic field created by a changing current In the
wire-bul that's getljng beyond our scope.) The traffic li ght c,antral
ci rcuil keep checking Ihe wire's induclance (perhaps by Irylng 10
change the current and seeing how much the current reaJly changes In
a certain time period). and if inductance is more than nonnal , the
circuit aSSumes a car is above the loop of wire.
Many people lhink Ihal Ihe loops seen in Ihe pavemenl are scales
that measure weight-I've seen bicyclists jumping up and down o.n
the loops trying 10 gel a lighl 10 change. ThaI doesn'l work, bUI II
sure is entenaining to watch.
Many others believe Ihal small cylinders a((ached 10 a Lraffic lighl 's
suppon anns, like Ihal in Figure 4.62(b), delecl vehicles. Those inslead
are Iypically devices illal delecl a special encoded radio or infrared-lighl
signal from emergency vehicles, causing the traffic li ght 10 tum green
for the emergency vehicle (e.g .. 3M's "Oplicom" syslem). Such systems
are anolher example of digilal syslems, reducing the lime needed by
emergency vehicles 10 reach the scene of an emergency as well as
reducing accidents involving the emergency vehicle ilself proceeding
Ihrough a traffic light, thus often saving lives.
(b)
Figure 4.62 (a) Inducti ve loop for
delecling a vehi cle on a road, (b)
emergency vehi cle signal sensor for
changing an intersecti on's traffic
lighl 10 green for Ihe approaching
emergency vehicle.
4.7 Multiplier-Array Style
189
4.7 MULTIPLIER-ARRAY STYLE
An NxN lIIulliplier is a d'II 'IP' tl h ' . .
. , , a 1 component I at mul tipli es two N-blt inpul binary
Illl.mbers A (Ihe multiplkand) nnd B (Ihe multi plier). and OUIPUIS an (N+N)-bi t result. For
exampl e, an 8x8 muili pli er multipli es 11'0 8- bil bi nnry numbers and OUIPUIS a 16-bil
result. Deslgnlllg an NxN multipli er in 11'0 levels of logic using the siandard combina-
ti onal deSi gn process wi ll result in 100 complex of a design. as we've al ready seen for
prevIous operati ons like addit ion and compari son. For multipli ers wilh N grealer than 4 or
so, we need a more effi cienl melhod .
. We can creale a reasonabl y sized multipli er by mimick ing how we perl'onn multipli-
call an by hand. ConSider multiplying 111'0 4-bil binary numbers 0110 and 0011 by hand:
OllO
001l
0110
0110
0000
+0000
(Ihe lap number is call ed the lIIultiplicalld)
(Ihe bOll om number is call ed Ihe IIIl1ltiplier)
(each row below is call ed a partial product)
(because Ihe righlmoSI bil of Ihe multipli er is 1. and 0110*1 =0110)
(because Ihe second bil of Ihe multipli er is 1, and 0110*1 =0110)
(because Ihe Ihird bil of Ihe multipli er is O. and 0110*0=0000)
(because Ihe leflmOSI bil of Ihe mullipli er is O. and 011 0*0=0000)
00010010 (Ihe product is Ihe sum of all Ihe panial producls: 18. which is 6*3)
Each panial prodUCI is easi ly oblained by ANDing Ihe presenl multiplier bit wilh lhe
multipl icand. Thus. multipli cation of IWO 4-bil numbers A (a3a2alaO) and B
(b3b2 blbO) can be represenl ed as fo ll ows:
a3 a2 al aO
X b3 b2 b 1 bO
- - - - - - - - - - - - - -- - ---------------- ----
bOa3 bOa2 bOal bOaO (ppl)
bla3 bla2 bl al bl aO 0 (pp2)
b2a3 b2a2 b2al b2aO 0 0 (pp3)
+ b3a3 b3a2 b3al b3aO 0 0 0 (pp4)
- --- --- - ------ -- --- - -------
- - - - - - - --
p7 p6 p5 p4 p3 p2 pI pO
Afler generaling Ihe partial produclS (pp l. pp2. pp3. and pp4) by ANDing the preselll
mu lli plier bil wilh each mullipl icand bit. wc me rely need 10 sum those partial products
together. We can use Ih ree adders of varying widths for compuling Ihat sum. The resulting
design is shown in Figure 4.63.
Th is design has a reasonable size. about Ihree times bigger than a carry-ripple adder.
The desi gn has reasonable speed. The delay consists of I gate-delay for generating the
partial producls. plus Ihe delay of Ihe adders. If each adder is a carry-ripple adder. then
the 5-bil adder delay wi ll be 5*2 = 10 gate-delays, Ihe 6-bi l adder delay will be 6*2 = 12
gale-delays, and Ihe 7-bil adder delay will be 7*2 = 14 gate-delays. If we a sume lhat lhe
10lal delay of Ihe adders is simpl y Ihe sum of lhe adder delays. Ihen the lotal delay would
Ihus be I + 10+ 12 + 14 = 37 gale-delays. However. Ihe 100ai delay of carr -ripple adders
when chained logelher is aClually less Ihan Iheir sum-see Exercise 4. 15.
190
Datapath Components
a3 a2 a1
aO

A B
; Block symbol p7 .. pO
Figure 4.63 Inlernal design of a 4-bil by 4- bil array-SlYIe ll1ullipl ier.
Delays for larger multipliers. whi ch lVili have an even longer chain of adders, lVi li be
even slolVer. Faster mUliplier des igns are possibl e. al Ihe expense of more gates.
4.8 SUBTRACTORS
An N-bit slIblracl or is a datapath component that takes two N-bit binary inputs A and B.
and output s an N- bit resull 5 equaling A- B.
Subtractor for Positive Numbers Only
Subtracti on gets sli ghtl y more complex when we consider negati ve results, like 5 - 7 = -2,
because thus fa r we haven' t di scussed representati on of negati ve numbers. For now, let 's
assume we are on ly dealing with positive numbers. so the subtractor's inputs are posit ive,
and the result is always positi ve. This cou ld be the case, for example, when we are
designing a system that onl y subtracts small er numbers from larger nu mbers. such as when
compensating a sampled temperature that wi ll always be greater than 80 using a small
compensati on value that will always be less than 10.
Designing an N- bit subtractor using the standard combinati onal logic design process
suffers from the same exponenti al size growth problem as an N-bit adder. (See Section
4.3.) Instead. we can again try to mimi c subtracti on by hand in hardware.
Figure 4.64 shows subtracti on of 4-bit binary numbers "by hand." Starting wi th the
nrst column, we see that a is less than b (0 < 1). necessitating a borrow from the pre-
vious column. The nrst column result is then 10 - 1 - 1 (in base ten, two minus one
equals one). The second column has a a for a because of the borrow by the nrst column,
making a < b (0 < 1), generating a borrow from the third column- which must
4.8 Subtractors 191
itself borrow from the fa nl I
. u 1 co umn. The result of the second column is then 10 _ I _
1. The third column bec f h b
'. ,ause a I e arrow generated by Ihe second column. has an a of
1, whIch IS nOi less than b I If ' . .
. so 11e resu l athe Ihlrd column IS I-I The founh col umn
has a=O due 10 Ihe bo f h . .
0-0=0. rrow rom I e Ihlrd column. and smce b is also 0, the resull is
- 0
l si column
o
o % 10
2nd column
o 1 10
..y l{) ..y 0
- 0
1
(b)
(a)
3rd column
o 1
..y 0
-0
41h column
o
..y 0 0
- 0
o 0
(e)
Figure 4.64 Design of a 4-bil sublraClor: (a) subtracli on "by hand". (b) borrow-ripple
Impl ementati on with four full -subtraclors \vi th a borrow-in input wi. and (c) bl ock symbol.
Based on the above-described behavior. we coul d create the internal design of 3 full-
subtractor combinat ional component to implement the behavior of each col;mn. with a
full- subtractor having an input wi representing a borrow by the previous column. and an
output wo representing a borrow from the next column. in addition to the inputs a and b
and the output s. (We use w's for the borrows rather than b's becau e b is already used
for the input : the IV comes from the end of the word borrow.) We leave the design of a
fu ll -subtractor as an exercise for the reader.
EXAMPLE 4.1 9 DIP-switch-based adding/subtracting calculator
In Exampl e 4.8. we designed a simple ca/culalor Ihal could add IWO 8-bil bi nary numbers and
produce an 8-bil resuli . using DIP switches for inpuls. and a regisler plus LEDs for outpUI. LeI'
extend thai calculator to tlllow the user (0 choose 311l ong addi tion and subtraction operations. \Vc'l!
introduce a si nglcswil ch DIP switch that CIS a signal f (for "function") as another sy (em input.
When f =0. Ihe calculator should add: when f l. Ihe calcut ator shoutd subtr:lcl.
One illlplemcnlnti on of thi s calcul ator would use an adder. a subtractor. and 3 multiplexor. as
in Fi gure 4.65. The f inpul chooses whi ch component . the adder or sublraclor. 10 pass through the
I11UX to (he register inputs. \Vhen the user presses e. ei ther the addition or subtrnclion result gets
loaded inlo Ihe regisler and displ ayed al Ihe LEDs.
Thi s exampl e assumes the result of a subtracti on is always :l positive number. negathe.
It also assumes thm the result is always between 0 and 255.
192 4 Datapath Components
Figure 4.65 8- bil DlP-swi lch-
based adding/subtracting
c"lcul mor. Inpul f sc lecls
between addition and
subtraction.
1
o
DIP switches
CALC
I OOOOO.O'/ LEDS
EXAMPLE 4.20 Color space converter- RGB to CMYK
Comput er moni tors. di gi tal cameras. scanners, primers, and other electroni c devices deal with
color images. Those devices Lreal an image as milli ons of tiny pixels (short for "pi cture ele-
mems"). which are indi visible dots representing a tiny part of the image. Each pi xel has a color, so
an image is j ust a coll ecti on of colored pixels. A good computer monitor may support over 10
milli on uni que colors for each pixel. How does a monitor create each unique color for a pixel? In
a common method used in what are known as RGB monitors. the moni tor has three li ght sources
inside-red, green, and blue. Any color of li ghl can be crealed by adding spec ifi c inl ensities of
each of the three colors. Thus. for each pixel. the moni tor shines a specifi c intensit y of red. of
green, and of blue at that pi xel's locati on on the monitor's screen. so thai the three col ors add
IOgelher 10 creale Ihe desired pi xel color. Each subeolor (red, green, or blue) is Iypicall y repre-
sented as an 8- bit binary number (thus each ranging from 0 to 255), meaning a color is represented
by 8+8+8=24 bils. An (R. G, B) value of (a, 0, 0) represems bl ack. ( la, 10. 10) represenl s a very
dark gray, while (200, 200, 200) represenls a li ght gray. (255, 0, 0) reprcsenlS red, whi le ( 100. 0,
0) represents a darker (noninl ense) red. (255, 255, 255) represenls while. ( 109, 35. 201 ) rcpresellis
some mixture of the three base colors. Representing color lIsing intensity values for red. green.
and blue is known as an RGB color space.
ROB color space is great for compuler monitors and cert ain other devices, but not the besl for
some other devices, like pri nters. Mi xing red, green. and blue ink on paper will not result in white,
but rather in black. Why? Because ink is not li ghl ; ralhcr, ink reReCis li gh!. So red ink refleClS red
lighl, absorbing green and blue li gh!. Likewise, green ink absorbs red and blue li gh!. Blue ink
absorbs red and green li gh!. Mi x all Ihree inks logelher on paper, and the mi xlUre absorbs olf lighl,
reRecti ng none, Ihus yielding bl ack. Printers Ihererore use a differenl color space based on the com-
plementary colors or red/greenlblue, namely, cyan/magent a/yell ow, known as a eMY color space.
Cyan ink absorbs red, reRecting green and blue (Ihe mix ture of whi ch is cyan). Magenta ink
absorbs green Ii ghl , reRecling red and blue (whi ch is magema). Ye ll ow in k absorbs blue, rcRecling
red and green (which is yell ow).
Notice a color printer may have
three color 10k cartri dges, one cyan. one
magenla. and one yellow. Figure 4.66 shows
ink cartri dges for a particular color
pnnter. Some printers have a single cart ride:c
for of three. wi th Ihal single
cart ndge lIltemally contai ning separated
nui d compartments for the three colors.
A printer must convert a received RCB
inKlge into CMY. Let's design a fast circuit
to perform thut conversion. Given three 8-bit
values fa: R. C, and B for a part icular pixel.
the equati ons for C. M. and Yare simpl y:
C 255 R
M 255 G
Y 255 8
(255 is the maximum value of an 8-bi t
number). A circui t to perform such conver-
sion can be built using subtractors. as shown
in Figure 4.67.
Actuall y. Ihe conversion needs 10 be
slighll y more complex. Ink isn' l pcrrcci.
meani ng that mi xing cyan, magenta, and
4.8 Subtractors 193
Figure 4.66 A color pri nter mixes cyan. magenta.
and yell ow inks 10 create any color. The pi cture
shows inside a color printer having those three
colors can ridges on Ihe righl. labe led C. M. and Y.
Such pri mers may usc black ink direcll y (Ihe big
cnnridgc on the left). ruther Ihan mi xing the three
colors. to make gr:.Jys and bl acks, in order to creale
a better-looking black and to conserve the more
e,xpcnsive color inks.
yell owyields a black Ihal docsn' l look as black as you mighl cxpeCi . Funhennore. colored ink.s are
expenSive c?l11.pared 10 black ink. Therefore. color printers use black ink whenever possible. One
way 10 maXimize usc of black ink is to factor out Ihe black from the C. M. and Y values. In other
words, a (C, M. Y) value or (250. 200. 200) can be Ihoughl of as (200. 200. 200) plus (50. O. 0).
>-
:2
u
2
<D
"
a:
Figure 4.67 RGB 10 CMY converter.
Figure 4.68 RGB 10 CMYK convener.
Datapath Components
The (200. 200, 100). whi ch is i.I li ght gray. call be generated using black i nk. The remaining (50, O.
0) can be generated lI sing a small amoun t of cyan. and no or yell ow ink at all , thus
savi ne. precious color ink. A CMY color :-. pace c.xtcnded with black IS knowll as a CA1YK color
spnce- (the "K" comes from the last Jetlcr in the word "black'" " K " is used instead of "8" 10 avoid
confusion with the " B" frol11 " blue"),
An RGB to CMYK conver1er can thus be described ;1S:
K Min imu m (C . M. Y)
C2 C K
M2 M - K
Y2 Y - K
where C. M. ;lnd Y are defi ned as ear lier. \Ve thus create the circuit in Fi gure 4.68 for convening an
RGB color space 10 a CMYK color space. We've used the RGBloCMY component from Figure
4.67. \,Vc've al so used two of the MIN component lhat we created in Example 4.12 to
comput e the minimum of two !lumbers: using twO such components computes the minimum of
three numbers. Finally. we use three more subtractors to remove the K va lue from the C. M, and Y
values. In a rcal primer. the imperfections of ink and paper requi re even more acijllsllneill s. A more
rea li sti c color space convener mult iplies the R. G. and B values by a seri es of constant s, which can
be described using matrices:
I C I I mOO mO 1 m02 I I R I
Iml 0 mll m12 1* I GI
IYI I m20 m2 1 m22 I I BI
Further di scussion of such a matri x- based converter is beyond the scope of thi s exampl e.
Representing Negative Numbers: Two's Complement
The subtractor design in the previous section ass umed we onl y dealt with positi ve input
numbers and positi ve result s. But in many systems, we may obtain result s that are nega
ti ve. and in fact. our input values may even be negati ve numbers. We thus need a way to
represent negali ve numbers using bilS.
One obvious but not very effecti ve represent ati on is known as signed-magnitude. In
thi s representati on. the highest-order bi t is used onl y to represent the number's sign, with
o meaning positi ve and 1 meaning negat ive. The remaining low-order bits represent the
magnitude of the number. In thi s represent ation. and using 4-bi t numbers, 0111 would
represent +7. whil e 1111 would represent -7. Thus, four bits could represent -7 to 7.
(Notice. by the way. that bot h 0000 and 1000 would represent 0, the former representing
O. the laller -0.) Signed- magnitude is easy for humans to understand, but doesn' t lend
itself easily to the design of simpl e arit hmetic component s li ke adders and subtractors.
For example. if an adder's inputs use signed-magnitude represent ati on, the adder would
have to look at the hi ghc t- order bit. and then internall y perform either an additi on or a
subtraction, using different circuit s for each.
Instead. the most common method of representing negati ve numbers and performing
subtraction in a di git al system actuall y uses a tri ck that all ows u to lise (III adder 10
pelform subtractiOIl . Using an adder to perform subtract ion would enable us to keep our
simple adder. and to u e the same component for both additi on and subtract ion.
The kcy to performing subt racti on using addit ion li cs in what are known as comple
mellts. We' ll first inlroduce complement s in the base ten numbcr system just so you can
We (/re
illiroducillg l ell's
complell/ em jll SI
/or illllliliol/
purposes- we '1/
(l clltally be usillg
11\'0 's complemelll,
familiari ze yourself with the concept. but bear in mind that the
mt enll on IS to use compl ement s in base two. nOt base ten.
Consider subtraction invol ving two single-digit base ten
numbers, say 7 - 4. The result should be 3. Let' define the
complemellt of a single-digit base ten number A as Ihe mlmber
Ihal ,vhell added 10 A res,,/Is ill a S"III of lell. So the comple-
ment of I is 9, of 2 is 8, and so on. Figure 4.69 provide the
compl ements for the numbers I th rough 9.
The wonderful thing about a compl ement is that you can
use It to subtracti on uSing addit ion. by repl acing the
number bemg subtracted with its compl ement. then by adding,
and then by finall y throwing away tJ, e carry. For example:
7 - 4 -) 7 + 6 13 -) t 3 3
4.8 Subtracters 195
1-9
2-8
3-7
4-6
5-5
6-4
7-3
8-2
9-1
Figure 4.69 Complements
in bnse ten.
We replaced 4 by its compl ement. 6, and then added 6 to 7 to obtain 13. Finally. we
then threw away the carry. leaving 3. which is the correct re ult. Thus, we perforllled sub.
lr(l Cl fOli uSing oddi/ioll.
complements

Adding the complement results in an answer
exactly 10 too much - dropping Ihe lens column gives
the right answer.
Figure 4.70 SUbtracting by adding- subtracting a number (4) is the same as adding the number"
compl ement (6) and then droppi ng the carry. since by definition of the compl emenl. lhe resul t will
be exactl y 10 too much. Arter all . that's how the complement was defined- the number plus its
complement equals 10.
A number line helps us visualize why complement work. as shown in Figure -1.70.
Complements work for any number of digits. Say we want to perfonn ubtracti on
using two two-digit base ten numbers. perhaps 55 - 30. The complement of 30 would be
the number that when added to 30 results in 100. so the complement of 30 i 70. - - + 70
is 125. Throwing away the carry yields 25. whi ch is the correct result for 5: - 30.
So using compl ements achi eves subtracti on using addition.
"Not so fast! " you might say. In or ler to determine the complement. don't w{, have to
perform subtraction? We know that 6 is the complement of 4 by computing 10 - = 6.
We know that 70 is the complement of 30 by computing 100 - 30 = 70. 0 haven't \\ e
just moved the subtracti on to another step-the step of computing the complement'?
196 4 Datapath Components
Two'scomplemellr
call he compllled
s imply by
ifli'erti"8 the bits
and adding J-
IhllS al'oiding the
needior
slI brracrion Il'hen
computing a
complement.
The highest-order
bit in two 's
complemem aClS
as a si8" bit: 0
means pOJili ve,
I mean.' negati ve.
Yes. Except. it lUms out that ill base two, we call compute rite complemel1{ ill a milch
simpler way-jllsl by inverling all Ihe bils alld addillg J. For exampl e, cons ider com-
puti ng the compl ement of the 3-bit base- two number 00 1. The complement would be the
number that when added to 001 yields 1000-you can probably see that the complement
should be 111. Using the same method for computing the compl ement as we did in base
ten, we comput e the two's complement of 001 as: 1000 - 001 = Ill-so III is the
complement of 00 1. However, it just so happens that if we invert all the bits of 00 1 and
add 1, we get the same result! Inverting the bits of 00 1 yields 110: adding 1 yields
110+ 1 = Ill-the correct compl ement.
Thus, to perform a subt racti on, say all - 00 1, we would perform the following:
a ll - 001
- ) all + (( 001 ) ' +1 )
all + ( 110+1)
=011+11 1
= 1010 (throwaway the carry)
- ) 010
That's the correct answer, and didn ' t involve any subtractions-onl y an invert and
addi ti ons.
We omi t di scussion as to why one can compute the compl ement in base two by
inverting the bits and adding I -for our purposes, we just need to know that that trick
works for binary numbers.
There are actuall y two types of complements of a binary number. The type we' ve
been using above is known as the two's complement, obtained by inverting all the bits of
the binary number and adding 1. Another type is known as the olle's complemellt, which
is obtained simpl y by inverting all the bits, without adding a 1. The two' s complement is
much more commonly used in di gital circui ts and results in simpler logic.
Two' s complement leads to a simple way to represent negati ve numbers. Say we have
four bits to represent numbers, and we want to represent both positive and negative num-
bers. We can choose to represent positive numbers as 0000 to a 111 (0 to 7). Negative
numbers would be obtained by taking the two's complement of the positive numbers,
because a - b is the same as a + (-b)' So - I would be represented by taking the
two's complement of 000 1, or( 000 1 ) '+ 1 = 1110+ 1 = 1111. Likewise, -2 would
be (00 10) ' +1 = 1101+1 = 1110.-3 would be (0011 ) ' +1 = 1100+1 = 1101.
And so on. -7 would be (all]) '+1 = 1000+1 = 1001. Notice that the two' s com-
pl ement of 0000 is 1111 + 1 = 0000. Two' s compl ement represent ati on has only one
representation of 0, namely, 0000 (unlike signed-magnitude representation, which had
two representations of 0). Also not ice that we can represent - 8 a 1000. So two's com-
pl ement is Slightly asymmetri c, representing one more negative number than positive
numbers. A 4-bit two's-complement number can represent any number from -8 to +7.
Say you have 4- bit numbers and you want to store-5. - 5 would be (0101) '+1
1010+1 = 1011. Now you want to add -5 to 4 (or 0100). So we s imply add: 1011 +
a 1 a a = 1111, which is -I-the correct answer.
Note that negati ve numbers all have a 1 in the hi ghest-order bit; thu . the hi ghest-
order bit in two' s compl ement is often referred 10 as the sign bit, a indi cating a positive
number, 1 a negative number.
4.8 Subtractors 197
If you Want to know the n . d f' .
. . I agnuu e a a two s complement negatIve number, you Can
obtall1 the magDl tude by taki ng the two's complement again. So to determine what
number 1111 represents, we can take the two's complement of 1111: (1 111 ) ' + 1 =
0000+1 .= 000 1. We put a negative sign in front to yield -0001, or-I.
. A qUI ck way for humans to mentall y figu re out the magnitude of a negative number
ln 4-bn two's compl ement (having a 1 in the hi gh order bit) is to subtract the magnitude
of the three lower bits from 8. So for 1111 , the low three bits are 111 or 7, so the mao -
nnude IS 8 - 7 = I, which in -tum means that 1111 represents _ I. For an 8-bit two':s
compl ement number, we would subtract the magnitude of the lower 7 bits from 128. So
10000111 would be-(128-7) = - 12 1.
. To sum,,:,ari ze, we can represent negati ve numbers using two's complement represen-
tall on. AddulOn of two' s compl ement numbers proceeds unmodifi ed-we j ust add the
numbers. Even if one or both numbers are negati ve, we simply add the numbers. We
perform subtractI on of A - 8 by taking the two' s complement of 8 and then adding that
two's complement to A, res ulting in A + (- 8) . We compute the two's complement of 8
by simpl y inverting the bits of 8 and then adding 1.
Building a Subtractor Using an Adder and Two's Complement
With knowledge of the two's complement representa-
ti on, we can now see how to subtract using an adder. To
compute A - 8, we compute A + (-8) , which is the
same as A + 8 ' + 1 because - 8 can be computed as
8 ' + 1 in two's complement. Thus, to perform subtrac-
ti on, we invert 8, and input a 1 to the carry- in of an
adder, as shown in Fi gure 4.7 1.
Adder/Subtractor
Figure 4.71 Two's complement
subtractor buill with an adder.
sub
b7 b6
.:tE:ft\SUb
IvY'
, ..
~ ... _----- ..,,/
adder's B inputs
(b)
We can strai ghtforwardl y design
an adder/subtractor component ,
havi ng an input sub, such that
when s u b= 1 . the component sub-
tracts, but when sub=O, the
component adds, as shown in
Fi gure 4.72(a). The N-bit 2x I mul -
tipl exor passes 8 when sub=O.
and passes 8 ' when sub=l. sub
is connected to C in also, so that
c i n is 1 when subtracting. Actu-
all y, XORs can be used instead of
the inverters and mux, as hown in
Figure 4.72(b). When sub=O, the
output of XOR equals the other
input 's value. When sub=]' the
output of the XOR i Ihe inverse of
the other input' s value.
Figure 4.72 (a) 1\1'0'5 complement adderl ubtrn tor
using a I11UX. and (b) allemative circuit for Busing XOR
gate.
198 4 Datapath Components
EXAMPLE 4.21 DIP-swltch-based adding/subtracting calculator (continued)
Let's revisi t our DIP-switch-based 3dding/subtfaCling calculator of Example 4. 19. Ob ervc Lh at at
any ojvcn lime the Olil pUI displays the results of either the adder or subtraclOr. ,but ,never both
Thus. we rcall y don', need both an adder and a. sublraclOf In parallel;
instend. we can li se a single adderlsubtraClOr component. DIP swltc.hes have been set,
setting f ""0 (add) verMIS f 3 1 (subtract ) should result in the foll owlIl g computations:
00001111 + 00000001 00010000
00001111 - 00000001 00001111 + 11111110 + 1
00001110
\Ve achieve thi s simply by connecting f 10 (he 5 u b input of the as shown in
Figure 73.
Figure 4.73 S-bil DIP-
swilch-based adding!
subLracting calculator. using
an adder/s ublractor and
two's complement number
representation.
DIP switches

Le('s consider signed numbers using (Wo's complement. If the user is unaware that two's com.
plement represcntation is bei ng used and the user will only be inputting posi ti ve numbers using the
DIP witches. Ihen Ihe user should only use Ihe low-order 7 swi lches of the 8-switch DIP inputs,
leaving the eighth switch in the 0 position. meaning the user can only input numbers ranging from
0 (00000000) to 127 (0111 I Ill). The reason the user can'l usc the eighth bit is that in two's
complement representation. making the highest-order bit a 1 causes the number to represent a neg-
ative number.
If the uScr iii aware of two's complement, then the user could use the DIP switches to represent
negative number too. from - I (1111111) down 10 - 128 (10000000). Of course. the user will
need to check Ihe lefimoSI LED 10 delerminc whclher Ihe outpul represent. " posilivc number or a
negali ve number in two's complement form.
Detecting Overflow
When we perform ari thmeti c using binary numbers of a nxed bit width. sometimes the
result i, wider than the fixed bitwidth, a si tuation known as overflow. For example, can.
ider adding two 4-bit binary numbers (just regular bi nary numbers for now, not two's
complement numbe,,) and storing the result as another 4-bit number. Adding 1111 +
0001 yields a re_ult of I OOOO-a 5-bit number. which i, bigger thnn the 4 bi lS we have
to store the re, ult. In ot her words. 15 + I = 16, and 16 require 5 bi" to repre em in
-----_ .. _---
4.8 Subtractors 199
binary. We can easil y detect overfl ow when adding two binary number simply by
100kll1g m the carry-out bit of the adder- if the carry-out bi t is 1. overflow has occurred.
So a 4-bl t adder adding IIII + 0001 would output 1 + 0000. where the 1 is the
carry out-i ndi cming overfl ow.
When using two's complement
numbers, detecting overflow is
somewhat more compli cated.
Suppose again we have 4-bi t
numbers but now those numbers are
in two's complement form. Con-
sider the additi on of two posi ti ve
numbers, such as 0111 and 000 I
in Figure 4.74(a). A 4-bi t adder
would output 1000, but that is
incorrect-the result of 7 + I should
be 8, but 1000 represents -8 in
two's complement. The problem is
that the largest positive number we
can represent in 4-bittwo's compl e-
sign bits
(0\ 1 1
0
(j)ooo
overflow
(a)
r;\ 1 1 1
A:Jooo
@11 1
overflow
(b)
rl 0 0 0

(j) 1 1 1
no overflow
(c)
If the numbers' sign bits have the same value. which
differs from the resuWs sign bit, overflow has occuned.
Figure 4.74 Two's complemem o'erflow
detection comparing sign bits: (3) when adding
two po itive numbers. (b) when adding {Wo
negative numbers. (c) no overflow.
ment form is 7. Thus, when adding two positive numbers. we can detect O\'erflow by
checking whether the most significant bit is a 1 in the result.
Likewise, consider the addit ion of two neati ve numbers. such as 1111 and 1000 in
Figure 4.74(b). An adder would output a of 0111 (and a caIT) out of 1). 0111 i
incorrect: - I + -8 should be -9. but 0111 is +7. The problem is that the mo t negative
number we can represent with 4-bit two' complement i -8. Thus. when adding two neg-
ative numbers. we can detect overfl ow by checking whether the mo t ignificant bit is a a
in the result .
Notice thaI adding a po itive with a negati ve. or a negative with a positive. can never
result in overflow. The result wi ll always be less negati "e than the moot negati\e number.
or less positive than the most positive number. For example. the extreme i the addition of
-8 + 7. whi ch is - I. Increasing -8 or decreasing 7 in that addition still re ults in a number
between -8 and 7.
So detecting overflow in two's complement iovo" es detecting that both input
numbers were positi ve but yielded a negati ve result. or that both input numbers were neg-
ative but yielded a positive result. Restated. detecting overflow in 1\\0' complement
involves detecting that the sign bit ' of bOlh inputs are the same as one another but differ
from the result 's sign bit. If we call the sign bit of one input a and the . ign bit of the other
input b. and the sign bit of the result r . then the following equllti n outputs I \\ heo there
is overflow:
ove rflow - abr ' + a'b'r
Although the cireuit implementing the above o\t'fflO\\ del'ction equation is quit
simpl e Hnd illluiti vc. we cun cre:tte an e\en simpler circuit if our adlkr gen r:uc!\ 3
out. The simpler method merel) ompare ' the can, into the 'Ign 1>11 alumn \\ ith the
arry out of the sign bit column-if the calT) in allll ';In, (lut dlll>r. \)\emo\\ h <
occurred.
200 Datapath Components
Figurc 4.75 illustrates thi s
1 0 0 0 0 0 0
method for several cases. In Figure
4.75(a). the carry into the sign bi t is l.
whereas the carry out is O. Because
the carry in and carry Oll t difTer. over-
flow has occurred. A circuit detecti ng
whether two bits dirfer is j ust an XOR
gatc. whi ch is slightl y simpl er than
the ci rcuit or the previous mcthod. We
omit discussion as 10 why thi s Illctil od
works. but laoki ne: at the cases in
Figure 4.75 shoul d help provide the
intuiti on.
0 1 0 0
+0 0 0 + 1 0 0 0 + 0
o t 0 0 0 10 0 1
overflow overflow no overflow
(a) (b) (e)
" the carry into the sign bit column differs from the
carry out of that column. overflow has occurred.
Figure 4.75 1\\lo's complement overflow
detecti on comparing carry int o and out of the
sign bi t column: (a) when addi ng two positive
numbers. (b) when adding two negative
numbers. (c) no overfl ow.
WHY SUCH CHEAP CALCULATORS?
Se\'eral earl ier examples dealt with designing simple
ca1culators. Cheap caJcularors. costi ng less than a
dollar. are easy (Q find. Calculators are even given
away for free by many companies selling something
else. But a calculator internally contains a chip
implementi ng a digital cireui!. and chips nomlally
arcn '{ cheap_ Why are some cnlcul:uors such a
bargain?
The reason is known as economy of scale. which
means that products are often cheaper when produced
in large vol umes. Why? Because the design and setup
costs can be amonized over larger numbers. Suppose it
cOSIS S 1.000.000 to design a CUSlom calculator chip
and to setup the chip's manuracturing (not so
unreasonable a number}----design and setup costs are
often caJJed nonrecurring engineering. or NRE.
coSIS. If you plan to produce and sell one such chip.
--_ ... _-_.
then you need to add $1.000,000 to the selling pri ce or
thai chip if you wanl to break even (meaning to
recover your design and setup COSlS) when you sell the
chip. Ir you plan to produce and sell 10 such chips,
then you need to add S 1.000.00011 0 = $100.000 to the
selling pri ce of each chip. Ir you plan to produce and
sell 1.000.000 such chips, then you need to add only
S 1.000.00011.000.000 = $1 to the selling price or each
chi p. And if you plan to produce and sell 10.000.000,
you need to add a mere $1.000.00011 0.000,000 =
50. 10 = 10 cenlS to the selling price or each chip. Ir
the actual raw materials only co t 20 cenlS per chip,
and you add another 10 cents per chip for profit. then I
can buy the chip from you ror a mere 40 cents. And [
can Lhen give away such a calculator for free, as many
companies do. as an incentive ror people to buy
somethi ng else.
/
Display Chip (covered) Battery
4.9 Arithmetic-Logic Units-ALUs 201
4.9 ARITHMETIC-LOGIC UNI TS-ALUS
An N-bit adthmetic-Iogic ullit (A LU) is a datapath component able to perfonn a variety
of anthmellc and logic Operati ons on two N-bit wide data inputs, generating an N-bit data
output Example arithmetic operat ions incl ude addi ti on and ubtraction. Example logic
operall ons .'"clude AND, OR, XOR. etc. Control inputs to the ALU indicate which panic-
ul ar operat Ion to perfonn.
To understand the need ror an ALU component, consider Example 4.22.
EXAMPLE 4.22 Multi-function calculator without using an ALU
LeI's extend our earli er DIP-switch-based calculator to sUPPOI1 eight operations. determined by a
three-switch DIP switch that provides three inputs x. y. and z to our system. as shown in Figure
4.76. For each combi nation of the three switches. we want to perform Lhe operations shown in Table
4.2, on the S-bit data inputs A and B. generating the S- bit output on S.
TABLE 4.2
Desired calculator operations
Inputs
Sample output ir
Operation
A =0000 Illl,
X
Y Z
B-OOOOO10 I
0 0 0 S-A+B
5=00010100
0 0 5=A-B
S=OOOOIOIO
0 0 S = A + I
5=00010000
0 S=A
5=000011 II
0 0 S = A AND B (bitwi se A D) S=OOOOO10 I
0 5 = A OR B (bitwise OR)
5=00001111
0 S = A XOR B (bi twi se XOR)
S=OOOOIOIO
S= OT A (bitwise complement) S=I II 10000
The tabl e includes several bitwise operations (AND. OR. XOR. :Illd complement). A biI><is.
operation appli es to each corresponding pair or bits or A :Illd B separatel).
\ Ve can design 3 circuit for our aJculator a shown in Figure t76. u iog 3 separ.lIC datapath
component to compute each operation: we use an adder 10 compute the addition. 8 subtrnctor to
compute the subtraction. an incremcllIer to compute the increment. and so on. HO\\(!\cr. that
circuit is very inefficient with respect to the number of wire. power consumption. or lbere
nre too many wires that must be routed to all those components. and to the mu."(. \\ b.icll
wi ll have 8*8 ;: (H input!!>. Furthermore. every operation is computed all nme. \\hh .. \\asfes
power. hmlgi nc instead that \\c were dealing nOt with -bit numbel'$. but \\ ith num-
bers. and we wanted to suppan not just operations but 3_ opernuons. Then \\ould hJ.\ C!\ n
morc wires (32*32 = 1024 at I1lU\ inputs). and e\en more po\\cr n!>umpu\'In. Funher-
more. a 3:!x I \\ ill rcquir'l:' sc\cral els of I!ntes. du to pr.t .... al "'ns. d
logic gate the IllU\) \\ill li"-cl) n':c.!d to'" be implemented -I, ('If ... mall r
logic
202 4 Datapath Components
DIP swilches
,..---=-=-=-=-=-="
OODDDOOB
1
o
I.------yo
8
CALC
Wasted
power
Fig ure 4.76 -bi t DIP-switch.based
multifunction calculator. using
separ.lIe components for each
function.
We saw in the above example that using sepamte component s for each operalion is
not effi cient. To solve the problem. we observe lhat the calculator can only be configured
to do One operation at a time. so there is no need to compute all the operallons III parallel
as was done in the example. Instead. we can create a slllgle component (an ALU) that can
compute any of the eight operati on . Such a component woul d be more area and power
efficient. and would have less delay because a large mux woul d nO! be needed.
Let' s stan wi th an adder a our ba e internal AL design. To avoid confusion. we' ll
call the inputs to the int ernal adder 1 A and 1 B. shon for "int ernal A" and "int ernal B.': to
lhose input s from the external ALU inputs A and B. We stan \YlIh the deSign
shown in Figure 4.77(a). The ALU consists of an adder. and logiC III front of the
adders input s. We' lI call lhat logic an arithmeticfl ogic extender. or IIL-extellder. The
purpose of the AL-extellder is to et the adder inputs based on the values of the ALU's
Figure 4 77 Arnhmetl c- Ioglc unll '
la) AL de\lgn ba",d on a \I ngle
adder with an anthmcllc/Joglc
extender. dnd IOJ drnhmClltlloglc
Icnder detail
a7 b7 a6 b6
la7 ib7 la6 1b6
aD bO
(b)
4.9 Arithmetic-Logic Units-ALUs
203
control x. y. and z. such that the desired ari thmeti c or logic result appears at the
adder s output. The AL-extellder actuall y consists of eight identical components labeled
abext. one for each pair of bits a i and b i . as shown in Figure 4.77(b). It al so has a Com-
ponent cill ext to comput e the c i n bit.
Thus. we need to design the abext and cillext components to complete the ALU
design. Con ider the fi rst four calcul ator operati ons from Table -1.2. which are all arith-
meti c operations:
When xyz=OOO. S=A+B. So in that case. we want IA=A. 1 B=8. and ci n=O.
When xyz=OOl, S=A - B. So we Want 1 A=A. 1 8=B ' . and ci n= 1.
When xyz =O 1 O. S=A+ 1. So we want 1 A=A. 1 B=O. and c i n=1.
When xy z=O ll, S=A. So we want I A=A. IB=O. and cin=O .. Olice that A will
pass through the adder. because A+O+O=A.
The last four ALU operat ions are all logical operati ons. We can compute the desired
operati on in the abext component. and input the result to 1 A. We then set 1 B to 0 and ci n
to O. so lhat the va lue on 1 A passes th rou2h the adder unchan2ed.
One possibl e design of abext pl aces ; n 8x I mux in front each output of the abexr
and cillext component s. wilh x. y. and z as the select inputs. in which case we would set
each mux dat a input as described above. A more efficient and faster de ign would reate
a custom circuit for each component output. We leave the completi on of the internal
design of the abert and cillext component s as an exerci e for the reader.
Example 4.23 redesigns the multifunction calcul ator of Example -I.n . this time uti-
lizing an ALU.
EXAMPLE 4.23 Multi -function calculator uSing an ALU
Exampl e 4.22 bui ll an eighl funcli on calculmor \\ ithoUl an AL . The result \\ as W:bled area
and power. complex wiring. and long deja) . sing the abo\ c-designed ALL', the akulJ.mr could
inlOtcad be built :IS shm\ 11 in Figure 4.78. I ot ice the simple and efficient
Flgur. 4.78 S-hlt DIP-
, \\ ih:h-hn, cd mul l! -
flllll' llOn calcuhuor.
U'll1g nn ALU
4 Datapath Components
410 REGISTER FILES
An MxN register file is a datapalh memory component that provides efficient access to a
collecti on of AI registers. where each register IS N bll s Wide. To.understand the need for a
register file component in building good datapaths. rather than JlIst uSll1g M separate reg-
consider Example -1.24.
EXAMPLE 424 Abovemlrror display system uSing 16 32bit registers
Recall the above-mirror di splay syslem rrom Example 4.4. Four 8-bit were to
all S-bi l OUlpUt. Suppose imacad that the system required sixteen 32-blt registers: to display more
values. c3ch of more precision. We would therefore need a 16:< I mulllP.lexor, shown
in Fif!ure ..t 79. From 3 purely digital logic perspective. Ihe deSign IS Just fin.e. BUI In that
multiplexor \'cry incfli cicnl. COllnt the number of wires that would ..be fcd Into that multlplexor-
16:<12 = 511 wi res. That's n 101 of wires to I ry 10 route from the rcguers to the plug-
oin!! 5 11 wires into the back of one stereo system for a hands-on demonstration. HaVing too many
: in a small area is known as cOllgestioll .
" E ., 0
4x 16 ec
u.. Q)
" 4 13'10
e
load
o
32
figure 4 79 Abovemi rror display design. ass uming sixt een J2. bit registers. The mux has too
many input wires .. in Also. the data lines C arc fanned out to too many
in weak current.
Likcwi\e. consider routi ng the dala inpUlto all c;; ixtccn &Ich data input wire is being
branched inl o ,ixtccn Imagine electric current being Iikc a ri ver of waler- branching a
main river inlo smaller will yield much waler now in each c; maller river than in
the main river. branChing a wire. known a}. jallolli . can only be done so many times
before lhe branched wires' arc 100 \ mall to conl rol Furthennore.
low-current wire, may be very 'low altOio. '0 fanoul can create long delayc; over wires too.
The fanout and congesti on probl ems illustnllcd in the previ u< e nmple nn be solved
by ob,ervi ng that we never need to load more Ihan one (It a lime. and lhal we never
need 10 read more than one al a lime ei lher. An M N rcgmcr foIe <olves the fanoul
,lI1d by grouping the M Into a component, with that
4.10 Register Files
having a si ngle N-bit wide data inpul. and a si ngle N-bit wide data outpuL The
wlfmg mSlde the component is done carefully to handle fanout and congestion. Figure
4.80 shows a block symbol of a 16x32 register file ( 16 registers. each 32-bits wide).
. Consider writing a value to a register in a reg-
Ister fi le. We would place the data to be written on 32 32
the input W_ data. We then need a way to indicate A_data """':'-
which register we actuall y Want to write. Since A_addr -+
there are 16 regi sters, we need four bits to speci fY
a panicu lar register. Those four bits are called the A_en _
register's address. We would thus write the desired
regi ter's address on the input W_ add r . For
example, if we wanted to write to register 7. we
would set To indicate that we
actuall y Want to writ e on a panicular clock cycle
(we won' t want to wri te on every cycle). we would set the input W_en to 1. The coUec-
tion of inputs W_ da tao W_add r. and W_en i known as a regi ster file' wrile port.
Reading is similar. We would pecify the register to read on input R_addr. and set
1. Those valJes would cau e the register file to output the addressed regi ter con-
tents Onto output R_data. R data,R addr.and R en are known as a re.n terfile' read
port. The read pon and writ; pon are i ndependent ;;f one another. Thus. during the same
clock cycle, we can write to one register. and read from another (or the same) register.
Let 's consider how to internall y design a regi ster file. For simplicity. con ider a 4 x 3_
regi ster file. rather than the 16 x 32 register file described above. One internal design of a
4x32 register fi le is shown in Figure 4.81. Let's consider the circuitry for writing to this
register file, found in the left half of the figure. If the reg; tcr file \\ on't write fO
any register, because the write decoders outputs will be aliOs. If I. then the write
decoder decodes dd r and sets to 1 the load input of exactly one regi ter. lllat register
will be written on the next clock cycle with the value on W_data.
32
W_data + --...,...'-___ ..... ___ --,
iO
it
d
2x4
d
wnle
decoder
d
e
4x32 register "ie
206 Datapath Components
Such componems
ore more
commnnl\' J..nOh ll
0\ . tn-slate
dnn:rs rtflher
than' three-stote."
But "tri-state"
If a registered
trademark of
VOl/ollal
SemlC:ondu({or
Corp .. fO rother
than the
requITed
(rademarJ.. Hmbol
aJlu t"\ er;. of
lhe lerm "frI-
UaU. man,\
documenH 11ft' the
term rh"l'-\lClle
Notice the circled one-input one-output
component placed on Ihe ICda ta line (there would a ILI -
ally be 32 such components since ICda ta is 32 blls Wide).
That component "'-flown as a drirer. call ed a
bllffer. illU'1r3!ed in Figure 4.82(3). A dnver S OUIPUI
equals it, input. but the OUIPUI is a stronger (higher current )
Remember the fa nout problem we descnbed III
E;amplc -l.2-l? A driver reduces Ihe fanout problem. In
Fi2ure -l .8 1. the ICda ta lines only fanout to twO registers
before Ihey go Ihrough the driver. The driver's OUlput then
oul to on l y IWO morc registers. Thus. inslcnd of a
ranout of four, Ihe H_da ta lines have a ranout of only two
d
q=d
(a)
e=l : q=d d-q
e=O: q=' Z' d- ; -q
like no
(b)
(actually three if you count the driver itself). The inserti on Figure 4.82 (a) driver, (b)
of drivers is beyond the scope or Ihi s book. and is inslead a Ihree-Slale driver.
subjeci ror a VLSI design book or an advanced digital . .
desi2n book. But secinc at least one exampl e of the usc of a dnver hoperull y gives you an
idea-or one reason wh; a register file is a userul component-the component hides the
complexity or ranoul rrom a designer. .
To under;tand Ihe read circuiuy. you must fi rst understand Ihe behaVior or another
new componelll thai we've illlroduced-the tri angul ar component having two inputs and
one output. That component is known as a three-Slate driver or three-state bllffer, Illus-
trated in Figure -l.82(b). When the control inpul C is 1. the component acts like a regul ar
driver-the component' s out put equals its input. However, when the control input c is 0,
the driver's OUIPUI is neither D or 1. but instead what is known as hi gh-impedance, wri tten
as 'Z'. High-impedance can be thought or as no connecti on at all between the dri ver's
inpul and output. '"Three-stale" means the driver has three po, ible output tates-D, 1,
and Z.
Let's now consider the circuit ry ror reading rrom the register file. round in the right
hall' or Figure 4.81. II' R_en=D. the regisler fi le won 't read rrom any register, since the
read decoder' s outputs will be all Ds, meaning all the drivers wi ll output Z's,
and thus the Out pul R_da ta wi ll be high-impedance. II' R_en-1. then the read decoder
decodes R_addr and scts to 1 the control input or exactly one three-Mate driver. which
will pa s ilS regi ster val ue through to the R_da ta output.
Be awarc that each shown three-state driver actually repre,ents a set or 32 three-
,tate driver>. one ror each or the 32 wires coming rrom the 32-bit and going
10 the 32-bit R_da ta OU lput. All 32 drivers in a ,el arc controlled by Ihe same
control input.
The wi res red by the various three-Mate driver', arc known a, a bllS, as indi-
cated in Figure 4. 81. A bus is a popular alternative to a multiplexor when each mux
dala input many bllS wide and/or when there are many mux dma inputs. becau e
a bus result; in les, congesti on.
Notice that Ihe regi ster file design ,cales well to larger numbe" or registen.
The write data 11I1e, can be driven by more drivel'\ If nece"ary. The read data line
arc red rrom three-state drivers and thu, there " no congc'l1on at a single multi-
plexor. The reader may wi sh to compare the rcg"ter file de Ign In Figure 4. I with
the de\lgn In hgure 4.6. which was c"cntially a poor dcslgn or" regi\tcr file.
-----_. -
4.10 Register Files
. Figure 4:83 provides example timing diagrams describing wriling and reading or a reg-
Ister fi le. Dunng cycleJ, we do not know the contents of the register file. so the register file's
contents are shown as "?" DUring cycle J, we set W_d ata =9 (in binary. or course).
H_addr=3, and W_en=l. Those values cause a write of9 to regi sler file location 3 00 the
first cl ock edge. Notice that we had set R_en=D. so the regi ter file outputs nothing ('T).
and the value we put On R_addr does not matter (the value is a "don't care", written as "X").
elk
2 3 4 5
W_dataX X i X
w_addrX::=:::t ; Gtx X X;
W_en} -: : 1 I : I :
;:::..
R_data > Z i Z i
R_addr( X !X21X \3 ! 3 i
I I .1
I i L' i k' i I ' !
, , I I I I
I i i i i
2: ? : 2: ? : 2:? t 2: ? : 2: ? : 2: 177 2. 177
I I , I I
3: ? ! 3: 9 i 3: 9 I 3: 9 i 3: 9 : 3: 9 11 3.j 555
I , ,
Figure 4.83 Writing and reading a regisle, fi le.
Duri ng cycle2, we setICdat a=22. W addr=1. and W en= . These values ause a
wri te or 22 to register file location I on edge _. -
During cycle3. we et W_en=D. so then it-doe n't marter to wbat valu \\e set
W_data and W_addr. We also set R_addr=3 and R_en=1. Those "alues use the reg_
ister file to read out the contents or register file location 3 0010 R_da a. ausing a:c
to output 9. Noti ce that the reading i not yn hronized to cI k
changes soon after R_en becomes I. Examinin2 the desi2n or Fi2ure -l. I hould make
clear why reading i not synchronous- etting R_en t; 1 enabl the output
decoder to turn on one set or the three-state buffers.
During cycle-l. we return R_en to D. Note that this cause me ", 3gtllD.
During cycle5. we Want to si mult aneously" rite and read the regi ter iile. We read
locati on I (which causes JLda a to be ome while writing 1 ati 02
with the value 177.
Finall y. during cy le6. we want to simultan read and 'Hite the : me register
fi le location. We set R_addr=3 and R_en-l. causing I ation 3' < contenl'> fQ to appear
n R_da a sh rtll' after setting those ' -:llues. We also set W .3 .Q3t3=...:S. llld
W_ en-1. On clock edge 6. 5:5 thu. gets ,tored into localion.1. :\ou
clock edge. R. da a abo changes to :55.
TIle ability t read and " nte locations cf J regl,ter til. , n the
ution. i ' a "idel) u,ed feature of regbter fiie>. The ne\ t e\ .ullple m e, \I> l fth.it fe lUI\".
208 4 Datapath Components
EXAMPLE 4.25 Above mirror displ ay system using a 16x32 register fil e
mml ",)rt f
rnm on 0
In (I
pmdulf Mat' f O
r"ul ptJrt f and 5
lot rift' {J'Jr lf
E)..;}mple 4.4 used four S-bil registers for an above-mi rror display Example 4.24 extended
the system to use sixteen 32-bi t regi sters. resulting in and problems. \Ve can redo
that using a register fi le. The design is shown in 4.84. 511lcc system OUt-
puts one of the register values to the display. we ti ed the R_en Input to I . Not ice that the wnung and
reading of pani cular regi sters are independent of one another.
figur.4.84 Abovemirror
di spl ay design. using a
regi ster file.
':"::
16x32 - 1
register lite RA
A register fi le having one read pon and one write pon is sometimes referred to as a
dual-ported regisrer file. To make clear that the twO pons consist of one read pon and
one write pon. such a register fil e may be referred to as follows: dllalporred (I read, I
write) register file.
A regi ster file may actuall y have just one pon, whi ch would be used for both reading
and writing. Such a register file has only one set of data lines that can serve as inputs or
outputs. one set of address inputs. an enabl e input, and one more input indicating whether
we wi sh to wri te or read the register file. Such a register file is known as a sillgle-ported
register file .
Multiported (2 Read, 1 Write) Register File. Many regi ster fi les have three pons:
one write port , and two read ports. Thus. in the same clock cycle. two regi sters can
be read simultaneously. and anot her register written. Such a regi ster file is especially
useful in a microprocessor. since a typical microprocessor in. truction operates on
two register and stores the result in a third register. like in the instruction "RO <-
RI + R2 ."
We can create a second read port in a register file by addi ng another set of lines,
Rb_da t ao Rb_addr . and Rb_en. We would introduce a second read decoder wi th inputs
Rb_add r and enable input Rb_en. a second set of three state drivers. and a second bus
connected to the Rb_da ta output.
Other Register File Varia/iOtl s. Regi ster fil es come in all sons of configurations.
Typi cal numbers of registers in a regiMer fi le range from 4 to 1024. and typical register
wi dths range from 8 bi ts to 64 bits per register. but may vary beyond those mnges.
Regi ters fil es may have one pon. two pons. three pons. or evcn more. but increasing to
many more than three pons can slow down the rcgbtcr perf0n110nCC incrca c its
signifi cantl y. due to the difficulty of routing olllhose wires around in,ide the regi ter
file. Nevenheless, you' lI occasionally run aero" rcgi'lcr liIe, with perhops J wri te ports
and 3 rcad pons, when concurrent access IS cflti col.
4.13 Product Profile: An Ultrasound Machine
Ul9
4.11 DATA PATH COMPONENTTRADEOFFS (SEE SECTION 6.4)
For each datapath component that we introduced in previous sections. we created the most
basIc and easy-to-understand implementation. In thi ection. which physically appears in
the book as SectIon 6.4, we describe alternative implementations of several datapath com-
ponents. Each alternative trades off one design criteria for another-most of those
alternatIves trade off larger size in exchange for less delay. One use of this book covers those
alternatI ve Impl ementations immediately after introducing the basic implementations
(meaning now). Another use of the book covers those alternative implementations later. after
shOWing how to use datapath components during register-transfer level design.
4.12 DATAPATH COMPONENT DESCRIPTION USING HARDWARE
DESCRIPTION LANGUAGES (SEE SECTION 9.4)
Thi s secti on, which physically appears in the book as Section 9.4. shows how to use
HDLs to describe several datapath components. One use of the book describes such HDL
use now, whil e another use describes such HDL use later.
4.13 PRODUCT PROFILE: AN ULTRASOUND MACHINE
I f you or someone you know has ever had a baby, then you may have seen ultrasound images
of that baby before he/she was born. like the images of a fetu . head in Figure 4. -(a).
figure 4.85 (a) Uhrasound image of a fetus. created
using an ullrnsound devi e lhat is simply placed on the
mom's abdomen (b) and lhm fonns the image
gcncrnting sound waves and li stening to the
Pholos coune y of Philips )Slems.
That image wasn't taken by a camem omt'how in. ned into th uteru" Nt r:uh r
an ult rasound machine pressed against the mom's skin :md pointed to\\ ,mI th f tlL. <'
Ullrasound imaging is now common prJctice in obterri - Illainl) helping d.: tl . t"
truck the fetus' progres, and om t potential probl ms earl). Nt aI . ... );1\ 11\ nl:- a
huge thrill when the get their tirst glimpse of their bab) 's h' ud. h:md.. ... :md lint f 't'
110 4 Datapath Components
Functi onal Overview
This section brieny describes the key func ti onal idea, of how ullra ound imagi.ng work .
Digital de,igners don't typicall y work in a vacuum-in>tead. they their skills to par
. -I . t'Oll ' 'Ind thus designers typicall y learn the key functIOnal Ideas underlYlllg lieu ar .lpp I S. _ . . . .
tho,e application,. We therefore inlroduce you to basIc Ideas of ult rasound appitcatl ons.
Itra,ound imaging works by sending sound waves IIlt o the body and itstelllng to the echoes
that return. like bones yield difTerent echoes than objects like ski n or nUld , so an
ullrasound machine processes the different echoes to generate Images li ke III FIgure
-I .85(a)-strong echoes might be displ ayed as white. weak ones as black. Today . ult rasound
machines rely heavi ly 0/1 rast circuits to generate the sounds waves, li sten 10 the
echoe,. and process the echo data to generate good quality images in real lime.
Figure 4.86 Ba.:; ic components of an Ullr'JSound machine.
Figure -1 .86 illustrates the bas ic pan of an ult rasound machine. Let's di scuss each
pan indi viduall y.
Transducer
A lrallsducer convens energy from one form to another. You' re cenainl y fa mi liar with
one type of transducer. a te reo speaker, which convens electrical energy into sound by
changi ng the current in a wire. which causes a nearby magnet to move back and fonh,
whi ch pushes the air and hence creates sound. Another familiar transducer is a dynamic
microphone. which convertS sound into eleclrical energy by letting sound waves move a
magnet. whi ch induces current changes in a nearby wire. In an ult rasound machine, the
lransducer conven> eleclri cal pulses into sound pulses. and sound pulses (lhe echoe ) into
electrical pul ses. but the lranducer u!.es piezoeleclri c cry ..tab inMead of magnets.
Applying electri c current to such a crystal cause .. the cry' tal to change ,hape rnpidly, or
vibrate. thus generating sound waves-typically in the I to 30 Mcg"hert l frequency
range. Human .. can't hear much above 30 ki lohenl- thc term "ultrasound" re fers to the
fact that the frequency is beyond human hearing. Inver,ely. ,ound wave, (echoes) hitt ing
the crystal create electri c current. An ultrasound machine', tr:l n,duccr component may
contain hundred .. of , uch crYMal ,. which we can of ;" hundred, of t", n; ducer.;.
Each ,"eh tran,ducer i .. con .. idered to form a challl/ci.
Beamformer
A heamformer elel/rfill/ClIIII' "focu,c," and "qeer," the 'OIlI1d beam of:1I1 amy of lllln .
duce" to or from panicul,,; focal poinL' . without ac tually mO\lIIg.ln hardware like 3
d"h to obtall1 \lIch focu .. lI1g and .. teenng.
Real designers
mllsl often lean!
abolll ,he doma;"
for which 'he)' will
deSign, Mall)'
designers
cOllsider such
leaming about
domains. like
II ltrasoul/d, as olle
of 'he !ascillmillg
features of Ihe job.
4.13 Product Profile: An Ultrasound Machine
211
To understand the idea of beam forming. we mUSl first under Land the idea of additive
sound. Consider two loud fi reworks expl odi ng al the same lime. one I mile away from
you, and the olher 2 mil es away. You' ll hear the clo er firework after about 5 seconds-
assuming sound travels 0.2 mil es/second (or I mile every 5 seconds)--a reasonable
approximati on. You'll hear the fanher fi rework after about 10 second . So you'lI hear
"boom .. . (five seconds pass) ". boom." However, suppose instead that the closer firework
expl oded 5 seconds later than the fanher one. Then you'lI hear both at the ame time-
one bi g "BOOM!" That's because the two sounds add logether. ow suppose there are
100 fireworks spread throughout a city, and you want all the sound from tho e fireworks
to reach one pani cular house (perhaps somebody you don' t like very much) at the same
time. You can do thi s by expl oding the closer fi reworks laler than the farther fireworks. If
you time everylhing just ri ght, that panicul ar house will hear a tremendou Iy loud ingle
"BOOOOOM! !!!." probabl y rattling the house's wal ls pretty good. as if one huge fire.
work had expl oded. Olher houses lhroughout the city will instead hear a serie of quieter
booms. since the liming of the expl o ions don ' t result in all the sounds adding at th.ose
olher hou es.
Now you understand a basic principl e of beamforrning: If you have multiple sound
source (fi reworks in our example, lransducers in an ultrasound machine) in different
locati ons. you can cause the sound to add together at any desired point in pace. by care.
fu ll y riming the generati on of sound from each source uch that all the ouod wa\'es arrive
at the desired point at the ame lime. In other words. you can electronically focus and
steer the sound beam by introducing appropriale delays. Focusing and teering the sound
to a panicular point is useful because lhen Ihal poilll will prodllce a much louder echo
,ltan all ollter POiIllS, so we can easily hear the echo from lhat point o\'er all the echoes
from other points.
Fi gure 4.87 illustrate the concept of electronic focusing and teering. using two
sound ources to foc us and steer a beam to a desired point X.
focal
wave
(a) (b)
Bo/h waves reach the focal
point the same ome
,>- ...
, I
' ..... '
(e)
focaf
polnt
(dl
Figure 4.87 Focu>ing .ound at 3 p3nic'ul3r point using be3mfonning (al Ii t nme
boll 111 tran du cr (b) lime :-tt'p--the lOp [r:lnMlu r 00\\ ge:oer.u _
too. (e) third time 1\\ 0 '\ound Jllhe f "at POlOt (d) an II1\bD'3o m, l.I\g
thul the top tmnsduccr j., (\\ 0 lime II\\ from the focal p0lOt. \\ hlle the )[tl'l11 tr'3.ru
three time tel' 3\\,1). mcun1l1g the lOp trnn,du,'er ,hould gent:r.ut." ... \'Oe un\( p t r
the bollOIl1 lrnll,dul'cr.
Datapath Components
At the fi, ' te) (Fioure .j.87(a)), the bottom source has begun its
, r. t lime S.I e (F' ur' .j 87(b, the top source has begun lransmllllng
!)ound wave. Arter two lime steps Ig C h
its sound wave. After tllece time steps (Figure 4.87(c: the waves frol11 both reac
h f
. . I TI , II continue adding as long a the waves from both
t e ocal POlllt addll10 t02et ler. Ie) . .
. ' e. - th ' r We can si mplify the draWing by shOWing only the
sources are 111 phase wnh one. ana wn in Fi 2ure .j .87(d).
lilles from the sources to the focal POII1!. as sho -.. .
An ultrasound machine uses thi s abilit y to electrolllcall y focus and steer sound,. In
d
. ' I entire reoion in fron t of the LIansducers. The machine
or er to scan, POIIl! by pain!. tIe e ..
does such scanning perhaps tens or hundreds of limes per second. .
F I f 1
. h chine needs to It sten to the echo lhat comes back from
Or eac 1 DCa P01l1t. I e m3 . . . .. .
whatever object is located at the focal point , to determine If that object IS bone, skin,
blood. etc., utili zing the fact that each such object generates a different echo. Remember,
the echo from the focal point wi ll be louder than echoes from POlillS, because lhe
sound adds at lhat pain!. We can use beamfomling to also focus ilI on a panlcular pOint 111
space that we want to lisl ell to. In the same way lhat we generated sound pul ses wl lh par,
ti cular delays to focus the sound all a pani cul ar pain!. likewise, to "listen" to the sounds
from a panicular point, we also want to introduce delays to the Ignals received by the
transducers. That's because the sounds will arrive at the closer lransducers sooner lhan at
the fanher lIansducers, so by using appropriate delays. we can '.' Iine up" the signals
each LIansducer so Ihat the sounds coming from the focal pOint all add together. ThiS
concepi is shown in Figure 4.88.
focal x\
POInI "-J
(a)
Q)Q)
' I
--'
(b) (e) (d)
resull wilhoUi
Ihedelay
Figure 4.88 Li lening 10 ound from a part icular poinl u. ing beamfomling: (a) firsl lime Slep.
(b) second lime slep-lhe lOp transducer has heard the sound 1i,,1. (c) Ihird lime slep-lhc bouom
Iran,ducer hears the sound al Ihis lime, (d) delaying Ihe lOp lran,ducer by one lime slep results in
the waves from the focal poinl adding, ampli fyi ng the sound.
NOle that lhere wi ll cenninl y be echoes from olher poinL' in Ihe region, but those
coming from the focal poi lll will be much slronger- hence, the weaker echoes can be fil
lered OU!.
NOIice lhal beamforming can be u' cd to li sten to a panicular point even if the ounds
coming from lhat poim are not echoe' coming from our 0\\ n ,ound pulses-the
,ound could be coming from the objeCt at the point IL,clf. ,uch u\ a cur cngi ne or a person
talking. 8eamformi ng b Ihe electronic equi vafelll to poi llltng n hl g flambolic dish in a
panlcular directi on, bUI beamforming require\ no rnovlII g PUrt,
4.13 Product Profile: An Ultrasound Machine 213
8 eamforming is lremendously common in a wide variety of sonar applications, such
as observtnga fetus, observi ng a human hean, searching for oil underground, monitoring
the .s urroundlll gs of a submarine, spying, etc. 8eamfomning is used in some hearing aids
havlllg mulupl.e microphones, lO focus in on the source of detected speech-in that case,
lhe beamfomnlllg must be adaplive. 8eamforrning can be used i.n multimierophone ceU
phones to focus III ,on the user's voice. and can even be used in cellular telephone base
stall ons (uslll g radiO signals lhough, not sound waves) lO focus a signal going lO or
commg from a cell phone.
Signal Processor, Scan Converter, and Moni tor
The signal processor analyzes the echo data of every point in the scanned region. by fil -
tering out noise (see Seclion 5.11 for a di scussion on filLering), interpolating between
pOInts. asslgnlll g a level of gray to each poi m depending on the echoes heard (echoes cor-
responding to bones might be shaded as while, liquid as black, and skin as gray. for
example), and olher tasks. The resulL is a gray,scale image of the region. The scan con-
vener steps lhrough this image to generate the necessary signals for a black-and-while
moni tor, and the monitor displays the image.
Digital Circuits in an Ultrasound Machine's Beamformer
Much of the conLIol and signal processing lasks in an ullIasound machine are carried OUi
using software running on one or more microprocessors, typically special micro-
processors specifical ly designed for digital signal processi ng, known as digital signal
processors, or DSP . But cenain tasks are much more amenable to custom digital
ci rcui lIy. such as those in the beamfomner.
Sound Generation and Echo Delay Circuits
8eamforrni ng during the sound genera-
li on step consi IS of providing starCout
appropriate delays to hundreds of tran -
ducers. Those delays vary depending on
lhe focal point. so they can' l be built
into the lIansducers themselves. [nstead.
we can place a del ay circuit in front of
each LIansducer, as shown in Figure
4.89. For a given focal poim. the DSP
writes the appropriate delay val ue imo
each delay circui t. by wriLing lhe delay
val ue on the bus labeled de lay_out.
Delay
Figure U9 Transducer OUtpul in-uilS for
writi ng the "address" on the lines Iwo channels,
fabel ed add r. and enabling the decoder,
The decoder will lhu et the load line
of one of the OllrDeia . component ,
fter wri ting to every ueh c mpollcnt. the 0 P stJJ'lS all of them simullJ.lleQ\l> b)
selli ng s ta rCou t to 1. Each OIlIDelll), c mponeOl \\ill. after the _pe<-ilied deJa), put
its 0 output, which we'll assume cau es the lransdul'Cr to generate s undo TIte D P \\ auld
then sel S ta rt_out to O. and then Ii -ten for th -ho,
21 -4
Datapath Components
We C3n implement the Oil/Deia\' compo-
nent lIsing a downcount er with parallel load.
as in Figure 4.90. The parall el load
inputs L and 1 d load the down-counter With
its count value. The ent input commences the
down-CoUilting-when the CQunter reaches
zero. the pulses te. The data output of
the counter is unused in tbi s implementati on.
After the ultrasound machine sends out
sound waves focused on a part icular focal
point. the machine Ill ust li sten to the echo
cOllli ng back frolll that focal point. Thi s li s-
tening requires appropriate delays for each
transducer to account for the differing di s-
tances of each transducer from the focal poi nt.
Thus. each transducer needs another delay
circuit for delaying the received echo . ignal.
as shown in Figure 4.91. The EchoDelar com-
ponent on input t the signal from the
transducer. which we ll assume has been dig-
itized into a stream of N-bit values. The
component should output that signal on output
t_de 1 ayed . delayed by the appropriate
amount. The delay amount can be written by
the DSP using the component 's d and 1 d
IIlputs.
We can implement the EchoDelay com-
ponent using a series of registers. as shown in
Figure 4.92. That impl ementation can delay
the OUtput signal by O. I. 2. or 3 clock cycles,
imply using the appropri ate select line values
for the 4x I mux. A longer register chai n. along
wi th a larger mux. would support longer
delays. The DSP confi gures the delay amount
by writing to the top register. which sets the
4x I mux select lines. A more nexibl e imple-
mentation of the EchoDelay component woul d
instead u e a timer component.
s
t
010--- te
ent
LI--,....

Id l---
counter
_ c
<1--<
Out Delay
Figure 4.90 Out Delay circuit.
start_out
delay_out
d ---,..:...-
r-...., .... -_ .. to
d
Id
adders
Figure 4.91 Transducer output and echo
delay circui ts for one channel.
Summation Circuits-Adder Tree figure 4.92 EchoDelay circuit.
The output of each transducer, appropriately
delayed. hould be , ummed to create a single echo signal from the focal poi nt. as wa iIIus
tented in Fi gure 4.88. That illu tration had only two transducer;. and thus only one adder.
What if we have 256 transducers. would be more likely in a real ultmsound machine?
How do we add 256 values? We could add the value!> in a lincar way. illustrated on the
left Side of Fi gure 4.93(a) for eight value' . The delay of that cir Ult i, roughly equal to the
delay of ,"ven addm. For 256 values. the delay would roughly be that of 255 adders. That '
a very long delay.
4.13 PrOduct Profile: An Ultrasound Machine
figure 4.93 Adding many numbers: (a,
l.inearl y. (b) using an adder tree. :\me that
both melhods use seven adders_
215
We can do better by reorganizing
how we compute the sum, USing a config-
uration of adders known as an adder tree.
In other words, rather than computing
((((((A+B)+C)+D)+E)+F)+G)+H.
depicted in Fi gure 4.93(a), we could
IIl stead compute ((A+B)+(C+D +
((E+F)+(G+H)). as shown in Figure
4.93(b). The answer comes out the same
and uses the same number of adders, bu;
the latter method computes four addi-
tions in parallel. then two addi ti ons in
parall el, and then performs a last addi-
tion. The delay is thus onl y that of three
adder. For 256 values. the tree's first
level would compute 128 addi ti ons in
parallel, the second level would compute
64 addi ti ons, then 32, then 16. then 8, then 4. then 2. and finally I last addition. Thu . that
adder tree would have eight level. meaning a total delay equal to eight adder dela, . That'
a lot faster than 256 adder delays-32 limes/asler. in fact. -
The output of the adder tree can be fed into a memory to keep track of the re ults for
the DSP. which may access the results sometime after they are generated.
Multipliers
We presented a greatl y simplifi ed version
of beamforming above. In reality. many
other factors must be considered durin2
beamfonning. Several of those
ati ons can be account ed for by
multiplyi ng each channel with specific
constant values. which the DSP a2ain
sets indivi duall y for each cbannel. -For
example. focusing on a point close to the
handheld device may require u to more
heavil y weigh the incoming Signals of
transducers near the center of the device.
A channel may therefore actuulIy include
a mUltiplier. as shown in Fi2ure 4.94. The
DSP could wri te to the ";gister shown.
Figure 4.94 Channel e\tended \\i th a
multiplier.
whi ch would represent a constant by which the transducer signal" uld be multiplied
Our introduction of the ultrasound ma hine is simplifit'd from real rna-tune.
yet even in thi s simplified introducli n, you an see of this chapt r's dat3P'lth xtn-
ponents in use. We used a down-c unter t implemt'nt the OllrD 1<11 'mpon nt .1nJ
several registers along with muxes r the component. We u>t'd many 3JJe
to sum the in ollling tmn du er , ignals. nJ \\e ust'd a multiplit'f to \\clgh tIK
incomi ng !'ignab.
216 4 Datapath Components
Future Challenges in Ultrasound
Over the past two decades. ultrasound machines have, moved from mostl y machines
to mosll y digilal machines. The digital syslcmS conSISI of bOlh CUSlom dI gItal CirCUIts and
software on DSPs and microprocessors. working together (0 creHl e real -time Images.
One or the mai n trends in ultrasound machines involves crcating three-di mensional
(3- D) images in realtime. Most ultrasound machines or the I 990s and 2000s generated two-
dimensional images. with Ihe qualit y or those images (e.g .. more rocal points per image)
improvino during those decades. In contrast to two-dimensional ultrasound. generating 3-D
images r:quires the regi on of interest from differen,l perspecti :,es, just li ke people
vicw things from lheir tWO eyes. Such generation also requires extenSi ve computations to
creale a 3- D image from the twO (or more) perspecti ves. The result is a picture li ke that in
Fi gure -1 .95.
Thal's a fetus' face. Impressive. isn'l il ? Keep in
mind that image is made solely from sound waves
bouncing int o a woman's womb. Col or can also be
added 10 distinguish among different Ruids and ti ssues.
Those computati ons take time, but faster processors.
coupled with clever custom digital circuits, are
bringing real-time 3- D ultrasound cl oser to reality.
Anot her trend is toward making ultrasound
machines small er and lighter, so that they can be used
in a wider vari ety of health care situations. Earl y
machines were big and heavy, with more recent ones
comi ng on roll able cans. Some recent versions are
Figure 4.95 3- D ullrasound image
of a fe lus's face. Photo counesy of
Phi lips Medi cal Syslems.
handheld. A related trend is making ultrasound machines cheaper. so that perhaps every
doctor coul d have a machine in every examination room. every ambul ance could carry a
machine to help emergency personnel ascenain the extent of cen ain wounds. and so on.
Ul trasound i used for numerous other medi cal appli cations. such as imaging of the
heart to detect artery or valve problems. Ultrasound is also used in vari ous other appli ca
tions. like submarine region monitoring.
4.14 CHAPTER SUMMARY
In this chapter. we began (Secti on 4. 1) by introducing the idea of new bu ilding blocks
intended for common operati ons on multibit data, wi th those blocks being known as data
path components, or register-tran fer-level component . We then introduced a number of
datapath components. incl uding registers. shifters. adder. comparator. counters. multi
pliers, ,ubtractors. arithmetic- logic uni ts. and register fi le,. F r each component, we
examined two a pects: the internal design of the component , and the u,e of the compo
nent in applications.
We ended (Secti on 4. 13) by describing some principles underl ying the opera
tion of an ultrasound machine, and showing how several of th ' datuput h components
might be u'>ed to implement pans of such a machine. One thing YOll mi ght n ti e i how
de\igning a real ult rasound machine would require ,ome knm ledge of lhe domnin of
)
4.15 Exercises 217
ultrasound. The requirement th t f
understanding of an r . a a so tware programmer or digi tal designer have some
. app Icall on domam IS quite common.
I n the commg chapter yo '11 I
sequenti al logic desi (' u WI app y your knowledge of combinational logic design,
cuits that c . I gn controll er deSIgn), and datapath components, to bui ld digital cir-
an Imp ement very general and powerful computations.
4. 15 EXERCISES
'\
.L .......
ExerFci ses marked with an asterisk (*) represent especiall y chal lenging problems.
or exerCIses relallng to data th .
bl h . pa components, each problem Indi cates whether the
pro em emp aSlzes the component 's internal design or the component 's use.
SECTION 4.2: REGISTERS

Trace the behavior of an 8-bil araJl I I d ' ..
. I . P e oa register With IIlpUI I. output Q. and load conrrol
IIlpUI d by compl ellng the foll owing liming di agram.
Id
Q
Trace the behavior of an 8-bit parall el load regi ster with input I. OUtpUI Q. load conrrol input
Id, and synchronous clear IIl put clr by completing the following timing diagram.

H
-l.S
_____________ ____ _
ctr --------____ --.l L--___
clk
Q
Design a 4-bit regist er with 2 control inputs 51 and sO, -1 data inputs I .11. II. and 10. and 4
dala outputs Q3. Q2. Q I. and QQ. When s 150=00. the regisler maint:uns its "3Jue. \\'hen
5150=01. the register loads 13 .. 10. When slsO=I O. the register clem itself ro 0000. When
s I sO= II. Ihe regisler complements itself. so, for enmple. 0000 would become 1111. 3ild
1010 would become 0101. (Componem design problem.)
Repeat Ihe previous probl em. but when s IsO= II. the regisler re\'erses its bits. so 1110 ",auld
become 011 1. and 1010 would become 0101. (Component design problmLl
Design an -bit regisler with 2 control inpuls sl and O. data input> I . .lD. and J;uu outpul>o
Q7 .. QQ. s lsO=OO means mai nl ain the prescnt \alue. IsO=01 me3n. load. 3ild IsO=lO me>n>
clear. s I sO= I I Illeans to swap Ihe high nibble with the 10" nibble (3 nibble is 4 l>il:>\.
11110000 w uld become 000011 11. and 11000101 " ould tx>rollle 010111 . '(',.""1'<"1<'/11
drsig" problelll. )
218 Datapath Components
If0jS
I lice officer is always outputting;] radar and the
-' ,6. The radar gun used b) .1 ,POass However. when the officer wants to tIcket <Ill mdlvldual for
of the caf !<o as the) p. . "'d 'd of the caT on the md3f unit. Bui ld a system to
d' I " I swc the mcaSUft: spel: . .
spec mg. k: . rc for the r:ldaf gun. The system ha!) an 8-bll speed mpul 5, an
impl ement S3\\; fe.llu th . d r gu n and an 8-bil output D that will be sent to the
input 8 from the S3\C butl on on e r.l a ... .
. d d' pi ")' (ColIIl'Jollellf li se problem.)
mdar 5 gun
SECTION ADDERS
'" " . rino nt the outputs of a 3-bit carry-ripple add\! r for one full-adder
-'.7. Trace the \ alues .tppe.l e . . h 011 Assume all inputs were prevIOusly zero for a
delay time peri od. when adding I II \VII .
long time. . . d d
- . I' I f I time unit . comput e the longest tllne require to a d two
... . 8. Assu[11Jng nil gates have n de 0 dd
numbers usi ng an S-bi! carry-npple a er. . .
. I 0 have;] dclay of 2 lime uni ts. OR gates have a delay of I lime unit, and
-'.9. Assuming AND cates f 3 . e units compute the longest time required to add two
XOR 2ateS have a del:ly 0 tlJll
numbe;s usi ng an S- bit carry-ripple adder.
Dcsi2n a carry- ripple adder using carry-rippl e adders. (Componelll use problem.)
If0jS De;i: n an odder lhOl computes the sum of three S-bil number. using S-bil carry-ripple adders.
lise problem.) . . .
Des ion an adder thaI computes lhe sum of four S-bil numbers. uSIOg S-b" carry-npple adders.
If0jS -l.12 (Co117
p
Oll elllllse problelll.)
... 13 Design a digital thermometer lhat can compensate for errors in the sensing
. devi;e's output T. which is an S- bit inpu.t t? our system. The can be
osi li\'e onl y. and comes to our system VIU Inputs a. b. and c. .1 3-pln DIP switch. Our
p h "' nsated temperature on an 8-blt output U. (CompOllelll liSt
system should output t e compe
problem.)
Repeal the previous problem. except that the compensati on amount can positi ve. or nega-
ti ve comino to our system via four inputs a. b. c. and d from a 4-pl.n DIP switch. The
amount is in two's complement form (so lhe . scttlng the DIP switch
beller know that!). Design the ci rcuit. What i the range by which the Input temperat ure can be
compensated? (Compone11l lise problem).
We can add three 8-bil numbers by chaining one 8-bil corry-ripple adder 10 the Outpul of
another -bil carry-ripple adder. Assuming every gate has a delay of I lline- Unlt. the
longe" delay of thi s lhree 8-bit number adder. Hint: you may hove to look carefull y ,"Side the
carry-rippl e adde", even in; ide lhe fu ll -adder; . to correct ly compute lhe longesl delay from
any input to any output. (Compolle11l use problem.)
SECTlO, 4.4: SHIFTERS
4.16 De; ign an 8-bi t shifter lhat shifts its inpuls lWO bits to lhe ri ght (' hifling in Os) when the
shi fter\ 'hift control inpul is I. (Compollell l desigll plVhl em.)
c-: -I 17 Design a ci rcuit thaI OUlput, the avemge of four 8-bi t input ' rcpre,enllng binary numbers (not
PLUS . in two', complement form). (CompOll elll ll le pmhlelll. )
Dc"sn a CIrCUit thaI take, an 8-bi tlnput D repre<;ent'"g binary number. (not in two's compl<
ment rorm). and outputs two that \<tluc. (Componelll IHl' IUy/blr",. )
-1.19 De"gn a eircUitthat output , nine tim .. 11' 8-blt ,"put D reprc,enllng blnllry numbers (not in
two\ complement form). II lnt: \e:1 and an odder ( o",po"elll 'HI!
4.15 Exercises 219
4.20 Design a special multipl ier ei rcuil lhal can multiply ilS l6-bil inpul by I. 2, 4. 8. 16. or 32.
speCified by lhree ,"puts a, b. c (abe=OOO means no multipl y. abc=OOl means multiply by I.
abc=OIO means by 4, abe=OII means by 8. abe=IOO means by 16. abe=IOI means by 32).
H'"t: Use a predefined component deSCribed in lhi s chapler. (Component use problem.)
4.21 Trace lhrough lhe execul ion of lhe barrel shifter shown in Figure 4.42. when 1=011 00101. x =
I. Y = 0, Z = I. Be Sure to show how the inpul I is hifted after each internal shi fter stage.
4.22 Trace through the execuli on of lhe barrel shiftershown in Fi gure 4.42. when 1=1 0011011. x =
0, y = I, Z = O. Be sure to show how the input I is shifted after each iniernaJ shifter stage.
4.23 Using the ba,:,el shifter shown in Fi gure 4.42, whal settings of the inputs x. ). and z are
required to shift lhe ,"PUI I left by six posilions?
SECTION 4.5: COMPARATORS
4.24 Trace through the executi on of the 4-bit magnitude comparator shown in Figure 4.45 '" ben
a = 15 and b= 12. Be Sure to show how the comparisons propagate thought the individual
comparators.
Desig.n a comparator that determines if three 4-bit numbers are equaL by connecting 4-bit
magnitude comparators together and using additional components if necesS3I). ( ComponenJ
use problem.)
4.26 Design a 4-bit carry-ripple slyle magnitude comparator that has two outputs. a greater-than or
equal-to output gl e, and a less- than or equal -to output lIe. Be ure to clearly sho\\ the equa-
tions u ed in developing the indivi dual I-bit comparators and how they are connected to fonn
the 4-bi t circuit. (Compoll ellf design problem.)
4.27 Design a S-bil magnitude comparator. (Compollelll design problem.)
4.28 Design a ci rcuit thaI outputs I if the circuit'S S-bit input equal 99:
(a) usi ng an equal il y comparator,
(b) using gates onl y.
Hint: In the case of (b). you need only I AND gate and some imeners. (Componem us,
plVblem.)
4.29 Use magnitude comparators and logic to design 3 circuit that rompme5 the minimum of three
8-bit numbers. ( Componelll use problem. )
4.30 Use magnitude comparators and logic to design a circuit that compme5 the ma..,irnum of (Wo
16-bil numbers. (Compollelllilse problem. )
4.31 Usc magnitude comparators and logic to design a circuit thut outputs 1 \\hen an -bit lDput is
between 75 and 100, incl usive. (Compoflenr use problt'm.)
-1.32 You are to design 0 human body temperature alarm system for a h pit.!. Your 'tern
an 8-bi t input repre enting the temperature. whirh can range from 0 to :.!.55. If the nle:lSured
lemper:llure is 95 or less, you should set omput A to I. If the temperature I> 96 to 10-l.
should set out put B 10 I. If the temperature is 105 or abo\"e. u should set output C t 1.
( Companelll lise problem"
4.33 You are working as It weight gue in an amusement p3.fk. Your job is to tr) to go -- tM
weight of an individual before they on the scale. If .. .. i!-' n)( "ithllli n of
the individual'S octuulll'cight (higher or lo\\er)_ the indh'idual \\In-. pll2 BUild 3 \\ !$ht
analyzer system that OUtputs \\hether the \\ib "llhin ten 'Th-e \\
guess ullnl) ll! r has an -bit input G. J.I1 :'>- bit input from the S('31e \\ \\lth the... ::t
\\eight . and a outpUt C that is I irthe \\clght \\.b \\ ithlO Jeri"'-,\! hmlb of
lhe game. (CompoIJt'IJI usr l,mblem.)
220 Datapath Components
SECTION COlJNTERS
".J'" Design a 4-bi t up-counter that hrl s twO control inputs: elll enables cOllllling up. while clear
synchronously resets the counter to all Os:
(3) using a parallcl IO::ld regi ster as a building block.
(b) using flip-flops and 11111XeS directly by following the regiSlef design process of Section 4.2.
(Componelll desig" problelll.)
.t35 Design a 4-bit down-counter that has three control inputs: elll enublcs cOll ll li ng up. clear syn-
chronously resets the counter to all as. ;} nd sef synchronously sets the coumer to all Is:
(3) lI sing 3 parall el load regi ster as a building block.
(b) usi ng fl i p-flops and muxes direct ly by foll owing the register design process of Section 4.2.
(Compoll ellf design problem.)
.tJ6 Design a 4-bi l up-counter with an additional output IIpper. tipper outputs a I whenever the
counter is within the upper hal f of the counter's range, 8 to 15. Use J basic 4-bi t up-counter as
a bui lding block. (Compoll elll desigll problem.)
-1.-'7 Design a 4-bit up/down-counter lhat has four conuol inputs: CIICUP enables counting up.
elll down enables counlin! down. clear synchronously resets the counter to aliOs, and set
seLS the to all I s. If bot h counl control inputs cm_lIp and cllt_dow,l are
I. the counter will retain its current count value. Use a parallel load register as a building
block. (Component design problem.)
.. 38 Design a circuit for a 4-bit decrementer. (Component design problem.)
Design an electronic turnstile system using a 64-bit counter. The input is a bit A, which is I
for exact ly one clock cycle whenever a person walks through the turnstil e. The output is a 64
bi t binary number. A second input 8 is 1 whenever a reset button is pressed. and should reset
Ihe OUIPUIIO Os. Knowing Ihal California' s Di sneyland altraclS aboul 15.000 visilors per day,
and assuming they all pass through your one turnstil e. how many days would pass before your
counter would roll over? (Compone1l1ltse problem.)
(a) Using an up-counter with a synchronous clear contcol input , and extra logic. design a
circui l Ihal OUIPUIS a I every 99 clock cycles.
(b) Design Ihe counler from part (a). bUI use a down-counler wi lh parall el load.
(c) Whal are Ihe tradeolTs belween Ihe IWO designs from parts (a) and (b)?
(Compone1l1llSe problem.)
4AI (a) Gi ve Ihe counl range for Ihe foll owing sized up-counlers: 8-bils. 12-bil s. 16-bils. 20bi lS,
32 bi lS. 40bils. 64bilS, and I 28-bils.
(b) For each size of counler in part (a). assuming a I Hz clock. indical e how many minules.
hours, days, etc .. the counter would counl before wrapping around.
SECTION 4.7: MlJL TIPLIER-ARRA Y STYLE
Assuming all gales have a delay of I limeunit. which of the foll owing designs will compule
the 8-bil multiplicalion A ' 9 fasler:
(a) a ci rcuil as designed in Exercise 4.19. or
(b) an 8-bil array slyle multiplier with one of ils connecled 10 II conSlanl value of nine.
4.43 Design an 8bi l array,slyle multipli er. (CompOll elll desigll proMem.)
De;ign a more accurale version of the Celsius 10 F"hrcnheil convert er from Example 4. 10.
The new conver$ion circuit receives n digitized temperature in Cebi u us n 16-bi t binary
number C and OUIPUIS Ihe lemperalure in Fahrenheil as a l 6-bil OUIPUI F. Our more accurnle
equali on for calcul aling an approximate conversion from cl,i ll ' 10 Fahrenheil is: F = C'301
16 + 32. (Compoll elll lise probl em.)
4.15 Exercises 221
SECTION 4.8: SlJBTRACTORS
4.45 Creale Ihe internal design of a fu ll .
446 C h ' Sublraclor. (Compollelll design problem)
.. onvert t e foll OWing two's com I .
(a) 0000 II II P emenl binary numbers 10 decimal numbers:
(b) 10000000
(c) 1000000 1
(d) 11111111
(e) 100 10101
4.47 Conven the following Iwo's co I .
(a) 0 1001 101 mp ement binary numbers 10 decimal numbers:
(b) 0001101 0
(c) 111 0 1001
(d) 101 0 1010
(e) 11111100
4.48 Convert Ihe foll owi ng IWO's complemenl b' be .
(a) 111 00000 Inary num rs 10 deCImal numbers:
(b) 01 111111
(c) 1111 0000
(d) I 1000000
(e) 111 00000
4.49 Convert Ihe foll owing 9bi l IWO'S compleme I b' b .
(a) 011111111 n Inary num ers 10 deCImal numbers:
(b) 1111111I1
(c) I 00000000
(d) I I 00000oo
(e) 11111111 0
4.50 Convert Ihe foll owing decimal numbers 10 gb1 I" I .
(a) 2 I wo S comp ement binary ronn:
(b) - I
(c) -23
(d) - 128
(e) 126
(f) 127
(g) 0
4.51 Convert the foll owin!! decimal numbers 10 b'
(a) 29 - . II IWO'S complement binary ronn:
(b) 100
(c) 125
(d) - 29
(e) - 100
(f) - 125
(g) - 2
222 Datapath Components
. g. bit tWO' s complement binary fOfm:
Convert the roll owi ng decllllal numbers to
(a) 6
(b) 26
(c) - 8
(d) -30
(e) -60
(I) -90
(g) - 120 .
.... .' 9.bil twO' s compl ement binary fonn:
"'.53 Convert the foll owmg deCimal numbers to
(a) I
(b) - I
(c) -256
(d) -255
(e) 255
(I) -8
( 0) - 128
o th t has three S-bit inputs. A. B, and C, and a single
"' .5-& Usin2 4-bi t subtractors. bui ld a sublraclOf a bl )
8-bi t .... output F. where F=(A-B) - C. (Compollelllils
e
pro em,
. . that di oi li zes a temperature int.o a 16-bit binary number K
..a .S5 You are given a digital thermometer e 0 a 16-bit Fahrenheit value. Use the fol -
in Kel vin. Build a system to convert that temperalUre ,I *
. . ' d xi m, te converSIOn: F= (K-273) 2+32. (Compoll ellt lise
lOWing equauon (0 proYI e an appro
problem. )
SECTION 4.9: ARITHMETIC-LOGIC UNITS-ALUS
Desion an ALU with two 8-bit inputs A and B. and control signals x, y, and z. The ALU
should support the operations described in Table 4.3. Use an 8- blt adder and an anthmeuc!
logic extender. (Componefll design problem.)
TABLE 43 Desired ALU operati ons.
Inputs
Operalion
X y
0 0 0
S=A-B
0 0
S=A+B
0 0
S=A " S
0
S = A / 8
0 0
S = A NAND B (bitwi se NA D)
0 1
S = A XOR B (bitwise XOR)
0
S = Reverse A (bi t reversal)
S = NOT A (bitwi se compl ement)
4.57 Design an ALU wilh two 8-bit inputs A and B. and contro l x. y. and z. The
should support the operati ons described in Table 4.4. Usc an 8-blt adder and an anthmeuc!
logic extender. (CoII/ ponenl design problem.)
4.15 Exercises 223
TABLE 4.4
Desired ALU operations.
Inputs
X y
Operation
0 0 0 S-A+B
0 0 1 S = A AND B (bitwise AND)
0 0 S=A
AND B (bit wise NAND)
0 1 S = A OR B (bit wi se OR)
0 0 S = A NOR B (bit wise NOR)
0 1 S = A XOR B (bit wi se XOR)
0 S = A XNOR B (bi twise XNOR)
S = NOT A (bi twise complement )
4.58 An instructor teachi ng Boolean algebra wants to help her students learn and understand basic
Boolean operator.; by providing the students wi th a calculator capable of perfomling bitwise
AND. NAND, OR. NOR. XOR. XNOR. and NOT operations. Using the ALU specified in
Exercise 4.57. bui ld a simple logic calculator using DIP switches for input and LEDs for
output. The logic calculator should have three DIP swi lch inputs to select which logic opera-
ti on 10 perform. (CompOll elll use problem. )
SECTION 4.10: REGISTER HLES
4.59 Design an 8x32 two port (I read. I write) regi ster fi le. (Compollent design problem)
4.60 Design a 4x4 three port (2 read. I wri te) register fi le. (Compoll em design problem_)
4.61 Design a IOx l 4 register fi le (one read port. one write port). (Compollem design problem)
4.62 " Create a speed-dial system for a telephone. Ei ght speci:1l bunon bO-b 7 access each stored
number. The most recently dialed number exists as ni ne digits stored in nine 8-bi{ regi rers RO-R .
When the phone user presses another button S simultaneously with any bunon bO-b7. the most
recently dialed number gets stored in the button's corresponding storage. When the user presses a
button bO-b7 by itself. the number in that button' s storage gets read out and placed on nine -bit
outputs PO-P8. Hint: use nine regi ster fi les and some extra logic. (Componenl use problem. '
224 4 Datapath Components
DESIGNER PROFILE
Roman began slUdying
Computer Science in
college due to his interest in
soflware development .
During hi s undergraduate
studies. his interests
expanded 10 incl ude digilal
design and embedded
syslems and eventually led
hi m to become involved in
research developi ng new
melhods 10 hel p designers
quickly build large integraled circuils (IC). Roman
conLinued his educati on through grnduate studies and
recei ved hi s M.S. in Computer Science. after which Roman
worked for bOlh a large company designing integrated
circuits (Ie) for consumer electron ics as well as a slart-up
company focusing on high-performance processing.
Roman enjoys working as both a software developer
and hardware engi neer and believes that "fundamentally
soft ware and hardware design are very similar. both
relying on efficientl y solving difficult problems. While
good problem solving skill s are important, good learning
skill s are also imponant." Contrary 10 what many studenls
may believe. he points au[ that "Ieaming is a fundamental
activity and ski ll mat does not end when you recei ve your
degree. In order 10 solve problems, you often are required
10 leam new skills. adopl new programming languages and
10015. and delennine if existi ng solutions wi ll help you
solve the probl ems you face as an engi neer." Roman points
out that digital design has changed at a rapid pace over the
last few decades. requiring engi neers to leam new design
techniques, leam new programming languages, such as
YHDL or SyslemC, and be able 10 adopl new lechnologies
10 stay successful. "As the industry continues to advance al
such a rapid pace. compani es do not onl y hjre engineers
for what they already know, but more so on how well those
engineers can continue to expand their knowledge and
leam new skill s," He poinls oUI Ihal "college provides
slUdenls wilh an excell ent opportunity 10 not only learn the
essemi al infonnation and skill s from their course work but
also to learn additional infonnation on their own, possibly
by learning differenl programmi ng languages, gelting
involved in research. or working on larger design projects."
Roman is mOli valed by hi s enjoymenl of Ihe work he
does as well being able 10 work with other engineers who
share hi s interests. "Motivati on is one of the keys to
success in an engineering career. While motivation can
come from many different sources, finding a career that
you are trul y inleresled in and enj oy reall y helps. Co-
workers are also a great source of moti vation as well as
knowledge and lechni cal advice. Working as a member of
a team that communi cates well is very rewarding. You are
able 10 mOli vale each olher and use your strengths along
with the strengths of your co-workers to achieve goals far
beyond Ihal which you could achi eve on your own."
5
Register-Transfer level
(RTl) Design
5.1 INTRODUCTION
In the previ ous chapters, we've defined the combinational and sequential components
needed to build di git al systems. In thi s Chapter, we'lI learn to build interesting and useful
di gi tal syslems from those components. In particular. we'lI put LOgether datapaLh compo-
nent s 10 build datapaths, and we'll use controller to control those datapaths. The
combinati on of a controll er and datapath is known as a processor. Some processors. like
Ihose in personal computers. are programmable-those processors are the focus of
Chapter 8. Other processors are custom-designed for a parti cular task. and are nOl pro-
grammable-<lesign of such custom processors is the focus of this chapter.
Di gital designers today focus largely on designing cuslom proces ors. as opposed to
designing lower-level digi tal components. We can define a custom proces or as a digital
circuit that implements a computer algori thm-a sequence of instruction that carry Out a
pani cul ar task. For example, we can define an algorithm to filler out noise from a digitized
stream of audio. and we can then create a processor to implement that algorithm. Another
algorithm might encrypt data for secure electroni c commerce purpo es. An algorithm might
compare a fingerprint to a set of 10.000 fingerprints to quickly enable a pou e officer to
detemli ne if someone is a wanted criminal. An image processing algorithm might detect a
lank in a large video image. Beamfonning. pan of the ultrasound machine example in the
previ ou chapler. can be thoughl of as another algorithm. implemented u ing the processor
design described in that chapter. In facI, several of our exanlples in the previou chapler. like
the above- mirror di splay, DIP-switch-based calculator. and color space on\'ener. an a tu-
all y be thought of as very simple proce sors implementing imple algorithms.
Processors can be designed using different design method. The 010 t ommon
method in pm tice loday is known as register-transfer le\'el de ign. Regisrer-transfer
level desigll , or RTL design. actually consists of a wide variety of approache- but in gen-
eral. a des igner specifie the registers of a design. des ribe. the po ible tr.lnsJe _ and
operati ons perfoml ed on input. output . or register data. and define the ontrol that pe.-i-
fies when to transfer and operate on data.
Recall the design processes we defined for combinational logic des ign in hapler 2.
and f r sequenti al logic (controller) design in Chapter 3:
226 Registe r-Transler LevellRTLI Design
In the combinat ional logic design process outlined in Tabl e 2.5, .. .
I . The first step was to caplllre the desired behavior of the comblll auonal logtc,
wit h either a truth table or an equation.
2. The rcmaining stcps were to cOllller / the behavior to a circuit.
In the sequenti al logic (controffer) design process in Table 3.2. . . .
I. The first step was to caplll re the desired behaVior of the sequenual logtc, usmg
a finite-state machi ne.
2. The remai ning steps were to convert the behavior to a circuit.
It should therefore come as no surpri se that: .
I. The first step of an RTL des ign method wi ff be to captll re the des ired behavior of
the processor. We' ff introduce the concept of a hi gh-l evel stat e machme for cap-
turi ng RTL behavior.
2. The re maining steps wi ff be to cOlI l/ertthe behavior to a circuit.
Figure 5. I il lustrates the idea that the design process
be viewed as first capturing behavior and then con-
venin o the behavior to structure. That process applies
regardless of whether we are performing combinati onal
logic design. sequential logic design, or RTL design.
In thi s chapter. we wiff introduce the RTL design
process. also known as the RTL design method. As the
process is largely creati ve, we wiff utili ze numerous
examples to iff ustrate the process. We wi ff also intro-
duce several hi gh-level component s that are useful
during RTL design. includi ng memory component s and
queue components.
5.2 RTL DESIGN METHOD
Capture behavior
Convert to circuit
Figule 5.1 The design process.
RTL des ign is carri ed out using a wi de variety of methods in practice. but it may be
useful to defi ne a general method as in Table 5. 1
TABLE 5.1 RTl design method.
Step
CapfUre {I high-level
Q. S(llfe machine

cii
; a darapllI/i
<0
COlin eel the datapath
" 10 a cOli/ roller
.., Derive 'he
e- cOllfmller 's FSM
cii
Description
Describe the system's de ired behavior a a hi gh-level state machine.
The Slale machine consists of slales and Lransil ions. The Slate machi ne
is "high-level" because the transition condili ons and the stal e actions
are morc than just Boolean operations all bit inputs and outputs.
Cleate a datapat h to carry out the data opeltl ti ons of the hi gh-level
Slale machine.
Connect the datapmh to a controll er bl ock. Connect external Boolean
inputs and output to the controll er block.
Convert the hi gh-level state machine to a fini te-,t.te machine (FSM)
for the controll er. by replaci ng data operati ons with sctting and rending
of control signal s to . nd from the dutapath.
5.2 RTL Design Method 227
A fifth step may be necessary, in which one selects a clock frequency. Designers
seeking high performance may choose a clock frequency that is the fastest possible based
on the longest register-to-register delay in the final circuit.
Impl ementing the controff er's FSM as a sequenti al circuit. as we learned in Chapter
3, would then compl ete the design.
Notice that the first step captures the desired behavior, whi le the remaining step
COII l/ert that behavior to a circui t.
We' ll first provide a smaff and simple exampl e as a "preview" of the RTL design
method's steps, before we define each step in more detaif.
EXAMPLE 5.1 Soda machine dispenser
We are (0 design a processor for a soda di spenser. A coin dClcclOr
provides our processor with a Ibit input c mal becomes 1 for
one clock cycl e when a coin is detected. and an 8-bit input a
indicaling the coin's value in cents. Another 8.bit input S indi.
cates the cost of a soda (thi s cost can be set by the machine
owner). Once the processor has seen coins whose value equals or
exceeds the cost of a soda. the processor should set an OUlput bit
d to 1 for one clock cycle. causing a soda to be dispensed (thi s
machine has only one type of soda). The system does not give
change-any excess money is kept. Fi gure 5.2 provides a block
symbol of the system.
c_
d _
Soda
dispenser
processor
Figule 52 Soda dispenser
block symbol.
Step 1 of our RTL design method is to capture the
desired behavior of the system. Figure 5.3 shows a
hi gh-level state machine describi ng the desired
behavior. The first state. Ill il. sets the output d to 0
and initiali zes a local register tot 10 O. tot will
keep track of how many cents the syslem has seen
so far. The Slate machine then enters stare Wail.
(Recall from Chapter 3 that a transiti on with no
condi ti on has an impli cit "[rue" condition. and thus
transiti ons on the next rising clock edge.) The FSM
stays there as long as no coin is detected and the
total cents seen so far is less than the cost of a soda.
When a coin is detected. the stale machi ne goes lO
state Add. whi ch adds the coin's value to t o t. and
then returns to stale Waif. Once tot is greater than
or equal to (in other words. nO( les than) the cost
of a soda, the state machine goes to stale Disp.
which dispenses a soda by selling d to 1. The state
machine then returns to Slale /"il .
Inputs: c (bit ), a(8 bits), s (8 bits)
Outputs: d (bit)
Slep 2 is to create a datapath. We'll need a local
regisl er for tot. an adder connected to tot and a
to compute tot + a, and a comparator con
ne ted to tot and S to compute tot<S. The
resulting dalapUlh appears in Figure 5.4.
Local registers: tot (8 bits)
d=t
Figure 5.3 Soda dispenser high-!e\.!
m:Jchinc.
228 Register-Transfer l evell RTLI Design
Step 3 is to connect the datapath to a
controller. Figure 5.5 shows the con-
nections. Notice that the controller's
input s and outputs arc all just one-bit
signal s.
Step 4 is 10 derive (he comfoller's
FSM. The FSM has the same states
and transitions as the high-level stale
machine, but ut ilizes the datapath 10
perfoml any data operati ons. Fi gure
5.6 shows the FSM for the controll er.
in the hi gh-level stale machine. stale
fil iI had a data operarion of tot =
o (tot is 8 bits wide. so tha! assign-
ment of a is not a single-bit
operation). We replace that assign-
ment by selling tot_c 1 ~ 1, whi ch
clears the tot register to O. State
Wait's transitions had data operati ons
comparing tot < s. Now we have d
a comparalOr computing thaI com-
parison for the controll er. so the
controller need onl y look at the
result of that comparison in the
signal tot_l t_s. State Add had a
da!a operati on of tot ~ tot +
a. The da!apath computes that addi -
tion for the controller using the
adder. so the controller merely needs
to set to t_ 1 d ~ 1 to cause the addi -
tion result to be loaded int o the tot
regi ster.
To complete the design, we
would implement the cont roll er's
FSM as a Slate register and combi-
national logic. Figure 5.7 shows a
partial state table for the controll er,
with the states encoded as /lIil : 00,
Wail: 01. Add: 10, and Disp: 11. To
complete the controll er design, we
would complete the state table.
create a 2-bit Slate register, and
crcate a ci rcuit for each of the five
outputs from the table. as discussed
in Chapter 3. Appendix C provides
details of compl eting the controll er' S
design. That appendix al so traces
through the functioni ng of the con-
troller and datapath wi th one
another.
E
'iii
!:
"0
"0
<:
a.
'" 0
Datapath
Figure 5.5 Soda di spenser controller
and datapath connecti ons.
InpulS: c, toUCs (bit)
Oulpuls: d, toUd, toCctr Ibit)
Controlier
d=l
Figure 5.6 Soda di spenser contoll er FSM.
15
I-
s t sO c
,;:
nl nO d
0 0 0 0 0 t 0
0 0 0 1 0 1 0
0 0 1 0 0 1 a
0 0 1 1 0 1 0
0 1 0 0 1 1 0
0 1 0 1 0 1 0
0 1 1 0 1 0 0
0 t 1 1 1 0 0
1 0 0 0 0 1 0
... ...
1 1 0 0 0 0 1
... ...
15
1-
0:
0
0
0
0
0
0
0
0
1
0
Figure 5.7 Sada di spenser controll er's stute table
(panial ).
toUd
0
I-
g
t
1
t
1
0
0
0
0
0
0
5.2 Rll Design Method 229
The previous exampl e gave a preview of the RTL design method. Notice that we
started with a high-level state machine, which wasn't just an FSM because there were
local registers declared, and because there were dat a operati ons (rather than just Boolean
operations) in the states and on the transiti ons. We then created a datapath to implement
those local registers and to carry out the data operation. We further needed a controller to
control that datapath. We defi ned the behavior of that controll er to be the same as the
behavior of the high-level state machine. except the contrOller' s FSM used datapath
cont rol signals to carry out and evaluate the datapath operations. Finally, we could design
the controll er using Chapter 3's Controller design process.
We now di scuss each RTL design method step in more detail, while illustrating each
step with another example.
Step 1-Creating a High-Level State Machine
A hi gh- level state machine is a comput ation model similar to a finite-state machine_ but
with additional features that enable the descripti on of computations involving more than
just Boolean data.
Recall that a finit e-state machine (FSM) consists of inputs. outputs, states_ state
acti ons (a mapping of states to output va lues), and state transitions (a mapping of state
and inputs to next states). However, the inputs and outputs of an FSM are limited to
Boolean types, actions are limited to Boolean equati ons, and transition conditions are
limit ed to Boolean expressions. These limitations make specifying of computations
involving data cumbersome, other than for just si ngle-bit data.
Fi gure 5.3 showed a high- level state
machine describing the behavior of a soda dis-
penser processor. Notice that the state machine is
not an FSM because of the severa l reasons hi gh-
li ght ed in Fi gure 5.8. One reason is because the
state machine has inputs that are 8-bi t types,
whereas FSMs only all ow input and outputs of
Boolean types (a single bit each) . Another reason
is because the state machine declares a local reg-
ister tot to store int em, ediate data. whereas
FSMs don' t all ow local data storage-the only
"stored" it em in an FSM is the stat e itself. A
third reason is because the state acti ons and tran-
sition conditions involve data operations. like
InpulS: c (bit), t8 bitsl. s (8 bits)
Outputs: d (bit)
Locat registers: tot (8 bits)
d=l
Figure 5.8 ada dispenser high-le\eJ
State machine with noo-FSM con.sttucts
hi ghlighted.
tot = 0 (remember that tot is S-bits wide). tot < s (there' no .. <_. Boolean oper-
ator), and tot ~ tot + a (where the "+" is addition. not OR. and there's no addinon
Boolean operator). whereas an FSM all ows only Boolean equations and expre <tons. _
Therefore. a useful foml of hi gh-level state machine i an extenston of an F 1\I lD
whi ch:
input s and outp uts may involve dma types beyond just single bits.
local registers may be declared (of various data type ). and .
actions and condition may involve general arithmetic equmion. and e: prenoru;.
rather than just Boo)c.1t1 equations and expressions.
230 Register Transfer Level (RTL) Design
Sli ch a high-level state machine is not the onl y possible ex.tension to an FSM.
of varieti cs of extended FSMs exist. However. we will be lItlhzlI1g the above-descnbed
extended FSM variety throughout this chapter. That parll cul ar vari ety of hI gh-level state
machine is someLimes call ed an FSM with data . or FSMD. .
We will continue to use the foll owing conventi ons for hI gh-level state machines,
which we also used for FSMs:
Each transi ti on is impli citl y ANDed with a ri sing cl ock edge.
Any bit output not expli citl y assigned a value in a Slat e is implicitl y assigned a O.
NOIe: thi s conventi on does not appl y for mulllbJl output s.
We now provide anoLher example of describing a sys tem using a high-level state
machine.
EXAMPLE 5.2 Laser-based distance measurer- High-Ievel state machine
There are coullIless applications thnt require one to accurately measure distance of an object
from a known point. For example. road buil ders need to the .Iength of a
of road. Map makers need to accurately determine the locat.lOns a.nd. heights of hill s and mountainS
and the sizes of lakes. A t! iant crane for constructing skyfl sc bUi ldings needs to accurately deler
mine the distance of the crane arm from the base. In all of these appli cnti ons. stringing out
a tape measure to measure the ....di stance is not very practical. A bellcr method involves laserbased
distance measurement.
In laser-based distance measurement. a laser is paimed at the object of illlerest. The laser is
briefly turned on. and a timer is started. The laser li ght , traveling at the speed of light. travels to the
object and refl ects back. A sensor detects the refl ecli on of the laser li ght . causing the timer to stop.
Knowing the time T taken by the light to travel to the object and back. and knowing that the speed of
light is 3x lOB meters/second. we can compute the distance 0 eas ily by the equati on: 2D = T seconds
* 3x meters/second. Laser.bused distance measurement is illustrated in Figure 5.9.
o
20 = T sec . 3xl 0
8
mlsec
Figure 5.9 Laserbascd distance measurement .
Objectot
interest
Let 's design a processor to control the laser and the timer and to comput e di slHnces up to 2000
meters. A block diagram of the system is shown in Figure 5. 10. The system has a bit input B. which
equal s 1 when the ulicr a butt on to stan the measurement. Another bi t input S comes from
the ' en,or. and is I when the rcnected laler is detected. A bit output L control. the luser. turning the
la,er on when L i, 1. Finall y. an N-bit output D indicates Ihe diltance in binary. in units of meters-
we' ll aSlume a display converts that binary number into a decimal number alld displ ays the
on ''" LCD for the U; cr to read. D will have to be at lealt I I bitl. sill c I I bi ts cun represent the
number; 0 to 2047. and we want to measure dillanCc,1 up to 20()() metefl. Let' l make D 16 bits.
from bunon B
Laserbased
distance
measurer
L
S
5.2 RTL Design Method
to laser
from sensor
Figure 5.10 Block diagram of the laser based di stance measurement system.
Step I-Create a hi gh-level state machine.
231
We can describe the overall comrol of the system using a hi gh-level stille machine. To facilitate the
creation of the Sia le machine, we enumerate the sequence of events underlying the measurement
system:
The system powers on. Initi all y. the system's laser is off and the system outputs a distance of
o melers.
The system should then wai t for the user to init iate measurement by pressing a button. B.
Arter the bUllon is pressed. the system should tum the laser on. We'll choose to leave the
laser on for one clock cycle.
Aflcr the laser is pul sed. the system should wai l for the sensor lO detcctlhe laser's reflection.
Meanwhil e. the system should count how much lime passes from me lime the laser was
pul sed unti l the refl ecti on is sensed.
Aft er the refl ecti on is detected. the system should use the amount of time passed since the
laser was pul sed to compule the distance to the obj ect of i nteresL The system should then
return to waiting for the user to press the bunon so that a new measurement can be taken.
The above sequence guides our construc-
ti on of a hi gh-level Slate machine. \Ve begi n
with an iniLi al state. which we call SO. SO's task
i to ensure lh31 when our system powers on. it
does nol output an incorrect distance. and it
does not tum the laser on (possibl y injuring the
uns uspect ing user). Speci fying this behavior as
a high-level Slate machine is straighlforward
and seen in Figure 5.1 1. Olice that the high.
level state machine differs from an FSM in that
Inpuls: B. S (1 bit each)
Outputs: L (bit). 0 (16 bits)
0-?
L = 0 (laser off)
0 = 0 (distance = 0)
Figure 5.11 Panial high-level state macnine for
the late's acti ons use u dUla type that is larger measuremenl system: initialization.
than one bit (namely. D is 16 bits). However. the
hi gh-level slale machine itself follows the convention thai every tr::lOsition implicitly A1 Ded ,,; th
a ri sing clock edge. so the state machi ne onl y transi tions during clock edges (just like for an FSMt
Note that even though the assignl1lents L - 0 and D = 0 look the same. the assignment L = 0 :L<signs
a 0 bit to the one bit output L, whereas the assignment 0"" 0 assigns the l6-bit binaT) number 0
(which is actuall y 0000000000000000) to the 16-bi t output D. ome other n ullions distingUlsb
bil assignments from dala assignments usi ng different notations. such a en -losing a bit in singk
quoles. For the bit assignment L - 0 could be \\ rinen instead :b L - ' 0 ' .
After initiali zation. the measuremenl system wailS for the user to pre:,.., the!' bunon S. \\ hJ h ini-
tiales the measurement process. When the user pre" e.IIi:i the bUHon. B \\ ill l'qUal 1. .U1,j th
mcnSUfement sy. tem should proceed to acti":lte the laser. To perronn the \\ aiting. \\ e add :1
aft er O. which we cull SI. shown in . L . The shO\\1l mmsitj os C.3U_ th' .... tatc ... tufl(' {('I
remuin in '({He I whi le B - 0 (mc:ming B' is trod.
232 Register-Transfer Level (RTL) Design
When B= I. the laser should
slay on for one cycle. In olher
words. when B= 1. the state
machine should transition 10 a
Slat e that IUrns the Inser Oil. fol-
lo\\'ed by a slate that turns the
Jaser ofr. \Vc'lI call the laser-on
st:lte 52 and the laser-ofr slate
53. Figure 5.13 shows how 52
and 53 afe connected in the
high-level state machine.
In Male 53. the slat e
machine should wait until the
!\cnsor Ihe laser's renee-
lion (S=] ). The SIJIC machine
remains in 53 while S=O. As
mentioned in the earlier sequence
of events. the state machine
should meanwhile count the
duration between the laser bei ng
pul sed and the laser's reflection
being sensed. From the di scus-
sion of timers in Chapler 4. we
know Ihm with a given clock
period. we can measure time by
counting the number of clock
Inpuls: B, 5 (1 bit each)
Outputs: L (bi t) . D (16 bits)
B' (buNon not pressed)
0--8,-,
L=O
D=O
(buNon
pressed)
Figure 5.12 Parti al hi gh-level slate machine for
measurement system: wai ting for a button press.
Inputs: B. 5 (1 bit each)
Outputs: L (bit). D (16 bits)
B'
"0-0,0-8
L= 0
D=O
L= 1
(laser on)
L=O
(Iaserolf)
Figure 5.13 Partial hi gh-level state machi ne ror
measurement systcm: pulsing the laser for one cycle.
cycles and multiplying that number by the cl ock peri od (time = cycles' ( I/clock frequency. Thus,
\\e use a locrt! register. which we' lI cnll Detr. to count clock cycl es. The slate machine increments
Dc t r as long 3S the state machine is wailing for the laser' s reflect ion. (For si mpli city. we ignore the
possibility that no refl ection is ever detected.) We must also initialize Dc t r to D. which we choose to
do in State 51. \"lith these modifications. our hi gh-level state machine is seen in Figure 5. 14.
Inputs: B, 5 (1 bit each) Outputs: L (bit), D (16 bits)
Local Registers: Dctr (t6 bits)
B' 5' (no reflection)

50 51 52 53 ?
B
L = 0 Dctr = 0 L = 1 L = 0
D = 0 (reset cycle Dctr = Dctr + 1
count) (count cycles)
Figure 5.14 Partial hi gh- level state machine ror measurement sy tem: wai ting ror the laser
reflecti on and counti ng clock cycles.
Once the rcnecti on i, detected (5-1), our high-level state machine should compute the distance
o that i, being mea,ured. From Figure 5.9: we know that 2*0 = Tsec 3x 10" mlsec. We also know
that the time T in second, is Octr ( I/clock rrequency). To 'i mpliry the system's design, let's
"" ume the clock rrequency i, )x lO" Hz. or 300 MHz. Since li ght (fttvcb )x l o" meters pcr second,
5.2 RTL Design Method 233
each clock cycle would thus correspond to one meter. Thus wi th a 300 MHz clock. Octr counts the
number of (hal the lascr beam traveled from the measurer t.o the object and back to the mea-
sureLTo COunt Just the distance rrom the measurer t.o the object, we divide Octr by 2 (algebraic
Simpli ficat ion of the equations in this paragraph veriry that D = Dc t r /2). We' ll pcrfonn this cal-
cul all on III a state we Will call S4. Our fina l hi gh-level state machine is shown in Figure 5. I 5_
Inputs: B, S (1 bit each) Outputs: L (bit). D (16 bits)
Local Registers: DClr (16 bits)
S
S4
L=O
D=O
Dctr =O L= 1 L=O 0 = Dctrl2
Dctr = DctH 1 (calculate D)
Figure 5.15 Hi gh level staLe machine for measurement system: calculating the value of D.
We can summarize the behavior of the hi gh-level state machine in Figure 5.15 as follows:
50 is the initial state. In state 50, the state machine initi alizes the laser to off by setting L =0
and sets the output 0=0 too. The machine then transiti ons to 5l.
51 clears Dc t r to 0 and then waits unti l the bUllon is pressed. When the button is pressed.
the machine transi ti ons to state 52.
52 turns on the laser. The machine then transition to 53.
53 turns off the laser and increments Dctr every clock cycle (with a 300 MHz clock. every
cycle corresponds to one meter). The machine stays in 53. incrementing Deo- during each
clock cycle, until the refl ection is sensed, at which time the machine transition to Stale 54.
54 sets the output 0 to the count ed number of cycles di vided by two, which corresponds to
the measured di stance in meters. The machine then returns to state 51. which waiLS for the
bUllon to be pressed again.
A real laser-based di stance measurer mi ght use a faster clock frequency in order [0 measUJ'l!
di stance with a greater precision than just 1 meter.
The hi gh-level state machine de cribed above is just one type of FSM \-ariation. A dif-
ferent state machine variation that was previously qui te popular was called Algorithmic
Stale Machines, Or ASMs. ASM are similar to flowcharts. except that A M include a
noti on of a cl ock that enables transi ti ons from one slate to another (a traditional flow hart
does not have a n explicit clock concept). ASMs, like flowchans, comain more "srru lUre"
than a Slate machine. A tate machine can transition from any Slate to an) other lale,
whe reas a n ASM restri ts transiti ons in a way Ihat cau es the omputation I look more
like an algorithm-an ordered sequence of instructions. An AS 1 u e ' several type of
boxes. including s tate boxe , condition boxe , and output box' . A Ms Iypicall) nls
all owed local data storage and data operations.
The advent of hardware desc'ription languages (see Chapter 9) 10 hu\c large!)
re pl aced the use of A Ms. as hardware de cripti n language, contain tht" nSlru IS sup-
poning algorit hmi c structure, and much more. Thus, we do not de, critx M' funh r.
234 5 Regist er-Transfer LevellRTLI Design
Step 2-Creating a Datapath
Gi ven a hi gi1 -lcvcJ slate machi ne. we wanl to creatc a data pat h lhat .can all the
data storage and computati ons on non-Boolean dat a types present III the high-level state
machine_ Doing so will enabl e us to then replace the state machme an FSM
that merely controls the datapath. We can decompose the create a datapath step Into
severa l substeps:
Step 2: Create a datapath
(a) Make all data input s and out put s to be datapath inputs and output s.
(b) Implement the data storage by adding a register component into the datapath for
every declared regi ster in the high-level stat e machill e. Furthermore, we tYPIcally
want to add a register component for every data output.
(e) Methodi call y examine each state and each transiti on, adding and connecting new
dat apath components to implement new data computat ions. We add mUltiplexor.;
in front of component inputs as they become necessary III order to share a com-
ponent among multi ple signals that use the same component in different states.
Sometimes we find that a component already eXIsts (e.g., a regIster) but that we
need to add a new control inpUlto that component (e.g., a clear input on a register
to set the register to 0).
A common term used to describe the adding of a component int o a design is ;nstan-
tiation . Thus. we say that we "instantiate a new regi ster" rather than we "add a new
register."' Using the term "instanti ate' rather than "add" hel ps avoid possible confusion
with the use of the term "add" to mean arithmeti c additi on (e.g. , saying "we add two reg-
isters ' could otherwise be confusing). When we instantiat e a new component , we should
give that component a name that is unique from any other datapath component name. So
if we instantiat e a regi ster. we mi ght call it "Reg;ster} .' If we instanti ate another regi ster,
we mi ght call it "" Register2.' Actuall y, we should give meaningful names whenever pos-
sibl e. So we mi ght call one register "Telllperatl/reReg.' and anot her register
"" HI/I//;dityReg.""
When we instant iate a new component_ we may create addit ional datapmh inputs cor-
responding to the control inputs of the component. For exampl e, instanti ating a register
will create a new datapath input corresponding to the register' s load and clear control
inputs. We should give uni que names to each new datapath control input. ideally
describing whi ch component the input controls and the control operati on performed. For
exampl e. if we instantiate a register named Register}. we mi ght then create two new data-
path inputs named Register} _load and Register! J lear. Li kewise. we may need to utilize
control outputs of a component. li ke the out put of a comparator. in whi ch case we should
give tho. e outputs unique names 100.
EXAMPLE 5.3 Laser-based distance measurer-Creating a datapath
We now continue Example 5.2 by proceedjng to the ,econd ' tep of Ihe RTL design method.
Step 2--Create a data path
We can foll ow the , ub<leps of thi , step to creale the d""'path , howli in Figure 5. 17:
5.2 RTL Design Method 235
(a) Output 0 is a data output (16 bits), so we make D an outpul of the dalapath. as shown in
Figure 5. I 6(i).
(b) We need a register to implement the 16-bi l local register Dctr. Noting thaI the operations
on Dc t r are clear (in Slate SI ) and illcremelll (in state 53). we can implement that register
by instantiating a 16bit upcounter, as shown in Figure 5. 16(ii ). Furthennore. as we Want
1.0 cont rol when the output 0 changes (noli ce that we onl y change 0 in slate 54). we instan.
tiale a 16-bit regi ster Dreg at the OUlpu t D. as shown in Fi gure 5.1 6(iii). We extend the
Dc tr COunter and Dreg register control signals to be inpulS to the dalapath_ wi th each
signal having a unique name, as in Fi gure 5. 16(i v).
(iv)
Dreg_cl' _ t-D_a_ta_p_a_th_________
Dreg_ld - 1-- - --- -------.
Dctr_clr
Detr_cnt
Q
(ii)
Oct,: 16-bit
up-counter
(iii)
Figure 5.16 Partial dalapath for the laser-based distance measurer.
Q
Dreg: t 6-bit
register
16
(i) 0
(c) Noting that S3 wri tes 0 wi th Dc t r di vided by 2. we insert a ri ghl shifter between Dc t r
and 0 reg 10 implement the divide by 2, as shown in Figure 5.17.
Dreg_clr - t-----------+--,
DregJd - +-----------4.
Detr_el'
Detr_cnt
o
Figure 5.17 The dmapath for the laser-based distance measuremenl sySlem.
The resulting datapath in Figure 5. 17 is <1 very simple dalapmh. but a d3lapath noneLhel
The previous example did not require any multipl exors_ so we -II illu trate separatel) \\ h)
sometimes multipl exors must be instantiated. Consider the ample high-le\eI , tate nla -hine
porti on shown in Figure 5. 18(a). Figure S. 18(b) show- the daropnth :lftcr implementing the
<Iclions of' state TO. Tho e a lions require an adder. with the E and F regi sters ronne.."ted to
the A and B inputs of that adder. Figure 5.18(c) shows that datapath after implementing th
acti ons of state T! . That state also requires an add r. blll because one alread) e,ists III the
datapmh. we need not instanti ate another udder. H \\ e\ cr_ the R anJ G regisl'rs must
236 Reg ister-Transfer Level (RTL) Design
connect 10 Ihe A Hnd B inputs of that adder. bUI Ihose input s of Ihe adder already have con-
nccli o,,, from E and F. We Iherefore need 10 instanli ale multipl exors. as shown in Figure
5.18(d). I a li ce Ihm we creale uni que names for each mux's control input.
Local regIsters:
E. F. G. R (16 bils)
(a) (b) (c)
add_A_sO
add_B_sO-+--='F::.J
(d)
Figure 5.18 1"' lanlialing dalapmh Illuxes: (a) sample high-level Slate machine portion, (b) dalapath
aflcr implcmcnling TO's aClions. (e) datapath afl er implementing TJ's actions. resulting in two
sources for each ;'Idder input. (d) dalapalh after instanti ating muxes 10 handle the multiple sources.
Step 3-Connecting the Datapath to a Controller
Slep 3 of the RTL design melhod is actuall y quile straight forwa rd. We simply create a
controller block having the system's Boolean input s and output s, and we connect the con-
troll er block with Ihe datapalh conlrol inputs and outputs.
EXAMPLE 5.4 Laser-based d,stance measurer-Connecting the data path to a controller
COlllinuing Ihe previous example. we proceed 10 step 3 of Ihe RTL design method:
Step 3-Connect the datapath to a controll er_
\Ve connect (he dalapalh to a controller as shown in Figure 5. 19. We connect the control inputs and
oUlPUIS (B. L. and 51 to Ihe controll er. and Ihe dala OUlpUt (D) 10 the datapath. We also connect the
controller to the d"tapath control inputs (Dreg_dr, Dreg_ld, DClr _dr. DCIr _CII/) . Normally we don't
draw (he clock generator block. but we've explicitly shown the clock generator in the figure 10 make
clear that the generator must be exactly 300 MHz.
from buno
10 d'Splay
0
f6
f-
Controller
Dreg_elr
Dreg_Id
r-
1>
Detr elr
Dctr_eOl
-{ 300 MHz Clock r-. .>
L
S
Datapath
to laser
from sensor
Ftgure 519 COOlrollcr/dnWp"th for the laser-based d"tnncc measurer.
5.2 RTL Design Method 237
Step 4---Deriving the Controller'S FSM
If we created Our d atapath correctl y, deri ving an FSM for the controller is traightfa r-
ward. The FSM wt ll have the same states and transitions as the high-level state machine.
We merel y defi ne the FSM's inputs and outputs (all wi ll now be single bits). and replace
any data computations in the actions and condit ions by the appropriate datapath control
SIgnal values. Remember, we created the datapath specifically to carry OUt those compu-
tattons, and therefore we should onl y need to appropriately configure the datapath control
stgnals to Implement each pani cular computati on at the right time.
EXAMPLE 5.5 Laser-based distance measurer-Deriving the controller's FSM
We continue the previ ous exampl e by goi ng to slep 4 of Ihe RTL design method.
Step 4-Derive the conlroller's FSM.
The last step is to design the comroll er's internals. We can describe the comroller's behavior by
refining our high-level Slate machine from Figure 5. 15 inlo an FSM. replacing the "high-level:'
acti ons and conditi ons. li ke Dc t r""O. by actual controller input and output signal assignments and
condilions, like Dctr _c 1 r=1. as shown in Figure 5.20. Olice that the FSM does nOl directly
indi cate the computations that are happening in the datapath. For example_ loads Dreg with
Dctr /2. but Ihe FSM itself onl y shows Dreg 's load signal being activated. Thus. the overall
syslem behavior can be determined from Ihe FSM by looking also at the datapath.
Inpuls: B, S Oulpufs: L, Dreg_elr. Dreg_fd. Dctr_e1r. Delr_ent
L=O
Dreg_elr= 1
Dreg_Id = 0
Detr_cl r = 0
Delr_enl = 0
(laser off)
(clear Dreg)
L= 0
Dreg_elr=O
Dreg_fd = 0
Detr_etr = 1
Dctr_cnl = 0
(clear counl)
L=l
Dreg_elr = 0
DreQ.,ld = 0
Detr_elr = 0
Delr_eOl = 0
(laser on)
L=O
Dreg_e1r = 0
Dreg_ld = 0
Detr_elr = 0
Detr_ent = 1
(laser off)
(count up)
L=O
Dreg_elr = 0
Dreg_Id = 1
Dell_e1r = 0
Detr_cnt = 0
(load Dreg with Dctrl2)
(slap counling)
Figure 5.20 FSM description of the controlier for the laser-based distance measurer. The desired
action in each state is shown in itali cs in the bouom row: the c rresponding bit signal assignment
that achi eves thal acti on is shown in bold.
HOW DOES IT WORK?-AUTOMOTIVE ADAPTIVE CRUISE CONTROL
The earl y 2000s saw the advenl of automobi le crui se
control systems that not onl y maintained n paniculnr
speed, but also mainlained a particular dislQrlce from
the car in front-thus slowing the automobi le down
when necessary. Such "adaptive" cruise control thus
adapls to changi ng hi ghway Irnflie. Adaptiv. erui e
controllers must measure the dislonce to the car in
front . One way to me,:uure th:n db-lance a
based distance mea urer. "ith the and :: n: r
placed in the front grill of the C"'. ""nil( -led to
circuit and/or mkroproce sor that th
distan e. The distance is then mput to the
control "hich dett!'nlline, \\h n lO or
automobilt",
238 Register-Transfer l evel (RTl ) Design
Recall from Chapler 3 thaI we typically foll ow the that output signals not
explici tl y assigned in a state arc implici tly assigned O. F01l 0W1I1 that the FSM
look as in Fi gure 5.21 . We mny still choose 10 expli ctl y show the a,sslgnll1Cnt of 0 (e.g .. L = 0 10
state 53) that as:-.ignmcnl is a key action of a stale. The key aCll ons of each stale were bolded
in Figure 5.20.
Inpu/s: B, SOu/puis: L. Dreg_elr, Dreg_d, Dctr_elr, Detr_cnt
L=O
Dreg_clr 1
(Iaserolf)
(clear Dreg)
Dctr clr= 1
(Cle;;r counl)
L= 1
(laser on)
L= 0
Dctr_cnt 1
(Iaserolf)
(count up)
Dreg_ld 1
Dctr_cnt = 0
(load Dreg with Dctrl 2)
(stop counting)
Figure 5.21 FSM descripti on of the controller for the di stance the
conventi on that FSM output s not expli cit ly assigned a value 10 a state are Implicitly aSSIgned O.
\Ve would complete the design by implementing thi s FSM, using a 3 bit state register and
combinat ional logic to describe the nex t stat e and output logic. as was described in Chapter 3.
5.3 RTL DESIGN EXAMPLES AND ISSUES
RTL design involves a cert ain amount of creati vity and insight. Thus, a good way to begin
to leam RTL design is perhaps through seeing everal exampl es. We now provide additional
exampfes of the RTL design method. through which we also explain some detailed issues.
Simple Bus Interface Design Example
EXAMPLE 5.6 Simple bus interface
Processors typically need 10 transfer data to and from other processors. They typically communicate
such data over a bus, to reduce wire congestion probl ems that mi ght otherwise occur (see Section
4. 10). Suppose 16 different processors each has a 32-bit output connected to a single 32-bit bus
named D. Suppa e another processor, a master processor. may want to read the output of any of
those 16 processors. (Let's call those 16 processors per ipherals. which is a common term for a pro-
cessor that is aux.iliary to a master processor). The maMer processor output s a 4-bit addrc s, A. that
all the 16 peripherals can read. with each peripheral having it s own unique address (0000, or
0001. or 0010, etc.). Because the ma' ter proces>or must always set the 'tddress lines to a value.
but might nOt always want to read, the ma' ter processor has another output. rd, that the muster pro-
ce"or sets to I when reading, and 0 when not reading. So if the mOMer proce sor wonts to read the
value in periphcml number five. the maMer proccs>or would 'et the addres, lines A to 0101, then
'et rd to 1. The mast er procc"or would then rcad the datu lines D (perhaps storing the read data
into a local regi'ter), and then return rd to O. Additionally, the value on D ,hould not change while
the m3)tcr procc\sor i\ reading.
A block diagram of the system is shown in
Figure 5.22. Such an arrangement is very simi lar
to the arrangement in a desktop compuler, where
a master processor can read peri pheral processor
registers-peripherals might include a disk drive,
a CD-ROM drive, a keyboard, a modem, etc.
We have just described what is known as a
bus protocol. A bus protocol defi nes a sequence
of actions over a set of data, address. and control
li nes, 10 carry out a data transfer over those lines
from one processor to another.
A bus interface implements a bus protocol
for a processor. Let 's implement the bus interface
for one of the peripheral processors. Fi gure 5.23
provides a block diagram for a peripheral di vided
imo a main part and a bus interface part. The main
part's output 0 is an input to the bus interface.
Let' s assume the peripheral's own address is
another input. called Faddr, to the bus interface.
Fad d r mi ght come from a DIP swi tch. or
perhaps another register. The bus interface also
has inpuls and out puts that connect to the bus
signals rd, D, and A.
Step J of our RTL design method is to
create a hi gh-level state machine. Based on the
bus protocol we defined. a peripheral's bus inter-
face part sends data only if the address on input A
5.3 RTl Design Examples and Issues 239
Figure 5.22 Bus int erface example.
to/from processor bus
rd 0 A
Peripheral
Figure 5.23 Bus interface block diagram.
matches the address on input Faddr AND the processor requests a read by sening rd to 1. While
the bus interface waits For an instruct ion from the master processor to send data. the bus interface
should not interfere with what another processor may be writing to the hared darn lines. D. Thus.
whil e waiti ng for a matching address and rd= 1. the bus interface should drive 0 with no value
(known as high impedall ce. represemed as "Z").
When the bus interface detects a matching address and rd=1. the bus interfa e should output
data from the input 0 (from the mai n part) to the output D. However. we must also ensure that 0 does
not change while the master processor is reading from the bus interface. \Ve can keep the \"aJue on 0
stable by storing 0 into a local register Ol. As long as the bus interface is not sending data. the bus
interface updates 01 wi th the current
value of O. When the bus interface is
sending data, the bus interFace does
not upd3le 0 I and out puts 0 I on D.
causing 0 to not change during a send.
We cun see that the bus inter-
face's implementation of the bus
protocol can be described by a high-
level siale machine using IWO states,
shown in Figure 5.24: a tate in
whi ch the bus interface waits to be
able 10 send data ( lI'ai/MyAddl"l'ss)
and (\ state in which the bus
sends dRw ( eIltIOn/a) .
Inputs-. rd (bit); a (32 bits); A, Faddr (4 bits)
Outputs-. 0 (32 bits)
Local register. at (32 bits)
\
o ='"Z"
01
_ _ --.'rd

Figure 5.24 High-le\el st3te mnchine of the sending
half of a simple bus inter!'""".
2-10 RegisterTransfer Level lRTLI Design
Fi gure 5. 25 provide... :1 sampl e
timing di 32.r3111 of the machine's
beh:l\lor. tw for state IVail/Hy-
Address. SO for Sell dDow ). As long as
the system is in the \V state. the system
OUlpUIS Z on D. When r d= 1 and
A"" Fa d dr. the system outputs the con-
tents of 01 beginning at the next cl ock
cyc le's rising edge. The system con-
tinue:;. to output 01 as long as rd= 1.
\Vhen read returns to O. the system
returns to the lVailMyAddress slate at
the next ri sing cl ock edge and hence

Inputs rh : rt----"l !
rd ---' : 4------J j 4--
State [ I w I w I SO I w I w I SO I SO I w I
--z--I1-0-1+1--z -11---
0
-
1
-+I--.z I
Figure 5.25 Bus interface liming diagram.
outputs Z again. . . . .
Slep 2 to crealt:: a datnpath . 3S shown on the nght III Figure 5.26. The datapath con tams a
equalil y comparalOr 10 compare A and Faddr . a 32 bil regiSler 01. and a 32 bil wide !hree
Slale dri ver 10 enable dri ving of D by nOlhing or by 01. A, Fadd r, and 0 are Ihe dalapa!h's dala
inputs. and 0 is the onl y data out put.
rd
Inputs: rd, A_eQ_ Faddr (bit)
Outpurs: Ql _ld, D_en (bit)
o en= 1
Ql _'d = a
32
Controller "D:..; a"ta ",p",a:.: th"-_ -I-....J
Interface
o
Figure 5.26 Dalapalh (righl) and conlroll er FSM de!ocripli on (Iefl) for Ihe simple bus inlerface.
Step 3 i!o 10 conneCI Ihe dalapath to , controll er, a, shown in Fi gure 5.26. The controller has
one eXlemal comrol inpul. rd, "nd also gels a conllol inpul from Ihe d,wpal h, A_ eq_Faddr, indio
caling whelher A cqual' Fa ddr . The conl roller has IWO cOlllrol OUlPUIS 10 Ihe dalapalh, with
o L 1 d causing 01 10 be loaded wilh 0, and O_ en controlling Ihe Ihree-Slale dri ver,
Step i, 10 deri ve Ihe coni roller' , FSM. We simply replace Ihe dOl" operali ons in Ihe hi gll
level \lale machine of Figure 5.24 by Ihe appropriale conlrol , ignal; , . , sh wn on the ler. ide of
Figure 5,26. We replace A- F addr by Ihe , ign31 A .eq Fadd r, Ihe aClion, of 0- " Z" and of 0-0
by D en-D and D_en-1. and Ihe acti on of 01 - 0 by 01 l d- 1. We would Ihen implemenl the
FSM u' '''g a "ale regi"er (in Ihi, case only I bil ) and cornbinalionulloglc.
You may have heard of , everal popul ar bu'"" like Ihe P I (Peripheral olll ponenl Interface)
bu, '" a pef\onal com pUler Thole are Ihe buse, Ihal a PC "card" plug' "" 0 In a pc, like !he canl
, hown in Fi gure 5,27. You can \Co on Ihe card the lI1ewl pad, of Ihe bu,", IIch pad corresponds
to one WlfC of the hUI . The bu, prolocol for PCI " fllr morc tolllpl., Ihan Ihe prolocol in Ihe abolt
example. Hundreds of OIher "Slan.
dard" bus prolOcols exis!. Designers
not needing to interface to other chips
often defi ne their own bus protocol
for communication,
5,3 RTL Design Examples and Issues
Figure 5.27 PCI card plugged inlo a PC's PCl SIOL
241
ALL =5 ARE NOT EOUAL.
Figure 5.24 showed two di stinci uses of !he "_ "
symbol. In a stal e's actions, ":::" meant "assign the
value oflhe righl side 10 the lef! si de," e.g. , D = 01. On
a transi tion, ""," meant "the left and right sides are the
same," e.g" A- Fa dd r . Be careful nOI 10 confuse
Ihose two meani ngs of Ihe "- " symbol. Some
languages use differe", ymbols 10 distinguish !he two
meanings. For example, Ihe C language uses "=" for
"assign" and " ==" for "!he same: ' VHDL uses " : ="
(or" <-") for "assign", and " m" for "!he same."
Video Compression-Sum-of-Absolute Differences (SAD) Design Example
EXAMPLE 5,7 Video compression-sumofabsolute differences
A/ter(12004
flatl/ral disaster ill
Indonesia. a n'
news reporter
reported from rhe
SCCll t! by "camero
phone... rhe video
was smoolh as
10llg (IS Ihe scelle
wasil " changing
signijiC(lIIfly,
When 'he sum!
chat/sed (like
pa1l1liflg across
the I01ldscape),
the video became
\'el)' jul.) .,
because the
camera pholl e had
to trallsm;t com-
plete piclllres
rotll er tlltm j llst
differences.
resulting ill / t' u'er
frames trans-
milled ol'er tht
limited band-
width a/the
camurl phont".
Di gitized is becomi ng increasingly commonpl ace. like in Lhe case of the increasingl y popul ar
DVO (see Secllon 6.7 for further infoml alion on DVDs). A slraighlforward digiti zed video consislS
of a sequence of di giti zed pictures. where each picture is known as aframe. However. such di sti-
ti zed video resul ts in huge data fi les. Each pi xel of a frame is stored a' everal b)tes. and let's sa; a
frame contains about a mill ion pixels. Let's assume Lhen Lhat we require about I Mbyte per frame.
and we play approximalely 30 frames per second (a nomal rale for a TV), 0 tha.-s I MbYlelframe
30 frames/sec = 30 MbYles/sec. One minUle of video would require 60 sec' 30 Mb},e sec = I.
GbYle , and 60 mi nules would require 108 GbYles, A 2hour movie would require o'er _00 Gbn es.
Thai 's a 101 of dala, more Ihan can be downloaded quickl y O\'er the l",emel, or Slored on a DVD,
whi ch can onl y hold between 5 Gbytes and 15 Gbytes. In order 10 make practical use of digitized
Video wiLh web pages, di gital camcorders, cellular telephones, or even with DVD.., we need to com-
press those files into much smaJler fil es. A key technique in compressing "ideo to recognize that
successive frames often have much similarity. so instead of sending 3 sequen e of digitized pi rures.
we can end one digilized piclure frame (a "base" frame), follo\\ed by dala descrihingju I !he dif.
ference be,ween !he base frame and !he nexi frame. We can .end j U' 1 the difference <!ala for
numerous frames. before sending another base frame. uch a method in some 1 -- of quallt).
bUI as long as we send base frames frequentl y enough, the quali lY mal be ", eptable.
Of course, if Ihere's a major change from one frame 10 !he (like for a hange f scene, or
lots of 3ctivi ty). we can' t use the difference method. Video compressi n de\lC'eS therefore need (Q
qui ckly estimnte the similarity between two successive die.iti1.ed frames to determine
frames can be sent using the difference method. A common ) to detennlOC the , tnulant) of ( \\ 0
frame is 10 compute what is known n the " "(,(' ( 0 ). For c3('h p1\el in
frame I, we compute the di fference between th3t pi'\;cl \\ ilh the p1\el In
Each pi xel is represented b n number, so differen e means difterence In numbef'.. uPPc \\
represent a pi e1 with a byte (real pixels are usuall) repres.nle<! b) al 1<3-<1 !hR..., b) tc' \, .m.:l \\ e "'"
comparing Ihe pi xel ", Ihe lIpper ler. offr:lme, 1 and 2 in Figure J ) frame I ' , upr<!r-I fI
pixel has:l volm' of ... 55. Fr.J.UlC _'s pixel clearl) the ,lInc. ' l' \\ ould h3h' l \ alue If '= -5 ;11'\....,
2-'2 Register-Transfer LevellRTLI Des ign
Digitized Digitized
Digitized Difference of
frame 1 frame 2
frame 1 2 from 1


[J 8
S8

;::s
1 Mbyte 1 Mbyle 1 Mbyte 0.01 Mbyte
(al
Ibl
Figure 5.28 A key principle of video compression recognizes that successive frames have much
similarity: (a) sending every frame as a di stinct digi ti zed picture. (b) instead. sending a base frame
and then difference data. from whi ch the origi nal frames can later be reconSlfll ctcd. I f we could do
this for 10 frames. (a) woul d require I 10 = 10 Nlbyt.s. whil e (b) (compressed) woul d
requi re only I Nlbyte + 9 0.0 1 Mbyte = 1.09 Nlb)' tes. an almoSt lOx size reducti on.
the difference of these two pi-xcls is 255 - 255 = O. We might compare the next pixels of both
frames in that row. finding the difference lO be 0 agai n. And so on for all the pi xels in that row for
both frames. as well as the next several rows. However. when we compute the difference of the lefl
most pixcl of the middle row. where thai bl ack circle is localcd. we see that frame I' s pixel wiIJ be
black. say wi th a value of O. On the other hand, frame 2's corresponding pi xel wi ll be white, say
with a val ue of255. So the difference is 255-0 = 255. Li kewi se. somewhere in the middle ofthm
ro\\. well find another di fference, thi time with frame I 's pixel white (255) and frame 2's pc,.1
black (D)-the difference is again 255-0 = 255. Note that we onl y care about the difference. not
which is bigger or small er. so we are actuall y looking at the absolutc value of the difference
between frame I and frame 2 pixel. By summing the absolute value of the differences for every
pair of pixels. we get a number representing the si milarity of the two frames-D means identical,
and bigger numbers means less simi lar. I f the resulting sum below some threshold (e.g., below
1.000). we mighl then appl y the met hod of sending the difference data . as in Figure 5.28(b)-we
don't explai n how to compute the difference data here. as that is beyond the scope of this example.
If the Sum is above the threshold. then the difference between the block is too great. so we might
in!ltead send the full digitized frame for frame 2. vidco with similarity among frames will
ac:hic\c a higher compression than video wi th plenty of
Actually. video compre ion mcthods compute , imilari ty not bctween two entire frames,
but rather between corresponding 16x 16 pixel blocks-yet the idea is the snme.
Computing the sum of absolute differences is !<ilow in software. thnt task may be done usine
a CU'i tom digital circui t. whil e other [ask may remain in ... oftware. For example. you mighl find
SAD circuit imide a digital camcorder. or a cellul ar telephone that supports video. Let',
de,ign \uch a circuit. A block di agram is . hown in Fi gure 5.29. The circuit " inputs wil l be
256-byte memory A. holding the conten!> of a 16.<16 bl ock of pixe" of frame I. and another
256-byte memory 8. holdi ng the corr.'ponding block of fr:llne 2. Memon., Will be di scussed in
Secllon 5.6. for now. conSIder Ihe memory a, a regier hie. and Ignore dcwti , of the interfuce to the
memo"e,. Anolher cirCUli input go lell , the circlI lI when to compuling. An OUlput sad will
pre,ent the re, ult after 'orne number of clock cycle' .
SAD
sad
(a)
5.3 RTL Desi gn Examples and Issues
Inputs: A, B 1256-byte memory); go (bit)
Oulput sad 132 bits)
Local registers: sum. sad_reg (32 bils); i 19 bils)
!go

i<256
sum=sum+abs(A{i]-B(i])
i::i+1
Ib)
243
Figure 5.29 Sum-of-absolute-differences (SAD) component: (a) block diagram. and (b) high-level
slate machine.
Step I of our RTL design method is to create a hieh-Ievel state machine. \ e can describe the
behavior of the SAD component using the high-Ie"el ; tate machine sbown in Fieure - 29(a)_ We
dec lare the inputs. outputs, and local regi sters sum. i . and sad_reg. The sum ;;'gister will hold
the running sum of differences; we make thi s regi ster 32 bi LS wide. The i reistcr will be used to
index into the current pixel in [he bl ock memories: i will range from 0 to 256. and therefore we'll
make it 9 bits wide. sad_reg will be connected to the sad (i!"s good procti e to register
your data outputs). so will be 32 bits wide, like the S ad output. The tate machine initiall) waits for
!.he Input go 10 become 1. The state machine then inirializes registers s urn and i to O. The st:Ue
machine then enters a loop: if i is less than 256. the state machine computes the absolute value of
the difference of the two blocks" pixels indexed by i (the notation A[ i) refers to the data in "ord
i of memory A) . updates the runnjng sum. increments i. and repeat:s. Otherwise. the stale machine
loads sad_reg with the sum. which now represents the fi nal sum. and rerurns to me fin" srate to
wait for the go signal to become 1 again.
One poin! to reemphasize is that the order of actions in a state does not impact the resul .
because nil those actions occur si multaneously. Thus. for the tnu: in ide the loop. arranging me
acti onsas "Sum: sum + abs(A[i)-8[i) ) :i : i T I"oras"i = i T I: 5 urn = Sun
+ a b s ( A [ i ) - 8 [ i ) r ' does not impact the results. Either arraneemem u " the old vnlue of ; .
Slep 2 of our RTL design method is to create a darapath.-We see from the high-level - e
machine that we' l! need a subtmctor. an absolute-value omponem (\\hich \\e 001 designed
earl ier. but is . traightforward to design). an adder, and a comparison of i to 256. We build the datn-
path shown in Figure 5.30. TI,e adder will be 32-bits wide. so the -bil input conling from the abs
component wi ll need to have as appended for its high _4 bits.
Step 3 is to connect the datapath to n controller block .. < sho\\ n in Figu!1." 5.30. that
we've defi ned the interfnce 10 the A nnd B memories. consisrimz of 9. read line. 3ddre .. d
dat a lines. Also note that we hawn't explici tl y listed the inputs a';,d outputs t,r the
a, they can be secn at the periphery of the controllers blo.:k.
Step -' i, to convert the high-Ie" el stnte machin to an FSM. We 'ho\\ the \1O th I ft "J
of Figure 5.30. For comcnicnce. \,e\e shown the original h.i.gh-le\el J. , .. "u). and
\\e've ... hO\\ 11 their b) F t action\".
lU 5 RegisterTransfer Level (RTL) Design
sad
Figure 5.30 SAD data path and controll er FSM.
To complete the design. we \\ ould convert the FSM to a controll er impl ementati on (3 state reg
and combinati onal logic). as described in Chapter 3.
Comparing Software and Custom Circuit Implementations
In Example 5.7. we said Ihat the output appears after some number of clock cycles. Lei 's
determine exact ly how many cycles. After go becomes 1. our state machine will spend
one cycle initiali zing registers in 5 /. then will spend two cycles in each of the 256 loop
iterati ons (states 52 and 53). and finally one more cycle to update the output register in
state 5.J. for a total of 1+ 2*256 + I = 514 cycles.
If we executed SAD in software. we wou ld likely need more than two clock cycles
per loop iteration. We would need perhaps two cycle to load internal registers, then a
cycle for , ubtract. perhaps two cycles for absolut e va lue. and a cycle for sum. for a total
of six cycle per iteration. The cu torn ci rcui t we buil t. al two cycles per iteration, is thu
about three times faster for computing SAD. as uming equal clock frequencies.
We' ll see in Section 6.5 that we could aCluall y build a SAD circuit that is IIlllCh fasler.
DIGITAL VIDEO- IMAGINING THE FUTURE.
People 'Cern 10 have an '"'au able appeti le for good
quality \ideo. and thu, much allention " placed on
de'elopmg f..,1 and/or power-efficienl encoders and
de.;ode" for dlgHll l video device,. like DVD players and
recorde". dlgH.f VideO came"". cell phone, , upponing
d'gH.] "deo. 'Ideo confcrenc'"8 UnlL'. TV,. TV ,.t. top
Ix" ... '" It\ ,"",re,I,"S 10 Ihmlloward the fUlUre-
.. ." umlng "Iden enuxhng/det.:cxhng become, even more
p<J\'ocrful and d1ill.a1 cOmmUOICiJlIOn 'peed, IOcrea\C. \Itt
mighl imagine video di'play (With audiol on our wafls
al home or Ihal conlinuaJly di spfay whm'
happening at anolher home (perhaps our mom's house)
or al a panner ollice on the other ide of the counlJ)'-
like a vi nuaf window to an Iher place. Or we mighl
Imagine ponahJc device; that enahle u; 10 continually
\CC what wmcone eJ\C 'Wcann!! n tiny camcm- pcrhnps
our child or 'I"'U"''''",. TI,o"", de' elopmcnL' could
\ ignlficanlly change nur 11\ 109
5.3 RTL Design Examples and fssues
RTl Design Pitfalls and Good Practice
Pitfall: Assuming a R . t I
Written egLS er s Updated in the State in Which the Register Is
245
Perhaps the most com . k ' .
th t '. man ml sta e m Creallng a hi gh-l evel state machine is assuming
a a regl ter IS updated in the t . h'
. " s ate m w Ich the register is written. Such an assumn-
tlOn IS mcorrect and can lead t r
th
. . ' a unexpected behavior when the state machine reads
e register m the same state d I ' k .
. . . ,an I eWl se when the state machine reads the reo;qer
m a transitIOn condit ion lea' h .,...
. I h' vmg t at state. For exampl e. Figure 5.31 (a) shows a
simp e Igh level stat e mach ' E .
I
. . Ine. xamme the state machine, and then answer the fol-
owmg two questi ons:
What wi ll be the value of a after state A?
What will be the fi nal state: Cor D?
The answers may surpri se you.
The value of a will not be 99; a 's
value will actually be unknown. The
reason is illustrated by the timing
diagram in Fi gure 5.31 (b). State A
confi gures the datapath to load a 99
int o R on the next clock edge. and COn-
figures the data path to load the value
of register R into register a on the next
clock edge. When the next clock edge
occurs. both those loads Occur Silllll/'
ralleoll s/y. a therefore gets whatever
val ue was in R JUSt before the next
clock edge. which is unknown.
Furthermore, the final state will
not be D. but will rather be C. The
reason is illustrated by the timi ng
diagram in Figure 5. 3 I (b). tate B
configures the datapath to load 100
int o R on the next clock yele. and
configures the controll er to load the
Local ,egislero. A, 0 (8 brts)
(a)
o
(b)
Figure 5.31 High-fe,el sml. machine
thai behavcs diffenenl than some people
may e.'(peC'L due to reads of a reruter in
the arne tate as ",Tiles to that ;'cister:
(al smle m3 hine. (b) timing di3i.un,
next tate ba ed on the transition conditi on. R is 99. and therefore the transition ndition
R<lOO is true. meaning the controll er wi ll be configured to load tate C into the tale
regi . ter. not state D. On the next lock edge. R be ames 100. and the ne\ t tate become C.
The key i to alway remember that a srare' acriolls COllji uu rht! dorapuliJ and
cOllrroller slich rhar rhe lIexr clock lI'i1lload Ihe desirt'd \'Olue -bUllh,'St' \0/ ...
dOI/'t actually get loaded wllil rhar clock ed e. Thu . an) e\ pre .i n. m.1 st3le' s
actions r outgoing transition \\ill be u ing the pre,ious ,J.!ues f regJ'.
ler . . not the values being 3S igned in that state itself. B\ (he . me reasantn"!, all th
aCli ons of a state occur . irnultnneou, h on the ne\t d -. edlre. and thu,
written in an order. . -
2-16 Register-Transfer LevellRTlI Design
that (he designer
actuall y Q to equal 99 and
the Iinal swte to be D. then a sol u-
ti on is to add an ext ra swt e before
reading tbe value of a register that
we assi gn. Figure 5.31(a) shows a
new in which the
assi nmem of Q=R has been
mo\:ed 10 state B. after R=99 has
taken effect. Furthermore. the
state machine has a new state. B2.
that simply all ows R to be updated
with the new value before we read
that val ue in the transiti on condi-
tions. The liming diagram in
Figure 5.32(b) shows the behavior
th;t the designer expected.
An alt ernati ve solution for the
transition issue in thi s case would
Local regislers: R. Q (8 bits)
(b)
Figure 5,32 Hi gh-level state machine that avoids
reading j ust-assigned register : (a) state machine.
(b) timing diagram.
be to uti li ze comparison values that take into account that the old value is being used. So
instead of comparing R 10 100. the comparisons might instead compare to 99.
Avoiding this pitfall is the reason that we included stat e 52 in Example 5.7.
Pitfall: Reading Outputs
Another common mistake is 10
create a high-level state machine
in whi ch an external output i read
in the state machine. Out puts can
only be wrillen and cannot be
read. For exampl e. Fi gure 5.33(a)
shows an inval id high-level state
machine-the read of P in state T
is not allowed. If you wish to read
an output . then create and use a
local register. Fi gure 5.33(b)
shows use of a local register R 10
avoi d reading output P.
Inputs: A, B (8 bits)
Outputs: P (8 bits)
(a)
Inputs: A. B (8 bits)
Outputs: P (8 bits)
Locat register: R (8 bits)
(b)
Figure 5.33 (a) Reading an output is not allowed.
(b) using a local register.
Good Design Practice: Registered Data Outputs
fI 's a good idea to always en ure your design has a register at every data output. Doi ng so
prevent;, those outputs from displayi ng spurious values. For exampl e. the state machi ne of
Figure 5.33(b) could be implemented as a datapat h in whi ch output P is directly con
nected to the output of an adder, as shown in Fi gure 5.34. P wi ll therefore output spuri ous
values for 'ome time after R i loaded wi th A. whil e the addition is being computed. Fur
thermore. If B or A changes in some other states. P will also change. but such change is
hkely not the intended behavior of the state machi ne-P should only change when we
explicitl y assign P in a Mate. Another problem is that any proces 'or using the P output
-, -. - -- ------------
must take into account the adder when
computing longest register-to-register
delays to determine a circuit 's criti cal path
(see Section 5.4).
5.3 RTL Design Examples and Issues
247
(a)
(b)
Figure 5.34 (a) P will exhibit spurious
va/ ues. (b) regislering P solves the problellL
Therefore, we wi ll follow the design
practi ce of always pUlling a register directly
before the data output, as shown in Fi gure
5.34(b). Even if we don ' t explici tly declare
the register as a local register, we always
assume it is there in interpreting the high-
level stat e machine, and we always add that
register when creating the datapath. Alter-
nat ively, we can explicitl y decl are that
register, and then assume that the output is
directl y connected to that regi ster-thi s is the approach we took in Example 5.7. in whicb
we declared the register sad_reg. !t 's good practice to no/ read this regi ster. the reg-
ister's only purpose is to connect to the Output port. -
Regi stering data outputs does have the potential di sadvantage of delaying wriles to
the output port by one cycle, depending on the example_
Data-Dominated RTL Design
We can consider RTL designs as falljne into one of two Calegorie : contral-dontinated
designs and data-dominated designs. - -
A cOlltroldomillated desigll is a design whose controller comain mo I of the om-
plex ity of the desi gn. When creatine such a desi!!n a desi!!T1er focu es mostlv on the
desi gn of the controller, meaning design effort mo into defining state
behavior of the system. Once the desi!!ner has defined thaI tate behavior. hei be can
derive the datapath straightforwardl y from that stale behavior. A contral-dominated
design typicall y responds to eXlemal inputs in a precise anlQun! of time. and typi bas
a simple dat apath.
A datadomillated desigll is a de ign whose datapath contains mo t of the m-
plexity of the design. When creating such a design. a designer focu es on the
design of the data path. meaning de ign effort goes mostly into instantiating and inrerron-
necting datapal.h components. Once the designer has defined lhe dampath. h she an
define the controller' s state behavior straightforwardl y. A data-dontinated d -ign lypi :ally
has a lot of parall eli sm in its datapath. and the datapath ma_ be large. For a d:lla-doffil-
nated design. de igners of len ski p the first tep of our RTL d ign method of Table -.1.
The laser-based distance mea urer example in the uon \\ as an <!.umple
of control-dominated design. ince the compl ex it)' of the d ' ign \\ as reall) in th :on.
troll er, not tl,e datapath.
The tenns "controldominated" and "data-dominated" are descripti\ . and
an' t be used to tri ct ly categori ze de-igns. me de igns \\ill e:-.h.ibit propenies oitx'lh
types of de igns. It ' like the tenn "intra\'en" and "extra\en" for d s.:ribtng pIe--
whil e the temlS are useful. people an' t be ' trictly categorized ' either mtro\<'ID
ex troverts. since many peoplc are somewhere in bet\\een. or e\.h.ibit f ature, "f xh
248 Register- Transfer Level (RTL) Design
categori es. The example of lhe si mple bus interface was an exa mple of a lhat has
a similar amoulll of control and data design. The VIdeo compressIon SAD C" cull , at least
the way we designed it. was also a mix of control and .
RTL design is very mllch a creati ve process. Two desIgners may come lip wIth very
different desions for the same system. foll owing perhaps dIfferent desIgn methods, wllh
those in terms of performance. size, and oLher metrics.
FIR Filter Design Exampl e
As our previous examples were ei ther control-domi nated or a mi x of control and data, we
now provide an example of a data-dominated design.
EXAMPLE 5.8 FIR filter
A. digital fi lter takes a SLream of digital inputs and generales a stream of digita.1 outputs with some
feature of the input stream removed or modified. Figure 5.35 shows a block diagram of a popular
digital filter known as an FIR filter. X and Yare N-bits wide each. such as 12 bits each. As a fi ltering
x
y
digital fil ter 12
elk
Figure 5.35 General block di agram of an FIR fi lter.
example. consider the following stream of digital temperature values on X comi ng from a car
engine temperature sensor sampled every second: 180. 180. 18 1. 240, 180, 18 1. That 240 is prob-
ably nOf an accurate measuremenl. as a car engine's temperature can nOI j ump 60 degrees in one
second. A digital filter would remove such "noise" from the input stream. generat ing perhaps an
output stream on Y like: 180. 180. 181. 181. 180, 181.
An FIR filter (usuaUy pronounced by saying the leiters F ."f"" R"). short for 'Fi nite Impulse
Re ponse filter. is a popular general digi tal fi lter design that can be used for a wide variety of fi l-
tering goals. Figure 5.35 shows a block diagram of an FIR fi lt er. The basic idea of an FIR fi lter is
simple: the present output is obtained by multi plying the present input value by a constant, and
addi ng that resuh to the previous input value limes a con lant, and adding that resull to the next
earlier input val ue limes a constan t. and so on. In a sense. adding 10 previous val ues i n this manner
results in a weighted average. We describe digital fi ltering and FIR fi lters in more detai l in Secti on
5.11 . For the purpose of this example. we merely need to know lhat an FIR fi lter can be described
by the following equation:
y( r) = cOxx( r) + c l xx(t- I ) + c2xx(r-2)
An FIR filte r with three term. as in the above equati on. is known as a J-tap FIR fi lter. Real
FIR filter; typicall y have many tens of taps-we u,e only three taps for the purpose of illustration.
A filter de. igner using an FIR filter achieves a parti cular filtering goal s;mply IJY c1r00s;/l8 lire FIR
filter 's con.'ilGl/tr.
We wi h to a ci rcuit to implement an FIR filter. Becau." the FIR filter equation is ju t
data tran<formation and no control. lets skip Step I of the RTL de.ign method and go straight to
tep 2--<lesigning the datapath. We' ll need a for each tap to hold X(I ). x(I- I ). nnd x(I-2). On
each clock cycle. we ll want to move x(I- I ) to x(I-2). to move x(1) 10 x(I- I ). and to load .I"(r) wi th th.
pre.,.,m Input. We lhus <tart the datapath wilh three reglSlers. connected 0' . hown in Figure 5.36.
-- # - ------------
5.3 RTL Design Exa mples and Issues
249
Noti ce how lhe data moves to the ri ght on each clock cycle, so that register xtO holds the current
mput sample, X tJ holds the previous input sampl e, and x t2 holds the sample before the previous
one. For the exampl e, we'll assume data is 12 bits wide.
X(I)
xtO
12
12
3-tap FIR filter
x(I-I)
Xl1
12
x(I-2)
xt2
y
12
Figure 5.36 Beginning to bui ld the datapalh for the FIR fil ter-inserung and connecting thex(I).
x(l -f ). and x(I-2) registers.
Now we need another register for each tap to hold the constant value cO. c1. or c2-weU
worry later about how those registers will be loaded. We' ll also need a multiplier for each tap. to
mul tiply the taps X value by the Constant C val ue. The datapath with the constant registers and mul-
ti pliers is shown in Figure 5.37.
x
clk
Figure 5.37 Extendi ng the datapath for the FrR filter-inserung and connecting the cO_ c L and c2
registers, along with the multipl iers. for each tap. For simplicity. clock connections are DOl sho""Il.
and all data lines are assumed to be 12 bits wide.
The out put Y is the sum of each taps product. We can thus insert adders to compute !be sum_
and we can connect thal sum to the aUipUl Y. as shown in Figure -.38.
We have completed the heart of the FIR filter datapath design. We DOW need to pro\"idc: a
met hod for a user to load values into the constant registers cO. c1. and c2. LeCs rn:ate!lIlOlbet-
input C to the fi lter. a load line Cl. and a 2-bit address Cal and CaO_ that the filter user an use to
load a pani cular constant register. Ca I Ca 0-00 indicates that register cO should be loaded I
indicates that c I should be loaded. and 10 indi ates that c2 hould be loaded_ L ding of the ,
on input C into th. appropriate register occurs on a lock edge only when CL-l. We <= trnigbt-
forwardly design the circuit for such loading using a decoder. as shown in Figure _19. address
lines C a I and C a 0 feed into a 2x4 decoder. thus enabling the appropriate register (JlO{e that address
II is unused). The load input C L is connected to the decoder" enable input_ 'Ole that.. -\ 3IS<'
added a register at the Y output. \ hich is genernlly good design practi _ such l
ensures tlle output doesn t Rucluute a intemlediate products and sums are mputed. and rectu.;, -
the likelihood of the user accidentally extending the riti al path nne<:ting tttrough. \0( of
combinational I gic before loadi ng Y il1lO a register.
150 5 Register-Transfer Level (RTU Design
x
clk
y
. y ' Ih' FIR fill er as Ihe SlIlll of Ihe I"P produclS (all dala lines Figure 5.38 ComplI lIllg th.... output 111 t.:
arc 10 be 11 bi t:-- "ide).
Figure 5.39 Finalll.lOg the FIR fi ltcrdalapath wi th circuitry for loading the constalll rcgisters. We've
aJl,o added a Oil the Y output. which is good practice. The crit icnl puth- the longest
rcg/\ lCHo-rcg/\ lt:r delay- if., <, hown :1 ... a dotted linc.
Our RTL dC' ign melhod "lVolve, IIVO ' Ieps afl er de, igning Ihe dalap:ll h 10 compl ete Ihe con-
troll er. HO\l. cvcr. thl \ pUrll cular dc\ ign doc\ nOI requi re a cont roll er. nOI e\en n simple one! n,iJ
ewmplt' H "u",.o ed an l'. \frClI/e l'xm1lph, oj a (/ma- (/OI1lIlUll eti rle,\ix".
Compa ring Soft wa r e a nd Custom Circuit
It " Inl erc, tlllg to compare the performance of Ilte hardwa re "np/emellt ati on of a 3-lap
fIR filt er with a ,oft"'are implemcllI ati on. The criti cal path goc, from Ihe X t and c reg-
i, ter,. through nne multipli er. and through two add.". before rc.lchllt g the Y rcg;;tcr yreg.
________ <0 _
5.4 Determining Clock Frequency 251
HOW OOES IT WORK?-VOICE QUALITY ON CELL PHONES_
Cellul ar telephones have become commonpl ace over
the past decade. Cell phones operate in environments
far noisier than regul ar "I andlinc" telephones.
incl uding noise from 3U1omobiJes. wind. crowds of
lal king people, elC. Thus. fi heri ng OUI such noise is
especiall y important in cell phones. Your cell phone
cont ains at least one, and probably more like several,
mi croprocessors and custom digital circuits. After
converting the analog audi o signal from the
mi crophone into a di git al audi o stream of bits. part of
the job of those di gital syslems is 10 fi ller OUI the
background noise from the audi o signal Pay anenlion
next lime you talk [Q someone usi ng a cell phone in a
noi sy environment. and nOlice how much Ic\ noise
you hear Ihan is probabl y aClUall y heard by the
microphone. As ci rcuits continue to improve in speed
ize. and power, filtering will likel ) improve further.
Some slate-of-the-an may even use two
microphones. coupl ed wiLh beam forming techniques
(see Secti on 4. J 3). to focus in on a user's voice.
For hardware impl ementati on. let's as ume that the adder has a 2 ns del a). Let' also
assume that chaining the adders together results in the delays adding. SO that ("'0 adders
chained together have a delay of 4 ns (detailed analy is of the internal gate of the adders
could show the delay to actuall y be slightl y less). Let ' as ume the multipl ier has a 20 os
delay. Then the criti cal path. or longest register-t o-regi ster del ay (to be di cussed funher
in Secli on 5.4). woul d be from cO to yreq. going through the multipli er and two adders as
shown in Figure 5.39. That path 's length would be 20+4 = 24 ns. , ote that the path from
clIo yreg would be equall y long, but not longer. A critical path of 24 n means the data-
pat h could be clocked at a frequency of I / 24 ns = 42 MHz. In other words. a ne\\ ample
could appear 01 X every 24 ns. and new output would appear at Ye\'el)' 24 n .
Now let' s consider the hardware perfonnance of a larger ized fi lter: a 1000tap FIR
filt er rather than a 3-tap filt er. The main perfonnance difference i that \\ e- Il need to add 100
values rather than just three. Recall from Section 4.13 that an adder tree is a fast wa) to add
many values. One hundred values will require a tree wi th seven le\'els- - 0 addition _ then
25. then 13 (roughl y). then 7. then 4. then 2. then I . SO the total would be ns (for
the multipli er) pl us seven adder-delays (7*2ns = 14ns). for a total dela) of 3-1 05.
For a software implementation. we' ll aSSume 10 ns per instruction. _ -ume h mul -
tiplicati on or addition would require two instructi ons. A 1000tap filt er \\ ould need
approx imately 100 multipli cation and 100 addi tions . so the tota l ti me \\ ould be (100 mul-
ti pli cati ons 2 instr/ mult + 100 addit ions * 2 instr/add) * 10 ns per in-tru tion = -WOO os_
In ot her words. the hardware implementation would be mer 100 tim' Ia, ter (
3-1 ns) than the software implementat ion. hardware implementation uld therefore pm:
100 lime more dala than a software implementation. resulting in much bett er tiltenng.
5.4 DETERMINI NG CLOCK FREQUENCY
RTL de igll produces a processor. consisting of a datapJth und l controller. in' ld th
datapath and cont roll er are registers. and registe.." reqUI re ad'" , ign:ll . .-\ ...1.' ' lgn;1]
must have" panicular frequenc) . The \\ ill ho\\ f,bt th , ) , t III \\(11
exc ' ut e >pecilkd tlIS" . b\ iou,I) . a 10\\ 'r \\ ill re,ult III , 1.,\\ ' r \ Utlc'l\.
\\ hile a hi gher frequcnc) \I ill result in a fu>tcr c\ utl on. <'(1\ '" . J I.trg r
period i. , 10\\ <1'. \\ hilc 1I "Illllkr I 'nlxf I' fast'r.
252 Reg isler-Transfer LevellRTLI Design
Desio"ers of dioital circui ts often (but not always) want their systems to execute as
fast as a designer cannot choose an arbitraril y hi gh clock frequency
(meaning an arbi traril y small period). Consider, for thestmple ClrCU!! m Ftgure
5..l0. in which registers a and b feed through an adder Int o register c. The adder h as a
delay of 2 ns. me;ni ng that when the adder' s inputs change .. the adder's outputs WI ll not
be stable unt il after 2 ns-before 2 ns, the adder's out puts wtll have spunous values (see
Section -1.3 for a descripti on of spurious val ues appearing at an adder's outputs): If the
designer chooses a clock period of 10 ns, the circu!! should work fine. Shortentng the
period to 5 ns wi ll speed the execution. But
shortening the period to I ns will result in
incorrect ci rcuit behavior. One cl ock cycle
might lond new values into registers a and b.
The next clock cycle wi ll load register c I ns
Imer (as well as a and b). but the output of the
adder won' t be stable until 2 ns have passed.
The value loaded into register c will thus be
some spuri ous va lue that has no useful
meaning. and will not be the sum of a and b.
Thus. a designer must be careful not to set
the clock freq uency too hi gh. To determine the
highest possible frequency, a designer must Figure 5.40 Longest path is 2 ns.
anal yze the enti re ci rcuit , and find the longest
path delay from any register to any other regi ster. or from any circuit input to any register.
The longest register-to-register or input-to-regi ster delay in a circuit is known as the cir-
cuit' s critical pal". Designers then choose a clock whose peri od is longer than the
circuit's crit ical path.
Fi gure 5.4 1 illustrates a ci rcuit with at least four poss ibl e paths from any register to
any other register:
One path starts m register a, goes
through the adder, and ends at regi ster
c. This path's delay is 2 ns.
Another path starts at register a, goes
through the adder, through the multi -
plier, and ends at register d. This
path's delay is 2 ns + 5 ns = 7 ns,
Another path starts at regi sler b, goes
through the adde r, through the multi -
plier, and ends al register d. This
path', delay is al,o 2 ns + 5 ns = 7 ns.
The la' l path >tarts at register b, goe,
through the mUltiplier, and ends (It
regi"er d. Thi , palh \ delay i, 5 ns.
Max
(2,7,7,5)
= 7 ns
Figure 541 Dctenni nins the cri tical pOIh.
The longest path is thus 7 ns (there arc aCluall y two ,ueh path,), Thus, Lhe dock
penod mU\1 be al lea'l 7 n'>,
-- - -------------------
5.4 Determining Clock Frequency
The above analysis assumes that the onl y delay
between regIsters IS caused by logic delays. [n reality,
lVires also have a delay. [n Ihe 1980s and 1990s, the
of logic dominated over the del ay of wires-
Wire delays were often negli gible. But in modem chip
technologIes, the del ay of wires may equal or even
exceed the delay of logic, and thus wire delays cannot
be Ignored. Wire delays add 10 a palh's length just as
logIC delays do. Fi gure 5.42 ill ustrales a path length
calculatI on wllh Wire delays included.
Figure 5.42 Longest path is
253
3 ns con idering wire delays.
Furthermore, the above analysis does not consider
se!Up times for the regi slers. Recall from Secti on 3.5
that flip-flop inputs (and hence register inputs) must be
stable for a specified amounl of time before a clock
edge. The setup lime adds to the path length.
Even considering wire delays and etup times, designers typically choose a clock
period thaI is stiliioll ger than the critical path by an amount depending on how conserva-
ti ve the deSIgner wants to be with respect to ensuring the circuit works under a varie!)' of
operating conditions. Certai n conditi ons can change the delay of circuit components, con-
dilions like very high lemperature, very low temperature, vi bration, age, etc. Generally,
the longer the period beyond the critical path, the more conservative the design_ For
example, we might determine that the critical path is 7 ns. but we might choose a clock
period of IO ns, or even 15 ns, the latter being quite conservative,
If low power is a design goal , then a designer mi ght choose an even longer periO<i
such as lOOns, to reduce circui t power. Why reducing the clock frequency reduces power
will be di scussed in Section 6.6.
When analyzi ng a proeessor (controller and datapath) to find the critical path, a
designer must be aware that regi ster-to-register paths ex.i I not just \\ithin the datapath
(Figure 5.43(a, but also within the controller (Figure - .43(b), between the controUer
and dal apaLh (Fi gure 5.43(c, and even between the proce or and external omponems.
The number of possibl e paths in a circuit can be quite large. Consider a circuit \\ith .\'
registers that has paths from every register 10 every other register. Then there are S",V,
possibl e regisler-to-regisler paths. For example, if i 3 and the three regi ICrs are named'-\,
CONSERVATIVE CHIP MAKERS, AND PC OVERCLOCKING,
Chip makers usually publish their chips' mlL,imum
clOCki ng frequency somewhat lower than (he real
mMi mum- pcrhap 10%, 20"", or even 30% lower. ueh
conservatism reduces the chances thn! the chip \\i ll fuil in
unnnlicip3loo silualions. such as extremes of hot or Id
weather, or sli ght vnrinti om in Ul e chip m!U1ufucturing
process, Many pcrnlnal computer enthusiasts have taken
nd,".ullOge of such con",,,,,,,hm b} "overclocking" their
PCs, meaning to sct the clock frcquenC} higher than J
chip's published mal imum, b} cbanging !he PC's 81
(basic input/oulput S) tem) sening .. NUJ1lcrOUS v..
posl stnoso on !he su ;;es :md f:lil=- of
trying to o' 'erdock ne:ui) .' IJ PC
the norm is about 10'l- hIgher lII.ln !he puNosbod
' \\ . I don't f'e\"'QnUllef'kJ \erckx: ng ,flY
one, you !he ml="""", ,.. due 1(\
O\erhe:lting). but lOt re.ting tft tb-: '"
prescO\..'e of all\ dc--.Ign
5 Register-Transfer Level (RTL) Desi gn
Figure 5.43 Crit ical paths throughout a ci rcuit: (a) within a datapath . (b) within ;] controll er,
(c) be",een a controller and d.l. pnth.
B. and C. then the possibl e paths are: A- >A, A->B, A- >C. B->A. B->B, B->C,
C- >,-I. C->B. C->c. for 3*3 = 9 po sibl e paths. For N=50. there may be up to 2500 pos-
, ibl e paths. Because of the large number of possibl e paths. aut omated tools be of great
assi tance. Timing analysis tools automati call y anal yze all paths to determlOe the longest
path, and may also ensure that setup and hold times are salt sfied throughout the CIrcUIt.
5.5 BEHAVIORAL-LEVEL DESI GN: C TO GATES (OPTIONAL)
As pcr chip continue to increase and hence dc. igners build more complex digital
systems that use tho e additi onal transistors. di gita l ystem behavior becomes increasi ngly
difficult to understand. Frequentl y. a designer building a new digit al y tern finds it useful to
fi r t descri be the de ired system behavior u ing a programming language. like C. C++. or
Java. in order to fi rst get that de ired behavior correct. (Alt ernati vely. the designer may use
the hi gh-l evel programmi ng constructs in a hardware descripti on language. li ke VHDL or
Veri log. to fi N get the desired behavior correct. ) Then. the des igner convens lhat program-
mi ng language descri plion 10 an RTL design. by foll owing Ihe RTL design melhod Ihal
usuall y Sians wilh a high-level Siale machine RTL descripli on. Converting a syslem's pro-
gramming language de!>Criplion 10 an RTL descriplion is known as behavioral-level design.
We-li lOlroduce behavioral-level desi gn tIl, ing an exampl e.
EXAMPLE 59 SUI -of absolute-dlHerences In C for video compression
Recall bamplc 5.7. 111 which we crealed a ,um-or-ahsolutc-(hrrcrence, component. In Ihat
eJ;ample. we 'tdrted with hl gh.lc\' cl , late machine-but Ih.1I \ UlI C nmchlllc wa.,n't vcry easy. to
undeN nd We can more eaSily descnbe Ihe compul all on of Ihe , um of ab,ol ule d,rrerence,
C code. a, ,h""n In I.gure 5.4-1
Figure 5.44 C program description
or a sum-or-absolul e differences
computat ion- the C program may
be easier to develop and easier to
understand than a state machine.
5.5 Behavioral-Level Design: C 10 Gates (Optional) 255
int SAD (byte A (256J, byte B (256]) II not quite C syntax
{
uint sum; short uint i;
sum = 0;
i =0;
while (i < 256) {
sum = sum + abs (A/ij - B(ij);
i= i + 1:
relum (sum);
That code is much easier to understand ror mOSI peopl e than Ihe high-level stale machine in
Fi gure 5.29. Thu ror some designs. C code (or somelhing similar) is the mosl natural tarring poinL
To begin the RTL design melhod. we could conven Ihi s code to a high-Ie, el lale machine_ like thaI
in Figure 5.29, and then proceed to complele the RTL design method and hence design the circuiL
Ii is instructi ve 10 define a Struclured method for converting C code 10 a high-level stare
machine. Defi ning such a method makes clear 10 us thai C code can be autamatically com-
piled to either software on a programmable proces or. or 10 a cllsrom digiral circuit_ We
poi nt Oul lhal moSi designers lhal stan with C code and then continue with RTL design do
lIot nece saril y follow a particul ar melhod in performing such cOI1lersion. Howe\er. auto-
mated lools do foll ow a melhod having some similarities to the one we now describe. \>'-e
also poi nl OUI lhm lhe conversion melhod wiLl somelimes result in "extra" tates that you
might noti ce could be combined Wilh other slales-these extra states would be combined by
a later optimi zation slep. though we' lI combine some of them as we follow the method.
We consi der lhree Iypes of staiemenlS in C code-as ignment statemenLS_ while
loop. and condilion statements (if-lhen-else)-and provide higb- Ie\ el tate rna hine tem-
plales for each such Slalemenl.
An assignment tatement in C
Iranslates simply into a stale in a largel = expression: ..
Slate machine. wi th lhe Sl ate' s
actions carryi ng oUl lhe assignmenl.
as shown in Figure -.45.
An if-thell 131ement in C trans-
lates into a Slme Ihm checks Ihe
condilion of Ihe if Slmemen!. and
branches 10 lhe sime for lhe thell
part if Ihe condi li on is lrue. Olher-
wi se branchi ng pasl Ihose tutes to
an end hlle. as shown in Fi gure
5.-16.
We can tranJat e an if-rhell -else
stmemenl in inlo Il similar -late
machine wilh a stUi C Ihut lhe
onditi on of the if stmemOn! . but
Figure 5.45
statement.
.f (cond) {
II lhen stmts
..
(then Slm
+
256 Register-Transfer LeveII RTL} Design
this time branching to states for the
else pan if the if conditi on is fal se_
as shown in Figure 5.47.
The else pan commonl y con-
tains another if statement as C
programmers may have multiple
else if pans in a region of code.
Finall y. a while loop statement
in C translates into states simi lar to
if (cond) (
1/ then stmts
else (
II else stmts
+

--pond
(then stmts) (else stmts)
(end)
+ -.-J
an if-then statemenl. excepl that Figure 5.47 Templat e for if-then-else Slnternent.
after executing the while's state-
ments. if the while condition is true,
the state machine branches back to
the condition check. rather than to
the end state. as shown in Figure
5.-18. Only when the condition is
false can we reach the end slale.
Given these simpl e templates.
we can conven a wide variety of C
programs to hi gh-level state
machines. from which we already
know how to create circuit designs
following our RTL design method.
while (cand) (
II while stmts (while stmts)
+
Figure 5.48 Templme for while loop statement.
EXAMPLE 5.10 Converting an if-then-else statement to a state machine
We are given the C-like code shown in Fi gure 5.49(a). which computes the maximum of two data
inputs X and Y. We can translale that code to a state machine by first translaling the if-then-else state-
ment to states usi ng the method of Fi gure 5.47. as shown in Figure 5.-l9(b). We then translate the lhen
statements 10 states. and then the else statements. yielding the final state machine in Figure 5.49(c).
Inputs: uint X. Y
OutPUIS: uint Max
if(X >Y) (
r------------,
: Max = X; :
r-----------"
else (
r------------,
: Max=Y; :
r-----------"
(a)
(then stmts) (else stmls)

(b) (e)
Figure 5.49 Behavioral-level design slani ng from C code: (a) C code for compuling Ihe max of two
numbers. (b) translating Ihe if- Ihen-else Stalemenl 10 a hi gh- level 'tnte machine. ( ) translaling the
Ihen and else ,tatements 10 states. From the stale machine In (c). we could usc our RTL design
method 10 complete the deSIgn. Note: max can be Impl emenled morc efficiently: we u e max here
(0 provide an easY-lo-understand example.
EXAMPLE 5.11
5.5 Behavioral-Level Design: C to Gates (Optional) 257
SAD C code to high-level state machine conversion
We wish to Conven the C program de cription of the sum-of-absolute differences example of
Example 5.9 to a high-level state machine. The code is shown in Figure 5.5()(a)_ written as an infi-
nite loop rather than a procedure call. and using an input "go" to indi cate when the system should
compuLe the SAD. The "while ( I )" statement, afler some optimization, translates just to a transition
from the last state back to the first state. so we' lI hold off on adding that transition until we have
formed the rest of the state machine. We begin with the statement "while (!go):' which based 00 the
template approach translates to the states shown in Figure 5.50(b). Since the loop has no statemeots
Inputs: byte A(256).B[256) i--------------i
bit go; "
Output: int sad : : : I::
mainO ! ! !go
{ I:'" !.. ______ _ ___ .1
uint sum; short uint i;;1 :
while (1) { ,,.." ! !
1---------------, /" : !
: while (!go); (': :
----------------- I I
sum = 0; : :
i = 0; L _________________J
(b)
(c)
sum = sum + abs(A(i) - Bn)); :
___ _________________. _ J
(. ) !",o,"m I
; ______________ J. __
j
j l(k256)j.-.
.-.
: :./'
i I
L ______ __________ (9)

.go go
sum=O
i=O
(d)
Figure 5.50 Behavioral-Ie,'el design of the sum-or-absolute difleren,.., ,'Ode; (3) ongin31 C
code, wrillen an infinite loop. (b) lrnnsiating the statement ',\, hile l!g.o):' to 3. ,Ute
(e) simplified for "while (!go):" and states for the !bSignmcm ,tll "ment that (-.:
(d) merging tit two assignment into one. (e) insening the template fOf the nt"\ t \\hil 10l'P.
(f) inserting the SIBle, r th!lt \\ hil' loop. merging (\\ 0 ignmenl '19.t 'menlS tOto one. ,,) the rirul
hi gh-Ievd , tnt e ma hine. \\ ith the ',\, hi Ie (I)" inciud,'<! tran",uoolOg tn>m the '3>t - t
the fin-I SHIlt' . and \\ ith ob\ioll'\l) unnccessat: st ..ltc.:o,
258 Register Transfer LevelIRTL) Design
III the loop bod). \\c;! can simplify the loop's Slates a.s shown in Figure S.50(c), 5.50(c) also
thl! .. for the next IWO which are assignment MalCl11c nt s. SInce two
a .. ,i!!nmenb could be done si multanl!oll sly. we merge the IWO illlo Olle, as shown 111
5.56<d). We then the next lIhi/e loop. using the Il'hile loop to the SUi tes sh,own In
Figure :i.SO(e), We fill in the SH\ les for the wllile loop's statement s III Figure 5.50(0. merging the
J"si2l1mclll :-talemenl :-.131es into one stale since the assignments can be done simult aneously.
Fhwrc :11500 shows the state for the last statement of the C code. whi ch assigns sat/=sum.
\\ e eliminate obvious. ly unnecessary cmpty swtes. and add ;1 transit ion from the last slate 10
the state to account fo r the entire code being encloscd in a "while ( 1)" loop.
NOl ice the similarity between our final hi gh- leve l state machine in Figure 5.50(g) and the high-
le"el stiJle 1113chine we des igned from scratch in Figure 5,29.
\\'e will need to map the C data Iypes 10 bits at some point. For exampl e: the C code,
i to be a shan unsigned integer. whi ch means 16 bits. So we could dec lare 1 to be 16 bit s In the
high-level s t3le machine. Or. knowing the range of i to be a to 256. we could instead define i to be
9 bib (C doesn't have a 9-bil wide data lype).
\Ve could then proceed to design a controller and dalapalh from thi s stale machi ne. as was
done in Figure 5.30. we can translate C code to using a straightforward automatable
method.
Through the previous exampl es. you have seen howe code can be convened to a
custom digital circuit using methods that are full y aut omatable.
General e code can contain additi onal types of statement s. some of which can be
eas il y translated to states. For example. afar loop can be translated to states by first trans
fonning the for loop into a IIhile loop. A 51vitch statement can be translated by tirst
translating the 511itch statement to if the,, eI5e statements.
Some e construct s pose problems for converting to a circu it. though. For example,
pointers and recurs ion are not easy to translate. Thus. tool that automate behavioral
design from e code typicall y impose re tricti ons on the a ll owable e code that can be
handled by the tool. Such restri cti ons are known as suhsellillg the language.
Whil e we have emphas ized e code in thi s section. obviously any simi lar language,
such as e++. Java. VHDL. Veri log. etc .. can be converted to cu tom digital circuit s-with
appropriate language subsetting.
5.6 MEMORY COMPONENTS
Registertransfer level design involves instanti ating and con
necting datapath components to fonn data paths, controlled by
controllers. RTL design often uti lizes some additi onal compo
nenh Outside the data path and controller.
One such component is a memory. An MxN memory is a
memory component able to . tore M data it ems of N bit; each.
Each data item in a memory i. known as a word. Figure 5.5 1
depicts the storage avai lable in an MxN memory.
We can generall y categorize memoric, into two group' :
RAM memory. which can be written to and read from, and
ROM memory, which can on ly be read from. Howcver, a' wc
,ha ll sec. the distinction between the two categori c, is billr-
nng due to new technologic, .
------.



...
::;
B
Nb/IS
wide each
MxNmemory
Figuro 551 Logical
\ lew or a memQry.
5.6 Memory Components 259
Random Access Memory (RAM)
A RAM is logicall y the same as a register file (see Section 4.IO)--both components are
memories whose words (each of whi ch can be thought of as a register) can be individually
read and written using address inputs. The differences between a RAM and a regi ter file are:
The size of M- We typi call y refer to smaller memories (from 4 to 512 or perhaps
even 1024 words or so) as regi ster fi les. and larger memories as RAM .
The bit storage impl ementati on-For large numbers of word. a compact imple.
mentati on becomes increaSingly imponant. Thus. a RAM typically uses a very
compact implement ati on for bit storage. rather than u ing a Rip-Rop.
The memory' s physical shape-For large numbers of words, the phy ical shape of
the memory s impl ementation becomes imponant. A tall rectangular hape will
have some shon wires and some long wi res, whereas a square shape will have all
medium length wires. A RAM therefore typicall y ha a square hape. to reduce
the memory's critical path. Reads are perfonned by first readi ng out an entire row
of words, and then selecti ng the appropri ate word (column) out of that row.
There's no c1ear cut border between what defi nes a regi ster file and whal defines a
RAM. Smaller memori es (typi call y) tend to be call ed files , and larger memorie
tend to be called RAMs. But you' ll often see the tenns used quite interchangeably.
A typical RAM is single-ported. Some RAMs are dualponed. Adding more pons 10
RAMs is much less common than to register files, because a RAMs larger size makes the
delay and size overhead of extra pons much more costly. ' everthele . a
RAM can have an arbitrary number of read pons and wri te pons. ju t like a register file.
Fi gure 5.52 shows a bl ock diagram for a ID24x32
single-pon RAM (M= 1024. N=32). data is a 32-bit wide
set of data lines that can serve either as input lines during
writes or as output lines during reads. add r is a JDbit input
serving as the address lines during reads or wri te. rw is a I
bit control input that indicates whether the present operation
should be a read or a write (e.g .. rw = 0 means read. rw = l
means write). en i a Ibit control input th3t enables the
RAM for reading or writing-if we dont want to read nor
1024 x 32
RAM
writ e during a parti cul ar clock cycle. we set en to 0 to Figure 5.52 RA\I
prevent a read or write (regardless of the value of block symboL
WHY IS IT CALLED "RANDOM ACCESS" MEMORY?
In the early days of digital de, ign. RA i s did not exist.
If you had infomlntion you wanted your digi tal ircuit
to store. you stored it on a magnetic drum. or :l
magneti c tape. Tape drives (and drum drives too) had
to spin the 13pe to get the head. whi h ould read or
write onto the (ope. alx)\'c the de ired melllo!,)
location. I f the hend urTCI111y ubo\'c locution 900.
and you wanted to wri le t loclllion 999. the tape
would hnve to pi n P"'t 901. 902, ... 99 . until location
999 was under the head. In Nher \\ ords. the tape \\ as
acce sed requtlJ{ial/y. \\'ben R."'-M \\ firq rei a.cl
its Illost appealing feature \\J.!. that ''r.lndQ(1)'
address auld he a 't'sSt.'iI in the S!lJ11C lJ1l()unt of rune
as any other of the-
read addre_ . That" be<-au: then' '" no o.ad-- '" tt'
acres. n R. t. and no pinnlll, of or drum,
Thus. the Icnn ''rJndoOl JI.: \\3..' u..ed..
.tnd ha ... :-tlll'k to thi:-
260 Register-Transfer LevellRTLI Design
Fi gure 5.53 shows the logical internal structure of an MxN RAM. By "Iogical" struc-
ture. we mean that we can think of the structure bei ng implemented in that way, although
a real physical implementati on may possess a different actual structure. (As an analogy, a
logical structure of a telephone incl udes a microphone and a speaker connected to a
phone line. al though real phys ical telephones vary tremendously in their implementa-
ti ons. including handheld devices. headsets, wireless connections. built-in answeri ng
machines, etc.) The main pan of the RAM structure is the gri d of bit storage blocks, also
known as cells. A coll ecti on of N cell s forms a word, and there are M words. The address
inputs feed into a decoder. each output of whi ch enables all the cell s in one word corre-
sponding to the present address values. The enable input en can disable the decoder and
prevent any word from being enabled. The read/writ e control input rw also connects to
every cell to control whether the cell wi ll be wrill en wi th wdata. or read out to rdata. The
data lines are connected through one word 's cell to the next word 's cell , so each cell must
be designed to onl y output it s content s when enabled and thus output not hing when di s-
abled, to avoid interfering wi th another cell 's out put.
i5
"C
'"
addrIA-l)
clk
LetA = 1092 M
rdata(N-l) rdatalN-21 rdataO
Figure 5.53 Logical internal structure of a RAM.
Noti ce that the RAM in Figure 5.53 has the
same inputs and outputS as the RAM block diagram
in Figure 5.52, except that the RAM in Fi gure 5.53
has separate write and read data lines whereas
Figure 5.52 has a single set of data lines (a single
port). Figure 5.54 shows how the separate lines
might be combined inside a RAM having just a
single set of data lines.

data(N-l ) dataO
Figure 5.54 RAM data inpui/
output for a single port .
Bit Storage in a RAM
Compared to a register file, the key feature of RAM tS ItS compactness. Recall from
Chapter 3 that we implemented a bit storage bl ock using a D nip-Oop. Because RAMs
store large numbers of bits, RAMs utilize a bit torage bl ock that is more compact than a
flip-flop. We thus discuss briefly the internal design of the bi t storage blocks inside tWO
------------------
5.6 Memory Components 261
popul ar types of RAM-stat ic RAM and dynamic RAM. However. be forewarned that
the Internal deSIgn of those block ' I . .
'. S InVO ves electrontcs Issues beyond the scope of this
book, and Instead IS wi thin the scope of textbooks on VLSI or advanced digital design.
Fortunately. a RAM component hides the complexi ty of its internal elecrronics by using a
controlle:, and thus a digital designer's interaction with a RAM remai ns as dis-
cussed In the prevIOus ection.
Stati c RAM
RAM uses a bit storage bl ock involving
two Inverters connected in a loop. as shown in
Figure 5.55. A bit d will go through the
bOllom inverter to become d', then back
through the top inverter to become d again-
thus, the bit is stored in the inverter loop.
NOlice that thi s bit storage block has an extra
line, da ta '. passing through it , compared
with the "logical" RAM structure in Figure
5.53.
To wri te a bit into thi s inverter loop, we
set the data line to the value of the desired
bit , and d a t a' to the compl ement. So to store
a I, the memory controll er sets d a t a = 1 and
data ' =0, as shown in Fi gure 5.56. (To store
a O. the controll er would have set data=O and
data ' =l.) The controller then sets
Figure 5.55 SRAM cell.
enabl e=l, whi ch turns on bot h shown tran-
sistors. The data and data ' values thus '---------------------------'
appear in the inverter loop as shown (over- Figure 5.56 Wri ti ng a I to an
writing whatever value was there before). SRAM cell.
Fully understanding why thi circuit works
involves electri cal details beyond the scope of
thi s di scussion.
Reading the stored bit can be done by first elling the da ta and da ta' line bolh 10
1 (an act known as prechargillg). and then by serting enable 10 1. One of the enabled
transistors will have a 0 at one end. causing the precharged 1 on the da ta or da ta' 10
drop to a vol tage slightl y less than a regular logic 1. Both the data and data' lines
connect to a special circuit call ed a sellse amplifier that detects the \'oltage On
d a t a is sli ghtl y hi gher than data' . meaning IOf!ic 1 is stored. or whether the \' n
data ' is slight ly hi gher than on data . logic 0 is slOred. Again. detail -fthe
electronics are beyond the scope of this di scussion.
Notice that the bit storage block of Figure 5.-7 utili zes ix transistors-{\\O in ' ide
each of the two inverters_ and two transistors outside the in\'erters. ix transi_ t rs are
fewer than needed inside a D flip-flop. A tradeoff is that special must be used t
read a bit stored in thi s bit storage block. where:!., a D Hip-flop ourput ' regular logic
values directl y. uch special circuitry slows the access time f the SIOI\.-d bit.
262 Register-Transfer LevellRTLI Design
DRA'v1 ch/fJ'ifirrt
appeared Ifl the
early 197(}(, ufld
((Ju/d hold only a
tlwu wnd bm
W{)t!unDRA \tfs
('(In hold tnt",)'
hllllon\ of bill
RAM based on a six-transistor bi t storage
block. or similar such block, is known as a
sIalic RAM. or SRAM. A static RAM mai n-
tains the stored bit as long as power is
supplied to the transistors. Except. of course.
when the block is being written. the stored bit
does /lot change-it is (noL changing),
Dynamic RAM
An alt ernati ve popu lar bit storage block used
in RAM has only a single transi stor per block.
Such a block utili zes a (relati vely large)
capacitor at the output of the transistor. as
shown in Fi gure 5.58(a).
Writi ng can occur when enable is 1:
d a t a 1 will charge the top plate of the
capacitortoa L will make it O.
When enable is returned to O. a 1 on the top
pl ate will begin to di scharge across to the
bottom plate of the capacitor on to ground
(Why? Because that 's what a capacitor does.)
However. the capacitor is intentionall y
desi gned to be relativel y large, so that the dis-
charge takes a long time, during which time
the bi t d is effecti vely stored in the capacit or.
Fi gure 5.58(b) provides a timing diagram
illustrating the charge and di scharge of the
capacitor.
Reading can be done by first setting da ta
word
enabte
To sense amplifiers
Figure 5.57 Reading an SRAM.

cell
word
pi:-
enable
I d
,/ slowly
Tttapacltor
discharging

(al



Ibl
Figure 5.58 DRAM bit storage (a)
bit storage block. (b) discharge.
to a volt age midway between 0 and L and then setti ng enabl e to 1. The val ue stored in
the capaci tor will alt er the voltage on the data line. and that alt ered volt age can be sensed
by special circuit s connected to the data line that amplify the ensed value to either a
logic I or a logic O.
lt turns out that readi ng the charge stored in the capacit or di scharges the capacitor.
Thus. the RAM must immediately write the read bi t back to the bit storage block after
reading the bl ock. The RAM mu t cont ain a memory controll er that aut omat icall y per-
forms such a write back.
Because a bit tored in the capacit or graduall y di scharges to ground, the RAM must
refresh every bit storage block before the bi ts compl etely di scharge and hence the stored
bit is lost. To refresh a bit storage block, the RAM must read the block and then write the
read bit back to the bl ock. Such refreshing may be done every few mi croseconds. The
RAM must include a built-in memory controller that automati call y perform these
refreshes.
Note that the RAM may be bUl>Y refreshing it self at a time that we wish to read the
RAM. Funhermore. every read must be foll owed by an automatic writ e. Thu . RAM
based on one- Lra nl istor plus capacitor technology may be slower to "ecess.
Using a RAM
5.6 Memory Components 263
Because the stored bit challges (di scharges) even when power is upplied and we are
not writing the bit storage block, RAM based on the one transistor plus capacitor bit
storage block is known as dynamic RAM, or DRAM.
Compared to SRAM, DRAM is even more compact, requiring only one transistor per
bi t storage bl ock rather than six transistors. The tradeoff is that DRAM requires
refreshing, which ultimately slows the access ti me. Another tradeoff. not alluded to
above, is that creating the relati vely large capaci tor in a DRAM requires a special chip
fabricati on process. and thus incorporating DRAM with regular logic can be costl y. In the
I 990s, incorporating DRAM with regular logic on the same chip was nearl y unheard of.
Technology advancement s, however, have led to DRAM and logic appeari ng on the same
chip in more and more cases.
Fi gure 5.59 graphi call y depicts the compact-
ness advantages of SRAM over register fi les, and
DRAM over SRAM, for storing the sallie number
of biLs.
Figure 5.60 shows timing diagrams de cribing
how to write and read the RAM of Fi gure 5.52.
The timing diagram in Figure 5.60 shows how to
write a 9 and a 13 into locations 500 and 999
during clock edges I and 2, respecti vely. The next
cycle shows how to read locati on 9 of the RAM,
MxN memory
implemented as a:
register
file
SRAM
DRAM
Figure 5.59 Depiction of compacrnes
benefits of SRAM and DRAM (not to
by setting and scale).
(meaning read). Shonl y after r w become 0, data becomes 500 (the value we had previ-
ously stored in location 9). Notice that we had to disable our writing of data first (by
setting it to Z). so a not to interfere wi th the data being read from the RAM. AI 0 notice
that Lhis RAM' s read functionali ty is asynchronous.
, , ,

1 l
addr
, , ,


rw 1 means write : "
, ,
, , ,
h h :
l RAM{9] i RAM{13] i
now equals 500 now equals 999
(a)

addr R setup i
data 500
rw 1
I I time :
I I aa:ess
! ! bine
1 '
(b)
Figure 5.60 Rending and writi ng a RAM: (a) timing diagroms. (b) setup. hold. and J< ss time-
The delay between our setting the rw line to read and the rend datu stabilizing ut the
da ta output is known as the RAM's access time or read tillle_
We now provide ,m example of using a RA t in an RTL design.
5 Register-Transfer level (RTl) Design
EXAMPLE 5.12 Digital sound recorder using a RAM
Let's design a system that CJIl record sound. and can pl ay thai recorded Such a recorder
i!'> found in various toys. in telephone answering machines. In cell phone announcements,
and numerous Dlher devices. \Vc'lInccd an analog-todigital convener (0 dIgiti ze the sound, a RAM
(0 store the digi tized sound. a digital .lo-analog convener to output the digitized sound, and a pro-
cessor 10 both convert ers the RAM. Figure 5.61 shows a block diagram of the system.

If
microphone
4096x16
RAM
speaker
Figure 5.61 Utilizing a RAM in a digital sound recorder system.
To slOrc digitized sound. the processor block can
implement the hi gh-level stale machine segment shown in
Figure 5.62. The machine fi rst inti ali zes its internal
address counler a to 0 in state S. Next. in state T. the
machine loads 11 value inl o the analog-Io-digital convener
to cause a new analog sample to be digitized. and sets lhe
three-state buffer to pass that digitized value to the
RAM's da ta lines. That state also sets the RAM address
to the counler a's value. and sets the control li nes (0
enable writing. The machine lhen transitions to slate U.
whose transitions check the value of a against 4095. That
Slate also increments a. (Remember that the transi tions
from U will use the old va lue. not the incremenlcd value,
of a. Thus. the transiti ons compare wit h 4095. not 4096.)
The machi ne returns to Slat e T and hence cOnli nues
writing samples in memory addresses as long
as the memory is nOt yet fill ed (a < 4095). Notice that
the comparison is with 4095. not 4096. Thi s is because
the action in Slale X of a - a + 1 does nOI occur until
the next clock edge. so the compari son of a < 4095 on
, tate K s outgoing uses the old value of a, not
the incremented value (See Secti on 5.3 discussion of
common pitfallq
To playback the stored digititcd .ound. the processor
block can implement the hi gh-level Mate machine
segment hown in Figure 5.63. After initializing the
counl er a in stale V. the machine Male W St:Jle tV
Figure 5.62 State machine for
stori ng di giti zed sound in RAM.
a=O
Figure 5.63 Stat e machine for
playing ,ound from the RAM.
5.6 Memory Components 265
di sables the three-state buffer. to avoid interfering with the RAM's output data that wi ll appear
dUflng RAM reads. That state also sets the RAM address lines. and sets the RAM control lines to
enable reading. Read data will thus appear on da ta lines. The next state X loads a value into the
convert er, 10 convert lhe data jusl read from RAM to the analog signal. That stale
also IOcrements the counter a. The machine returns to state W to continue reading. until the entire
memory has been rcad.
Read-Onlv Memory (ROM)
A Read-Onl y Memory (ROM) is a memory that can be read from. but not written to.
Because of being read only, the bit-storage mechanism in a ROM can be made to have
several advantages over a RAM, including:
Compoct/less-a ROM's bit slorage may be even small er Ihan a RAM' s_
NO/l voIOlility--A ROM's bi t storage mai ntains its contents even after the power
suppl y to the ROM i shut off-when turned back on. the ROM's contents can be
read agai n. In Contrast. a RAM loses its contents when power is shut off. A
memory Ihat loses its cont ent s when power i shut off is known as volatile. while
a memory Ihal maint ains its contents wi thout power is known as nonvolatile.
Speed-A ROM may be faster to read than a RAM. e pecially Ihan a DRAM.
wIV-polVer- A ROM does not consume power to maintain its contents. in con-
trast to a RAM. Thus, a ROM consumes less power than a RAM.
Therefore. when the data stored in a memory will not change. we might choose to
store that data in a ROM to gain the above advantages.
Fi gure 5.64 shows a bl ock symbol of a I024x32
ROM. The logical internal structure of an MxN ROM
is shown in Fi gure 5.65. Notice that Ihe internal
structure is very imi lar to the internal structure of a
RAM shown in Figure 5.53. Bit storage blocks
forming a word are enabled by a decoder output. with
the decoder input being the addres . However.
data
en
because a ROM can onl y be read and cannO! be Figure 5.64 10.4x3. ROM
written, there is no need for a rw input comrol to block symbol.
specify read versus write. nor for wdata inputs to
provide data being written. Also. because no synchro-
nous writ es Occur in a ROM. the ROM does not have a clock input. In fact. not only is a
ROM an asynchronous component. but in fact a ROM can be thought of as 3 combina-
tio/l ol component (when we only read from the ROM: we'lI see variation later).
Some readers mighl at this poi nt be wondering how we write the initial ntents of a
ROM lhal we then can onl y read. After all. if we can't write the content of a at all.
then the ROM is reall y of no u e to us. Obviously. there must be a \\ 3) 10 write the con-
lents of a ROM. bUI in ROM terminology. the writing of the initial contents of 3 i
known a ROM programmillg. ROM types differ in their bit storage bl k implemenm-
tions. which in lurn causcs differen es in the methods used r RO;\1 programming. We
now describe several popul ar bil slomge block implementations for R
266 Register Transfer Level (RTL) Desi gn
ROM Types
addrO
-0 addrl
u
'"
addr(A ' )
en
LetA = log2 M
word i5
enableai
dO --- --- - - ---
--- ------
(t__ L_ J
I - , --- I
data(N' ) dala(N2) dataO
Figure 5,65 Logical int ernal structure of a ROM.
Mask-programmed ROM
bit storage
block
(a"ceW' )
word
word word
enable-enabi8
data
data line o data line
Fi gure 5.66 illustrates the bi t storage cell
for a mask- programmed ROM. A mask-
programmed ROM has its contents pro-
grammed when the chip is
manufactured. by directl y lIIirill g Is to
cells that should store a I , and Os to
cell s that should store a O. Recall that a
"I" is ac tuall y hi gher-than-zero
voltage coming from one of everal
power input pins to a chi p-thus. wiring
a I means wi ring the power input pin
directl y to the cell. Likewise. wiring a 0
Figure 5.66 Mask programmed ROM
cell s: teft cell programmed with 1. ri ght
cetl with O.
to a cell means wiring the ground pin
directl y to the cell . Be aware that Fi gure 5.66 presents a logical view of a mask.pro-
grammed ROM cell- the actual phys ical design of such cell s may be somewhat
diffe rent-for example. a common design strings several vert ical cell s together to form a
large NOR-like logic gate. We leave detail s for more advanced textbooks on CMOS
circuil design.
Wires are pl aced ont o chips during manufacturi ng by using a combinati on of light.
sensiti ve chemi cals and li ght passed through len es and "masks" that block the li ght from
reaching regions of Ihe chemi cals. (See Chapter 7 for funher det ails. ) Hence the term
"mask" in mask-programmed ROM.
Mask-programmed ROM has Ihe best compactness of any ROM type. but the con
of the ROM must be known during chip manufacturing. This ROM type is best
suited for high-volume well -established products in whi ch compactness or very low cost
is critical, and in which programming of Ihe ROM will never be done after the ROM's
chip i, manufact ured.
-
5.6 Memory Components 267
Fuse-Based Programmable ROM-One.Time Programmable (OTP) ROM
Fi gure 5.67 illustrates Ihe bit storage cell
for a fuse-based ROM. A /use-based ROM
uses a fuse in each cell. A fuse is an elec-
tri cal component that initially conducts
from one end to the other just li ke a wi re,
but whose connecti on from one end to the
other can be destroyed ("blown") by
pass ing a hi gher-than-normal current
through the fuse. A bl own fuse does not
conduct and is instead an open circuit (no
connecti on). In the figure, the cell on the
left has its fuse int act, so when the cell is
enabl ed. a 1 appears on the data line. The
data line data line
word
__ __ it-t __ __ -tr
fuse blown luse
Fi gure 5.61 Fuse-based ROM celt s: left cell
programmed with t . ri ght celt with O.
cell on the ri ght has it s fuse bl own. so when the cell is enabled. nothing appears on the
data line (special electronics wi ll be necessary to conven that nothing 10 a logic 0).
A fuse-based ROM is manufactured with all fuses intact, so the initiall y stored con-
tents are all I s. A user of thi s ROM can program the contents by connecting the ROM to
a special device, known as a programmer. that provides hi gher than normal currents 10
onl y those fuses in cell s that should store Os. Because a user can program the contents of
thi s ROM. the ROM is known as a programmable ROM, or PROM.
A blown fuse cannot be changed back to its initi al conducting form. Thus. a fuse-
based ROM can onl y be programmed once. Fuse-based ROM are therefore al so known as
olle-lime programmable (OTP) ROM.
Erasable PROM-EPROM
Figure 5.68 depicts a logi cal view of an
erasable PROM cell. An erasable PROM.
or EPROM. cell uses a special type of
transistor, having what is known a. s a
floating gate, in each cell . The details of a
floating gate transistor are beyond the
scope of thi s section. but briefly-a
fl oat ing gate transistor has a special gate in
whi ch electrons can be "trapped:' A Lran-
sistor with electrons trapped in its gate
stays in the nonconducting siruation. and
thus is programmed to store a O. Other-
wise, the cell is considered to store a 1.



"''''

word
enable
data line data line
celt
trapped electrons
Fi gure 5.68 EPROM celt s: left celt
programmed with L right celt \\ ith O.
pecial electronic circuitry convens sensed current on the dat a line' a; logic I or O.
An EPROM cell initially has no electrons trapped in any fl oating gate transistors. -
the initially stored contents are all I s. A programmer d \ ice applies higher-than-nonnal
volt age to those transistors in cell s that should store Os. That high \'olt:\g" ' ause, d -
trons to IlI/l1l e/ through a slllall insul ator into the fl oating gate region. When th' \ Itnge is
removed. the electrons do not have enough energy to tunnel ba k. and thus are trapped as
shown in the right cell of Figure 5.6 .
268 RegisterTransfer LevellRTLI Design
The electrons can be freed by exposing the electrons
to ultraviolet (UV) light of a part icul ar wavelength. The
UV light energizes the electrons suc h that they tunnel back
through the small insulator, thus escaping the floating gale
region. Exposing an EPROM chip lo UV li ght therefore
"erases" all the stored Os. reslOring the chip lo having all
1s as cont elllS. aftcr whi ch the EPROM can be pro-
grammed agai n. Hence the term "erasable" PROM. Such a
chip can typicall y be erased and reprogrammed about ten
thousand times or more, and can retain its content s without
power for ten years or more. Because a chip usuall y
appears inside a bl ack package thm doesn' t pass light. a
chip with an EPROM requires a wi ndow in that package
through which UV light can pass. as shown in Figure 5.69.
EEPROM and Flash Memory
Figure 5.69 The "window"
in (he package of a
microprocessor that uses an
EPROM 10 Slore programs.
An electrically erasable PROM, or EEPROM, utili zes the EPROM programming method
of using high voltage lO trap electrons in a fl oating gate tranSislOr. However, unlike an
EPROM that requires UV light to free the electrons and hence erase the PROM, an
EEPROM uses anot her high volt age to free the electrons. EEPROMs thus avoid the need
for placing the chip under UV li ght.
Because EEPROMs use voltages for erasing, those volt ages can be applied to spe-
cific cells only. Thus, whi le EPROMs must be erased in their entirely, EEPROMs can be
erased one word al a lime. Thus, we can erase and reprogram certain words in an
EEPROM wit houl changing the conl enlS of olher words.
Some EEPROMs require a special programmer device for programming. However,
most modem EEPROMs do not require special voltages to be applied to the pins, and also
include internal memory controll ers that manage the programming process. Thus, we can
reprogram an EEPROM's contents (or part of its contents) wi thout ever removing the chip
from the system that the EEPROM serves-such an EEPROM is known as being in-system
programmable. Most such devices can therefore be read and wrillen in a manner very
similar to a RAM.
Figure 5.70 shows a block di agram of an
EEPROM. Notice that the data lines are bidirectional.
just as was the case for RAM. The EEPROM has a
control inpul wri te-vlri te=O indicates a read
operat ion (when en=1), whi le write=1 indicates
thai the data on the data lines should be programmed
into the word at Ihe address specified by the address
linc . Programming a word into an EEPROM takes
time, though, perhaps several. dozens, hundreds, or
even thousands of clock cycles. Therefore. EEPROM
may have a control OUlput busy to indicate that pro-
gramming is nOI yet complete. While the device is
the EEPROM user should not try writing to a dif-
ferent word. Fortunalely, mOM EEPROMs will load
32
--+- data
10

_ en 1024 x 32
EEPROM
----.. write
_ busy
I>
Figure 5.10 1024x32 EEPROM
block symbol.
Using a ROM
5.6 Memory Components 269
the data to be programmed and the add '. . .
wnlmg the EEPROM f h' ress mto mternal regIsters, freemg the circuit that is
rom avmg to hold th a1 .
Modem EEPROM ese v ues constant dunng programming
s can be prog d .
more, and can retain thel' ramme tens of thousands to millions of time or
r Contents for tens t h d
Whil e erasing one word t . . 0 one un red years or more without power.
other applications need to a a tIme IS fine for some applications that utilize EEPROM
erase large block f ' .
camera application would d s 0 memory qUIckly-for example. a digital
. nee to era e a blo k f .
pIcture. Flash memory is a Iype of EEP . c a. memory correspondmg to an entire
memory can be erased ve ui ROM In whIch all the words with a large block of
time. A flash memory q ckly, perhaps sImultaneously, rather than one word at a
Many fl ash memories:a: erased by setting an erase control input to 1.
erased whil e other ' y a specifi c regIon, known as a block or sector. lo be
regIons are left untouched.
We now provide examples of using a ROM in RTL designs.
EXAMPLE 5.13 Talking doll using a ROM
We wish to design a doll thai s aks lh " .
moved. A block diagram of th pc e. message NIce 10 meel you" whenever the doll's righl arm is
e syslcm IS shown in Fioure 5 71 A 'b . .
ann has an output V that is 1 when vibr.ltion '. 0 " VI ration sensor In the dolr right
then output a digitized version of the "Nice IS sensed. A detects, vibration and houJd
attached to a speaker. The "Nice 10 mec " [0 meet message to a dlguaJ-to-analog converter
actress. Because that message 'II t you message wil l be the prerecorded voice of a professional
message in a ROM. WI nOI change for the li fetime of the doll producl, we can store thai
Figure 5.11 Utili zing a ROM
in a lalking doll system.
4096 x 16 ROM
Figure 5.72 shows a high-level stale machine
s.egment that plays the message after detecti ng vibra-
lI on. The machine starts in stale S. inil'i:liizing the
address counter a to O. and waiting for vibra-
tIOn be sensed. When vibration is sensed. the
machine to Slate T. which reads the current
locatIon. The I11Hchine moves on to state U.
whIch loads the digital -la-analog converter with the
read value fmm ROM. incremems a. and proceeds
back 10 Tas long as a hasn' l reach d -1095 (remember
thm the transilion fmm U uscs the value of a before
Ihe II1cremenl. so compares 104095. not to -1096).
speaker
vibration
sensor
270 Register-Transfer LevelI RTL) Design
Because thi:-. doll' s message wi ll never change. we choose, to usc a
ROM or an OTP ROM. We migiu uti li ze OTP ROM dUri ng protot yplll g or dUri ng IIlll1al sales of
the doll. :lIld then produce m3;k-prograllll11cd ROM versions during high-volume producli on of
the doll.
EXAMPLE 5.14 Dtgttal telephone answertng machine using a flash memory
\Ve are to design the olltgoing announcement part of a telephone answering mac.hi.ne (e.g .. "We're
Il ot home ri!:!llI now, leave a messnge."). That announccmcm should be stored digit all y. should be
recordable by the machi ne owner any nu mber of ti mes. and should be saved even if power is removed
from the illlswering machine. Recording begins immediately after the owner presses a record buncn,
which selS a signal rec 10 1. Because we must be abl e record the announcement . we cannot use
a maskprogrnlllllled ROM or OTP ROM. Because removll1g power shoul d not cause the announce-
ment to be lost. we cannot use a RAM. Thus. we might choose 3n EEPROM or a Aash memory.
We' lI u5e a nash memory. shown in Figure 5.73. Noti ce the fl ash memory has the same inte.r-
face as a RA1\ll. except that the nash memory has an extra Input erase. on
panicular nash memory cl ears the contents of the ent ire flash. \Vhll e the .nash memory IS erasmg
itself. the fla sh sets an output busy to 1. duri ng whi ch ti me we cannOl wnte to the fl ash memory.
4096 x 16 Flash
Figure 5.73 Utilizing a fl ash memory in a di gital answeri ng machine.
Fi gure 5.74 shows a hi gh level stale machine
segment for recording the nnnouncemeni. The
Mate machine segment begins when the record
bUlton i pressed. Slate S activates the erase of the
nash memory (e r =l ), and then state T waits for
the era, ing to compl ete (bu'). Such erasing
should occur in jusl n few mi ll iseconds. so we
shouldn' t mi ss any of the spoken an nouncement.
The state mnchine then transiti ons 10 Slale U.
which copies a digitized sample from the analog-
di gital converter to the fl a'>h memory. writing to
the current address a. State U also increments a.
The next 'tate f II) checks to 'ee if the memory i,
fill ed with , ample, by checking if d( 4096.
returning to ,tate U until the memory is fi ll ed .
. _------------------
Figure 574 State machine for storing
di gi ti zed .;;ound in a memory.
5.7 Queues IFIFOs) 271
Noti ce that. unli ke Examples 5. 12 and 5.13. this tate machi ne increments d before the state that
checks for the last address (state V) , so V"s Lransi li ons use 4096. not 4095. We how this version JUSI
for varlely. The version in Example 5. 12 may be Slightly bener because that version requires that d_
and the comparator, onl y be 12 bi ts wide (to represent 0 to 4095) rather than 13 bits wide (to repre-
sent 0 to 4096).
. state machine assumes thal writes to the fl ash occur in one cl ock cycle. Some flash memo-
nes requi re more ti me for writes, assert ing thei r busy out put unti l the write has complered. For such
a fl ash. we would need to add a slate between stat.es U and V. simil ar to the state T between Sand U.
To prevent mi ssing sound samples while wailing, we mi ght want to first save the entire sound
sample in a 4096x 16 RAM, and then copy the entire RAM contents to the flash.
The Blurring of the Distinction between RAM and ROM
Noti ce that EEPROM and Hash ROM blur the distincti on between RAM and ROM. Many
modem EEPROM devices are writ abl e just like a RAM. havi ng nearly the arne interface.
with the onl y difference being longer write times to an EEPROM than to a RAM. How-
ever. the difference between those time is shrinking each year.
Funher blurring the distinction are nonvolatile RAM (NVRAM) device, which are
RAM devices that retain their contents even without power. Unl ike ROM. NVRAM write
times are just as fast as regul ar RAM- typi call y one clock cycle. One type of NVRAM
simpl y includes an SRAM with a built -in battery. with the battery able to supply power to
the SRAM for perhap ten years or more. Another type of VRAM includes both an
SRAM and an EEPROM- the NVRAM controll er automaticall y backs up the SRAM's
contents into the EEPROM. typi call y just at the time when power is bei ng removed. Fur-
thermore, extensive research and development into new bit storage technologies are
leading to NVRAMs that are even cl oser to RAM in terms of performance and density
while being nonvolatile. One such technology is known as MAGRAM. shon for magnetic
RAM, which uses magneti sm to store charge. having access ti mes similar to DRMt. but
withoUlthe need for refreshing. and wi th nonvolatil ity.
Thus, di git al de igners have a tremendous variety of memory types available to them_
with those types di ffering in their cost. performance_ size. nonvolatility_ ease-of-use. write
time_ duration of data retenti on_ and other factors.
5.7 QUEUES (FIFOs)
Somerimes our data storage needs specifi -
call y require that we read items in the same
order that we wrote them, and that reading
removes the it em from the li st. For example,
a busy restauranl may mai n lai n a wail ing li sl
of customers-the host writes customer
names to the rear of the li st. but when a tabk
becomes avail abl e. the host reads the next
customer' s name from the fivlII of the li st
and removes that name from the list. Thus.
the fi rst customer wri tten to the list is the
first cu -tomer read from the list. A qlleue is
write items
to back
ofthe queue
back from
read (and
nemove) Items
from front of
the queue
Figure 5.75 C'onc'Cp1ual \ ie" of 3 queue.
272
PLEASE
QUEUE
FROM
THIS
END
Register-Transfe r Leve l lRTLJ Design
a list that i writt en at the rear of the list but read from the beginning of the list, with a read
also removi ng the read it em from the list, as illustrated in Figure 5.75. The common tenn
for a queue in American Engli sh is a "Iine"-for exampl e, you stand in a line at the grocery
store. with people entering the rear of the line. and being served from the front of the li ne.
In Bri ti sh English. Ihe word queue is used directly in everyday language (which somelimes
confuses Ameri cans who visit other English-speaki ng countries). Because the first item
wri tten int o the li st wi ll be the first item read out of the list, a queue is known as beingfi rst-
ill fi rst-out (FIFO). As such, sometimes queues are call ed FIFO queues, although that tenn
is redundanl because a queue is by definilion fi rst-in fi rst-out. The term FIFO itself is often
used to refer to a queue. The term buffer is also somelimes used. A wri te to a queue is
someti mes call ed apush or ellqueue, and a read i sometimes call ed pop or dequeue.
We can implement a queue using a 7 6 5 4 3 2 1 0
memory-either a register fi le or a RAM. :- -1: -1 :--1 :--1 :--1 :--1 r- -1 :--1
depending on the queue size needed. ::::!:! l! ! 1 ii Ii 1
When using a memory. the from and rear , __ J ,__ J , __ oJ ' __ J I __ J ' __ J '__ J ' __ J
wi ll move to diffe rent memory locations r I
as the queue is wrinen and read, as ill us- 7 6 4 3 0
trated in Figure 5.76. The fi gure shows an
initi all y empty eight -word queue with A---
fronl and rear bOlh set to memory address
O. The fi rst acti on on the queue i a write
of item A. whi ch goes to the rear (address
0). and the rear increments to address I.
The neXI aCli on is a writ e of it em B, B---
whi ch goes to the rear (address I). and Ihe
rear increments to 2. The next acti on is a
read. which comes from the front (address
0) and thus reads out it em A. and the front
increments to I.
6 2
I
o
,--- ,--' ,---,--, ,--- ,--- G G
' II II II '1 II ,
I II It II II II I
: :: :: :: :: :: : B A
I II II II II II I
1 __ .1 1 __ .1 I __ J t __ J 1 __ .1 ' __ J
7 6 3
r
2
r
I
o
Subsequent reads and wri tes continue
likewise, except that when the rear or front
reaches 7, its next value should be O. not 8.
[n other words. the memory can be thought
of as a circle. as shown in Figure 5.77.
Figure 5.76 Writing and reading a queue
implemented in a memory causes lhe front
(I) and rear (r) 10 move.
Two conditi ons of a queue are of
interest:
Empty: there are no items in the
queue. Thi s condition can be
detected as frolll = rear, as seen in
the topmost queue of Figure 5.76.
Full: there is no more room to add
items to Ihe queue, meani ng there
are N items in a queue of ize N.
This comes lIbout when the rear
wrap; around and catches back up
to the front. meaningfrollt = rear.
o
Figure 5.77 Implementing a queue in a
memory lreats the memory as a circl e.
A
- .. - - - - ._------
5.7 Queues {FIFOsJ 273
Unfortunately, not ice that the conditions detecting the queue being empty and the
queue beJllg full are the same- the front address equal s the rear address. One way to tell
the two conditi ons apart is to keep track of whether a write or a read preceded the front
and rear addresses becoming equal.
In many uses of a queue, the circuit writing the queue operates independentl y from
the CirCUli reading the queue. Thus, a queue impl ememed wi th a memory may use a two-
port memory havmg separate read and write ports.
We can implement an 8-word
queue using an 8-word two-port 8x16 register fil e
register fi le and additional compo- W a
16 16
data rdat
wdata rdata
nents, as depi cted in Fi gure 5.78.
A 3-bi t up-counter maintains the
front address, while another 3-bit
up-counter mai ntains the rear
address. Noti ce that these counters
will naturall y wrap around from 7
to 0, or from 0 to 7, as desired
when treating the memory as a
circl e. An equali ty comparator
detects whether the front counter
equals the rear counter. A con-
troll er writes the write data to the
register fi le and increments the
rear counter during a write, reads
the read data from the register fil e
and increments the front counter
~ -
~ f
~ ~
r
.2
e
c:
0
0
L....-
~
waddr
I--- ).r
f-
elr
I-
inc
3bil
up counter
> rear
+
eq I
.
raddr
~
rd h
r-
elr
inc
3-bil
up counter
> Ironl
+
-
I
I lull
em
S-word 16bit queue
duri ng a read, and determines Figure 5.78 Arehileclure of an S-word l6-bil queue.
PlY
whether the queue i full or empty based on the equality compari son a well as whether the
previous operation was a write or a read. We omi t further de cription of the queue' con-
troll er, but it can be built by start ing with an FSM.
A user of the queue should never read an empty queue or write a full queue-
depending on the controller design. uch an acti on might ju t be ignored or might put the
queue into a mi sleading internal state (e.g .. the front and rear addre ses may cross over).
Most queues come with one or more additi onal control output that indicate whether
the queue is half full . or perhaps 80% full .
Queue are commonplace in digit al system . Some exampl e incl ude:
A comput er keyboard writes the pressed keys int o a queue and meanwltile
requests that the computer read the queued keys. You might at ome ti me ha\'e
typed faster than your computer was reading the key. in which ase > our addi-
ti onal keystrokes were ignored-and you may have even heard beep, each time
you pre sed addi tional keys. indicating the queue \ as full .
A di gital video camera may write recently captured vi deo frames into a qUeue.
and concurrentl y may read those fmme.! from the queue. compre'. them. 3/ld store
them on tape or anotller medium.
A computer printer may store print job in a queue while th se j bs are waiting \0
be pri nted.
27.t Regist er Transfer LevellRTLI Design
A modem stores incoming data in a queue and requests a comput er to read .that
data. Likewi se, the modem writ es outgoing data received from the computer tnto
a queue and then sends that data out over the modem' s outgoi ng medium.
A comput er network router receives data packets from an input pon and writes
those packets into a queue. Meanwhile. the router reads the packet s from the
queue. ana lyzes the address information in the packel. and then sends the packet
along one of several output pons.
EXAMPLE 5.15 Using a queue 3 2 o
Show the internal stal e or a S-
word queue, and popped data
val ues. after each of the fol-
lowing sequences of pushes and
pops. assuming an in iti ally
empty queue:
I. Push 9. 5. 8. 5. 7. 2. and 3.
1. Pop
3. Push 6
4. Push 3
5. Push 4
6. Pop
Figure 5.79 shows the
queue's internal stales. After the
Initiallyemply
queue
1. Alter pushing
9, 5, 8, 5,7, 2, 3
2. Alter popping
first sequence of seven pushes 3. After pushing 6
(step I ). we see that the rear
address points to addre s 7. The
pop (step 2) reads from the front
address of O. returning data of 9.
The front address increments to 4. Alter pushing 3
I. Note that although the queue
might still contai n the value of 9
5. After pushing 4
7 5 2
765432
r
7 6
7 6 5 3 2
o
data:
9
8800080G lull
rl
ERROR! Pushing a full queue
results in unknown state
in address O. that 9 is no longer
accessibl e during proper queue
operat ion. and thus is essenti all y
gone. The push of 6 (step 3)
increments the rear address. Figure 5.79 Example pushes and pops of a queue.
which wraps around from 7 to O.
The push of 3 (step 4) increments the rear address to I. which now equals the front address,
meaning the queue is now full. If a pop were to occur now, it would read the value 5. But instead, a
push of 4 OCcurs (step 5)-this push should not have been performed. because the queue is Full.
Thus, thi s push puIS the queue into an erroneous state, and we cannot predi ct the behavior of any
subsequent pushes or pops.
A queue could of course come wi th some error- tolerance behavior built in, perhaps
ignoring pushes when full , or perhaps returning some panicular value (li ke 0) if popped
when empty.
- - . - .-- -------------
J
5.8 Hierarchy-A Key Design Concept 275
5.8 HIERARCHY-A KEY DESIGN CONCEPT
Managing Complexity
Through?ut thi s book, we have been utili zing a powerful design concept known as hier-
archy. HIerarchy In general is defined as an organi zati on with a few "things" at the top.
and each thing poss ibl y consisting of several other things. Perhaps the most widely
known type hi erarchy involves a Country. At the top is a country, which consists of many
states or provinces, each of which in turn consists of many cities. A hierarchy involvi ng a
country,. provinces, and citi es is shown in Fi gure 5.80. That figure shows all three levels
of the hterarchy-coumry, provinces, and cities.
Figure 5.81 shows the same country,
but this time showing only the top two
levels of hierarchy-countri es and prov-
inces. Indeed, most maps of a country only
show these top two levels (possibly
showi ng key cities in each province/state,
but cenainl y not all the cities}-showing
all the ci ties al so makes the map far too
detailed and cluttered. A map of a province/
state, however, might then show all the
ci ties within that state. Thus, we see that
hi erarchy plays an imponant role in under-
CityF
CityG
ountry
n
CD
'"
standing countri es (or at least their maps). Figure 5.80 Three-level hier.rrchy example: a
L' country, made up of provinces. each made up of
tkewi se, hierarchy plays an impor- ci ti es.
tant role in digital design. In Chapter 2, we
introduced the most fundamental compo-
nent in digital systems-the transistor. In
Chapters 2 and 3, we introduced several
basic components composed from transis-
tors, like AND gates, OR gates, and NOT
gates, and then some slightl y more
compl ex component s composed from
gates: multiplexers, decoders. flip-flops,
etc. In Chapter 4, we composed the basic Figure 5.81 Hierarchy showing j ust the top
components into a hi gher level of compo- two levels.
nent s, datapath component s, li ke registers.
adders, ALUs, multipliers, etc. In Chapter 5, we introduced components composed of data-
path components, including controllers. datapaths, proces ors (made up of controllers and
datapaths). memories. and queues.
Use of hierarchy enables us to manage complex design . Imagine trying to compre-
hend the design of Figure 5.30 at the level of logic gates-that deign likel\' con i IS of
several thousand logic gates. Humans can' t comprehend everal thousand thing at on .
But they can comprehend a few dozen things. A the number of things grow beyond 3
few dozen. we therefore group those things into a new thing. to manage the omplexity.
However, hierarchy alone is not suffi cient- \ e mu t also associate :lJl underst:lJldable
meaning to the higher-level things we create, a task known as absrrn ti n.
276 5 Register-Transfer LevellRTLI Design
Abstraction
Hi erarchy may not onl y involve grouping thi ngs into a larger thing, but may also involve
associat ing a hi gher-level behavior to that larger thing. So when we grouped transistors to
form an AND gate. we didn' t just say that an AND gate was a group of transi stors-
rather. we associated a specifi c behavior with the AND gate, with that behavior describing
the behavior of the group of transistors in an easil y understandable way. Likewise, when
we grouped logic gates int o a 32-bit adder. we didn ' t just say that an adder was a group
of logic gates-rat her, we associated a specifi c understandable behavior with the adder: A
32-bit adder adds two 32-bit number .
Associat ing higher-level behavior with a component to hide the complex inner details
of that component is a process known as abstractioll .
Abslract ion frees a designer from having to remember, or even understand, the low-
level detail s of a component. Knowing that an adder adds two numbers, a designer can
use an adder in a design. The designer need not worry about whether the adder internally
is implemented using a carry-ripple design, or using some compli cated design that is
perhaps fas ter but larger. Instead. the des igner just needs to know the delay of the adder
and the size of the adder. whi ch are further abstTactions.
Composing a Larger Component from Smaller Versions of the Same Component
A common design task is to create a larger version of a
component from smaller versions of the same compo-
nent . For exampl e. suppose you have 3- input AND
gates avail able to you, but you need a 9-input AND
gate. You could compose several 3-input AND gates to
form a 9- input AND gate, as shown in Figure 5.82. You
could compose OR gates into a larger OR gate, and
XOR gates into larger XOR gates, similarl y. Some
composi tions might require more than two levels-
composing an 8-bi t AND from 2-input ANDs requires
four 2-input ANDs in the first level , two 2- input ANDs
in the second level, and a 2-input AND in the third
level. Some compositions mi ght end up wi th extra
Figure 5.82 Composing a
9-inpul AND gate from
3- inpul AND gales.
inputs that must be hardwired to 0 or I-an 8-input AND bui lt from 3-input ANDs would
look similar to Figure 5.82. but with the bOllom input of the bOll om AND gate hardwired
to 1. After trying a few exampl es of composi ng AND gates into larger ones, you can
come up with a general rule to compose any size AND gates into a larger gate: fill the first
level with (the largest avai lable) AND gates until the sum of their inputs equal the desired
number of inputs, then fill the second level simil arl y (feeding first level output s to the
second level gates), until a level has just one gate (that's the last level). Connect any
unused AND gate inputs to 1. Composing NAND. NOR, or XNOR gates into larger gates
of the same kind would require a few more gates to maint ain the same behavior.
Multiplexers can also be composed together to form a larger mUltiplexer. For
example, suppose you had 4x I and 2x I muxes avai labl e, but you needed an 8x I mux. You
could compose the small er muxes int o an 8x I mux as shown in Figure 5.83. Notice that
-------------
5.8 Hierarchy-A Key Design Concept
52 selects among group i 0- i3 and i 4 - i 7 whil e 51
and 50 select one input from the group. You 'can check
that select line values pass the appropri ate input through,
for exampl e, 525 150 = 000 passes i 0, 525150 = 100
passes 14, and 525150 = 111 passes i 7.
. One particularl y commonl y occurring composi -
ti on problem IS that of creating a larger memory from
small er ones. The larger memory may have wider
words, may have more words, or both.
x
iO iO
i1 i1
i2
i3
i4
i5
i6
i7 i3
S1 sO s2
277
For example, Suppose you have avail able a laroe
number of 1024x8 ROMs, but you want a 1024x32
ROM. Composing the small er ROMs into the larger
one is straightforward, and shown in Fi gure 5.S4.
We' ll need four 1024xS ROMs to obtain 32 bits per
word. We connect the 10 address inputs to all four
ROMs. Likewise, we connect the enable input to all
four ROMs. We group the four 8-bit outputs into our
desired 32-bit output. Thus, each ROM stores one byte
of the 32-bit word. Reading a location, say location
99, results in four simultaneous reads, of the byte at
Figure 5.83 An 8x I mux composed
from 4x t and 2x I muxes.
location 99 of each ROM.
Figure 5.84 Composing a 1024x32
ROM from 1024x8 ROMs.
c:
"
1024x32
ROM
32
8
As another example using ROM. suppose you again have 1024x ROMs a\'ailable_
but thi s time you need a 2048x8 ROM. So you have an extra addre s line because y u
have twi ce as many words to address. Figure 5.85 haws ho\ to use two 1024x ROMs
to create a 2048x8 ROM. The top ROM represent the top half of the memory (10_4
words). and the bOllom ROM the bOllom half ( 1024 words)_ We u e the 11th addre line
(a 1 0) to enable either the top ROM or the bOllom RO 1-the other 10 bilS represent the
offset into the ROM. That 11th bit feeds into a Ix2 decoder. whose output reed into the
ROM enables. Fi gure 5.86 lI ses a table of addresses to show ho\\ the 11 th bit selects
among the two smaller ROMs.
278 Register Transfer level (RTlI Design
ACllIally. we could li se any bit
to scicct between the top RO I and
bOllom ROM. Designers com-
monl) use the lowest-order bit (aO)
to selecl. The lOp ROM would thus
represent all evenl y addressed
words. the bOllom ROM all oddl y
addressed words.
Finall y. since onl y one ROM
will be active at any time. we can
tie together the out put data lines 10
fOfm Ollr 8-bit output. as shown in
Figure 5.85.
As a tinal example using
ROM. suppose you needed a
-l096x32 ROM. but had onl y
102-lx8 RO Is available. In thi s
situation. we need bot h to creatc
more words. and wider words. The
approach is straightForward: fi rst.
create a -l096x8 ROM by using 4
ROMs one on top of the other and
by feed ing the lOp two address
lines to a 2x4 decoder 10 select the
appropriate ROM. and then
second. widen the ROM by adding
3 more ROMs 10 each row.
Most of the datapath compo-
nents we introduced in Chapter 4
can be composed into larger ver-
sions of the same type of
component.
,...- -- - - -- ------ ----- - -------------------,
11' ,
-0 add,
-g 1024x8
ROM
ij - - ---1
- j
L _________ ______ _
Figure 5.85 Composi ng a 2048x8 ROM from
1024x8 ROMs.
al0a9a8 aD
0000 0000000
0 0000000001
000000000 10
o 1 1 1 1 1 1 1 1 1 0
o 1 1 1 1 1 1 1 1 1
0000000000
0000000001
0000000010
1
1111
o
add,
1024xB
ROM
en data
add,
1024xB
ROM
en data
Figure 5.80 When composi ng a 2048x8 ROM from
two 1024x8 ROMs. we can use the highest address
bit ( 0 choose among the two ROMs: the remaining
address bits offset into the chosen ROM.
5.9 RTL DESIGN OPTIMIZATIONS AND TRADEOFFS (SEE SECTION 6.5)
Previous sections in thi s chapter described how to perform registertransfer level
de,ign to create processors consisting of a controll er and a datapath. This section,
whi ch phy,icall y appears in the book as Section 6.5. how to create proce
that are beller optimized. or that trade off one feature for another (e. g., size for
performance). One of this book covers such RTL optimi zati ons and tradeoffs
immediately after introducing RTL design. meaning now. Another use introduces
them later.
.. . . _____ _
5.10 RTl DeSi gn using Hardware Description languages (See Section 9.51 279
5.10 RTL DESIGN USING HARDWARE DESCRIPTION LANGUAGES (SEE
SECTION 9.5)
This section. whi ch physicall y appears in the book as Section 9.5, describes use of IfDLs
during RTL design. One use of this book describes such HDL use immediately after
introducing RTL design (meaning now). Another use describes use of HOLs later.
5.11 PRODUCT PROFILE: CELL PHONE
A cell phone, short for cellul ar telephone and also known as a mobile phone. is a portable
wireless telephone that can be used to make phone calls whil e moving aboul a city. CeU
phones have made it possible to communi cate with di stant people nearl y anytime and
anywhere. Before cell phones, most telephones were ti ed 10 physical places like a home
or an office. Some cities supported a radi o-based mobile telephone ystem usi ng a pow-
erful central antenna somewhere in the city. perhaps atop a tall building. Because radio
frequencies are scarce and thus carefull y doled out by governments, such a radio tele-
phone system could only use perhaps tens or a hundred di fferent radio frequencies. and
thus could not support large numbers of users. Those few users therefore paid a very high
fee for the service, limiting such mobile tel ephone use to a few wealthy individuals and to
key government officials. Those users had to be within a certai n radiu of the main
antenna, measured in tens of miles, 10 receive service. and that ervice typicall y didnt
work in another ci ty.
Cells and Basestations
Cell phone popularity exploded in the
I 990s, growing from a few million users
to hundreds of millions of users in that
decade (even though the first cell phone
call was made way back in 1973. by
Martin Cooper of Motorola. the inventor
of the cell phone), and today it i hard
for many people to remember life before
cell phones. The basic techni cal idea
behind cell phones divides a ci ty into
numerous small er regions. known as
cells (hence the term cell phone).
Figure 5.87 shows a city divided into
three cell s. A typical city might actuall y
be divided into dozens. hundreds. or
even thousands of cell s. Each cell has its
own radio antenna and equipment in the
city
basestation
antenna
: ..c:: ..'
\ .."
''-____ tollrom
regular
-
'-------' phone
system
Figure 5.87 Ph nc 1 in cell can use th same
radio frequency as phone _ in cell C. in reasing
the number of po sible mobile phone u!!ocrs in 3
city.
center. known as a basestatioll . Each basestati on can u ' e dozens or hundreds of different
mdio frequencies. Each basestati on antenna only needs to transmit radio signal> po\\erful
enough to reach the ba, estations cell area. Thu . nonadjacent cell. can a ll"lSc' the
same frequenci es. so the number of radio frequ'ncies ullo\\ro for mob,l phone -
280 Register-Transfer l evel (RTl) Design
can bc thus shared by more than one phone at onc time. Hence. far more users can be
supported. lead ing to reduced costs per user. Figure 5.87 illustrates that phone! in cell A
can usc the same radio frequency as plwI/e2 in cell C. because the radi o signals from cell
A don't reach cell C. Support ing more users means greatl y reduced cost per user, and more
basestal ions means service in more areas than just major citi es.
Figure 5.88(a) shows a typical
basestntion ant enna. The basestation's
equiplllclll Jllay be in a small building
or commonly in a small box near the
base of the ant enna. The antenna
shown actuall y suppons antennas
from tWO di fferent cellul ar servi ce
providers-one set on the top. one set
just under. on the same pole. Land for
the poles is expensive. whi ch is why
providers share. or sometimes find
existing tall Slnlctures on whi ch to
mount the antennas. l ike buil di ngs.
park light posts. and oLher interesting
places (e.g .. Figure 5.8 (b)). Some
providers try to disgui se thei r antennas
to make Lhem more soot hing to the
eye. as in Figure 5.88(c)-the entire (a)
Lree in the picture is artifi cial.
All the basesLati ons of a service Figure 5.88 Basestations found in vari ous locat ions.
provi der conneCL to a central switching
office of a ci ty. The switching office not onl y links the cell ul ar phone system LO the regular
"Iandline" phone sysLem, bUL also assign phone call s LO specific radio frequencies, and
handles SwiLching among cell s of a phone moving beLween cell s.
How Cellular Phone Calls Work
Suppose you are holding phol/e l in cell A of Fi gure 5.87. When you turn on the cell phone,
Lhe phone li stens for a signal from a basestati on on a comrol frequency, whi ch i s a special
radio frequency used for communi caLing commands (raLher than voice data) between the
basestation and cell phone. I f the phone finds no such signal , the phone reports a "No Ser-
vice" error. I f the phone finds the signal f rom basestati on A. Lhe phone Lhen Lransmits its
own identifi cation (10) number to base taLion A. Every cell phone has its own unique lD
number. (Actuall y, Lhere is a nonvolatil e memory card inside each phone Lhat has Lhat lD
number-a phone user can potenti all y witch cards among phones. or have multiple cards
for the phone. switching cards LO change phone numbers.) Basestation A communi-
cates Lhis ID number to the cemral swi tching office's computer, and Lhus the service
provider compuLer database now record Lhat your phone is in cell A. Your phone intermit-
Lently sends a comrol remind the swi tching omce of the phone's presence.
I f '>omebody Lhen call s your cell phone's number. the call may come in over the regular
phone sY'tem. which goes to the switching office. The ,witching omce computer database
5.11 Product Profil e: Cell Phone 281
indi cates that your phone is in cell A. In one Lype of cell phone Lechnology, the swi Lching
office computer assigns a specific radio frequency supported by basesLaLion A LO the call.
Actuall y, the computer assigns two frequencies, one for tal king, one for Ii teni ng_ so that
talking and li stening can OCCur simulLaneously on a cell phone-Iet's call that frequency
pair a channel. The computer then tell s your phone to carry OUL the cal l over the assigned
channel , and your phone rings. Of course, iL could happen Lhat Lhere are so many phones
already involved wiLh call s in cell A Lhat basestaLion A has no available frequencies-in thaL
case. the caller may hear a message indi cating Lhat user is unavai lable.
Placing a call proceeds simil arl y, but your cell phone initiate the call , ulLimatel y
resulting in assigned radio frequencies again (or a "system busy" message if no frequen-
cies are presently avai l able).
Suppose that your phone i s presentl y carrying OUI a call with base LaLion A, and thai
you are moving through cell A toward cell B in Fi gure 5.87. BasesLation A wi ll see your
si gnal weakening. whil e basestation B will ee your si gnal strengLhening_ and the two
basestaLions transmit thi s informati on LO the switching office. AL some point the
switching office computer will decide to switch your call from base Lation A LO basesta-
tion B. The computer assigns a new channel for the call in cell B (remember. adjacent
cell s use different sets of frequenci es to avoid interference)_ and sends your phone a
command (through base sLat ion A, of course) to switch to a new channel. Your phone
swi tches to the new channel and thus begi ns communicaLing wiLh basestaLion B. Such
swi tching may occur dozens of Limes whil e a car dri ves Lhrough a city dwing a phone
call , and is transparent to the phone user. SomeLimes the swiLching fails. perhaps if the
new cell has no available frequenci es. resulLing in a "dropped" call.
Inside a Cell Phone
Basic Components
A cell phone requires sophi sticated di gital circuiLry LO carry OUL call . Figure 5.89 how
Lhe insides of a typi cal basic cell phone. The printed-ci rcuit boards include evera! chip
implemenLing di giLal circuits. One of Lhose ci rcuit s performs analog-Lo-digital conversion
(a) (b) (e)
Figure 5.89 Inside a cell phone: (a) handset. (b) battery and ID card on left. pad JJld in
ccnler. digi tal ircuilry on n printed-circui t board on right , tc) the two side-s of the prinloo<u'Cuit
board. showing severnl digitnl chip package$ mounted on the bo:.ml.
282
F
Register-Transfer LevellRTLI Design
of a voice (or olher sound) 10 a signal Slream of Os and 1s, and anolher performs di gital-
lo-anal oll conversion of a received digital stream back (0 an analog signal. Some of the
circui ls. -lypicall y soft ware on a microprocessor. exeCUle lasks lhal manage lhe various
fealures of lhe phone. such as lhe menu syslem. address book. games, eiC. NOle that any
daw Ihal you save on your cell phone (e.g" an address book. cuslomi zed ring lones, game
high score information. elc.) will likely be slOred on a fl ash memory, whose nonvolalilily
lhe dat a Slays saved in memory even if Ihe ball ery di es or is removed. Anolher
imponanl lask involves responding 10 commands from lhe Swilching office. Anolher task
carried oul by lhe di gil al circuil s is fi ltering. One lype of filt ering removes the canier
radio signal from lhe incoming radi o frequency. Anolher lype of fillering removes noise
from lhe digili zed audi o Slream coming from lhe microphone, before transmitting lhal
stream on the outgoing radi o frequency. Let' examine fi ltering in more delail.
Filtering, and FIR Filters
Filtering is perhaps lhe moSI common task performed in di gi lal signal processing. Digilal
signal processing operales on a slream of digi lal dal a lhal comes from digitizing an inpul
si!:mal. such as an audio. video, or radio signal. Such streams of data are found in count-
le;s electronic devices. such as CD players. cell phones. hean monilors, ultrasound
machines, radios. engine conlrollers. eiC. Filterillg a dala slream is the lask of removing
panicular aspecls of lhe inpul signal , and OUl pulling a new signal wil hout lhose aspecls.
A common fi llering goal is 10 remove noise from a signal. You' ve cenainly heard
noise in audio signal s-ii 's thal hi ssi ng sound lhal 's so annoying on your slereo, cell
phone. or cordless phone. You ' ve also likely adjusled a fi ll er 10 reduce lhal noi se, when
you adjusled the "lrebl e" conlrol of your Slereo (lhough lhat fil ler may have been imple-
mented using analog mel hods ralher lhan di gil al). Noise can appear in any type of signal,
nOI j usl audio. oise mi ghl come from an imperfecl lransmilling device, an imperfecllis-
lening device (e.g., a cheap microphone), background noise (e.g., freeway sounds coming
inl o your cell phone). eleclrical inlerference from other eleclric appli ances, etc. Noi se
lypi call y appears in a signal as random j umps from a smoolh signal.
Anolher common filtering goal is 10 remove a carrier frequency from a signal. A
carrier frequency is a signal added lO a main signal for the purpose of lransmitting thai
main signal. For example. a radio slat ion mighl broadcasl a radio signal al 102.7 MHz.
102.7 MHz is lhe carri er freq uency. The carrier signal may be a sine wave of a panicular
freq uency (e.g" 102.7 MHz) lhal is added 10 lhe main signal, where lhe main signal is the
music signal ilself. A receiving device locks on 10 the carrier freq uency, and then
oul the carri er signal, leavi ng the main signal.
An FIR filler (usuall y pronounced by saying lhe lellers " P' "I" "R"), shon for "Finite
Impulse Response," is a very general filler design that can be used for a huge varielyof
fillering goals. The basic idea of an FIR fi lter is very simpl e: multiply the present inpul
va lue by a constan!. and add that re ul! 10 the previous inpul value limes a conslant , and add
thai result 10 lhe nexl earli er inpul value limes a con lant. and so on. A designer u ing an
FIR filter achi eves a parti cular filtering goal simply by choosillg Ihe F1Rfiller 's COll slalllS.
Malhematicall y. an FIR fi lter can be described as foll ow:
Y( I ) = cOx.r(t) + (' I xX(I - I ) + c2xx( I -2) + c3X . I -J) + c4 xx(I-4) + ".
I i, the pre\enl lime slep. x is lhe inpul signal. and y i, lhe OUlput signal. Each lenn
(c.g., CO*X(I)) is call ed a lap . So the above equation represenls a 51ap FIR filter.
5.11 Product Profile: Cell Phone 283
FIR some exampl es of lhe versalilil y of an FIR fi lter. Assume we have a 5.tap
I d
ter. or slaners, 10 Simpl y pass a signal lhrough lhe filter unchanged, we sel cO 10
,an we el cl=c2- c3- c4-0 "'0 I' f . .
I h - - -. " amp I y an IOpUI SIgnal, we can sel cO 10 a number
arger t an I, perhaps sell ing cO 10 2. To creale a moothino fil ler thai OUlputs the averaoe
of the present val ue and lh " . "
. I e pasl our IOpUI values. we can SImply sel all the conSlants 10
equl va enl lhat add 10 I, namel y, c!=c2=c3=c4=c5=0.2. The results of uch a filter
applied 10 a nOI sy IOpul Signal are shown in Figure 5.90. To smoolh and amplify. we can
sel all conSlalll S 10 equi val I h .
c!=c2=c3=c4= ' _ enl va ues I at add W omethlOg grealer than I. for example,
. c5-1, resultlOg 10 5x ampllfi call on. To creale a smoothing filter thai onl y
IOciudes lhe previous lWO rather lhan four inpul values, we simply sel c3 and c4 10 O. We
see that we can build alilhe above different fill ers j usl by changing the conSlanl values of
an FIR fi lter. The FIR fi ller is indeed quile versatil e.
1.5
1
----------------------,
____ original
---...- noisy
-+- fir_av9-out
Ilil


- 1.5'------_______________
Figure 5.90 5tap FIR fill er wilh cO=<:I=c2=<:3=c4=0.2 applied 10 a nois)' signal. The
ongmaJ signal IS a slOe wave. The noi sy signal has randomjump _ The RR output i
m.uch than noisy sig.nal. approaching the original signal. Olice that the FLR output i
sll ghlly shIned 10 Ihe nghl. meaning Ihe OUlPUI is sli ghtl y delayed in time (probably a ri ny fra rian
of a second delayed). Such slighl shifling is usual ly nOI imponanl 10 n particular application.
. Thai versalilily eXlends even further. We can actually filter OUI a carrier frequen y
uSlOg an FIR filter, by selllllg lhe coeffi ciellls 10 different value. carefully chosen 10 filter
OUI a pani cul ar frequency. Figure 5.91 shows a main signal. ill I . thai we \\ am 10 transmit
We can add that to a carri er signal , ill 2, 10 oblain the composile ignal. ill _lotal. The
SIgnal III_lOra/IS lhe SIgnal lhal woul d be the signal lhal i transmi ned by a radio lation.
for exampl e. wi lh illl being lhe signal of the mll sic. and ill2 lhe carrier
Now ay a lereo receiver receives that composile signal. and needs 10 filter OUI the
carrier signal, so the music signal can be sent 10 the slereo peakers. To delermine h \\ I
fill er OUI lhe carrier signal. look carefull y at the am pies (the small tilled squares in
Figure 5.9 1) of that carri er signal. Olice lhal lhe sampling rale i' such that if \\e lake :10'
sample. and add il 10 a sample from three time lep back. \\ e !!el O. That's be,:au,e f '"
po ili ve poil1l. lhree sampl es earlier wa a negative poinl of the same magnitude. For a
negalive poil1l. lhree samples earlier was a positive point of lhe same magnitude. nd for
a zero poin!. lhree samples earlier was also a zero poin!. Like\\ ise. adding a "artier .ignal
Register-Transfer Level (RTL) Design
2.5
1.5
[J
l i n
0,5
o
-0,5
Il..r-ll L\
1If 'r
1
.l!
-1 .5
-2
-2.5
H M
.. u
-+- in1
-
___ in 2 -
\ f"+
____ in_total -
.Jr\ R rI rI
M rI 11
\ fN....1 'J. Jtj \ JT'U.
\"1 'i r,
u ffll Joi

\ }
V. l.-'
Figure 5.91 Adding 3 main signal. iI/I. (0 a carri er signal. i1l2. resulting in a composi te signal
ill_fOfa!.
sample to a sa mple three steps later also adds to zero. So to filt er out the carri er signal , we
can add each sa mpl e to a sample three time steps back. Or we can add each sample to
112 times a sample three steps back. plus 112 times a sampl e three steps ahead. We can
achieve this using a 7-tap FIR fi lter wi th the foll owing seven coeffi c ient s: 0.5. 0, 0, 1, 0,
0. 0.5. Since that sums to 2. we can scale the coefficients to add to I, as follows: 0.25, 0,
0.0.5. O. O. 0.25. Applying such a 7-tap FIR fi lter to the composite signa l results in the
FIR output shown in Figure 5.92. The main signal is restored. We should point out that
we chose the mai n signal such that thi s example would come out very ni cely--{)ther
signals might nO! be restored so perfect ly. But the exa mpl e demonstrates the basic idea,
2.5r------------------- - ----,
2f---------,--------------- ---- in_total
__ fir_out

-1.51--------------=---\:+--I-+---\+_ ...... ---j

-2.5L---___________________ ---"
Figure 5.92 Filtering out the carrier signal using a 7-tap FIR filte r wi th constants 0.25, 0, 0, 0.5, O.
0.0.25. The sli ght delay in the output signal typicall y poses no problem,
While 5-tap and 7-tap FIR fi lters can cenai nl y be found in practi ce, many FIR filters may
contai n tens or hundreds of taps. FIR fi lters can cenai nl y be implement ed using software (and
often are). but many applications require that the hundreds of llluitipli cations and additions
for every sample be executed faster than is possible in soft ware, leading to custom di gital
ci rcui t impl ementations. Exampl e 5.8 ill ustrated the des ign of a c ircuit for an FfR filter.
Many types of filte rs exist other than FIR fi lt er;. Di git al signal fi lt ering is pan of a
larger field known as digital signal process ing, or DSP. DS P has" ri ch mathematical
foundation and is a field of study in itsel f. Advanced fi lt eri ng methods are what make cell
phone conver>ations as c lear as they are today.
- - _. - - ._-----------
5.12 Chapter Summary
285
5.12 CHAPTER SUMMARY
In this chapter, we described (Secti on 5. 1) that much digi tal design today involves designing
processor-level components, and that design is done at what is called the register-transfer
level (RTL). We Introduced (Secti on 5.2) a four-step RTL design method for convening
RTL behaVior to a processor implementation, wi th that implementat ion consisting of a data-
path controll ed by a Controll er. The RTL design method made use of the datapath
components defined In Chapter 4, and the controller design proce s defined in Chapter 3,
whi ch buil t on the combinat ional design process of Chapter 2. We provided several exam-
ples .of RTL design (Secti on 5.3), whil e poi nting out several pitfall and good design
praCll ces, and dl SCllSSlng the characteri sti cs of control- versus data-dominated designs. We
discussed (Secnon 5.4) how to set a circuit 's cl ock frequency based on the circuit's critical
path. We demonstrated (Secti on 5.5) how a sequent ial program. like a C program. could
conceptuall y be convened to gates using some straightforward transformati ons that trans-
form the C 11110 RTL behavior, which as we know can then be converted to gates using the
four-step RTL deSign method. That demonstration shoul d make it clear that a di"ital
system's functi onality can be impl emented as either software on a microprocessor or a
custom di gital circuit (or even as both). The differences among software and custom circuit
implementati ons are not related to what each can implement-they can both implement any
functionalit y. The differences are instead related to design metrics like system performance.
power consumpti on, size, cost, design time, and so on. Modem digi tal designers must there-
fore be comfonabl e migrating functionality between software on a microprocessor and
custom digital circuits, in order to obtai n the best overall impl ementation with respect to
constraints on design metri es. We introduced (Secti on 5.6) several memory components
commonl y used in RTL design, including RAM and ROM components. We also introduced
(Secti on 5.7) a queue component that can be useful during RTL design. We took a moment
to di scuss (Secti on 5. 8) a general technique that we've been using throughout the book.
hierarchy, whi ch helps a designer to manage complexiry.
In Chapters I through 5, we have emphasized straightforward design methods for
increas ingly complex systems, but we have not emphasized how to de ign those sy terns
well. Improving on Our designs will be the focus of the next chapter.
5. 13 EXERCISES
Any problems noted with an asteri k (*) represent especially chal lenging problems.
SECTION 5,2: RTL DESIGN METHOD
5. 1
5,2
PLUS
(a) Create a high-level Slate machine that describes the following system beha\-jor. The '} tem
h'15 an 8-bi l input A. a single-bit input d. and a 32-bit ompUI S. On every clock C) Ie. if
d= 1. the system shoul d add A 10 a running sum and output thut sum on S. If d=O, the
system shoul d instead subtract. Ignore issues of overflow and underllo\\ , Oon'l forgel to
include an initi ali zation state. Him: Declare and use an internal register (0 keep the sum.
(b) Add u I-bit input rs t to the system. When r s t = 1. the system hould dear its sum back to O.
Create a hi gh-level state machine for a simple data encryption/decryption dc\'i c. If:1 bit-input
b is 1. the device stores the data from 3 J2-bi t input I as \\ hat is kno\\T1 as an off \"3lue. Lf
b is 0 and another bit -input e is 1. then the devkt! "en [,)plS" its input I adding the stored
olTsct val ue to 1. and OUlput$ this encrypted "'title o\er 3 out Ul J. If ifure':.1d anothi'r
286 Register Transfer LevelIRTL) Design
r---.
PLUS

".-....
PLUS
bit-input d i'\ 1. the device should "decrypt" the data on r by subtr<lct ing the offset value
before outputting the decrypted value over J. Be sure to explicitl y IWlldle nil possible cambi-
of the three input bits.
5.3 Crca.tc a hi2h-l evc l stale machine for n digital bath-water conl roll er. The syste m has ::J. 3-bil
input ra t i-O indicating the desired ratio of cold water to hal water. and a bit input on indi-
cating that (he water should flow. The system has two 4-bit outputs hfl ow and efl ow,
the hal water now rJte and the cold water fl ow rale. The sum of these two rates
should equnl 16. Your hi gh-level slate machine should dClcnnine the output values for
h f 1 01,01 and c flow such that the r3tio or hot water to cold w;lter is as close as possi ble to the
desired rrt ti o. whil e the total now is always 16. Him: As there are only 8 possi ble rat ios, a rea
sonablc solution may use one statc ror each ratio.
5A Create a high-l eve l Slllt e machine that initializes a 16x32 register fi le's contents to all Os,
beginning the initi al iz ..llion when an input rs t is 1.
5.5 (a) Create a high-level state machine that adds each register or one 128x8 register file to the
corresponding registers or another 128x8 register file. storing the results in a third 128x8
register file. The system should onl y begin the addit ion when a bit-input add is 1. and
should not perrOnll the addition again until it has finished adding (onl y adding again if
add is I).
(b) Extend this system to ei ther add or subt ract. using an additional bit-i nput OPt where
op = I means add. and op = 0 means subtrac!.
5.6 Design a hi gh-level state machine ror a 4-bit up-counter with count control input cnt. count
clear input C 1 r . and a terminal count output tc. Use the RTL design met hod of Table 5.1 to
cOI1\'en the high-level state machine to a controll er and :l dntapath. Use a register and incre
mcntcr in the d:lIapath. not :l count er itself. Design the controll er down to a state register and
logic gates.
5.7 Compare the up-counter designed in Exercise 5.6 with the up-counter design shown in Figure
4.48.
5.8 Creme a datapath for the
hig h-level state machine in
Figure 5.93.
Inpuls: A, S, C (16 bils) ; go, rsllbit)
Outputs: S (16 bits)
Local registers: sum
5.9 Slaning with the soda
machine di. penser design
described in Example 5. 1,
create a block diagram and
highlevel state machine for
a soda machine dispenser
that has a choice of t \vo soda
types. and that also provides
sum<5096
0-
sum:
sum+C
Isum<5096)'
change to the consumer. A Figure 5.93 Sample hi gh-level state machine.
coin detector provides the
circuit wi th a I-bit input c that becomes 1 for one clock cycle when a coin is detected, and an
8-bit input a indicating the coin's value in cents. Two 8bit input s s I and s2 indicate the coS!
of the two soda choices. The user s soda selecti on i controlled by two bUllons b I and b2 that
when pushed will output I for one clock cycle. If the user has inserted enough change for their
<election. the ci rcuit set either output bit dl or d2 to I for one clock cycle. causing the
,elected soda to be di spensed. The soda di spenser circuit should also set an output bit cr to I
for one clock cycle if change is required. and should output the amount of change requi""
using an 8 bit output ca. Use the RTL design method ,hown in Table 5.1 to convert the high
level ' tate machine to a controller and a dataputh. Design the datapath to ,tructure. but design
the controller to the point of an FSM only. as wa, done in Fi gure 5.26.
_ . o' i . . .. ., _ _ ____ _
5.10 (a) Use the RTL design method of
Table 5. 1 to conVert the hi"h.
level stale l1l:lchine in
5.94 to a COntroller and a data-
path. Design the datapath to
Slmcturc. bUI design the con-
troll er to an FSM only, as was
done in Figure 5.26.
(b) "Design the COntroll er s FSM
down to structure.
5.13 Exercises 287
Inputs: slart(bil) , datal8 bilS), addr(8 bits), W wail(M)
Outputs: w_dalalB bits). w_addrlB bits), w_.wlbil)
w_wr::1
w_addr=addr
Figure 5.94 Hi gh-level stJte machine of bus
interface with bus wait signal.
5.1/ Create an FSM that interfaces
with the datapath in Figure 5.95.
The FSM should use the datapath
to compute the average value of
the 16 32-bit elements of any
A;:rra
y
A is stored in a memory. with the first element at address"5 the second at
a ress - . ,md so On .Assume that putting a new value onto the address line-s M addr causes

mcmf ory to almosl lI11mediarcly Output the read data on the M_data lines. leno-re the po i.
I lIy 0 overflow. -
average
Figure 5.95 Datapmh for computing the :lverage of 16 elements of an arm) .
5.12 Using the RTL design method shown in Table 5. 1. create an RTL desien of 3 reaction timer
circuit that measures the time elapsed between the illumin3lion of a ligh; and pressing of a
button by ;1 user. The reaction timer has three inputs. a clock inpUi elk. 3 fCSet input rsl. and :1
bUllon input B. and three OlHpUIS.:1 light enable output lell. a IO-bi t rea tion time output nime.
and a slol1' Output the lIser was not f:lst enough. The reaction timer \\ orks 3..\ fol-
lows. On reset. the reacti on timer waits for 10 seconds before iIIuminatine the lieh! b\ scltine
lell to I. The reaction timer then measures the len!.!.lh of Lime in ' the
presses the button B. outputt ing the time as n I_-bit binan number on mme. If me user did
not press the button within 1 seconds (:2CXXl milJi sc the reaction timer \\ ill set the-
output slow 10 I and output 2O<XJ on rrimt'. ssume) our clock input ha$ :1 of I kHz.
Him: This is " cont rol-dorni nnted RTL design problem. Dc,ign the dat3p;!th to structure. but
design the controll er to un F l\ t only. as W3,'\ done in Figure 5._6.
288 Register-Transfer Level (RTL) Design
5.13 Usc the RTL design method shown in Table 5. 1 to convert the hi gh-level stal e machi ne in
Figure 5.74 to a controller and a datapath. Design the dawpalh 10 structure. but design the con-
troller 10 the point of an FSM only. as was done in Figure 5.26.
SECTION 5.3: RTL DESIGN EXAMPLES AND ISS ES
For the following problems. design the datapat h to structure. bUI design the cont roll er to an FSM
only. as done in Figure 5.26.
Usi ng the RTL design method shown in Table 5. 1. create an RTL dc!\ ign thai computes the
sum of all positi ve numbers within a 512-word register Hie A consisting of 32-biL numbers
stored in IWO'S compl ement form.
5.15 Using the RTL design method shown in Table 5. 1. create an RTL design that computes the
sum of all positive numbers from a set of 16 separate 32-bit registers storing numbers in two's
complement form. Make the design as fast as possible by performing as many computations
concurrent ly as possible. Him: Thi s is a data-dominated design.
5.16 Using the RTL design method shown in Table 5.1. create an RTL design that outputs the
maximum value found wit hin a regi ster fi le A consisting of 64 32-bit numbers.
5.17 Using the RTL design method shown in Table 5.1. creme an RTL design that outputs a
warni ng signal whenever the average temperature over the past four samples exceeds a user-
defined value. The circuit has a 32-bit input CT indicating the current temperature reading, a
32-bi t input \VT indicating the user-specified temperature at which the warni ng should be
enabled. and a button input eI,. that will disable the warning. When the average temperature
exceeds the user-specified warning level. the ci rcuit should assert the output W to enable the
warning. The warning output should remain high unti l the elr button is pressed. Him: You can
use a right shift to implement the divide within your datapath.
5.18 Using the RTL design method shown in Table 5. 1, create an RTL design for a di gital filter that
outputs the average of the current 32-bit input and the previous 32-bit sample. Him: You can
usc a ri ght shift to implemcnt the divide within your datapath.
SECTION 5.4: DETERMINING CLOCK FREQUENCY
5.19 Assuming an inverter has a delay of I ns. all other gates have" delay of 2 ns. and wires have
a delay of I ns. determine the cri ti cal path for the full-adder circuit shown in Figure 4.3 I.
5.20 Assuming an invener has a delay of I ns. all other gates have a delay of 2 ns, and wires ha\'e
a delay of Ins. detennine the crit ical path for the 3x8 decoder of Fi gure 2.50.
5.21 Assuming an inverter has a delay of I ns. all other gates have a delay of 2 ns. and wires have
a delay of Ins. detennine the cri ti cal path for a 4x I multiplexer.
5.22 Assuming an inverter has a delay of I ns. and all other gates have a delay of 2 ns. detennine
the cri ti ca l path for an 8-bit carry-ripple adder:
(a) assuming wires have no delay.
(b) assumi ng wires have a delay of Ins.
5.23 (a) Convert the laser-based dis tance measurers FSM. shown ill Figure 5.21, to a state register
and logic.
(b) Assuming all gates have a del ay of 2 ns and the 16-bit up-counter has a delay of 5 ns. and
wires have no delay, determine the critical path for the laser-bascd distance measurer.
(c) Calculate the corresponding maximum clock frequency for the circuit.
SECTION 5.5: BEI-IA VIORAL-LEVEL DESIG : C TO GATES (O(yrIO AL)
5.24 Convert the following C-like code. which calculates the greate,t C0l111110n divisor (GCD) of
the two 8-bit a and b. into a hi gh-level state machine.
--. # - - - - - -----
j
Inputs : byte a . byte b. bit go
Outputs : byte ged . bi t done
GCD:
whi le(])
whi le( !go ) :
done: 0 ;
While ( !: b )
b ) I if(
- b;
el se (
b b - a :
ged = a:
done : 1:
5.13 Exercises 289
5.25 Use the RTL design method shown in Table 5.1 to convert the high-level state machine you
in Exercise 5.24 to a controll er and a datapath. Design the dalap:llh to structure. but
deSIgn the COntroll er to the point of an FSM only.
5.26 Conven C-like code, which calculates the maximum difference between any two
numbers wlthm an array A consisting of 256 8-bi t values. into a high-level Slate machine.
Input s : byte a(256). bit go
Outputs : byte max_di ff. bi t done
MAX_D I FF:
whi I e(]) (
while( !go);
done: 0:
i = 0:
max : 0:
min - 255 : II largest 8-bit va lue
while( i < 256 ) (
if( ali] < min) I
min = ali]:
if( ali] > max) (
max - ali]:
- i + 1:
max_ diff - max - min:
done - ]:
290 Reg ister Transfer Level (RTL) Design
5.27 Use the RTL design method shown in Tabl e 5. 1 to convert the hi gh-level Siale machine you
in Exercise 5.26 to il controll er and a datilpillh. Design the dawpa(h to structure, but
design the controller to the poi nt of an FSM onl y.
5.28 Convert the foll owing C-likc code. which calcul ates the number of limes lhe value b is found
within an array A consist ing of 256 8-bi t values. into a hi gh-level stat e machi ne.
Inputs : byte a[256] . byte b . bit go
Outputs : byte freq . bi t done
FREOUENCY :
"hi 1 e( 1) (
while( !go) :
done = 0 :
i = 0 :
freq = 0 :
while ( i < 256 ) (
i f ( a [i] == b ) (
freq = freq + 1:
done l '
5.29 Use the RTL design method shown in Table 5. 1 10 convert the hi gh- level st ate machine you
created in Exercise 5.28 to a controll er and a datapath. Design the data path to structure, bUI
design the cont roller to the point of an FSM onl y.
5.30 Develop a template for converting a dol )while loop of the foll owing form to a highlevel
state machine.
do (
II do while statements
) while (cond) :
5.31 ' Convert the while ( a ! = b ) loop within the C code description of Exercise 24 into a
doe )",hile loop as described in Exerc ise 5.30. Using the doe Jwhile loop templ ate
you created in Exercise 5.30. convert the revised C code into a high-Icvel statC machine. Use
the RTL design method shown in Table 5. 1 to convert the hi gh level state machine you created
in the previous problem to a conlroller and a datapath. Design the datapruh to structure, but
design the controll er to the point of an FSM onl y.
5.32 Develop a template for converting a for () loop of the foll owing form to a hi gh level state
machine.
for(i=start : i<cond : i++)
1/ for s ta ements
5.33 ' Convert the "'hile ( a ! = b ) loop within the C code descript ion of Exercise 5.24 to a
f or ( ) loop as described in Exerci se 5.32. , ing the for () loop template you created in
- - - . - - ._----------
j
5.13 Exercises 291
Exercise 5.32. convert the revised C code into a high level state machine. Use the RTL design
method shown in Tabl e 5. J to convert the hi gh-level Si ale machine you created in Lhe previous
probl em to a controll er and a datapmh. Design the dalapalh (0 structure, but design the con-
troll er to the poi nt of an FSM onl y ..
5.34 * Convert the while ( i < 256 ) loop within the C code description of Exercise 5.26 to
a for () loop as described in Exerci se 5.32. Using the for () loop template you created in
Exercise 5.32, convert the revi sed C-like code into a hi gh-level state machine. Use the RTL
design method shown in Tabl e 5. 1 to convert the hi gh-level stale machine you created in the
previous probl em to a controll er and a datapath. Design the data path to structure. but design
the controll er to the point of an FSM onl y.
5.35 Compare the time required to execute the foll owing computation using a custom circuit versu
using soft ware. Assume a gate has a delay of I ns. Assume a microprocessor executes one
instrucLi on every 5 ns. Assume that n:::: I 0 and 01::::5. Estimates are acceptable: you need not
design the circuit, or determine exactly how many software instructi ons will execute.
for (i = 0 : i<n . i++) (
s = 0 :
for (j 0 : j < m. j++)
+ c[i]*x[i + j] :
y[ i] s :
SECTION 5.6: MEMORY COMPONENTS
5.36 Calcul ate the approximate number of DRAM bit storage cell s that wi ll fit on an IC with a
capaci ty of 10 milli on transistors.
5.37 Calculate the approx imate number of SRAM bit storage cell s that will fit on an IC with a
capaci ty of 10 million transistors.
5.38 Summari ze the main differences between DRAM and SRAM memori es.
5.39 Draw a complete logic internal Slructure for :l 4:<2 DRAM (four words. 2 bilS each). clearly
labeling all internal components and connecl.i ons.
5.40 Draw a compl ete logic internal structure for a 4x2 SRAM (four words. _ bits each). dead)
labeling all internal components and connections.
SA l * Design an SRAM memory cell with a reset inpUi that when enabled \\ ill set the !TIernoI')
cell' s contents to O.
SECTION: READ-ONLY MEMORY (ROM)
5.42 Summarize the main differences between EPROM and EEPROM memories.
5.43 SUl11marize the main differences between EEPROM and Hash memories.
SECTION 5.7: QUEUES (FLFOS)
5.-'4 For an 8-word queue. show the queue's intemal state and provide the value of popped datu for
the foll owing sequences of pushes and pops: (I) push A. B. C. D. E. (2) pop. (3) pop. H) push
U, V. W. X. Y. (5) pop. (6) push Z. (7) pop. (8) pop. (9) pop.
5..15 Create nn FSM describing the queue cont roller of Figure 5.7 . careful :JHeution t, I. )r-
rcctl y sell ing the full and empty OUlputS.
292 Register-Transfer l evellRTl J Design
5A6 Create an FSM describi ng the queue controll er of Fi gure 5.78. bIll wilh error-preventing
behavior lhal ignores ;1I1Y pushes when the queue is full. and ignores pops of an empty queue
(outpuuing 0).
SECTION 5.8: HI ERARCHY-A KEY DESIG ' CO ' CEPT
SA7 Compose a 20- inpul AND gale from 2- inpul AND gales.
SAS Compose a 16x I IllUX from 2x I l1l uxes.
5A9 Compose ::I -tx 16 decoder wit h enable from 2x4 decoders with enable.
5.50 Compose a 1024x8 RAM using onl y 5 12x8 RAMs.
5.51 Compose a 512x8 RAM using onl y 512x4 RAMs.
5.52 Compose a 1024x8 ROM usi ng onl y 512x4 ROMs.
5.53 Compose a 2048x8 ROM using onl y 256x8 ROMs.
5.54 Compose a I 024x 16 RAM using only 512x8 RAMs.
5.55 Compose a 1024xl2 RAM using 512x8 and 5 12x4 RAMs.
5.56 Compose a MOx 12 RAM using only 128x4 RAMs.
5.57 *Writc a program that takes a parameter ,and 3utommicall y builds an N-i npul AND gate
from 2-inpul AND gotes. Your program merely need indi cate how many 2-inpul AND gales
exist in each level. from whi ch we could easily detenninc the connections.
-- - - - ------------
Chi -Kai staned coll ege as
an engineering major, and
became a Computer
Science maj or due to his
developing interests in
algorithms and in net-
works. Aft er graduating.
he worked for a Sili con
Valley stanup company
that made chips for com-
puter networking. His first
task was to help simulate those chips before the chips were
buill. For over 10 years now, he has worked on multi ple
generati ons of networking devices that buffer, schedule,
:md switch ATM network cells and Internet Protocol
packets. "The chips required to implement networking
devices are complex components that must all work
together almost perfectl y to provide the bui lding blocks of
tel ecommunicati on and data networks. Each generati on of
devices becomes successively more complex."
When asked what skill s are necessary for hi s job. Chi -
Kai says "More and more. breadth of one's skill set
matt ers more than depth. Bei ng an effective chip engineer
requires the ability to understand chip architecture (the bi g
picture), to design logic, to verify logic. and to bring up
the silicon in the lab. All these pans of the design cycle
interpl ay more and marc. To be trul y effecti ve :1I one
part icul ar area requires hands-on knowledge of the others
as well . Also, each requires very different skills. For
exampl e. verification requires good software programming
abil it y, while bring up requires knowing how to use a logic
analyzer-good hardware ski ll s:'
5.13 Exercises 293
Hi gh-end chips. like those involved in networking, are
quite costly. and requi re careful design. "The software
design process and the chip design process are
fundamemall y different. Software can afford to have bugs
because patches can be applied. Silicon is a different
story. The one time expenses to spin a chip are on the
order of $500.000. If there is a show-stopping bug. you
may need to spend another $500,000. This constraint
means the verification approach taken is quite different-
effecti vely: there can be no bugs." At the same time, these
chips must be designed quickly to beat competitors to the
market. making the j ob "extremely challenging and
exciti ng:'
One of the biggest surpri ses Chi-Kai encountered in his
job is the "incredible imponance of good communication
ski lls: ' Chi-Kai has worked in teams ranging from 10
people to 30 peopl e, and some chips require teams of over
100 people. "Techni calJ y outstanding engineers are
useless unless they know how to collaborate with others
and di ssemi nate their knowledge. Chips are only getting
more complex-individual blocks of code in a given chip
have the same complexity as an entire chi p only a few
years ago. To architect, design. and implement logic in
hardware requires the ability to convey complexity."
Funhermore. Chi -Kai points out that 'just like any social
entity, there are politics involved. For example, people are
worried about aspiration for promotion. financial gain.
and job securi ty. In thi greater context. the team still
must work together to deliver a chip:' So, contrary to the
conceptions many people have of engineers. engineers
must have excellent people skill . in addition 10 strong
technical ski ll s. Engineering is 3 socia] discipline.
294
6
Optimizations and Tradeoffs
6.1 INTRODUCTION
The previous chapters descri bed how to design di gital circui ts using straightforward tech-
niques. Thi s chapter will describe how to design belle,- circuit s. For our purposes, beller
means circuits that are small er. faster. or consume less power. Real world design may
involve additional criteria.
16 transistors
'il' =Dlgate-delays
y - U Fl
'il'::f'\ r
y=-LJ
F1 = wxy + wxy'
(a)
4 transistors
1 gate-delay
W- D F2
x-
F2 = wx
(b)
e Fl
2L
10
"' c
'" :::. 5 eF2
1 2 3 4
delay (gatedelays)
(e)
Figure 6.1 A circuit transformalion that improves both size and delay. (hal is, an optimization:
(a) original eireui !. (b) optimi zed circui t. (c) plot of size and delay of each circui!.
Consider the circuit for the equati on involvi ng Fl shown in Figure 6. I(a) . The
ci rcuit 's si ze. assumil/g tlVO t/'{l l/ sistors per gate iI/ put (a nd ignoring inverters for
simplicity), is 8 * 2 = 16 transistors. The circuit 's delay, which is the longest path
from any input to the output , is two gate-delays. We could algebraicall y transform
the equation into that for F2, shown in Figure 6. I(b) . F2 represent s the same
fun cti on as Fl. but requires onl y four transistors (instead of 16) and has a delay of
onl y one gate-delay (instead of two) . The transformation improved both size and
del ay, as shown in Figure 6. 1 (c). When we perform transformati ons that improve
all crit eri a of interest to us, we have performed an optimizatioll.
Now consider the circuit for a different fu ncti on, implementing the equation for Gl
in Fi gure 6.2(a). The circuit's size (assuming 2 transistors per gate input) is 14
and the ci rcuit', delay is two gate-delays. We could algebraicall y transform the equation
Into that shown for G2 in Figure 6.2(b). whi ch result$ in a circuit having onl y 12 transis-
to". However, the reducti on in transiMors comes at the expense of a longer delay of three
.- -- - - ------------
A tradeoff
improl'es some
criteria at the
expellJe of (Jlher
criteria oj imerest
101lJ. A"
oplimiznl ioll
improl res (II/
criteria of illlereJI
to liS, or improves
.wme of rhoJe
crirerioll'ir/wllr
U'orJell illg ril e
arhers,
6.1 Introduction 295
gate-delays, as shown in Figure 6.2(c). Which circuit is bener. that for Gl or for G2? The
answer depends on whether the size or delay criteri a is more imponant to us. When we
improve one criteria at the expense of another criteria of interest to us. we have per-
formed a tradeoff.
14 transistors
:grgate-delays
Gl
w
y
z
G1 =wx+wy + z
(a)
12 transistors

y G2
z- ___ --l
G2 = w(x+y} + z
(b)
20L:
'' 15 eGl
III eG2
'!?? 10
5
1 2 3 4
delay (gate-delays)
(e)
Figure 6.2 A circui t transformation that improves size bUl worsens de lay. lhal is. a Iradeoff:
<a) origi nal circuit. (b) transformed circuit. (c) plot of size and delay of each circuit.
You likely perform optimi zations and tradeoffs every day. Perhaps you regularl y
commute by car from one cit y to another via a particul ar route. You might be interested in
two cri teria: commute time and safety. Other criteri a. such as scenery along the route.
may not be of interest to you. If you choose a new route that improves both commute
time and safety. you have optimized your commute. If you instead choo e a route that
improves safety at the expense of increased commute time, you have made a tradeoff (and
perhaps a wise one at that).
Figure 6.3 illustrates optimi zations
versus tradeoffs for three different
staning designs, with the criteria of
delay and size, smaller being beller for
each criteria. Obviously, we prefer opti-
mi zations over tradeoffs, since
optimizati ons improve both criteri a (or
at least improve one criteria without
worsening another criteri a, as shown by
the horizontal and verti cal arrows on the
left side of the fi gure). But we can' t
always improve one criteri a without

delay
(a)
detay
(b)
Figure 6.3 (a) Optimizations, versu (b) tradeoffs.
worsening another crit eria. For exampl e, if a car designer wants to improve a car's fuel
efficiency, the designer may have to make the car smaller-a tradeoff among the criteria
of fuel efficiency and comfort .
Some general criteria commonl y of interest to digital sy tem designers include:
Performallce: a measure of executi on time for a computation on the stem.
Size: a measure of the number of transistors, or si lic n area, f a digital system.
POKIer: a measure of the energy consumed per second f a sy' tem,
relating to both the heat generated by the system and t the bane!) encr:,.!) n-
sumed by computati ons.
Dozens of ot her criteria exist.
296 Optimizations and Tradeoffs
Optimi zat ions and tradeoffs can bc made throughout nearly all stages of digital
design. Thi s chapter descri bes some common optimi zati ons and tradeoffs for some
common cri teri a. al various stages of di gital design.
6.2 COMBI NATIONAL LOGIC OPTIMIZATIONS AND TRADEOFFS
In Chapter 2. wc descri bed how to design combinat ional logi c, namely, how to conven
desi red combinational behavi or into a circuit of gales. There are optimi zation and tradeoff
methods we can appl y 10 make those circuits beller.
Two-Level Size Optimization Using Algebraic Methods
/ " rhe 1970s/
1980s. whe"
Transistors were
costly (l'.g .. cenls
each).
minimi:arion
!!1.fiJ.!J1. si:e
m;";",;:O/ion.
which dominated
digllal design.
Today 's cheaper
transistors (e.g ..
O.OOO} ufltseach)
make
optimi:tJrions of
other criteria
equally or more
crilical.
Implementing a Boolean function using onl y two levels of gates-a level of AND gates fol-
lowed by one OR gate-usuall y results in a circuit having minimum delay. Recall from
Chapter 2 that any Boolean equation can be wri llen in sum-of-products form, simply by
"multi plying out " the equat ion- for example, xy ( w+z ) xyw + xy z . Thus, any
Boolean functi on can be implemented using two levels of gates, simply by converting its
equation to sum-of-products fonn and then using AND gates for the products followed by
an OR gate for the sum.
A popul ar optimi zat ion is to minimize the number of transistors of a two-level logic
circuit implementati on of a Boolean functi on. Such optimization is tradi tionally called two-
level logic optimiwtion, or sometimes two-level logic millimiw tioll . We 'll refer to it as
two-level logic size optimization, 10 distingui sh such optimizati on from the increasingly
popular optimizations of performance and power, as well as other possi bl e optimizations.
To optimi ze size, we need a method to determine the number of transistors for a
given circui t. We' ll use a simple method for determining the number of transistors:
We' ll assume every logic gate input requires two transistors. So a 3-input logic
gate (whether an AND, OR, NAJ\fD, or NOR) would require 3 2 = 6 transistors.
The circuits inside logi c gates shown in Secti on 2.4 shoul d clarify why we assume
two transistors per gate input.
We' ll ignore inveners when determini ng the number of transistors, for simplicity.
We can view the probl em of two-level logic size optimi zation algebraically as the
problem of minimizing the number of literals and terms of a Boolean equation that is in
sllm-o!-products form. The reason we can view the problem algebraicall y is because,
recall from Secti on 2.4. we can translate a sum-of-products Boolean equati on direcOy to
a circuit using a level of AND gates foll owed by an OR gate. For exampl e, the equation
F ~ wxy + wxy ' from Fi gure 6. 1 (a) has six literals, w. x, y , W, x, and y' , and two
terms, vlXy and wxy " for a total of 6 + 2 = 8 literals and tenns. Each literal and each
term translates approx imately to a gate input in a circuit, as shown in Figure 6. I (a)-the
IlIera" translate to AND gate inputs, and the terms to OR gate inpuLs. The circuit thus has
3 + 3 + 2 = 8 gate inputs. With two transistors per gate input, the circuit has 8 2 = 16
transistors. We can minimize the number of litera ls and terms algebraically: F - wxy +
vlxy' = wx ( y+y' ) - WX , whi ch ha. only two lit era ls. W ;lIld x, resulting in 2 gate
IOput . or 2 * 2 = 4 transistors. as shown in Figure 6. 1 (b). (Note that a one-term equation
d o ~ n ' t require an OR gale.)
6.2 Combinational Logic Optimizations and Tr adeoffs
EXAMPLE 6.1 Two- level logic size optimization using algebraic methods
Minimi ze the number of literals and tenns in a two- level implementati on of the equation:
F - xy z + xyz ' + x ' y ' z ' + x ' y 'z
Let's minimi ze using algebraic transfonnali ons:
F - xy ( z + z ' ) + x ' y , ( z + z ' )
F = xy*l + x ' y ' * l
F - xy + x ' y '
297
There doesn' t seem to be any further minimization we can perform. Thus, we've reduced the circuit
from 12 literals and 4 terms (meaning 12 + 4 = 16 ga,e inputs. Or 32 transi ,ors), down to only 4
literals and 2 terms (meani ng 4 + 2 = 6 gate inputs. or 12 transistors).
The previous example showed the most common algebraic transformation us ed to sim-
pli fy a Boolean equation in sum-of-products form, a transformati on that generall y can be
wril1en as:
ab + a b ' ~ a ( b+b ' ) = a * l = a
Let's call thi s transformati on combining terms to eliminate a variable. More for-
mall y. thi s transformation is known as the ullitillg theorem. In the previous example, we
appl ied thi s transformat ion twice, once with xy bei ng a and z being b. and a second time
with x ' y' being a and Z being b.
Sometimes we need to duplicate a term in order to increase opportunities for com-
bining terms to eliminate a variable. as illustrated in the next example.
EXAMPLE 6.2 Reusing a te rm dur ing two-level logi c s ize opti mization
Minimize the number of literals and tenns in a two-level impl ementati on of the equation:
F - x ' y 'z ' + x 'y ' z + x ' yz
You mi ght notice twO opponunities to combi ne tenns to eliminate a variable:
I: x 'y'z ' + x ' y ' z - x ' y '
2: x' y ' z + X ' y z = x ' Z
Notice that the ' enll x ' y , Z appears in both opponuniti es. but that tenn onl y appears once in the
ori ginal equati on. We ll therefore fi rst repli ca,e 'he tenn in the original equation (such replication
doesn' t chnnge the fu ncti on, because a :: a + a) so thai we can use the tenn twice when rom-
bi ning terms to eliminrue a vari nble,:J. foll ows:
F - x'y ' z' + x ' y'z + x ' yz
- x ' y ' z ' + x'y'z + x'y'z + x ' yz
F - x ' y , (z+z ' ) + x ' Z (y ' +y)
F -x' y ' +x ' z
After we have combi ned terms to eliminate a vari abl e, the resulring tenn mi!!ht a1s
be combinable wit h other terms to eliminate a variabl e. as sho\\ n in the ~ -Uowing
example.
298 6 Optimi zations and Tradeoffs
EXAMPLE 6.3 Repeatedly combining terms to eliminate a variable
Minimi ze the number of literals and terms i n 3 two-level i mplementation of the equati on:
G : xy ' z ' + xy 'z + xyz + xyz '
\Ve can combi ne the first IWO terms to eli minate a variable. and the lasl Iwo terms also:
G = xy ' (z '+z) + xy(z+z ' )
G xy ' + xy
We can combine the twO remaini ng terms to elimi nate a vari abl e:
G
G
xy ' + xy
x(y ' +y)
G : x
In the previous examples, how did we "see" the opportuni ties to combine tenms to
eliminate a variable'? The examples' ori gi nal equations happened to be wri tten in a way
that made seeing Ihe opportuniti c easy-ternl s that coul d be combi ned were side-by
side. Suppose in; tead the equati on in Example 6. 1 had been writt en as:
F : x ' y ' z + xyz + xyz ' + x ' y ' z '
That's Ihe same fu nction, but the terms appear in a different order. We mi ght see that
the middle two ternl S can be combi ned:
x 'y ' z + xyz + xyz ' + x ' y ' z '
x ' y ' z + xy(z+z ' ) + x 'y ' z '
x ' y ' z + xy + x ' y ' z '
But then we might not see that the left and ri ght lenns can be combined. We Iherefore
might stop minimizing. thinki ng that we had obtained a full y minimi zed equati on.
There is a visual method to help us see opportunities to combi ne terms to eliminate a
variable. a method we now describe.
A Visual Method for Two-Level Size Optimization-K-Maps
Kamal/gil Maps, or K- maps for short , are a visual method intended to assist humans to
algebraicall y minimize Boolean equations having a few (two to four) variables. They actu
ally are not commonly used any longer in design practi ce, but nevertheless, they are a very
effective means for l/Iulersf(lIIdill g the basic opti mi zat ion methods underl ying today' s auto
mated tools. A K- map is essenti all y a graphi cal representati on of a truth lable, meaning a
K-map is yet anot her way to represent a function (the other ways including an equation,
truth table. and circuit). The idea underl ying a K-map is to graphicall y place mi nlenns
adjacent to one another if those mintenns differ in one variable onl y. so that we can actually
"see" the opportuni ty for combi ning terms to eliminate a variable.
Three-Va ri able K-Maps
Figure 6.4 shows a K-map for the equal ion:
F - x ' y ' Z + xyz + xyz ' x'y ' z '
IflaK-map.
adjacell1 cells
differ ;1/ ('.welly
olle mri(lble.
K-lIIl1PS enable
liS (osee
opportunities to
combine lerlllS
to eliminate a
mrioble.
EXAMPLE 6.4
6.2 Combinational Logic Optimizations and Tradeoffs 299
F yz
0
1 l
corresponds
to xyz;ooo,
or x'y' z'
i
00 "
Ot 1t
t t 0
0 0 1
notice not
/inorder
I
10
0
t
]\,
- ---- -------- ----""
treat left and right
edges as adjacent too
which is the equation from Exampl e 6.1 but wi th
terms appearing in a di fferent order. The map has
eight cell s, one for each possible combination of
vari abl e values. Let's examine the cell in the top
row. The upper-left cell corresponds to xyz:OOO,
meaning x ' y , z ' . The ne., t cell to the right corre-
sponds to XYZ:00 1, meaning x ' y ' z. The next cell
to the ri ght corresponds to xyz :011, meaning
x' yz. And the right mo t top cell corresponds to
xyz:010, meaning x ' yz'. Notice that the
orderi ng of those lOp cell s is 1I 0t in increasino
binary order. Instead. the order is ODD. 00 l. 01 t Fi gure 6.4 Three-variable K-map.
010. rather than ODD, 001, 010, 011 . The ordering
is such that adjacellt cel/s differ in exactly olle variable. For exampl e. the cells for X ' Y , z
(001) and x ' yz (011) are adjacent. and diffe r in exactl y one variable. namely. y. Like-
wi se. the cell s for x ' y , z ' and xy ' Z' are adjacent. and differ only in variable x. The
map is also assumed to have its left alld right edges adjacellt , so the rightmost top cell
(010) is adjacent to the leftmost top cell (00 D)- note those cells too differ in exactly one
vari abl e. Adj acent means abutted either hori zontall y or vertically. but 1I0t diagonal/y.
because di agonal cell s differ in more than one vari able. Adjacent bottom row cells also
differ in exactly one vari abl e. And cell s in a column also differ in exactly one variable.
We can represent a Boolean functi on as a K-map by placi ng Is in the cells conre-
sponding 10 the function's mimenns. So for the equati on F above. we place a 1 in cells
corresponding to minlerms x ' y' z, xyz, xyz ' . and x' y ' z ' . as shown in Fi2ure 6A. We
place Os in the remaining cell s. Noti ce that a K-map i j ust anotller repres;ntation of a
lruth table. Ralher than showing the output for every poss ible combination of inputs using
a tabl e. a K-map uses a graphical map. Therefore. a K-map is yet another representation
of a Boolean functi on. and in fact is anot her standard representation.
The usefulness of a K-map for size minimizati on is that. because the map is designed
such that adjacent cell differ in exactl y one vari able. then we know that (\\,0 adjacent 1s
ill {I K-map indicate tlial we can combine the {H'O m;llterms TO eliminate a l'ariable. 10
other words. a K-map lets us easil y see when we can combine two terms to eliminate a
variabl e. We indicate such combining by drawi ng a circle around two adjacent Is. and
then we show the resulting term aft er the differi ng vari able i removed. We iJlu ITate in
the foll owing exampl e.
Twolevellogic size optimization using a K-map
Mi nimi ze the number of literals and le mlS in a two- level F yz
of the equ:.uion:
F xyz + xyz ' + x ' y ' z ' + x ' y 'z
Ole that thi s is the same equation as in Example 6.1. \Ve
creme a K- map represcllI ing the runclion. shown in Figure
6.5. We see adjacent Is at the upper left of the map. so we
circle Ihose Is to yield Ihe Icn11 ' y ' -in olher \\ orus.
the circle is II sltorf/Illlld notation for).. I y , Z I + I Y . z
00 Ot t1 to
oC t 1 ) 0 0
1 0 o ( t 1

x'y'
""
Figure 6.5 Minimizing J
vnriabk fun 'tion l K-m.lp.
300 Optimizations and Tradeoffs
A/nap drau the
Circles
posJtble to "Over
the 1.1 In a Kmap.
'" x ' Y I. Likewi se. we see adjacent 1 s at the bottom right circle of the map. so we draw a circle
representing xyZ + xyz ' - xy. Thus. F x' y' + xy.
Recall from Exampl e 6.3 that someti mes terms can be repeatedly combined to elim-
inate a variable. resulting in even fewer terms and literal s. We can redo that example
using a diffe rent order of simpli fi cati ons as foll ows:
G xy ' z ' + xy ' z + xyz + xyz '
G x(y'z ' + y ' z + yz yz ')
G x(y ' (z ' +z) + y(z+z ' ))
G x (y ' +y )
G x
Not ice that Ihe second line above ANDs x wit h the OR of all possible combinations
of vari ables y and z. Obviously. onc of those combinati ons of y and z will be true for any
val ues of y and z. and thus the subexpression in parentheses will always evaluate to 1. as
we algebraicall y affi rmed in the latter lines above.
K-maps also help us graphicall y see Ihis situa- G yz
00 01 11 10
0 0 0 0 0
1 C 1 1 1 1
tion. In addi tion to helping us see when we can
combine two mi nlcrms 10 eliminate a vari able.
K-maps give us a graphi cal way to see when we can
combine four minterms to eliminate two variables.
We merely need to look for four adjacent cell s.
where the cell s form either a rectangle or a square
(bul not a shape like an " L"). Thosc four cell s will
have one variabl e the same. and all possible combi-
Fi gure 6.6 Four adj acent 15.

nati ons of the other two variables. Figure 6.6 shows the earli er functi on G as a three-
variable K-map. The map has four adjacent 1s in the bottom row. The four minterms cor-
responding 10 those Is are xy , z ' . xy , z. xy z. and xy z ' - note that x is the same in all
four minterms. whi le all four combinations of y and z appear in those minterms. We draw
a ci rcle around the bott om four 1s to represent the simplificati on of G shown in the equa-
lions above. The result is G x. In other words. the circle is a shorthand notation for the
algebraic simplifi cation of G shown in the five equations above.
'ate Ihat we could have drawn circles around
the left IWO 1s and the ri ght two 1s of the K-map.
as shown in Figure 6.7. result ing in G xy' +
xy. Clearl y, G can be further simpl ified to
x (y ' +y) Thus, we shoul d always draw the
biggest circle possi bl e. in order to best minimize
the equali on.
G yz
0
.. Y
xy
00
0
1
01 11 10
0 0 0
1 1 1

As another exampl e of four adj acent 1 s, con-
sider the equati on:
Fig ure 6.7 Nonoplimal circles.
H - x ' y'z + x'yz + xy ' z + xyz
xy
Figure 6.8 shows the K-map for that equati on's functi on. Circling the four adjacenl
Is yields the min imized equati on. H - z.
It 's OK 10 co\'er a
I more thcm ollce
to mi"imi:.e
mulliple terms.
Draw the fewest
ci,des possible. 10
mi"i", i:.e ,he
"umber of tenus.
6.2 Combinational Logic Optimizations and Tradeoffs 301
Sometimes, we need to draw circles Ihat include
the same 1 twice. That's okay. For exampl e, consider
the equation:
I x ' y ' z + xy ' z ' + xy ' z
+ xyz + xyz '
Figure 6.9 shows the K-map for that equati on 's
functi on. We can draw a circle around the bottom
four 1s to reduce those four mi nlerms 10 just x. But
that leaves the single 1 in the top row. corresponding
to minterm x ' y ' Z. We have 10 include that minterm
in the minimi zed equati on, since if we left that
mintenn out , we woul d be changing the funcl ion. We
could include Ihe mint erm itself. yielding I x +
x ' y , z. But that'S not minimized, because the ori o-
inal equation included mi nlerm xy , z. and xy ' z 0+
x ' y ' z (x+x ' )y ' z y ' z. On the K-map. we
draw a circle around that top 1 that also includes the
1 in the cell below. The minimi zed function is thus
I x + y ' z.
H yz
00 10
o 0 o
o o
Figure 6.8 Four adj acenl Is.
yz y'z
00 01 ) 11 10
0 0 1 0 0
x
1 ( 1 1 1 1

Figure 6.9 Circli ng a 1 twice.
It 's OK to include a 1 twice-that doe n' t change the functi on. Think about it: the
funcLi on doesn' t change if we dupli cate a minlerm (don ' t forgel. a a + a) _ and dupli -
cating a minterm can all ow for more opt imization. In other words:
x ' y ' z + xy ' z ' + xy ' z + xyz + xyz '
x ' y ' Z + xy ' z + xy ' Z ' + xy ' z + xy z + xy z '
(x ' y ' z + xy 'z) + (xy ' z ' + xy ' z + xyz + xyz')
(y ' Z) + (X)
We duplicated a minteml. which resulted in betler optimizati on.
On the other hand. there's no reason to circle 1 s more than once if the 1 are alread
included in a minimi zed term. For example. the K-map for the equation:
J x ' y ' z ' + x'y ' z + xy ' z + xyz
appears in Figure 6. 10. There' s no reason to draw the
circle resulting in the term y ' z. The other IWO
circles cover all the I s. meaning Ihose two circles'
terms cause the equati on to output 1 for all the
required input combinati ons. The third circle JUSt
result s in an extra term without changing the func-
ti on. Thus. we not onl y wanl 10 draw the large t
circles possible to cover all the 1 s. but we also want
to draw the f ewest circles.
yz
o
00
o
10
xz
We ment ioned earli er thot Ihe left and ri ght ides of a K- map are adja nt. Thus. we
can draw circles that wrap around the sides of a K-map. For example. the K-map for th
equati on:
K - xy'z' + yz' + ' y'z
302 Optimizati ons and Tradeoffs
appears in Fi gure 6. 11. The IWO cell s in the
with Is are adjacenL since the left and nght SIdes of
the map are adj acenl. and we can one
circle that covers both. resulllllg III the term xz .
Sometimes a I does not have any adj acent Is. In
that case. we simpl y circl e the single 1. 111 a
term that is a mi ntcfm. The tcrm x ' y ' z 111 Fi gure
6. I I is an example of such a term. .
A circle in a Lhree- vm'iabl e K-map musL Involve
one cell. two adjacenL cell s, four adj acenL cell s. or
eight adjacent cell s. A circle can lIot involve onl y
Lhree. fi ve. six. or seven cell s. The reason IS because
the circl e l11 ust represent algebraic lransform3t1,OnS
lilat eliminate variables appearing in all possibl e
combi nations. since Lhose variabl es can be facLored
ouL and Lhen combined La a 1. Three adj acenL cell s
don' L have all combinati ons of LwO variabl es-one
combi nation is mi ssing. Thus, the circle in Fi gure
6. 12 would not be va li d. since iL corresponds La
xy , z ' + xy , z + xy z. whi ch doesn' L simplify down
to one (crm. To cover that functi on. we woul d need
LwO circles. one around the lefL pair of 1 s, the oLher
around the ri ghL pair. .
If all the cell s in a K-map have Is. I1ke for the
funcLi on E in Fi gure 6. 13. Lhen we would have eighL
adj acent 1 s. We can draw a circle around those elghL
cell s. Since thaL circle represent s the ORing of all
possible combi naLi ons of the funcLion' s Lhree van-
abi es. and ince obviously one of Lhose combill all ons
wi ll be true for any combinaLi on of inpuL values, Lhe
equati on would minimize LO JUSL E = 1. .
Whenever in doubL as La whether a circle is val1 d,
j usL remember LhaL the circle represents a shorthand
K
E
F
yz
x'y'z
00 01 11 10
0 0 0
0 0
Figure 6.11 Sides are adj acenl.
yz
00 01 11 10
0 0 0 0 0
1 1 1 1 0
Figure 6.12 Invalid ci rcl e.
yz
o
Fig ure 6.13 Four adj acent 1s.
yz
for algebraic LransfonnaLi ons thaL combine Lerms LO
a vari able. A circle mUSL represenL a seL of
Lenns for which all possible combinaLions of some w
variabl es appear whil e ot her vari ables are idenLi cal in
x
00 01 11 10
all Lenns. The changing variables can be eliminaLed.
resulLing in a single Lerm wi Lhout those vari ables.
Four- Va riable K-Maps
K-maps are also usefu l for mini mizing fou r-variable
Boolean functions. Figure 6. 14 shows a four-variable
K-map for the following equaLion:
00
01
11
10
0
1
0
0
0 1 0
1 1 0
0 1 0
0 1 0
yz
"'--'
xz:
F = w' xy ' z ' + w' xy ' z + w' x ' yz + w' xyz
+wxyz+,tX ' Yz
Figure 6.14 Four-variable K-ntnp.
6.2 Combinational Logic Optimizations and Tradeoffs
303
Agai n, noti ce that every adj acent pair of cell s differs by exactl y one vari able. The left and
ri ght sides of the map are considered adj acenL, and the top and bottom edges of the map
are also adj acent- note that the left and ri ght cell s differ by onl y one variable, as do the
top and bOLl om cell s.
We COver the I s in the map with the two circl es shown in Fi gure 6. 14, resulting in
the terms w' xy ' and y z, so the minimi zed equati on is F w' xy ' + y z.
A circle covering eight adj acent cell s woul d rep-
resent all combinali ons of three vari abl es so
algebraic manipulati on would eliminate all three'vari-
abl es and yield one tenn. For exampl e, the function
in Fi gure 6. 15 simplifies to a single lenn, z, as
shown.
Legal-sized circles in a four-vari abl e K-map are
one, two, four, eight , or sixteen adj acent cell s. Cir-
cling all sixteen cell s yields a functi on that equals 1.
Larger K-Maps
G yz
wx
00
01
11
10
00
0
0
0
0
01 11 10
/, r;.,
0
1 1 0
1 1 0
1\1
1 0
,,?
K-maps for fi ve and six vari abl es have been pro-
posed, but are rather cumbersome to use effecti vely.
Thus, we do not di scuss them further.
Figure 6.15 Ei ght adjacent ceUs.
K-maps for two variables al so exi st, as shown in
Figure 6. 16. However, they aren' t particul arl y useful ,
because two-variable functions are very easy to mini-
mi ze algebraically.
Using a K-Map
Given any Boolean function of three or four vari -
ables, the foll owing method summari zes how to use a
K-map to minimi ze the functi on:
Figure 6.16 Two-variable K-map.
L COl/ vert the functi on's equation into sum-of-minternls fonn.
2, Place a 1 in the appropriate K-map cell for each mintenn.
3. Cover all the 1 s by drawi ng the 1I1il1i1l1UIII number of largest circle uch that
every 1 is included at least once. and wri te the corresponding tenn.
4, OR all the resulting tenns to create the mini mized function.
The first step. converting to sum-of-'ninternls fonn. can be done algebraical ly. as was
done in Chapter 2. Alt ernati vel y. many peopl e fi nd it easier to combine steps I and by
converting the functi on's equati on 10 sum-of-products fonn (where each tenn is not nec-
essaril y a mint enn), and then filling in the Is on the K-map corresponding to each tenn.
For exampl e. consider the four-variable function:
F = w' xz + yz + w'Xy'l '
The term \< ' xz corresponds to the two light l haded cdl in Figure 6. 17. so \\0 put
Is in tho e cell s. The tenn y l corre ponds to the entire dark- haded c lumn in the figure.
The lenn w' xy , z ' corresponds to the single unshaded cell shown on the left with a 1.
30"'
Optimizations and Tr adeoHs
Minimi zati on would proceed by coveri ng Ihe Is
wilh ci rcl es and ~ R i n g allihe lerms. The funclion in
Fi gure 6.1 7 is identical 10 Ihe function in Fi gure
6. 14. for whi ch we oblained the minimized equation:
F : w' xy ' + yz.
EXAMPLE 6.5 Two-level logic size optimization us ing
a three-variable K-map
Minimi ze the foll owing equation:
G : a + a ' b ' e ' + b* (e ' + be ' )
Lel"s begin by convening the equation to sum-of-products:
G : a + a ' b'e' + be ' + be '
\Ve place 1 s in a three-vari able K-map corresponding to
each teml. as in Fi gure 6. 18. The bottom row corresponds
to the term a. the top left cell to term a ' b ' e ' . and the
ri ght column to the teml be ' (whi ch appears (wi ce in the
equati on).
We then cover the Is usi ng the two circles shown in
Figure 6. 19. ~ R i n g lht.! resulting tenns yields the mini-
mi zed equation G = a + c '.
EXAMPLE 6.6 Two-leve l logic size opti mization usi ng
a four-variable K-map
Minimi ze the foll owing equation:
H: a 'b' (ed ' + c ' d ' ) + ab ' e ' d ' + ab ' ed '
+ a ' bd + a ' bcd '
Converting to sum-of-products form yields:
H : a'b'cd' + a ' b ' c 'd ' + ab ' c ' d' +
a b' cd ' + a' bd + a ' bcd '
We fi ll in the Is corresponding to each term, resulting in
the K-map shown in Figure 6.20. The term a ' bd corre-
sponds to the two cell s whose Is are in italics. All the
other (enns are minterms and thus correspond to one cel l.
We cover the Is using circl es as shown. One "circle"
covers the four comers, resulting in the tern' b ' d ' . That
ci rcle may look strange, but remember Lhal the top and
bottom cell s are adj acent , and the left and ri ght cells arc
adjacenl. Another circle results in the term a ' bd, and a
thi rd ci rcle in the term a ' be. The minimi zed two- level
equati on is thererore:
H - b 'd ' + a ' bc + a'bd
Ole the bolded 1 in Fi gure 6.20. We covered
that 1 by drawing a circle that included Ihe 1 10 Ihe
- - - ~
H
F yz
w x
00
~ ~
tt
to
w'xz
00 01
0
1\0\
t 1
0 0
0 0
yz
1t 10
1 0
1 0
r 0
1 0
Figure 6.17 IV ' xz and yz terms.
G be
a
00 01 It 10
0 t 0 0 1
1 1 t t t
Figure 6.18 Terms on the K-map.
G be
o
Figure 6.19 A cover.
cd
ab
00
b'd'
01
a' be
11
a'bd
10
Figure 6.20 K-mop example.
6.2 Combinational Logic Optimizations and Tradeoffs
305
left, yielding the lerm a ' bc. Alternatively, we could have drawn a circle that included the
1 above, yielding the term a' c d' , resulting in the minimized equation:
H : b ' d ' + a ' cd ' + a ' bd
NOI onl y does thai equal ion represent the same function as the previ ou equation, that
equation would also require the same number or transistors as the previ ou equation.
Thus, we see thai Ihere may be mUltiple minimi zed equations that are equally good.
Don't Care Input Combinations
Sometimes, we are guaranteed that cert ai n input combinati ons of a Boolean functi on can
never appear. For those combinati ons, we don' l care whether the functi on outputs a 1 or
a 0, because the function will never actuall y ee those input values-the output for those
inputs just doesn' l maHer. As an intuitive example. if you became ruler of the world_
would you li ve in a paJace or a castle? Your answer (the output) doesn't matter. because
the inpul (you becoming rul er of the world) simply won't happen.
Thus, when given a don't care input combination, we can choose whether to output a 1
or a 0 for each inpul combination, such that we obtain the best minimization pos ible. We
can choose whatever outpul yields the best minimization, becau e the output for those don' t
care input combinati ons doesn' l matter, as those combinations simply won'l happen.
Algebraically, we can use don ' t care terms by introducing them into an equation
during algebrai c minimi zation 10 create the opportuni ty to combine terms to eliminate a
variable. As a si mpl e example, consider a function F : xy ' l ' . for which we are for
some reason guaranteed that the ternl S x ' y , z ' and xy , z can each never evaluate to l.
We notice thai adding the firsl don'l care lerm to the equation would result in xy , z' +
x ' y ' z' (x+x ' ly ' l ' : y ' z'. Thus, introducing thai don't care term x ' y ' z '
into the equation yields a minimi zation benefit. However. introducing the second don' t
care term does not yield such a benefit, so we choose not to introduce that term.
In a K-map, don 'I care input combinations can
be easily handled by placing an X in a K-map for
each don't care mintenn. We don'l halle to cover the F yz 'fz'
Xs with circles. bUI we call cover some X if that
helps us draw bigger circles while covering the 1 s.
meaning fewer lit erals will appear in the term corre-
sponding to the circl e. For the above example, we
would draw the K-map shown in Figure 6.21 , having
one 1 corresponding to xy ' z '. when the func ti on
lilli S/ outpul l, and havi ng IWO XS corresponding to
x ' Y , z ' and xy , l, when the function ilia), OUtpUI 1
if thai helps us minimize the function. Drawing a
single ci rcl e results in the minimized equation F :
y , l ' . (Be careful in Ihis discussion not to confuse
the uppercase X. corresponding to a don't care. with
the lowercase x. corresponding to a variable.)
Remember, don't cares don 'I hare to be cov-
ered. The cover in Figure 6.22 gives an example of a
F
00 01 tt to
0 X 0 0 0
t 1 X 0 0
Figure 6.21 Map with don't cares.
yz 'fz' unneeded
00 01 11 10
0 X 0 0 0
1 1 X 0 0
Figure 6.22 Wasteful u>e of X
306 Optimizations and Tradeoffs
wastefu l use of don't cares. The circle covering the botl om X. yielding term xy ' , is not
needed. That tenn is not wrong, because we don ' t care whether the output is I or 0 when
xy ' evaluates to 1. But. that term would result in a larger circuit. because the resulting
equation is F - y ' z ' + xy ' . Since we don' t care, why not make the output 0 when
xy ' Z is I . and thus obtai n a small er circuit ?
EXAMPLE 6.7 Two-level logic size minimization with don't cares on a K-map
EXAMPLE 6.8
MinimilC the fo ll owing
F - a' be ' + ab c ' + a ' b ' e
given that tefms a ' be and abc are don't cares. Intuitively, those don' , cares mean that be can
be 11.
\Ve begin by the 3variable K-l11ap in Fi gure
6.23. We place ls in the three cell s for the functi on's mi n-
lenllS. \Ve then place Xs in the two cell s for the don't cares.
We c;.m cover the upper- left 1 using a circle that includes an
X. Likewise. includin2 the two Xs in a circle covers the (Wo
Is on the right with- a bigger circle. The resuhing mini-
mized equation is F a ' e + b.
F be
o
00
o
o
a'e
01
o
Wilhom don't cares. the equation would have mini-
mized to F = a ' b . c + be ' , Assuming two transistors
Figure 6.23 Using don' t cares.
per input and ignoring invcncrs, the equation mini - ,
mized wit hout don't cares would require (3+2+2) * 2 ;;;; 14 Inlllsistors (3 gate IIlputs for the first
AND gate, 1: fo r the second AND gate, and 2 for the OR gale, times 2 transistors per gate input), In
cont rast. the equati on minimized with don't cares requires only (2 + 0 + 2)*2 ;;;; 8 lransislOrs.
Don't care input combinations in a sliding switch example
Consider a sliding switch. shown in Figure
lhat can be in one of five positi ons. 3
with Ihree outputs x. Y, and Z indicati ng the 2,3,4,
positi on in bi nary. So xy Z can lake on the detector
values of 001. 010. al l. IDa, and 101. G
The other values for xy Z are nOt possible,
namel y. the values 000. 11 0. and III (or
x ' y ' z '. xyZ '. and xYZ ). We wish to Figure 6.24 Slidi ng swit ch example.
dCii ign combin:uional logic. with x. y, and Z
inpulS, that outputs 1 if the switch is in posi-
tion 2, 3, or 4, corresponding to xy z vlI lues
of 010. 011. or 100.
A Boolean equnti on describing the
de' ired logic is: G 2 x ' Y Z ' + x' y Z +
,(y , z ' . We can minimize the equation using
a K-map, a, shown in Figure 6.25. The mi n-
Imi/.ed equati on that rC'I uli s is: G .. xy ' l '
+ x ' y.
However, if we dan', we
can obtain a Simpler minimi7cd cqUlllion, In
part ic ul ar. we "'now th'H nOne of the thrce
G yz
00 01 11 10 x'y
0 0 0 1 1
xy'z'
1
1\.1/
'11 0 0
Figure 6.25 Without d n' t cares.
6.2 Combinational Logic Optimizations and Tradeoffs
mimerms x ' Y" l ' . xYl " and xy l can ever be true,
because. the switch can only be in one of the above-stated
five positi ons. So it doesn' t mailer whet her we omput a 1 G
or a 0 for those three other mi nterms. We can include yz y
o
Figure 6.26 With don' t cares.
307
these ca,rc input combinations as Xs on the K-map.
as shown III Fi gure 6.26. When coveri ng the Is in the top
fi ght. we can now draw a larger circle. resulting in the
term y. When covering the 1 at the bottom left , we can
draw a larger circl e also, result ing in the term z'.
Although we ended up covering all the Xs in thi s example.
recall ,that we do not have 10 cover the XS-we onl y use
them If they help us COver the Is wi th laroer ci rcles. The
minimized equal ion that results is: G "" yO + z ' .
That minimized equat ion lI sing don' t cares looks a lot different than the minimized equation
without don' t cares. But keep in mind the circuit still works the same. For example. if the witch is
in position 1. then xyz will be 001. so G - y + z' evaluates to O. as desired.
DOII'I cares II/IISI be IIsed w;lh call/ ;Oll . We must balance the criteri a of size with
other criteria, like reli able, error-t olerant , and safe ci rcui ts. when deciding whether to use
don' t cares. We must ask ourselves-is it ever possible that the don ' t care input combina-
ti on II/;ghl occur, even if in an error situation? And if it ;s possible. then do we really not
care at all what Our circuit outputs in that situation? Often. we really do care. and will
want to ensure Our circuit outputs a panicular value. For example, in the sliding witch
example above, perhaps temporary values could appear at the xy z outputs as the swi tch
is being moved. We might therefore want to ensure we output 0 for the don' t care val ues.
Several common situati ons lead to don't cares. Sometimes don't cares come from
physical limits on the inputs-a switch can' t be in two positi ons at once. for example. If
you' ve read Chapter 3, then you may reali ze that another common si tuati on in whicb don't
cares may appear is in controll er design, when a controller uses a state register that can
represent more states than the controller requires. For exan1ple. a controller with 17 tates
may use a 5-bit state register, meaning that 15 of the 32 possible states of the state register
would be unutili zed. Those 15 states could be treated as don' t cares (although to be safe.
we mi ght actually want to transiti on back to an initial tate if we ever enter one of those 15
unused states due to noise or some other error). If you've read Chapter 5. then you may
reali ze that another common situation where don' t cares arise i- in a controller controlling
a datapath. If we aren't readi ng or writing to a part icular memory or register file in a given
state, then we don' t care what address appears at the memory or register file during
that state. Likewise. if a mux feed into a register and we aren' t loading the register in a
given state. then we reall y don' t care which mux data input passes through the mux duri.ng
that state. If we aren' t going to load the output of an ALU into a register in a given statc-
then we reall y don' t care what function the AL computes during that state.
Automating Two-Level Logic Size Optimization
Visual sc of K-Maps Is Rather Limited
Although the visual K-map method is helpful in two-level optimization of three- and
four-variable functi ons. the visual method is unmanageable for functions \\ ith man> more
308
Optimizations and Tradeoffs
variables. One probl em is that we can' t effecti ve ly visuali ze beyond 5 or 6 vari-
ables. Another problem is that humans make mi stakes. and mi ght not draw
lhe biggest circl e possible on a K-map. Furthermore. the order 111 whi ch a deSi gner beginS
Is may resul t in a function that has more terlm than would have been obtamed
using a different order. For example. consider the functi on shown 111 the K-map of Fi gure
6. 27(a). Starting from the left. a designer mi ght first draw the circl e Yielding the term
y ' Z '. lhen the circle yielding x Y ' . then the ci rcle yielding y z. and finaHy tlhe wcie
yielding xy. for a towl of four terlns. The K-map in shows an a tematlve
cover. After drawing the circle y,eldll1g the lerm y z . the deSigner draws the Circle
yielding x z. and then the circl e yielding xy. The alt ernati ve cover uses only three terms
instead of four.
yz
yz
00 01 11 10 00 01 11 10
0 1 1 1 0 0 1 1 1 0
(a )
\\ 0
1 1 1 1
I I I
)
1
0 1 1 1 1
I I
(b
x'z y'Z: xy
y' Z: x'y' yz xy
Figure 6.27 A cover is nOI necessaril y oplimal : (a) a four-Ierm cover. and (b) a Ihree-Ierm cover of
the same funclion.
Concepts Underlying Automa ted 1\,'o-Level Size Optimization . .. .
Because of the above-menti oned problems, Iwo- Ievelloglc Size optimi zation IS done pnma
rily u ing automated compuler-based tools executing heuristic or exact algorilhms. A
heuristic is a problem solving melhod lhat IIslial/y yield a good solull on. whi ch IS
clo e to the oplimal. but II Ot II ecessarily optimal. An exact algoflthm . or Just algomhm. IS a
problem olving method lhat yields the optimal soluti on. An .optimatsollltion i as good or
better than any other possible soluti on. wilh respect to the cri teri a of Interest to us.
We firs t define some concepts underlying heuri stic and exact algorithms for two
level logic ize optimization. We wi ll illustrate lho e concepts graphicall y on K-map . but
uch illustration i onl y intended to provide the reader with an intuition of the concepts-
automated tools do not u e K-maps.
Recall that a functi on can be written as a um-of-mint erm equation. A minterm is a
product term that includes all the function' vari able exactl y once, in ei lher true or com
plemented form. The on-set of a function is the set
of minterms that define when the function should F
evaluate to 1 (i.e .. when the functi on is 'on"). For yz
the function in Fi gure 6.28. the on-set I x . y' Z.
/,y Z, xy Z ' I. The off-set of a functi on is all the
remaining minterms. For the functi on in Fi gure
6.28. the off- et is: I x' y , z ' . x ' y z " x' y Z.
o
':y'z
00
o
I Y , z ' . Jl.y . 1 J. V,j ng compact mintcrm'
tallon (<oee ection 2.6), the i, 11 .6.7}. and
the off-\et j- 10,2,3.4.5}.
Fi gure 6.28 Impliennl'.
6.2 Combinational Logic Optimizations and Tradeoffs
309
An implicant is a pd
'I bles b . ro uct term that may Include fewe r than all the function's vari-
, , ut IS a term that onl I .
d
'. y eva uales to 1 If the function should evaluate lO I -in other
War s, an Implicant of a f . .
. bl I unction IS a l.erm that should evaluale to 1 for a panicular set of
varia e va ues onl y if at I f ' , .
h
. bl east one 0 the funcuon son-set min terms evaluales to 1 for
lose varia e values Fl '
. I' . or examp e, the functi on F = x ' y ' Z + xyz' + xyz has four
IInp Icants: X ' Y ' z xy z ' '. . .
' 1 ' , xyz, and xy. Graphically, an Implicanlls any legal [but not
necessan y the bi ggest possible) circle on a K-map, as shown in Figure 6.28. All min-
terms are obViously impli cants, but not all impli cants are minterms .
We that the implicant xy covers minterms xy z' and xy z of function F. Graphi-
call y an Implicant 's circl . I h i' .
, e enCirc es t e s of the covered mlnlerms. Intuitively, we know
that we can replace the Covered minterms by the covering implicant and still obtain the
same function. In other words, we can repl ace xy z '+ xy z by xy. A sel of implicants that
covers the on-set of a func ti on (and covers no other min terms) is known as a caver of the
function. above function. one funclion cover is x ' y' z + xy z + xy z ' : another
cover IS X Y z + xy; yet another cover is x ' y ' z + xy z + xyz'+ xy.
RemOVing a variable from a term is known as expanding the term. which is the same
a expanding the size of a circle On a K-map. For example, for the functi on in Figure
6.28, expanding the term xy z to the term xy (by eliminating z) results in an implicant of
the func ti on: Expanding the term xy Z' to xy also results in an implicant (the same one).
But. expandl,n
g
xyz to xz (by eliminating y) doe not resu1l in an implicant-xz covers
mlnt ernl xy z, which IS not In the funclion ' s on-set.
A prime implicant of a function is an implicant with the property that if any variable
were elimmated from the implicant, the result would be a lerm coveriJlo a minterm not in
the function's on-set. Graphically. a prime implicant corresponds to ;ircles that are the
largest possible-enlarging the circie further would result in coverin!! as. which chanoes
the functi on. In 6.28, X Y . z and xy are prime implicants. Re;;'O\<ing any variable
from Impli cant x y z , say z, would result in a term (x' y . ) that covers a minlerm that is
not In the on-set-x ' y' covers x ' y , z . , for exanlple. which i not in the function' on-
set. Likewise. removing x ' or y ' from that term would cover a minterm not in the fun _
ti on' s on-set. Removi ng any variable from inlplicant xY. say y , would re ult in a lerm ( )
Ihat covers minterms not in the on- et. On the other hand. xy z is not a prime impli ant.
because z can be removed from that implicant without changing the functi on. since y
co.vers nllnteml s xyz and xy z' , both of which are in the on-set. Likewise. xyz' is nOla
prime Implicant. because z ' can be removed. There is no rea on to cover 3 function with
anything ot her than prime implicants, since a prime implicant a hie,'es the same function
wi th fewer literals than nonprime inlplicants (which is why we n!W:l) dra\\ the bi !! t
circl es po si bl e in K-mnp ). =
An essential prime implica"t is a prime implicant lhal is the mIl)' prime intplic3Dt
that covers a particular minteml in the fun tion' on-set. Graphicn!I). an e - ntin! prime
IIllp! lcant I the only circle (the largest PO' ible. f course. in e the circle rou ' represent
II prime Impli cant ) that covers a parti ular 1. In Figure 6.2 . x ' l is IlIl e ' ntial prime
impli cant. II i xy . because each i the only prime impli Wit vering n pani -ular 1.
nonessent ial pri me implicunt is a prime implicant \\ hose ,-o\ ered ruintenns are nJso
covered by one r more other prime implicllnts. Fig.ure shO\,s II different function
thnt has four prime implicant. but only two of which are e s ntial . ' 'is an e,' ntia!
prime implicant because it is th' only prime impJi ant that o'crs mint-eml \ '_ ':'. _
- - ._--
310 OptimIZations and Tradeoffs
j" nn c.." ential prilllt! illlplicani bt!cnllsc it is the only
prime impiicarll that CO\'\!f" minlenn xY Z ' . y' z a
nones ... elllial prime implicant because both of ItS
co\cred minh:: rm", are by other implicants
(lho;e other prime implicants mayor may not be
essential prime implicants). Likewise. Xl i not
The importance of essential prime illlpli-
C3nt" we know that we must include all
prime impiicanls in n function' s cover. 0111-
en' there would be .sOl11e minlcrms that could not
be covered. We mayor may nOl need 10 include non-
e"emial primc implicams 10 completely cover the
function. but we must include all essential prime
implicant <.
not essential
G yz y'z
00 10
o
x'y'
essential xz xy
not essential essential
Figure 6.29 Essential prime
impl icnnt<;,
Given the nOlion of prime implicants and essential prime implicants. a simpl e
approach for two-level logic optimization is given in Table 6.1.
TABLE 6.1 Approach for automated two-level logic size optimization.
tep Description
Deremlifle prime impliclIIw For e\cry mintcml in the function'", on-set. maximall y expand the tenn (meaning
eliminatc literal'i from the (eml ) such that the term still onl y covers minterms in the
on-set (like drawing the biggest circle possible around each 1 in a
Kmap), Repeat for each minterm. I f don' t cares exist, them to maximall y
expand mintenn\ into prime impli cants (like u:-. ing X's 10 create the biggest circles
for a given I in a K-map),
Add euefllial prim' imp/iclllII_\
to rhe fitllerion's cm'er
Find any minterms covered by only one prime impli cant ( i.e .. by an essential prime
impli c::mt), Add tho e prime impli canlS 10 the cover, and mark the minterms
co\ered b) tho\c implicanlS as already covered,
Cm"er remoinint: 11I/ntenllf hil" Co\cr the remaining minterms usi ng the minimal number of remaining prime
noneuellliul prune II11pliclllll5 impiicants.
The fir;t 1\\0 "eps are exact. The last tep is a bit tri cky. How do we choose which
pnme IInpli cants to u'e to cover the remaining minterm5? Recall the example of Figure
6.27. in v, hi ch the cover in Figure 6.27(a) used two prime implicants to cover the two Is
that would be left after adding cs;ential prime implicant'. whil e the cover in Figure
6.27(b) u,ed on ly one prime implicant to cover tho,c remaining two Is. When there are
on ly tv,o pO>'ibilitie,. we can try each po."ibili ty and pi k the one with fewe t prime
implicant; In the fi nal cover. But what if there were million, . or billion\. of possibilit ies?
We may not h:lve enough compute time to try al l tho;e For large functions
v,lth hundred, of mintcrm, and thou,and, of prime Impl lcnnt '. there moy indeed be mil-
lion, of po;"ble cover, to con;idcr III thc la\l ; tep.
If an npproach tnc, :111 ,uch po"ibilllie,. the :I pproach i, an :il gorithm. If an
approach )U,t tne, a few ,uch po"ibllitie,. the overall two-level ,i/e optimi1.lltion
approach may he a hcumtlc (unlc" the approach can guarant ee that the Ignored po sibil -
1I1C, couldn't IX""bly he pan of an optimal 'olution)
6.2 Combinational Logic Optimizations and Tradeoffs 311
We'lI demonstrate the approach for automated two-level logic size optimization with
the following exampl e.
EXAMPLE 6.9
Two-level logic size optimization with the approach of Table 6.1, illustrated on a K-map
Fi gure 6,30 shows a K-map for the function from Fi oure
6.27, for which we saw thai different covers yieldeddif_
ferent numbers of terms. The first step is to determine all
prime impJicams, shown in the top pan of the fi gure. For
each 1. draw every possible circle involving adjacent
Is. ensunng that each circl e is the largest possibl e.
The second step is to add essential prime impli-
cants to the function' s cover. Notice that the 1
corresponding to mi11lerm x ' y Z (the top righl 1) is
covered onl y by one prime impl icant. namely. x ' z.
know we' ll need to usc that prime implicant. so
we II Include prime implicant x ' Z in the cover. Also
notice that the 1 corresponding to mi11lerm xy z (the
bollom right 1) is only covered by one prime impl icant.
namel y, Xl I , so we' ll include thai prime implicant in
the cover 100. We mark all the 1 s covered by these essen-
tial prime impiicants. noted by italicized Is in the fi2ure.
The last step is 10 cover the remaining Is the
fewest number of prime implicant, There is only one 1
uncovered. and that 1 is covered by two prime impli-
cants, \Ve can choose ei ther prime implicant for the
cover-Iet's choose y I Z I , Thus, the final cover is:
I = x ' Z + xz' + y ' Z
This example uses a K-map merely to illustr:lte 10
the reader the sleps occurring wilhin an automated
tool-such :l 1001 does 1/ 0 1 use K-maps intemalJy, but
rather other means of representing the tenns of a
function.
Automated 1\"o-Level Logic Size Optimization
Using the Quine-McCluskey Method
yz
':z
00 01 1,] 10
0 1 1 1 0
(a )
v;-
1 t
1\ 0
0
,
y'z' ':y'
yz
x' z
(b)
y'z' ':y'
yz
(e)
o o
y'z'
Figure 6.30 liIuSlrntion of [\\0-
Ie'el optimi zation: (a) alJ prime
implic3I11S. (b) including e>.><:ntial
prime implicartlS in the C'O\er, tel
co\ ering remainmg :s..
The most well-known. and in fact the original. approach for automated t\\ o-Ie,e1 logic
size opti mi zation is the Quine-McCluskey method. sometimes ailed the tabular method.
The first step of thi s method finds all prime implicant . The step stan:. \\ ith thl.' func-
tion 's minterms- if we are minimizing a three-variable fun tion. then \\e mi2ht 'all these
three-variable terms. To find all the prime impli ants. the method first ea.:h
three-variable teml wi th every other three-variable teml. and if t\\ O tenus :Ire found that
diffcr by onl) one variable. the method adds a new tenn l\\ ithout thl.' dift'ering .. anablel t
a new set of two-variable tenm. For example. xy l ' and y: differ b, one ,mabl :.
rc. ulting in a new tenll xy being added to the t\\o-.. ariable <et. nc'c dc;nc , ... 'mparing all
three-variable tenns. the method pair of t\\o-"uiable tel111S fer tl.'l11l> that
differ by only one variable. in a :et of one-'mable tel11b. n ';lJ1aN t I11lS
can then be compared for teml> that dilTer b) one' ariable. but tf 'u 'h t'1111' .Ire fc'l1nJ.
J 12 Optimizations and Tradeoffs
then Ihe funct ion evalumes si mply to 1. Actuall y. nOI all terms in a sct need 10 be com-
pared-only tho,c terms whose Ililmber of uncompiemcillcd literals differs by one need
10 be compared. For example. x y z ' and xy z need not be compared. because the
number of uncomplemented lileral differs by two. not one. and thus can' t be simplified
to a new tenn by eliminating a vari able. If at any time in Ihi , step a term cannot be com-
bined wi th any olher term. we mark that term as a prime implicant. Thus. after thi s step,
all marked temlS represent all prime implicants. The method thus provides an approach
for fi nding prime implicants. more efficient than j ust maxi mall y expanding every term.
The second step is 10 add all Ihe essenti al prime implicants to the cover, and to mark
as "covered" all minl enns covered by Ihose pri me impl icant s.
The fina l step is 10 cover all remai ning uncovered mintemls by select ing the fewest
remaining prime implicants to cover Ihose mi nt erms. Trying all the pOSS ibi lities results in
a version of the Quine-McCluskey melhod that is an exact algorithm. Trying just a subset
may result in a heuri sti c.
Methods That Enumerate All Minterms or
Comput e All Prime Implicants May Be Inefficient
The Quine-McCluskey melhod works reasonably for functions wilh perhaps tens of vari-
ables. However. for larger funclions. just li sting all the mintemls could result in a huge
amount of data. A fu nct ion of 10 vari able could have up to 2
10
mintemls-that 's 1024
mintemls. which is fairly rea onable. But a funclion of 32 variables could have up to 2
32
mintemls. or up to about four bi lli on mintemls. Represeilling Ihose mintemls in a table
requi res prohibit ive computer memory. And comparing those minterms Wilh Olher min-
temlS could require on Ihe order of (four billi on)2 computat ions. or quadrillions of
comput ali ons (a quadrilli on is a Ihousand time a trillion). Even a computer performing
10 billion computations per second would require 100.000 seconds to perform all those
computation,. or 27 hours. And for 64 variables, Ihe numbers go up to 26-1 possibl e min-
temls. or quadrillions of mintemls. and quadrillions
2
of computali ons. which could
require a month of computation. Functions with 100 input . which are not that
uncommon. would require an absurd amount of memory, and many year of computa
tions. Even computing all prime impli cants. without first Ii ling all minterms. is
computationall y prohibiti ve for many modem-sized functions.
Iterati ve Heuristic for Two-Level Logic Size Optimization
Becau e enumerating all minterm of a functi on. or even ju>t all prime impli cant . is pro-
hibitive in temlS of computer memory and computation lime for functions with many
vari ables. mo t automated tools u e methods that instead just iteralively transform the
original function's equati on. in an attempt 10 fi nd improvement 10 the equation. Iterative
improvement means repeatedly maki ng small change. to an exisling solution umil we
decide 10 , top. perhap; because we can't find a better ,olution, or perhaps beeau e the
tool has run for en ugh time. As an exampl e of making small changes 10 an ex isting solu-
ti on. con"der the equation:
F - abcdefgh + abcde gh ' +
Clearl y. we can reduce th" equation 'imply by omh,nlng the Iirst two term and
rcmov lIlg ' anable h. re\ ultlllg III F - abcde f 9 + i 1 mnop. However, enumerating
the mllltcrm" J S reqUired III the carlier-de,cn1x:d ,ile optllnll311on methods. would have
6.2 Combin ational Log ic Optimizations and Tradeoffs 313
resulted in roughl y 1000 m' I d . . .
. r III erms an Ihen millions of computations to find the pnme
IInp Icant s-but such enumerali on and computation are obviously not necessary W mini-
mize thi S equalion.
Modem automated logic opt' '. . .
. ImlzaUon lools therefore don t try to enumerate aU the
mill terms for wi th many variables. Instead, those lools start with a given sum-
of-products equati on of the f t' I' k th "
. . unc lon, l ee descnpuon for F above. Those 1001 then try
to transform the equati on little by little into a better equation. meaning an equation with
fewer terms and/or fewer lil eral s. Those tools repeal, or iterote. until they find no further
Improvement or until some maximum time all ocated for Ihe 1001'S execution has expired_
. Heunstlcs for such two-level logic optimization
III modern tools can be quite complex. However a I yz
simple heuri stic Ihat is reasonably effective
repeated applicati on of Ihe expand operati on. The
expalld operati on means to remove a literal from a
teml and Ihen check whether the new teml is legal.
Removi ng a literal makes Ihat term cover more
temlS, like drawing a bigger circle on a K-map-
Ihus the name 'expand." For example. consider the
funclion F = x ' z + xy' z + xyz . We might
lry to expand the teml x' z by removing x '. or -by
removing z. Note Ihat expanding a teml reduces the
number of literals-the concept that expanding a
term redll ces the number of literals in a teml may
take a whi le for you to get used to. Thinkino of K-
map circles may help. as shown in Figure 6.31-1he
bigger Ihe circle. the fewer Ihe resulling literal . An
expansion is legal if the new teml covers onl y min-
terms in Ihe functi on' s on-set. or equivalentl y. does
lIot cover a mimeml in Ihe function's off-set-in
other words, an expansion i legal if the new teml i
o
(a)
yz
o
(b)
00 01 11 10
o
o o
r(z xyz
10
)(z
o
)(
o
r(z xyz
Figure 6.31 bpan ions of term
, Z in the fuoctioo F = x' z -
xy ' z + xyz: (a) Ie",,!. (bl 001
legal (because the tenn
cmer.; Os).
still an implicant of Ihe function. Figure 6.3 1 (a) shows that expanding term x' : to z for
Ihe given funclion is legal. as Ihe expanded teml covers onl y 1 . whereas expanding 'z
to x ' is not legal . as Ihe expanded teml covers at least one O. Lf an e: pansion is legal. "e
replace Ihe ongillal tenn by Ihe expanded teml. and we look for and an, OIher
lerm cOI'ered by tile expallded term. tn Figure 6.31 (a).lhe expanded term z terms
xy , z and xy z. so both Ihose latter tenns can be removed.
I ote that we illustrated Ihe expand operalion on a K-map merel) to aid in under-
tandi ng the intuition of Ihe operation-K-maps are nowhere to be found in heuri -[j I'\(}-
level logic size minimi zati on tool s.
As anolher example. for ti,e earlier inuodu ed function:
F - abcdefgh + abcdefgh '+ j lmnop
We might start b trying to expand the fir.;t tenl1. a bcde f gh_ ne "'pan i< n of th t t ml
b bcde fgh (i.e" we fCmoved the literal a ). Ho\\e\er. thaI term \ __ the teon
a 'bcde fgh. "hi h coven.mintenlls th31 are not in the fun tion' on-: t. $0 thaI ' pan-
sion not legal. We might try other e. pansions. finding them n t I '. until \\ e n:
OptimIZations and TradeoHs
aero" the to abcdefg (i.e .. we removed the literal h). That term strictly
CO\of' abcdefgh and abcdefgh '. both of whi ch are clearl y impli cants because they
appear in the origimll functi on. and thus the new tcrm Illust also be an implicant. There-
fore. \\e replacc the fir,t ten11 by the expanded term:
F = abcdefg4 + abcdefgh ' + jklmnop
and wc also rcmo\ c the second term. since that ten11 is covered by the expanded temJ:
abcdefgh + + jklmnop
abcdefg + jklmnop
Thus. lI;;ing j ust the expand operation. we have improved the equati on.
EXAMPLE 6.10 Iterative heUristic two-level logic size optimization using expand
tht: follo\\ ing equation. whi ch was also minimized in Example 6.4. using repeated appli-
cation of the oper:lIion:
F = xyz + xy z' + x ' y ' z ' + x ' y ' z
In other \\ orcl<. the on-,et consist.'> of the mintemls: 17.6. O. I I. and so the off-set consists of the
mlillerm" 3. 51
Let\ expand the !erms from left to righl. so we ll stan with xy Z. We can try to expand xyz
to xy. b that a legal expansion" xy covers minterms xy z ' (mi nterm 6) and xy z (minterm 7).
both III the on-,el. Thus. the expansion is legal. so we replace xy Z by xy. yielding the new
equation:
F = xyz + xyz ' + x ' y ' z' + x ' y ' z
We al ,o look for implicants cmered by the new implicant xy. xyz ' is covered by xy. so we
ehmillate xy Z ' . yielding:
F = xy + x 'y'z ' + x ' y ' z
Let\ continue ll)lng to expand that first lenn. \Ve can try expanding it from xy to x. The term
X co\ e" mintcrm' xy ' z ' (minlcrm xy' z (minteml 5). xy z ' (minterm 6). and xyz (min-
Icnn 7). The ICon X thul; covers mintenns "' and 5. which arc not in the onset. but instead in the
ofT-", 1. Thu,. that expan"on i, not legal. We can al,o try expandi ng xy to y, bUI we' ll find again
the not legal.
We ml ghl then the neXl term. x ' y , Z ' . Let"' try expand II to X ' Y , . That teren
co\." mlllterm' x ' y' Z ' (mi nterm 0) and x ' y , Z (minterm I). both in the on-sct. so the expan-
"on 1\ leg"1 We thu, replace the term by the expanded ne:
F - xy + + x ' y'z
We ched. fur other term, co\ered by the expanded tenn. and find lhat X ' Y ' z is covered by
/. ' I'. '0 v.e rcmO\c x ' y ' Z. Ica\lOg:
F - xy + x ' y'
\\e Cdn try c'pandlng the term x ' y ' further. hut ""ll hnd th.1I both PO' Ible expansions
(/ . or y , ) are not legal the above cquJllon rcprc\Cnl the mUIII1lI/cd equati on. Notice
rh..at Ihl' hJppcn, 10 he Ihe '<'Ole re,ull ..1\ v..c oht.uncd when we "H",fllI/cli the muial cqua-
flon In I '(.Impll' 6A
6.2 Combinational Logic Optimizations and TradeoHs 315
Even .though the heuri stic based on expand happened to generate the optimally minimized
equallon In the previous exampl e, there is no guarantee the results from the heuristic will
always be optimal.
. More advanced heuri sti cs utili ze additional operations beyond ju t the expand opera-
lion. One such operat ion i the reduce operation, which can be thought of as the opposite of
expand. The redll ce operation takes a tenn. and tries 10 add a literal to the tenn_ checking
that the equallon wllh the new tenn still covers the functi on. Addino a literal to a tenn i like
reduc ing the size of a circle on a K-map. Adding a literal 10 a te";' reduce the number of
mlllten11S covered by the tenn, hence the name redllce. Another operation is irredllndant-
which tries to remove a term entirely, checking that the new equation till covers the func-
li on. If so, the removed term was "redundant," hence the name irredlllldalll. Heuristic may
lIerate among the expand. reduce. irredundant. and other operati ons. uch as in the fol-
lowing heuristic: Try 10 random expansion operations. then 5 random reduce operations.
then 2 irredundant operations, and then repeat (i terate) the whole sequence until no
improvement occurs from one iteration to the next. Modem (wo-Ievel ize optimization
tools differ largely in their orderi ng of operati ons and their number of iterations.
Recall that we sai d that modem heuri stics don't enumerate all of a function' min-
terms. yet in the previous example we did enumerate aU the mintenns- actualh'. we "'ere
given the mimemls in the initial equation. When we don ' l initially kno" the -rninterms.
many advanced methods ex.ist to efficiently represent a functi on' on- et and off-sel
without enumerating the mimenns in those ets. and also 10 quick!) check if a tenn
covers lerms in the off-set. Those methods are beyond the cope of the book. and in tead
the subject of textbooks on digital design synthesis. But hopefull y you no\\ get the basi
idea of heuri stic two-level minimi zat ion.
One of the original 100is Ihat performed automated heuri tics as well as exacI two-
level logic optimi zation was called Espresso. developed at the University of California
Berkeley. The algorithms and heuri stics in Espresso fomled the basis of man, modern
commerci al logic optimi zation tools. -
Multilevel Logic Optimization-Performance and Size Tradeoffs
We have thus far di scussed two-level logic size optimi zation. H we\'er_ in pro rice_ \\e
may not need the speed of two levels of logi c. We may be \\ illing 10 use three_ four. or
more levels of logic if those additional level s reduce the amount of required log; _ A ' a
simple example. consider the equation:
Fl = ab + acd + ace
Thi s equation CHn ' t be minimized. The resulting two-Ie\e! ircuil is sh \\n in Figure
6. _(a).
, e could. howeva. algebraically manipul ate the equation a< follo\\s:
F2 - ab + ac(d + e) = a(b ... C( ... e
That equation 'Ill be implemented \\ ith the circuit ,ho\\ n in Figtlre tl.32(bl. mulu-
Ic\d logic implementation in fe\\er tmn Jt th \"f Ill\. gal
delays. li.' illustrated in Figure 6.32(c). The multile\ I nnpl'm nUll >n Ihu, rep >'nt,
/rc/{/eojJ compared to the t\\ l>-k\'d implemem,niOll _
316 Optimizations and Tradeoffs

b
FI = ab + acd + ace
(a)
FI
16 transistors
4 gale-delays
F2 = a(b+c(d+e))
(b)
,i::L F' .F2
'en 10

:. 5
I 2 3 4
delay (gale-delays)
(c)
Figure 6.32 muhilc\cllogic to tradeoff performance and ize: (::I) Iwo lcvel circui t.
(b) muhilcveJ circuit wit h fewer transistors. (c) illustration of the size versus delay tradeoff.
umbers in.. ide gales represent transistor CQunts.
Automaled heuri stics for multilevel logic opti mi zat ion iterati vely transform the
initial function's equati on. much like for two-level logic optimizati on, optimizing one of
the criteria at the expense of another.
EXAMPLE 6.11 Multilevel logic optimization
- - - -- ----------
Minimize lhe following function 's circuit ize. al the expense of perhaps slower performance. using
al gebraic manipulation. Pl ot the tradeoff of the initial and size-opt imi zed circuits with respect to
size and delay.
FI - abed + abeef
The ci rcuit corresponding to thi s equati on is shown in Figure 6.33(a). The circuit requires 22 tran
sistors and has a delay of 2 gate-delay.
a
b
c
d
a
b
c
e
f
22 transistors
2 gale-delays
F I = abed + abcef
(a)
FI
18 transistors
3 gate-delays
F2 = abc(d + ef)
(b)
F2
-W
2
LF'.
F2
C1) 15
10
:. 5
I 2 3 4
delay (gale-delays)
(c)
Figure 633 Mulul evel IOglc to tradeoff pe rformance and ,i7e: (n) two-level circuit . (b) multilevel
ClrC"'t IIo lih fewer lran, i' tor<. (c) tradeoff of Size ye"u< delay. umbe" in<lde gate, represent
Lran\ l \ tor COUOl\.
We can al gebraically manipulate Ihe equauon by factonng out the ab c term from the
term\. aco foll ow\
F2 abed + abee f - abe(d e f )
The CIrCUli for that equation" ,hown In Figur< 6.3J(b) The CIrCUli require, only 18 transi -
U",. but hJ longer delay of ) gate-delay, . The plot In figure (, 13(c) ,ho,", the sile and
performance ror ea( h
6.3 Sequential Logic Optimizations and Tradeoffs 317
EXAMPLE 6.12 RedUCing noncritical path size with multilevel logic
Usc multilevel logic to reduce the size of the circuit in Figure 6.34(a). without extending the cir-
cuit 's delay. Note that the circuit initially has 26 transistors. Furthermore. the longest delay from
any input to the output is three gate-delays. That delay occurs through the path shown by the dashed
line in the figure. The longest path through a circuit is the circuit's critical path.
26 transistors
a
3 gale-delays
b
d Fl
e
,
g
FI = (a+b)c + dIg + elg
(a)
22 transistors
3 gate-delays
25
L
F'
-W 20 -F2
CD 15

- 'l!' - - F2 '" c: 10
,,,"'- - 5
F2 = (a+b)c + (cJ+e)lg
(b)
1 2 3 4
delay
(e)
Figure 6.34 Multil evel optimizati on that reduces size without increasi ng delay. by altering a
noncriti cal path: (a) origi nal circuit, (b) new circuit with fewer tran istors but same dela) .
(c) illustration of the size optimization with no tradeoff of delay.
The other paths through the circuit are only two gate-delay . Thus. if we reduce the size of the
logic for the noncritical paths and extend those path to three gale-delay . we would nOl ha'.., extended
the overall delay of the circuit. We focus on the noncritical pans of the equation for F I in Fig=
6.34(a); the equation has its noncritical parts italicized. We can algebraically modify the noncritical
parts by factoring out the lenn fg , resulting in the new equation and circuit shown in Figure 6.34{b).
One of the modified paths is now also lhree gate-delays. so we now have tv.'o equally long critical
paths. both havi ng three gate-delays. The resulting circuit has only 22 transistors rompared to 26 in
the original circuit. yet sti ll has the same delay of three gate-delay . as illustrated in Fig= 6.34(c).
overall . we've pcrfonned a size optimization with no penalty in perfomlance.
Generally. multilevel logic optimization uses factoring (e.g .. abc
a b ( e+d)) to reduce the number of gates.
Multilevel logic optimi zat ion is probably more commonly u ed today than two-level
logic optimization. Multilevel logic optimization i also exten ivel) u ed automatic
tools that map circuits to FPGAs. FPGA will be di scu ed in Chapter .
6.3 SEQUENTIAL LOGIC OPTIMIZATIONS AND TRADEOFFS
State Reduction
In Chapt er 3. we described the design of equential logic. namely. of ntrollers. Wben
creating the F M. and conveni ng the F M to a tate-register and logi _ we an
some optimi zati ons and trndeoffs.
lal e reduction . also kno\\n as store minimbttion. i an ptimization redu, < the
number of F M stme without changing the F beh'l\ ior. B) mlu -ing th number
;.tates. IYC mny rcdu e size of required state regi,ter that nnplcm nt, th
318
D
OptimIZations and Tradeoffs
tbtl... rculicing circui t size. x...: ;_O_u...:tP_u_tS" _-'- y ______ ___
Reducing the number of is
po",iblc \\ hen F I contains
"'-latc!'> Ih31 ar equivalent 10 one
anolher. For c\ample. consider Ihe
of Figure 6.35(a). having
inpul x and OUIPUI y. Examinali on
reveab Ihm ,laIC, 52 and 53
appear 10 be Ihe , a me as SlaleS 50
and 5 /. of whClher we
sIan in 50 ';, r 52. Ihe OUlpU!!, will
be idemical. For example. if we
y=O y= 1 y=O
(a)

y=O y= 1
(b)
y=1
x x'
U
if x = 1,1,0,0
Ihen y = 0,1,1,0,0
(c)
start in SO and the input sequence Figure 6.35 El il11 in::lIi ng redundant "i tatcs: (a) ori ginal
FSM. (b) cqui \alenl FSM wil h fewer Slales. (c) Ihe
for four clock edge, is I. I . O. O. FS 1, arc indi <l inguishable from Ihc outside. providing
the SWle sequence wi ll be 50. 51. idenlical OUI PUI beha\l or for any inpul sequence.
5/.52.52. so Ihe OUI PUI sequence . '.
\\ ill be . I. I. D. L. If inslead we SIan in 52. Ihe same Inpul sequence wtll resull III a Slale
sequence of 52.53.53.50.50. so Ihe OUIPUI sequence \\ ill again be . I. I. O. O. In facl ,
if we tried all inpul sequences. we would find Ihat Ihe OUIPUI sequence slartIng
from slate 50 wou ld be idenli cal 10 the OUIPUI sequence slaning from Slate 52. Slates SO
and 52 are th us equivalent. Likewise. slates 5/ and 5J arc equivalem for the reason.
Thus. \\ e can redraw Ihe FSM as in Figure 6.35(b). The FSMs In FIgure 6.3) (a) and (b)
ha\e exacll y Ihe ,al11e behavior-for any sequence of inpllls. Ihe IWO FSMs provide
exacll y Ihe same sequence of OUI PUI . If we encapsul ale Ihe FSM as a box as 111 FIgure
6.35(c). Ihe world cannOl dislingui sh bel ween Ihe IwO F Ms based on the OUlputs.
Two states are equiva/ell t if;
Ihey "'l>ign Ihe same values 10 OUIPUI . A 0
for all possible sequences of inpuls. the F M will be Ihe same slaning
from either SlalC.
For large FSM,. visual inspeclion cannOI guaranl ee Ihal we've removed all redundant
'laICI-a more ,y'lemalic approach is needed. which we now inlroduce.
Impli cation Tables
Intuitively. we know Ihal IWO stales cannOI be
equivalent if Ihey produce dilTercnt OUIPUIS for
Ihe 'a me 'cquence of inpul'. Conl ider the FSM
in Figure 6.36. which is 3lmo>l identi cal 10 the
FSM in Figure 6.35 with a Ilighl modificalion -
in "ale 52, the FSM now OUIPUI' y - I in"ead of
y O. Stale, SO and S2 Iherefore clearly are nOI
cqulvalcOl. becau,c Ihey have dilTerenl OUIPUI
value, Stale, 5/ and 53 produce the 'arne OUlpu!.
hUI "'hen we I,"nlilion from either 'WIC 10 the
corre'p<>ndlng ne'l ,tale. Ihe OUIPUI dllTc", . FOr
c ample. I f the FSM 10 ,late S / and x
Occamc, r. Ihe nexl 1,lle (S2) oulpul, y . , bUI If
InputS" x; au/putS" y
x'
Figure 6.36 f\ \,trialll of Ihe FS f in
Figure (, 15 SO and S2 cannO!
oc cqlll\,llenl occaulc IhC) OUIPUI
dillerclIl \Jluc' . ,lnd 't.lIe, SI alld 53
c.ln't he equl\lIlcnl Occau-,c they hnH:
noncllUI\alent nc'l "'laIc.. lor the
!Oput \ Jim:'
6.3 Sequential Logic Optimizations and Tradeoffs
319
the FSM had staned in 53. Ihe nexl Slale (50) would OUIPUI 1-0. Thus. 51 and 53 cannot be
eqUiva leOl , because Ihe same inpul sequence results in a di fferent OUIPUI sequence.
If IWO Slates' OUIPUI S are nol equivalent. Ihe IWO slates clearly are not cquivalenL
Funhermore, if IWO Slates' next stales are nOl equi valent for a given inpul value. then the
IWO Slal es are also not equi va lent. Using these concepts of nonequivalent talCS. Table 6.2
descnbes an algorilhm for reducing an FSM' s number of stale.
TABLE 6.2 Algorithm for state reduction.
Slep
Mark Slate pairs havillg difJerem
OlflpUlS as I/ onequivalelll
For each unmarked SUl l e pair.
write the "ext st{Jfe pairs for the
same illPlII \'alues
For each lIllmarketl state pair.
mark slate pairs having nOllequil'lllelll
ll e.rrSlate pairs as I/ oll equi\ 'alem.
Repeal Ihis step III/til fl O cluII/ge
OCcurs, or ullIil al/ SUl l es are marked.
4 Merge remaining state pairs
Descripti on
States having different outputs ob\'iously cannOI be
equi valent.
States with nonequivalem stales for the same
input values can't be equi\'alent. Each time through
this slep is called a pass.
Remaining state pairs must be equi\aJem.
When comparing all poss ible pairs of Slates by hand. usi ng a graphical lable en UTe
Ihal we don'l mi ss any pairs. Consider the FSM of Figure 6.35(a). The F has -I tatcs.
Iherefore Ihere are -1 2 = 16 possible slale pairs. Figu"; 6.37(a) hows those po <ible pairs
graphicall y in a labl e. wilh Ihe Slales li sled along the rO\ and column headings. Ea h :-eU
corresponds 10 a Slate pair. We can simpli fy table size b) remo\'ing red-undanl ceU-
(e.g .. row 50. col umn 5/ is Ihe same as row 5/. column 50) and removing meaningless
cell s along Ihe diagonal of the labl e (Slat e 50 is obviou I) equi\'alent 10 :tale 0).- The
reduced lable is shown in Fi gure 6.37(b).
)S1

52
53
I
50 51 52 ]
(a
) 50

51

52
tii
53
m
1
(b
Redundanl
Diagonal
50 51 52 53
Figure 6.37 Table of ,Iatc 1'-1i",: ta) original labk comp.1ring JII rJII'. lbl ' "url'r tJN
only uni que and rclCqUH pain.. (c) una initial rililng. 111 \\ ith ,Iak' inf,-'mlJliCln.
Figurc 6.J7(c) sleps through the .Iate reduction algorithm of Tabl' (:0.2 t, r the ,:'\1
of Figure 6.35(3).
320 6 Optimizations and Tradeoffs
Step I involves looking a! every table cell and marking Ihat cell with a large " X" if Ihe
stales for Ihal cell have diITerenl OUlputS. We refer 10 such cell as bei ng marked. The first
stale pair (5/.50) is not equivalenl because SO OUI PUIS Y - O. whil e 51 OUIPUIS Y = I. We
Ihen look al laic pair (52.50). (52.5 / ). and so on. and finally (53.52). marking state pairs
having differenl OUlpUIS. resulting in the Xs shown in Figure 6.37(c).
Step 2 involves wriling Ihe nexi state pairs for each remaining unmarked cell. There are
IWO unmarked cells:
(52.50) (ci rcled in Figure 6.37(c: When x=1. state S2's nexl slate is 53, while
state SO's nexl stale is ' I (we see Ihi s by looki ng at Ihe FSM in Fi gure 6.35(a)).
Thus. we write " (S3.SI)" in tha! cell (the order doe n'l mail er). meaning thai for
slales 52 and SO 10 be equi valent. 53 and 51 muSI be equi valent. We Ihen consi der
Ihe case when inpul x=O. in which case Ihe nexl Slales are 52 and SO, so we wri le
"(52.50)" in Ihat cell also.
(53.51): When x=O. the next states are SO and 52. so we wrile (50.52) in Ihe cell.
For x= 1. we wrile (53.51) in the cell.
Step 3 involves marking as nonequivalent any unmarked cell s whose next slate pairs are
already marked as nonequivalent. Looking at cell (S2.S0). the next slate pair (53,5/ ) is
nOI marked. nor is next slate pai r (52,50) (which happens 10 be the current cell ), so we
can' l mark Ihis cell. Likewise, for cell (53.51), Ihe next state pair (SO.S2) is nOI marked,
nor is Ihe next Slale pair (53,S I), so we can' t mark thi s cell.
Because we made a pass Ihrough slep 3 wi thout any changes. we don'l repeat slep 3.
and inslead move on 10 step 4.
Step 4 involves declaring the unmarked tat e pairs as equivalent. so 52 and SO are equiv,
alent. and 53 and SI are equivalent. To finalize step 4 of the algori lhm. we combine the
equivalent tates in the FSM. After combi ning tales 52 and SO. and combining tales S3
and SI. we oblai n the FSM in Figure 6.35(b).
The method we have ju I employed is known a Ihe implicatioll table method for
state reduction.
Naturally, not every FSM can have its number of Slates reduced. For example, lei'
use the implication table method on the FSM in Figure 6.36. With 4 lale. the FSM's
implicalion table will be the same ize as the previous example. as shown in Figure
6.38(a). Step I marks state pairs wilh different OUlputs. shown in Figure 6.38(a). Step 2
li sls. for each unmarked cell, Ihe neXI tate pairs for identical inpul values. as also shown
in Figure 6.38(a).
In step 3's first pass. we firSI examine Ihe cell for late pair (52. 51). aturally.
Ihe nexI late pair (52. 52) is equi valent. The neXI Male pair (S3. S I) is unmarked. so
we cannot mark (52. 51). We then examine the cell for pair (53.51). and find th31
the nexl pair (50.52) il\ cell marked. Thi\ lell, u\ Ihm 3 and I eannol be
equi.alelll (because they could transition 10 noneqUlvalent "ate, for the sume inpul
we mark the cell for (53.51). Similarly. we mark (53.S2) ,ince its firsl neXI
'tate palf. (50.052). its cell marked. omplellng ,tep 1\ Ii "I pas re. ults in Ihe
table of Figure 6.38(b).
- - -- -----
(a)
so
(S2,S2)
(S3.S1)
(SQ,S2) (SQ,S2)
(S3,SI ) (S3,S3)
61 62
6.3 Sequential Logic Optimizations and Tradeoffs 321
Figure 6.38 Impli calion lable for FSM in Figure 6.36: (a) table after initial setup and steps I and
2. (b) after slep 3's firsl pass through the table. (c) after step 3's second and final pass through the
lable.
Because the table changed during the first pass (we marked rwo tate pairs). we must
make a second pass, because changes in the table may affect state pairs that we already
looked at and left unmarked. In the econd pass, we again look at state pair (S2.5/ ). Nat-
urall y, the next state pair (S2.52) is equi valent. The next state pair (53.5/ ). however. is
now marked, and therefore we mark (52,5/ ).
With all pair in the lable marked, as seen in Figure 6.38(c), we can conclude that no
states in the FSM are equi valent, and thu we leave the FSM unchanged.
We now provide another example of stale reducti on.
EXAMPLE 6.13 Minimizing states in an FSM using an implication table
Consider the FSM in Fi gure 6.39(a). Unlike previous examples. this FSM has 5 Iates. resulting in
more possibl e state pairs than in previous examples. The first task in minimizing the FS.M"s stares is
to construct an implication table so we can compare every state with en h other as a stale pair.
Inputs: x; Outputs: y
x'
y=1
(a)
y=1
(b)
($4.S3)
(SO.SO)
S3
Figure 6.39 n M needing Inte reduction: ta) original (h) impl; '31100 t3ble.llt<r _
I and _.
In step I of our ,tatc reduction algorithm. \\c marl \\11h an X !<lJtc plIf' WI"
lire nOI cqUi\"3icOl beenu.c Iheir UIPUI dilfer. as ShO\\l' in Figure 6.3'l<,b\.
I II
OpttmlZations and Tradeoffs
In stcp 2, \\ \.' write in all the next pairs for unmarked cell:-. of the implicati on table, as
.. ho\\ n in Fi g. ure 6.39( b). Since there arc onl y IWO of inputs (either x=O or
\ = 1), each ulll11arl-.t:d cell \\ ill have twO next slate pain-.
In sirp J's first pass. we ll1ark each SHitc pair if olle of their next stat e pai rs is marked. During
our pa!\, through the tabl e. we wi ll examine four Slale pai rs. Starting wi th (52.51). we see that
both of it:.- nl;':'( l Stal e pairs are unmarked. Looking at (53.50). \\ C one of its nexl Siale pairs.
(53.52\' i, marked. so lI' e mark (53.50)'s cell. We al so mark (5-1.50) bec,,"se ils neXl state pair (S4,S2)
i ... marked. \Ve (5.,f,53) unmarked as both of its next SHih: pairs arc unmarked. thus completing
the P3"", Fi gure 6AO( a) refl ects thc results or our fi rst pass through the impl icati on table.
Becam.e we marked new state pairs in the first pass. we conduct 3. second pass through step 1
During thaI we find no new cells to mark. Ic:.wing the table unchanged. We thus move on to step 4.
In step ..t we decl are the unmarked state pai r (52. 5 I) as equivalent. and the unmarked state
pair (S';.53) ao;; equi valent. \Ve combine states 52 and 51. and we combinc states S4 and 53. resulting
in the nc\\ shown in Figure 6...JO(b). Note that the two transiti ons with conditions x and X
from SO could be repl aced by one IrJllsitioll with no conditi ons.
Inputs: x: Outputs: y
(a) (b)
Figure 6.40 Implicalion lable and minimized FS I: (a) impl icali on lable afl er firsl pas .
(b) minimi zed "ate machine wilh stales 5 I and S2 combined. and S3 and S-I combined.
In Ihi , e.<ampl e. by reducing the number of slales from 5 down to 3. we have reduced Ihe
minimum I, lale rcgi, tcr site rrom 3 bits down to 2 bit,. perhaps reduci ng circuit size.
Sometimes equi valent states may overlap. For example. assume that for some FSM with
' tates {TO. TI . n. n . T/}. you find that state pairs (TO.TI ). (TI.n) and (n.TO) are
equivalent. How do you deal with the overl apping equi val encies'? The answer is simple:
the th ree qates. TO. TI. and n can be combined into a single
The impl icati on tabl e method is suitable for hand-optimizing small FSMs such as
tho,e introduced in the previous cx(lmples. but can qui ckl y become unwieldy for FSMs
"'ith more Consider the IS-stat e FSM in Fi gure 6.41 . reduced implicati on table
"' ould requi re 14 row' and 14 column'>. and 105 , tatc pair'>. With two combinations of
tnput'> (namely. a = 0 or a = 1), e:lch statc pair would have two Mate pairs. and. in the
"'ON ca, c. wc would need to chcck 105' 2=2 10 nc t ' tate pair, during our firM pass
"lone. What if the ,ame FSM had four input ('>ny. a, b. C. and d) in,tcad of one? With
four tnput'>. there would be 4' = 16 combination, of tnput ' (i .e. a' b ' C ' d '. a ' b ' c ' d,
0' > 'rrj' ... . abed ) and up 10 16 nc" , wte pair, III each cell In the implicati on IUble. If
tn"citd the FSM had. ,ay. 100 ' latc'> ((I rca,>onabl e number). the implication wble would
h,,\c on the order of 100* ' 00 = 10,000 '> tal e Pit " "
State Encoding
6.3 Sequential Logic Optimizations and Tradeoffs
323
Inputs: x: Outputs: z
Figure 6.41 A IS-Slate FSM.
z=t
State reduction is therefore lypicall y performed using automated tools. For mailer
FSMs, the tools may implement the impli cati on table method. For larger FSMs . the tools
may need to reson to heuri stics to avoi d inordinatel y large table sizes numbers of ne.<t
state pairs. -
Even when we reduce the number of states, we are not guaranteed that such state
reduction aCluall y reduces the size of the reSUlling logic. One re';on is because reducino the
states might not reduce the number of required -register bits-reducing the States from
15 down to 12 does not reduce the minimum state register size. which is in either case.
Another reason is because, even if the state reducti on reduce the tate re!!i ter ize. the
combinational logic size could pos ibly ill crease with a smaller state re!!i due to the
logic having to decode the state bits. Thus, automated state reduction t;'1 may need to
actuall y implement the combinational logic before and after state reduction. to determine if
state reduction ultimately yields improvements for a panicular FS 1.
Stale ellcodill g is the task of assigning a unique bit representati on for eacb tate in an
FSM. Some state encodings may opti mi ze the resulting controller circuit b\ redu im!
circuit size. or may trade off size and performance in the circuit. We now \
method for state encoding.
Alternative 1inimum-Bitwidth Binary Encodings
Previously. we assigned a unique binary en oding to ea h state in an FSM usi ng the
fewest number of bits po sible. representing a lII illiIIIUIII -biI",idlh biliary ell odi;; . If
there were four states. we used twO bit . ' f there were fi\'e. ix. seven. or st tes.
used three bi ts. The encodi ng represented the state in the ontroller's $t:1t lbere
are many ways to map minimum-bitwidth binary en odings to :1 of :lal 3\ \\ e J.re
given four states. A. B. C. and D. One en oding is .-1 :00. B: 1. :1. D: 1 . -. n(,th r
encodi ng is A:Ol. 8: 1 O. C. ll. D: OO. In fa t. there :1re 4*3 _ *' = 4! = _4 p'-'lS, i I
encodings into twO bits (4 encoding choice ' for th ' lirst stale. 3 for th ne" .
for the next. and I for the last state). Freight .'tate . lh're are " . or o\er 40-<)00. po: " i I'
encodings into three bits. For J states. there are N! (.V facto';;, )) IX , il'lk en :-c'Xling, ---a
huge \lumber for an) greater than 10 r $" . ne encoding re, ult in I '-'
324 Optimizations and Tradeoffs
combi nati onal logic than another encoding. Automated tools may lry several different
encoding' (but not all N! encodings) to redu e combinmional logic in the controller.
EXAMPLE 6.14
Alternative bll1ary encodll1g for three-cycles-high laser timer
I n Example 3.7. we encoded "laic' u.,mg a
encoding .... truting with
00. Ihen 0 l. Ihen 10. nnd Ihen II. The
rc ... ulting dC' lgn hud I grill.: inpuh (ignoring
invencr ... ), We can try In,tcad the ;Iltcmauve
binary cncmling \ huwn 10 Figure 6.42.
Tabl e 6.3 pruvide, Ihe "aIC lable for Ihe
new cnc(xllng. , howlng the difference ... from
the original CI1COdlllg.
From the \trw.: table. \\C obtain the fol
lowing CClulIlion' for the three combinational
logic output ... of a controller:
Figure 6.42 La"cr timcr diagrnm with
altcmall\'C binary ,Iatc encoding.
x - s I , sO (nole from Ihe lahle Ihal x-I
If sl-lor sO-I )
nl - 51 ' sOb ' '51 ' sOb + slsOb ' + slsOb
nl - 51 ' sO + 5150
nl - sO
nO - sl ' sO ' b + 51 ' sOb + 51 ' sOb '
nO - sl ' sO 'b + s l ' sOb + sl ' sOb +
5 I ' sOb '
nO - 51 ' b(sO ' + sO) + 51 ' sO(b + b ' )
nO - sl ' b + 51' 50
1l1C resulting circuit would have only 8 gate inpulS:
2 for x. 0 for n 1 (n I i< connecled 10 sO direclly wilh
wire). and 4 + 2 for nO. 11,e 8 Snle inpul is ignificanlly
less Ihan Ihe 15 salc inpuls needed for Ihe binary
encoding of Example 3.7. This encoding reduces size
wi thout any increase in delay. thus repre enling an

One-Hot Encoding
TABLE 6.3 State table for laser timer
conlrolier with alternative encoding
Inputs OutPU15
51 sO b x nl nO
0 0 0 0 0 0
Off
0 0 I 0 0 1
0 I 0 1 1 I
Onl
0 1 1 I I 1
I 1 0 1 1 0
On2
1 1 I 1 I 0
--
1 0 0 I 0 0
On3
1 0 1 1 0 0
There is no requirement that we encode a set of states using the fewest number of bilS.
For exampl e, we could encode four states A, B, C, and D using three bits instead of just
two bils. such as A:OOO, B: Ol1. C:llO_ D:llI. Using more bits requires a larger state
register. but possibly less logic. A popular encoding scheme is called oll e-hol encoding.
wherein we use the same number of bit for encoding as there are states, and each bit
corresponds to exact ly one state. For example, a one-hot encoding of four states A, B, C,
and D uses four bi ts, such as A:OOOl, B: OOI 0, C: Ol 00, D: 1000. The main advantage of
one-hot encodi ng is speed- becau e the state can be detected from just one bit and thus
need not be decoded using an AND gate, the controller's next state and output logic mal
involve fewer gates and/or gates with fewer inputs. resulting in a shoner delay.
6.3 Sequential LogiC Opllnuzotlons and Trodeoffs
325
EXAMPLE 6 15 One-hot encodll1g example
InpulS- non , 0u!pu1S:,
Figure 644 One-
hoI encooing can
reduce delay: (a)
minimum binary
encooing, (b) onc-
hOi encooing. (c)
though 10lal sizes
may be roughly
equal (one-hoI
encooi ng uses
fewer gales bUI
more flip-flops).
one-hOI yields a
shoner eri li cal path.
Con<ldcr Ihe '1mple 1- M (II h gure 6.4),
\\ hl ch n:pc.ltctJl) gcncnHe the nUlllul
,"quence 0_ 1. 1. 1. 0, 1. 1. 1. elc "
'IrJlghtforn.lrd I1llnlln.1I blll<lf) cncoomg I'
,h \\ n. \\ hleh I' then cm"cu OUI and n::plu cd
v. IIh n one-hOI
nle bmar) cncOOIl1!!, r'C\uh, III the '1I11e
table \hown In 'PJblc 6.4 'nc f"C,uhlllll C(IUU-
lion .. 3n:
x 0 x_1
x. 1 x_1
Figure 6,43 FSM II" gi\' ' II ,cqucllec.
n1 - 51 'sO + 5150 '
nO - sO'
x I + sO
TABLE 6 4 StOIO lable usi llU hillory
encedlllg
The one-hOi cncodlllg rc\uh, III the: t.lle
lable ' h"" n III Tuhle 6.5 Inc ""ulling <qU,I-
li on' arc
n3 .2
n2 - 5 I
nl - sO
nO - s3
- 53 + s2 + 51
A
/J
/)
Inl)UIS
s I sO
0 0
0 I
0
OUlputs
nl nO
0 I 0
0
I I
0 0
Figure 6.4-l \how\ Ihe rc,ulllllg clrcuiL,
for each encoding. -Inc binary ellcooillS Yield,
more gate" but more Hnponol11ly. require,
TABLE 6,5 Stale toble uslnu Olio-hOI ollcoding,
D
Input "
53 52 51
000
o
o
o
o
I
o
o
Output.,
sO n3 n2 nl nO
I 0 0 1 0 0
001 0 - 0- ]-
o
o
1 0 0 0 1
0- 0- 0--
Iwo le'Ol, of logic 11,e one-hOI cncoolll8 III
Ih".example require, only one bel of I08ic. II
NOllcc the logi C 10 gcncrJle the ncxt 'LUle 11
I ') just Wire!;! in th" example (olhcr example,
may require \Omc logic). Figure 6.44(c) lliu,-
lraleS.lhal lhe one-hOI encoding ha, les, delay,
mcanlOg we could Uf\C. a fa\ter clock fre.
quency for that ci rcuit
---__ ...L-_
, 2 3 4
delay (gale-delays)
(e)
326 6 Optimizations and Tradeoffs
EXAMPLE 6.16 Three-cycles-htgh laser timer using one-hot encoding
In Example 3.7. we encoded stales
using a ... traightforward binary
encodi ng. with 00. lhen a l.
then 10. and then 11. Herc. we'll
pafonn a one- hOI encoding of the
four !-laICS. requiring four bit s. as
shown in Fi gure 6.-l5.
Tabl e 6.6 shows a !- Iale wble for
the FSM of Figurc 6,45. using the
one- hoI encodi ng of the stales. We
don', show all possibl e rows. since the
table would bl.:! 100 large.
The step b to design the
combillruional logic. Deri ving equa-
tions for each output direct ly from
the table (assuming all other input
combinations Jre dOlfi-cares). and
minimili ng those equat ions
icalJy. result s in the foll owing:
x -53 + 52 + 51
n3 - 52
n2 - 5 I
nl - 50*b
nO - 50*b ' + 53
Thi s circui t woul d requi re
3+0+0+2+(2+2) = 9 gale inputs. Thus,
lht.! circuit has fewer gate inputs Ihan
the original binary encoding's 15 gate
inpuls-but one must also consider
thm a one-hOI encoding uses more
nip-nops.
Figure 6.45 One- hot encoding of laser limer.
TABLE 6.6 Slale lable for faser timer conlroller wilh
one-hoI encoding.
InpulS Oulputs
53 52 51 50 b x n3 n2 nl nO
a a a 1 a 0 a a a 1
Off
0 a 0 1 1 a a a 1 0
a 0 1 0 0 1 0 1 0 0
0111
a a 1 a 1 1 0 1 a 0
a 1 a a a 1 1 a a 0
0112
0 1 0 0 1 1 1 a a 0
1 a 0 a a 1 a a a I
0113
1 a 0 0 1 1 0 a a 1
More importantly. the ci rcui t with one-hot encoding is sli ghtl y faster. The critical path for thlll
circuil is nO : 50*b ' + 53. The cril ical path for the circuil with regular binary encoding is
nO 51 ' 5 0 ' b + 5 150' . The regular binary encoded circui l requires a 3-inpul AND gale
feeding into a 2-i npul OR gate. whereas the one-hal encoded circuit has a 2-input AND gate feeding
in a 2-i npul OR gate. Bccause a 2-input AND actuall y has sli ghl ly less delay than a 3-inpul AND
gate. Ihe one-hot encoded circui t has a shorter critical path.
For exampl es wit h more states, the cri tical path reducli ons from one-hoI encoding may be
even greater, and reducl ions in logic size may also be more pronounced. AI some poinl,
of course, one-hOI encoding results in 100 big of a slate regi ster-for example, an FSM
wilh 1000 Slales woul d require aiD-bit Slale register for a bi nary encoding. bUI would
require a looo-bil Siale regi ster for a one-hOI encoding, whi ch is probably too big 10 can
sider. In such cases, we mi ghl consider encodings using a number of bi ts in belween thai
for a binary encoding and thai for a one-hot encoding.
EXAMPLE 6.17
6.3 Sequential Logic Opl imizalions and Tradeoffs 327
OUlpUI Encoding
Some problem descriplion. require us 10 generale a particular ,cquenee of va lues On a el
of OUlpUI S. For example. a problem mighl require u, 10 repctllcdly oUlpul the following
sequence on a I"" r of OUI PUIS x and y : 00. 11.
10, 0 1 .. can caplure Ihe behavior using Ihe Inputs: none; Outputs: x, y
FSM wllh lour slales, A. B, . and D. as shown xy=OO
in Figure 6.46. A siraighiforward binary
encoding for Ihosc Slates would be; 11: 00.
8: 01. C:I 0, D:l1. liS shown in Fi gure 6.46.
we design a COntroll er for Ihi s syslem.
we II have a Iwo-bil SIaIC regisler. logic 10
delennll1e Ihe neXI MaIC. and logic 10 generale
Ihe OUlpul from Ihe present sllll e. BUI might il
xy=Ol
xy=l l xy=10
make more sense 10 a !'l Ime encoding that is Figure 6.46 FSM for given sequence.
idenlical 10 Ihe OUlpul va lues in each Male? If
we use such an encoding. Ihen we will slill have a Iwo-bi l sWle regisler. and we will still
have logic 10 generate Ihe nexi Mme. bUI we won' t have logic 10 generate the OUlput from
the prcselll Slate. Inslead. each OUI PUI will si mpl y be connecled by a wire to a bit in Ihe
slate regisler- Ihus reducing Ihe requi red number of logic galc .
If an FSM has at Icasl as many OUIPUIS needed for a binary encoding, and if each
Slale has a unique OUIPUI combinalion. Ihen we may consi der using a st.lies OUIPUI com-
bination as Ihe Slatcs enCoding. Such an encoding may reduce Ihe amount of logic
required. by eliminat ing Ihe need for logic 10 generale Ihe OUlput s from Ihe present Slate
encoding-I hal logic is reduced 10 jusl wires.
OUIPUI encoding requires Ihal Ihe syslem have al leasl as many outpulS as il has bits
in a minimal binary encoding. olherwise the OUIPUI S can'l represent enough encodi ngs 10
un iquely idenlify each Slate. Furthermore, we can' l usc outpul encoding if the desi red
outpul equenee contains Ihe same OUIPUI va lues in IWO different stales, since every
tate's encoding musl be unique. For exampl e. if we wish to repeatedly generate the
sequence 00, I I. 01. I I. we cannol use OUIPUI encoding. because if we did, then two
tates would have Ihe same encoding. Even in such a silumi on. though, we mi ght try to
OUlput encode as many slates as possible.
Sequence generator using output encoding
Exampl e 3. 10 involved design of a sequence gener-
ator. in whi ch we were 10 gCllcmte the sequence
000 I. 00 11. 11 00. 1000 on a sci orrour out pUIS.
as shown in Figure 6.47. 111 that example. we
encoded the states lI sing a two-bit binary encodi ng.
wi lh II being 00. B being 01. C being 10. and D
being 11. In thi s exampl e. we ll inslead use OUIPUI
encoding. The OUIPUIS have enough bit>. four.
whereas we need at least two bi ts to encode the four
Slates. The sequence also has a different output com-
bination for each state. Thus. we can consider output
encoding for Ihis example.
tnputs: none; Outputs: w, x, y, z
wxyz=OOOt wxyz=tOOO
y
wxYZ=OOll Wxyz=ll00
Figure 6.47 Sequence generator FSM.
.328 Optimizations and Tradeoffs
Table 6.7 ... a panial ,tatc U1ble for
the ,cquencc:: generator. an output
cncooinf!. Notice th:!! the outputs them
...e!'e' x. y. and z. don't need 10 appear
in the table. tht.!) \\ ill be the same as 53.
52. 51. and sO. We u,e a partial table to
avoid ha\ 10 all 16 rows. and we
assume all represent
From the table. we derive equati ons
for c:H.. 'h output
n3sl+s2
n2 - 5 I
nl - 51 ' 50
nO - 51 ' 50 + 5352 '
\\le obtained those equations by looking
al all the Is for a particular output. and visu-
all\ dClcrminine a minimal input equation
th;t "ould gene-rate those I s and Os for the
other ,ho\\ n column enLries (all orner output
\alues. not shown. are don't cares).
Figure 6A8 the final circuit.
Notice that there is no output logic-me
outpuLS \01 . X. y. and Z connect directly to the
Slate regi ster.
Compared 10 the circuit obtained in
E,ample 3.10 u'ing a binary encoding. the
output encoded circuit in Figure 6.-l 8 actu-
ally appear; to use morc transistors. In olher
c:<amples. an output encooed circuit might
use fewer
Whether one-hot encoding, binary
enCoding, output encoding, or some
\ariation thereof in fewe t tran-
TABLE 6.7 Partial state table lor sequence
generator controller using output encoding.
Inputs Outputs
s3 52 s l sO n3 n2 nl
A 0 0 0 I 0 0 I
B 0 0 I I I I 0
C I I 0 0 I 0 0
0 I 0 0 0 0 0 0
nO
I
0
0
I



>--
H-J
53 s2 st sO
-b State register I elk
--
n3 n2
+nl ' nO
Figure 6.48 Sequence generator controller with
output encoding.
sisto" or a ,honer critical path depends on the example itself. Thus, modern tool s may
try a variety of different encodings for a given problem to sec which works best.
Moore versus Mealv FSMs
Basic Mealy A rchiteclure
The FSM, dc'Cribed In this book have thus far all been a type of F M known as a Moore
FSM A Moore FSM b an FSM who c outputs arc n function of the FSM's state. An
alternatIVe type of F M " a Mealy F M. Mealy FSM is nn FSM who e out puts are a
funClton of the FSM\ ,tates alld illl'lIIf. Sometime, a Mealy F M results in fewer SUItes
than a Moore I-SM. rcprc-.enttng an opt.mtlallOn Sometime' tho'e fewer states come at
the c'pcn,c of liming that mu\{ be handled, repre,cnting a tmdeoff.
6.3 Sequential logic Optimizations and Tradeoffs
Recall the standard controller archi tec-
ture of Figure 3.48, reproduced in Figure
6.49. The architecture shows one block of
combinational logic, responsi bl e for con-
vening the present state and external inputs
into the next state and external outputs.
Because a Moore FSM's outputs are
solely a functi on of the present state (and
not the external inputs), then we can refine
the archi tecture to have two combinational
logic bl ocks: the lIexl-Slal e logic bl ock
convens the present state and external
Figure 6.49 Standard controller
architecture-general view.
329
o
c."
0
_ ",

inputs into a next state, and the outpullogic block convens the preseot stale (but nOI the
external inputs) into external outputs, as shown in Figure 6.50(a).
In contrast, a Mealy FSM' s outputs are a funclion of both the present stale and the
external inputs. Thus, the output logic bl ock for a Mealy FSM takes both the present State
alld the external FSM inputs as input, rather than ju t the present state_ as bown in
Fi gure 6.50(b). The next-stage logic is the same as for a Moore, taking as input both the
present state and the external FSM inputs.
o
c."
0i5 '"
SoS:
'"
Figure 6.50 Controller archi tectures for: (a) a Moore FSM. (b) 3 Meal) FSM.
Graphicall y. the FSM output assignments of
a Mealy FSM would be li sted with ea h transi-
tion. rather than each tate. beenu e each
transistion represent a present state and a partic-
ul ar input value. Figure 6.5 1 hows a two-state
1ealy F M wi th an input b and an output
When in state 0 and b-O, the F M outputs =0
and stays in state O. as indi 'atcd by the transiti n
labeled "b' I x-O". \ hen in state 0 and b = 1.
the F M output. - 1 and to state I. We
usc the .. r ,impl to sepn;'ue the tran iti n '$
Inputs: b: Outputs: x
Figure 6.51 A Me31)
output:.. \\ lth tmnoMti
- - - ---
330 OptimizatIOns and TradeoHs
"'"h !.((lflrr
F\\,I\. kf'/oIlIlK
'hI" f
11'll1/ unuUIF/nt"
'Jurpuli In II
"Iftl/.,I ,It",.
d
,
t.Jf r UlJ1urr
ITYJpIUIi/\
aU'j(flt'(/(1
input cundi ti ons from the output assignments-the .. r does not mcan "di vide". here.
Becal"e the tran>ition from 5/ to 50 IS taken no mattcr what the In put value. we li st the
simply "/x'='O:' meaning there's no input conditi on. but there is an output
assIgnment.
leah' Is lay Have Fewer lales
The minor difference between a Mealy and a Moore FSM. namely. that a
F output is a functi on of the state alld the current inputs. can lead to fewer
;tatc, for some behaviors when implemented as a Mealy machine. For example. conSider
the ,i mplc ,oda dis penser controll er FSM in Figure 6.52(a). Setting d= 1 di spenses a
>oda. The FSM stans in Slate /Ilir. whi ch sets d=O and sets an output C 1 ea 1. whi ch we
a,;ume clears a device keeping count of the amount of money deposited int o the soda dis-
penser machine. The FSM transit ions to state \Vail. where the FSM waits to be informed,
throu2h the enough input. that enough money has been deposited. Once enough money
ha; deposited. the FSM transiti ons to state Disp. whi ch di penses a soda by setting
output d= 1. and the FSM then returns to state /Ilil. (Readers who have Chapter 5
may notice thi s example is a simplified ve rsIOn of Example 5. 1: famili ari ty with that
example is not required. though. for the present diSCUS IOn.).
InpuIS: enough (bit)
OutpulS: d, clear (bit)
d=1

Inputs: enough
Slale: It Iw lw! D! I

(a)
Inpuls: enough (bit)
OutpulS: d. clear (bit)
/ d=O, clear=1
elk ...ruuiJuul
Inputs: enough -----t-i-L--
Slale: I I I w I Wit I
OUIPuIs:clea;
(b)
Figure 6.52 FS I, for q)(la di'pcn..er controll er: (a) Moore FSM h., action, 111 ; t.,O'. (b) Mealy
FSM acllon, on Iran'lition\ , rc5tu ll ing in Ihi" cn"'c In fc",cr ,tatcs.
Figure 6.52(b) .. how. a Mealy FSM for the ..nme cont roll er. The initi al slate /lIil has
no attlon .. iL<,elf. but rather ha, a conditi onle,' tran<ltion to ,tate \Vai/thm has the initial-
l/allOn action, d-O and cleo r-J. In ,tate Wail . u tran\ltion with condition enough'
return, to tatc Wail without any aClion, It'ted. nother tra''''tion with condition enough
ha, the aCllon d-I. and take, the FSM back to the /"il 'LUte. oti c thut the Mealy F M
doc nlll need the Dllp 'tate (0 ,ct d I. that aCllon occur, on a tfan"ti"n. Thus. we "ere
ahle to crcatc a MC<ll y FSM wllh fewcf tatc, thitn '" n Moore F I
EXAMPLE 6.18
6.3 Sequential logic Optimizations and Tradeoffs 331
. The Mealy state diagram in Fi gure 6.52(b) uses a convention similar to the conven-
ti on we used for Moore FSMs (Section 3.4). namel y. that any outputs not explicitly
aSSigned on a tranSlLlon are implicitl y assigned a O. As with Moore FSM . we till Ii tan
assignment to 0 explicitly if the assignment is key to the FSM- behavior (such as the
ass ignment of d=O in Fi gure 6.52(b.
Beeping wristwatch FSM using a Mealy machine
Create FSM for a wristwatch that can di spl ay one of four register by setting two outputs S 1 and
5 O. whi ch control a 4x I l11uhiplcxer that passes one of the four registers through. The four registers
correspond to lhe walch's present time (sls0=00). Ihe alann seILing (01). the dale (10). and a
stopwatch (11). The FSM should sequence 10 the nexi regisler. in the order listed above. each time
a bUlIon b is pressed (assume b is synchronized wi th the clock as 10 be high for only 1 clock cycle
on each umque. bUllon press). The FSM should SCI an OUlput p 10 1 each time the bUllOD i pressed_
c<JuslI1g an audible beep to sound.
Inpuls: b: Outpuls: s1 , sO. P
Inputs: b; Outputs: 51 . sO. P
b'/s1SO=OO. p=O Time
b'
b 5150=00. P=O
b'/51S0=01 , p=O c:w 5150=00. P=1
(a)
b'
b'!S150=10. p=O
b'/s1S0=11. p=O
(b)
Figure 6.53 FS ,I for 3 wristwatch with beeping
beh.vior (p= I) when bUlIon is pressed (b= 1): (3)
Meal y. (b) MoofC.
Figure 6.53(a) shows a Mealy describi ng the desi",d beha\'ior. 1\oti -e thai the
FSM e.lsil y the: beeping si mply by setting p-1 on the th:n :'Oln!spond
to bUlIonllfc".s. Inlhe F of Figure 6.53(bl. \\0 had 10 add an c,tra "at< 1I1 rem n ea.:b
pair of in Fi gure 6.53. with each t:'.xtra state haying the action p-l and ha\ ing a C\'\[ldio\. nI
to the IlC\( slate.
I alice that lhe Menl) fc\\a M:Hes than ma..:hioc.:\ dr.l\\ txk: b that \\ .:mm't
gunr:lIlla::d a beep \\ al least ont' lock C) ck. due to ttming that \\ e will :n
Ti min!,! Issues \\ illt F i\ ls
Icul), F 1 output s are not \\ ith ci<:l<:k bUI rather 'un 'hang in
dod edges if an iltput For e\JlIlrle. )It, id'r Ih lImtng dt.\gr.ull
331 OpllmlZations and Tradeoffs
\'''''/fllt Int' (}HI
lit ""/tt'M/In!
'-100ft" 11' HuJt's
mu hdp flU
rt'mrmhu /lUll U
W.,.,rr F51,,/ I
114"'.f Of IJ' tn
... 1,,/,.
\ 1,./11 "on Ilu'
Il/lIlllillft
sho\\n in Figure 6.52(a) for a soda dispcnser s Moore FSM. Note that the out put d
become, 1 1I0r righr (lfter the input enough became 1. but rather UII rhe fi rSI clock edge
ajrer enough became 1. In cont rast. the timing diagram for the Mealy FSM in Figure
,hows that the output d becomes 1 righl (1{ler the input eno ugh becomes 1.
outputs arc synchroni zeu wi th the cl ock: in panicular. Moore outputs onl y change
on entaing a new , tatc. which means Moore outputs only change sli ght ly after a risi ng
clock edgc loads a new state int o the stat e register. In contrast. Mealy outputs can change
not just on entering a new but also any lime an input changes. because Mealy
outputs are a function of both the stat e and the inputs. We took advantage of thi s fact to
eliminate the Disp state from the soda di spenser s Mealy FSM in Figure 6.52(b). Notice,
howe\cr. in the timi ng diilgrall1 that the d output of the Meal y FSM does 1101 SlaY lfor a
complele clock c.\'Cie. If we are unsure as to whether d's hi gh time is long enough, we
could inc lude a Disp state in the Mealy FSM. That stat e woul d have a single transiti on,
\\ith no condition and wi th acti on d=1. pointing back to state Illil. In that case, d would
be 1 for longer than one clock cycle (but less than two cycles).
The Mealy FSM feature of output s being a function of states and input s, which
enabl es the reducti on in number of states in some cases. also has an undesirable charac-
teristic-the outputs may glitch if the inputs glitch in between clock cycles. A designer
u,ing a Mealy FSM should determine whether such glitching could pose a probl em in a
panicular circuit. One solution to the glitching is to insen flip-fl ops between an asynchnr
nou Mealy FSM' s inputs and the FSM logic. or between the FSM logic and the outputs.
uch flip-fl ops make the Mealy FSM synchronous, and the Outputs will change at predi ct-
able interva ls. Of course. such flip-fl ops introduce a one clock cycle delay.
Implement ing a Mealy FSM
We create a controller implementing a Mealy
FSM in nearly the identical way that we created a
controller for Moore FSMs in Secti on 3.4. using
the method of Table 3.2. The onl y difference is
that when we create a state tabl e. the FSM out-
puts' values for all the rows of a panicular Slate
won-t necessarily be identical. For example_ Table
6.8 a state table for the Mealy FSM of
Figure 6.52(b). Notice that the output d should be
a in state Wail (50=1 ) if enoug h-a. but should
TABLE 6.8 Mealy state table lor soda
di spenser
Inputs Outputs
sO enough nO d clear
Inil 0 0 1 0 1
0 1 1 0 1
110;1 1 0 1 0 0
1 1 a 1 0
be 1 if enough= 1. In contrast. in a Moore state table. an output"s values were identical
wIthin a gi ven state. Given the state table of Table 6.8, we would proceed to implement the
oll1binational logic in the same manner as descri bed in Secti on 3.4.
Combining 100re and Mealy FSMs
Dc, igne" often utilit.e FSMs that arc a combination of Moore and Meal y types. Such a
comblllatlon the de\igner to specify some actions in tate _ and others on transi-
11 0n'>. Sueh a combination provides the reduced number or state advant age of a Mealy
FSM. yet avoid, having to replicate a , tatc', acti on. on every outgoing trnnsition of a
Itate_ Thl l , implifi catlon i, reall y ju,t a conveni ence to u designer describing the FSM:
the underl YIng implcmentatlon wi ll be thc arne as for the Menly FSM having rep-
heat ed actionl on a 'tate'" outgoing tranl;ti nl
6.4 Datapath Component Tradeoffs
EXAMPLE 6.19 Beepmg wristwatch FSM .
. usmg a combined Moore/Mealy machine
333
FIgure 6.54 shows a combined Moore/
Mealy FSM stat e diagram describing the
beeping wnstwatch of Example 6.18. The
has the same number of states as
did the Mealy FSM in Fi gure 6.53(a)_
because the FSM sull associates the beep
behaVIOr p= 1 Wi th transiti ons. avoiding
the need for extra Slates to describe the
BUI the combined FSM Slale
diagram IS easier to comprehend than the
Mealy FSM state diagram, because the
assignments to s I s 0 are associated wi th
each and not duplicated on every
outgoing transition.
InpulS: b: Oulpuls: s 1_ sO, P
b'/p:O
b'/p=O
b'/IT-O
b'/p:O
Figure 6_54 Cambinin.
Moore and Mealy -
FSMs yields a simpler
wri twalch FSM_
6.4 DATA PATH COMPONENT ffiADEOFFS
Faster Adders
4, we created several components that are useful in datapath . In that chapter. we
describe m n;,: basIC, easy to understand versions of tho e components. In this section_ we
et s to bUI ld faster or smaller versions of ome of those components.
Add"
two numbers is an extremely common operation in digital circuits, so it mak
.or us to try to cr.eate an adder that is faster than a carry-ripple adder. Recall that a
rry npple adder reqUIres that the carry blls ripple throu2h all the full-adders bef.ore all
outputs are longest path through the c; uit, shown in Fi2ure 6_ - -. i
as the CirCUli s crlfl cai path. Since each full- adder has a delav of ( \\"0- 2ate-delav
en a 4-bll carry-npple adder has a del ay of 4 2 = 8 "ate-delay -A -c- - -I'
add ' d I . 3? ' '= -' J _ .. -npp c
er s e ay IS 2 =. 64 gate-delays. That 's rather slow, but the nice thin2 -about a
carry-npple adder IS thal li doesn't require very man)' oale If a fuji dd - -
h 4 b" . - '= ' -a . er uses !!at - ,
t en a - 11 carry-npple adder reqUIres only -l 5 = 20 2ate . and a 3_-bit --ri I
adder would only require 32 5 = 160 gates. - ruT) pp e
a3 b3
a2b2 at bt ao bO Q
51
Figure 6 55 carry-ripple adda_ \\ith th,' I,>ng sl P.1th (th,' ,-nl1,' all'1thl ,oo\\n.

OptimIZations and Tradeoffs
We \\oldd like to an addcr thut i, much closcr to the dday of just a few gates,
pcrhap .... abollt 5 or 6 gatc-dda)!'-. at the of morc gales.
T\\ o-Level Logic Adder
One ob\ iOll':" way to crC(l tc a faster adder at the expense or morc gates is to Li se our
earlier-deli ned two- level combinational logic design process. An adder designed using
twO Ie"el> of logic has a delay of onl y twO gate-delays. ThOl 'S certainl y fast. But recall
from Figure that building an N- bit adder using twO levels of logic result s in exces-
shely large as N increases beyond 8 or so. To be you gel thi s point , let's
restate the previous sentence sli ghtl y:
Building an tV-bit adder 1I,ing twO levels of logic in large circuits as N
be) and or so.
For example. we estimat ed (i n Chapter that a two- level 16-bit adder would
require about 2 milli on transistors. and that a two-Icvel 32-bit adder would requi re about
100 bi II ion transistors.
On the other hand. a7 a6 as a4 b7 b6 b5 b4 a3 a2 al aO b3 b2 bl be
building a adder using
twO le"el, of logic result s in a a3 a2 al aO b3 b2 bt be
big. but reasonabl y sized ci
adder-about 100 gates. a was
sho\\ n in Figure We could
bui ld a larger adder by cas-
cading such fast adder>
together. Say \: c want ed an 8-
bit adder. We could build thi s
co
Figure 6.56 8-bil adder built from 1\\'0 fast 4-bi t adders.
by cascading "'0 fast adders together. as shown in Fi gure 6.56. If eacb 4-bit adder
i, built from two le'cls of logic. then each 4-bit adder has a delay of 2 gate-delays. The
-I-bit adder on the right take 2 gate-delays to gcnerllle the , um and carry out bits, after
\\hich the addcr on the left take another 2 to gencrate it outputs,
in a IOta I delay of 2 + 2 = 4 gate-delays. For a 32-bit adder built from eight
-I-bit adde". the delay would be * 2 = 16 gate-delay,. and the -ize would be about
8 100 gates = 800 gates. That's mucb bener lhan the 32 * 2 = 6-l gate-delays of a carry-
ripple adder. though lhe improved speed at the expen,e of more gates than the
32 - = 160 gate, of the carry-rippl e adder. Whi ch de,ign is bcner? The answer depends
on your requirements-lhe de,ign w. ing two- level log ic 4-bi t adders i bener if you
require marc ,peed and can afford the extra gate" where,,, the dc,ign using carry-ripple
-I-bit adde" i, better if you don'l need the speed or can' t afford the extra gates. It' a
tradeoff.
Carry-Lookahead Adder
A carry-Iookahead adder imp")\c; on the ,peed of a carry-ripple adder. but without using as
many l!ate, as a t"o-Ievel logic addeL The baSIC Idea" to "look abead" into lower stages to
determine whelher a carry "ill be cremed in tbe pre,elll , tage. -1l1i , lookabcad concept i
'cry elegant and general lie, to other problem,. We will therefore 'pend ,ome time intro-
dUCIn!! the IntUIlU)I1 unoerlYlng lookabead on,"der the ,"dellt"on of t\\O 4-bit numbers
,h,,"'n In I lgure 6.57(b). WIth lhe carne, In each column I.,hclcd O. ( I. t2. e3. and e4.
l
6.4 Oatapath Component Tradeoffs
335
a3b3 a2 b2 al bl aObO cin carries: c4
__ cm
c3 c2 cl cO
4-bit adder
coul
53
B: b3 b2 bl bO
52 51 sO
A: + a3 a2 al aO
(a)
caul 53 52 51 sO
a3 b3
(b)
a2 b2
al bl
(c)
Figure 6.57 Adding 1\\'0 binary numbe b "
looks all earlier bi ls and computes carry-lookahead scheme-each rage
delay IS slage 3 which has 2 10
0
ic I I f carry In bUIQ mal stage "QuId be a 1. The lon<test
of onl y four ga;e-delays. e eve S or the lookahead. and 2 for the full-adder. for a total d;lay
A Naive Inefficient Carry-Lookahead Sellen .
of carry-lookahead is as foll ows. Recall One Impk but nOt "et) effi ient way
II1pUlS a b and c ad e output equauon for a full-adder ba\ino
. . . n outputs co and s. are: =-
s = a xor b xor c
CO = bc ae + ab
So we know that lhe equations for el. e2. and e3 in a adder will be:
el coO bOcO + aOeO + aObO
e2
e3
col
co2
blel aIel + albl
b2e2 + a2e2 + a2b2
In other words. the equation for the carry ' t .
. r ' -In a a pano ular stage i- the same a- the equa-
t,on ,or the carry-out of the pre"ious stage.
We can substitute the equali n r e-l ,'nt 2
e - equati II. resulling in:
e2 - blel + aIel + albl
e2 - bl(bOeO + aOeO + aObO) al(bOcO T a cO aO
c2 - blbOeO + blaOcO + blaObO albOeO
albl
\ e can thell ,ub,tilllte the equ:lli n for c2 into c3', equal1oll. re,uhlOg in:
336 6 Optimi zations and Tradeoffs
e3 : b2e2 + a2c2 + a2b2
e3 = b2(blbOeO + blaOeO + blaObO + albOeO + alaOeO +
alaObO + albl) + a2(b lbOeO + blaOeO + blaObO
+ albOeO + alaOeO + alaObO + albl) + a2b2
e3 = b2blbOeO + b2blaOeO + b2blaObO + b2a lbOeO +
b2alaOeO + b2alaObO + b2albl + a2b l bOeO
+ a2blaOeO + a2blaObO + a2albOeO + a2alaOeO
+ a2alaObO + a2albl + a2b2
We'lI omi t the equation for e4. in order to save a few pages of paper.
We could creat e each stage with the needed inputs. and include a look ahead logic
component implementing the above equations. as shown in Fi gure 6.57(c). Notice that
there is no rippling of carry bits from stage to stage-each stage comput es its own carry
in bit by ""looki ng ahead"" to the val ues of the previous stages.
While the above demonstrates the basic idea of carry lookahead. the scheme is not
very efficient. e I requires .j gates. e2 requires 8 gates. and e3 requires 16 gates, with
each gate requiring more inputs in each stage. If we count gate inputs. e I requires 9 gate
inputs. e2 requires 27 gate inputs. and e3 requires 71 gate inputs. Building a larger
adder. sayan .bi t adder. using thi s lookahead scheme would thu likely result in execs
sively large size. While the pre ent ed scheme is therefore not practical. it serves to
introduce the basic idea of carrylookahead: by having each stage looking ahead at the
inputs to the previous stage and computing for itself whether that stage's carry.i n bit
should be I, rather than waiting for the carryin bit to ripple from previous stages, we get
a fourbit adder with a delay of only 4 gatedelays.
All Efficient Carry-Lookahead Scheme. A more efficient carry lookahead scheme is
as follows. Consider again the addition of twO 4bit numbers A and B. hown in Fi gure
6.58(a). Suppose that we add each column' s tWO operand bit (e.g .. aO + bO) using a
half. adder. ignori ng the carry in bi t of that column. The resulting halfadder outputs
(carry.out and sum) give us some useful informati on about the carry for the next stage. [n
panicular:
If the addi ti on of aO wi th bO resul ts in a carryout of 1. then we know for sure
that e I will be 1. regardless of whether cO is a I or O. Why? Because considering
adding aO+bO+eO. then 1+1+0=10. and 1+1+1 - 11 (the ""+" represents add
here, not OR}-both cases generate a carryout of I. Recall that a halfadder com
putes its carryout as a b.
If the additi on of aO with bO re ults in " su m of 1. then e I wi ll be 1 only if cO is
1. In panicular. con idering aO+bO+eO. then 1+0+1 - 10 and 0+1+1-10. Recall
that a halfadder computes sum as a xor b.
In other el wil l be I if aObO-l. OR if aO xor bO - 1 A D eO- !. So
we get the following equation, for the carry bits:
cl - aObO +
e2 - albl +
c3 - a2b2 +
e4 - a3b3
(aO xor bO)eO
(al xor bllel
(a2 xor bZ)eZ
(a3 xor b3le3
6.4 Datapath Component Tradeoffs 337
cl ........ cin ,'''' ... cO
: \ Cl---
1
lE ljl:' 1 1] ' 01 1----
bl 1 bO :----_______ ! 1 " t
: 1 -
al l aO ; 1 i .:!i...:J....: + 1
51\SO/ 0:
' ..../ ,,' ' ,: 0 0
(a) = l' if xor bO = t
carries: c4 c3 c2
:: I :
cout 53 s2
then cl = 1 then cl = 1 cO = 1
a3 b3 (call thiS G: Generate) (call this p . Propagate)
00
, G' " _____ L
.. ---.... - -. - ----.---_. . J
cout 53 (bl
P3 G3 52
sO
PO GO
r

"
"
"
" ::
"
"
"
"
"
"
"
"
"
"
"
"
" ' .
"
"
"
" : : /I ,
= ... ,-._.
------------------ --------------
cl = GO + POcO
c2=Gl +P1GO PfPOcO
c3 = G2 + P2Gt + P2PtGO + P2P1POcO
cout;, G3 + P3G2 + P3P2Gl + P3P2P1GO + P3P2P1POcO
(cl
Figure 6.58 Adding IWo binary numbers using 3 fasl cam Iookah d h .' .
propagate and -. ea "" eme. (al ,dea of 'IO
a
genernte tenns. (b) computing lhe propagate and c.ener-He U!mh d -ill
10 Ihe c;rrY.lookahe..1d logic. (e) using Ihe prop.1gale ';;d gene"';;e lerntS I ng::;m
or each olumn. The correspondence bel\\een e I in fi2Urt" tel and bl -sOO put.
cn'CIcs connected by the line: eAist fo; c ... and C . I' wn
Let's include a hnl fndda in en h tage to add ( \\ 0 """mnd b'I" t' tha I
<h . F . 6 -8 - '" ,r t, unm ,
. Igure (b). En h halfadderoutpulS a cam-uut bit {\\ luch 1<0 \ .1
(which IS a or b) . h . . , JIlu 'hwn 11
. ote III t e figllre thut for a gi \ cn :'Olumn. \\C Ju,t nU'd I< r the
-
338 Optimi zations and Tradeoffs
UJn '/UHCIWmt'.\'
\\ 'ht'1/ clObO=J. lit'
kfl(}\I\lt'sllould
gel/ frail! d I felr
c/. Hht'II"Oxor
bO= 1.11,1..,1011
propagate lhe cO
fa/m'l/rcllt' I'll/lit'
oJcl. mf'Cl1Img c /
sholild t'qlUlll,.'O.
half-addcr'$ Olltput with the column's carry-in bit to comput e that col umn's sum bit,
because Ihe sum bil fo r a column is jusl a xor b xor c (see Secli on 4.3. page 188).
Let"s rename Ihe earry-oulpUI of Ihe ha lf-adder gel/ erate. symbolized as G-so GO
mean, aObO. Gl mcans alb!. G2 means a2b2. and G3 means a3b3. Lel's also rename
Ihe sum OUlpUI of Ihe half-adder as propagate-so PO means aO xor bOo PI means
al xor b!. P2 means a2 xo r b2. and P3 means a3 xor b3. In short:
Gi aibi (gel/ erale)
Pi ai xor bi (propagme)
When we perfor m carry- Iookahead. ralher Ihan looking direcll y al Ihe operand bils of
previous slages as we did in Ihe naive look ahead scheme (e.g .. slage I looking al aO and
bOl o lel 's look inslead at Ihc half-adder oulPUI S of Ihe previous slage (e.g .. slage I looks
at GO and PO). Why? Because the lookahead logic wi ll turn OUI 10 be simpl er Ihan in the
nai ve lookahead scheme.
We can Iherefore rewrile our equations for each carry bil as fol lows:
cl GO + POcO
c2 Gl + Plcl
c3 G2 + P2c2
c4 G3 + P3c3
Substiluling like we did for Ihe naive scheme, we gel Ihe foll owing carry-Iookahead
equati ons:
cl GO + POcO
c2 Gl + Plcl Gl + Pl(GO + POcO)
c2 Gl + PIGO + PI POcO
c3 G2 + P2c2
G2 + P2 ( GI + PIGO + PIPOcO)
c3 G2 + P2Gl + P2P1GO + P2PlPOcO
c4 G3 + P3G2 + P3P2Gl + P3P2PlGO + P3 P2PlPOcO
Remember. Ihe P and G symbols represent simple lerms: G i

ai * bi, Pi ai
xor bi .
Fi gure 6.58(c) shows the circuits implementing Ihe carry- Iookahead equations for
compuling each slage' s carry.
Fi gure 6.59 shows a high-level view of Ihe carry- Iookahead adder's design from
Figure 6. 58(b) and (c). The four blocks on Ihe lOP are responsible for generating the sum,
Figure 6.59 Hl gh level view Or" 4 bil earr)' -Iookahc:rd adder.
6.4 Oatapath Component Tradeoffs 339
propagale, and generate b' I '
6.58(b) thaI each SPG b ns- el S call1hose "SPG blocks," and you'll recall from Figure
use the pro lock conSI sts of JUSI three gates. TIle 4-bn carr)'-Iookahead logic
using only I paglate a
l
nd generale bils 10 precompute the carry bits for high-order stages.
wo eve s of gates.
The complele 4-bil I
Ihe nonlookahead 10 ' carry- ookahead adder require onl y 26 gates (4*3=12 gate for
TI d I
g lC, and Ihen 2+3+4+5= 14 gates for the lookahead logIC).
le e ay of IllIS 4 bil dd .
h
- a er IS onl y 4 gale-delays- I gale Ihrouoh the half-adder 2
gates I rough the carry lookah d I . " .
thos I . F - ea OgIC, and I 10 finally generate Ihe sum bil (we can see
I de gahes tn Igure 6.58(b) and (c. An 8-bil adder buill usi ng the same carry-looka-
lea sc eme would still hav . d I
(8*3-?4 f e.r e ay of onl y 4 gate-delays. bUI would require 64 gate
k I -d I gal es or Ihe nonlookahead logic. and 2+3+4+5+6+7+8+9 = 44 gales for the 100-
b
a
lea 0lgd'C). A .16-bil carry-Iookahead adder would still have a delay of 4 gate-delays.
ul wou requIre ?OO gal ( 16*3 8 .
2 3+4 - 6 - es =4 gates for Ihe nonlookahead 10glc. and
3 + . +)+ +7+8+9+10+11+12+1 3+14+15+16+1 7=152 gates for the lookahead logic . A
2-bn carry-lookahead adder would have a delay of 4 gale-delays. but would require 656
gates (32*3=96 gales for the nonl ookahead logic. and 152+18+19+20+21+22+23+24+25
+26+27+28+29+30+3 1+32+33=560 gales).
Unfort unately. Ihere are problems thaI make
the size and delay of large carry-Iookahead
adders less attractive. Firs!. Ihe above analysis
counts gates, bUI nO! gale inputs. whereas gale
tnpUIS belter lell us the number of lransistors
needed. NOlice in Figure 6.58 that the gales keep
getting in hi gher stages. For example, stage
3 has a 4- tnput OR gate and 4- inpul AND gate.
whtl e slage 4 has a 5-inpul OR gate and 5-inpul
AND gate as hi ghli ghted in Figure 6.60. Siage 32
of a 32-bil carry-Iookahead adder would have 33-
input OR and AND gates, along wilh other large
gates. Since gates with more inpul s need mo;e
r _ ____ __ ___ ____ _ ____ _ _ _
,
,
,
,
,
,
,
,
,
,
,

,
,
,
,
,
,
,
,
,
,
,
,
___ _________ ___________
Figure 6.60 Gate size problem.
transistors, Ihen in lerms of tran istors. the carry-Iookahead design is actuall y quite large.
Furthermore, those huge gales would nO! have Ihe same delay as a 2-input AND or OR
gale. Such huge gates are Iypicall y built u ing a tree of smal ler gates. a we would ha\'e
more gate-delays.
Hierarchical Carry-Lcokahead Adders. Building a -I -bi t or even -bil carrv-lookahead
adder using the previous sec lion s method may b; reasonable with respecl gale si zes.
bUI larger carry- Iookahead adders begin to involve gates with 100 many inputs.
We can build a larger adder by connecting smaller adders in a carry-ripple manner. For
exampl e. suppose we have 4-bil carry-Iookahead adders available. We can build a 16-bit
adder by connecling four 4-bil carr)'-Iookahead adders. as sho\ n in Figure 6.61. Lf each
4-bil carry-look ahead adder had a -I-gale-delay. then the lotal dela) of the l6-bit adder
wou Id be = 16 gale-delays. Compare thi s to the delay of a 16-bil :lIT) -ripple
adder-if each full -adder has a IWO gale-delay. then a 16-bil calTy-ripple adder would ba\e
a delay of 16*2 = 3_ gate-delays. Thus. Ihe 16-bil adder built from ur !lIT) -1<X) ' ahead
adders connecled in a carry-ripple manner is Iwice as fasl as the 16-bit :lIT) -ripple udder.
3-'0 6 Optimizations and Tradeoffs
(Actually. careful observat ion of Figure 6.55 reveals that the carry-out of a four-bit carry-
lookahead adder would be generated in three gate-delays rat her than four. resulting in even
faster operation of the 16-bi t adder built from four carry-Iookahead adders-but for sim-
plicity. let' s not look inside the components for such detai led ti ming analysis.) Si xteen gate-
delays is good. but can we do bener? Can we avoid having to wait for the carri es to ripple
from the lower-order 4-bi t adders to the hi gher-order adders?
bl l.bB a7a6a5a4 b7b6b5b4 a3a2al aO b3b2blbO
cout s15-s12 slls8
Figure 6.61 l6-bi t adder implemented using fo ur 4-bi t adders connected in a carry-ripple manner.
In fact. avoidi ng the rippling is exactl y what we did in developing the 4-bit carry-looka-
head adder itself. Thus. we can repeal Ihe Sallie cany- Iookahead plVcess all/side of the
4-bit adders. to quickly provide the carry-in value to the higher-order 4-bit adders. To
accomplish this. we add another 4-bit carry-Iookahead logic block out side the four 4-bit
adders. as shown in Figure 6.62. The carry-Iookahead logic block has exactl y the same
internal design as was shown in Figure 6.58(c). Notice that the lookahead logic needs prop-
agate (P) and generate (G) signals from each adder block. Previously. each input block
output the P and G signals j ust by ANDing and XORing the block's a i and bi input bits.
However. in Figure 6.62. each block is a 4-bit carry- Iookahead adder. We therefore must
modify the internal design of a 4-bit carry- Iookahead adder to output P and G signals. so
that those adders can be used with a second level carry-Iookahead generator.
b11-b8 a7a6a5a4 b7b6b5b4 a3a2a l aO b3b2bl bO
Figure 6.62 l6-bit adder implemented usi ng four CLA 4-bit adders and a second level of lookahead.
Thus. let\ extend the 4-bit carry-look ahead logic of Figure 6.58 to output P and G
signal s. The equations for the P and G outputs of a 4-bit carry-Iookahead adder can be
written as follows:
P - P3P2PIPO
G - G3 + P3G2 + P3P2Gl + P3P2PIGO
6.4 Data path Component Tradeoffs 341
To understand these equ ['
should e ual th Ions, recall that propagate meant that the output carry for a
the COlumn) F qh e '"put carry of the column (hence propagaung the carry through
stage of the to be the case for the carry in and carry out of a 4-bit adder, the first
must
P
'. er must propagate Its '"put carry to its output carry, the second sta"e
ropagate Its '"put ca t ' e
[ oth d rry 0 Its output carry, and so on for the third and four stages
n er wor S each internal . I .
P3P2P1PO. ' propagate signa must be 1. hence the equation P
Likewi se recall that g
I
. enerate meant that the output carry of a column should be a 1
( lence generating a carry of 1) G .
G . enerate should thus be 1 If the first stage generates
a carry ( 0) and all the hi gher stages propa"ate the carry through (P3P2Pl) yield'
the term P3P2P1GO G h Id " . ' Lng
. . enerate s ou also be a 1 If the second stage generates a carry
and all hlgher stages propagate the carry through, yielding the term P3P2Gl. Likewise
for the third stage, whose term is P3G2. Finall y, generate should be 1 if the founh stage
generates a carry, represented as G3. ORing all four of these terms yields the equatio
G - G3 + P3G2 + P3P2G1 + P3P2P1GO. n
We then revise the 4-bit carry-Iookahead logic of Figure 6.58(c) to include
two additIOnal gates in stage four, one AND gate to compute P - P3 P 2 P 1 PO. and one
OR gate to compute G - G3 + P3G2 + P3P2G1 + P3P2P1GO (note that sta"e four
already has AND gates for each term, so we need only add an OR gate to OR the
For conCiseness, we 0 11111 a fi gure showing these two new gates.
We can introduce additi onal levels of 4-bit carry-lookahead generators to create even
larger adders. Fi gure 6.63 illustrates a hi gh-level view of a 32-bi t adder buil t using 32
SPG blo?ks and three levels of 4-bit carry-lookahead logic. otice that the carry-looka-
head logiC form a tree. Total delay for the 32-bit adder is only two gate-delays for the
Figure 6.63 Vicw of multilevel carry-lookahead. showing tree stru lure. \\ hi h erutbl fast .<!din n
with reasonable numbers and sizes of gUles. En h level adds nly 1\\ gate-<iel )s.
Optimization5 and TradeoH5
SPG blocks. and two gate-delays for each level of can'y- Iookahead (CLA) logic. for a
total of 2+2+2+2 = 8 gate-delays. (Actuall y. closer exami nati on of gale del ays within
each component would demonstrate thm total de lay of the 32-bit adder is actuall y less
than 8 gate-delays.) Carry- Iookahead adders buill from mul tiple levels of carry-lookahead
logic are known as IIIlIltile,'el or hierarchical carry-Iookahead adders .
In summary. the carry- Iookahead approach resulLs in faster additions of large bi nary
numbers (more than 8 bit s or so) than a carry- rippl e approach. at the expense of more gates.
However. by clever hierarchi cal design. the carry- Iookahead gate size is kept reasonable.
Carr y-Select Adders
AnOlher way to build a larger adder from small er adders is known as carry-select. Con-
sider bui lding an 8-bit adder from 4-bit adders. A carry-select approach uses two 4-bit
adders for the hi gh-order four bit s. whi ch weve labe led H14_1 and H14_0 in Figure 6.64.
HN_ I assullles the calTy- in will be I. whi le HI4_0 aSSlllll es the carry-i n wi ll be O. so both
generate stable output at the same time that LO.J generates stable output-after 4 gate-
delays (assuming the -I -bit adde r has a delay of four gate-delays). We use the L04 carry-
out value to select among H14_1 or HN_O. using a 5-bit-wide 2x I multiplexer-hence
the tenll carry-selecl adder.
a7a6a5a4 b7b6b5b4
a3a2al aO b3b2bl bO
ci
co 57565554 53525150
Figure 6.64 8-bil carry-seleci adder implemented usi ng Ihree 4-bil adders.
The delay of a 2x I mux is 2 gate-delays. so the lotal de lay of the 8-bit adder is 4
gate-delay; for H14_1 and 1-114_0 to generate correCI sum bit s (L04 executes in parall el).
2 gate-delays for the mux (whose select line is ready after onl y 3 gate-delays). for a
tOlal of 6 gate-delays. Compared with a carry- Iookahead impl ementati on usi ng two 4-bi t
wc've reduced the total de lay from 7 gale-delays down to 6 gat e-delays. The cost
is one exira 4-bit adder. If a 4-bi t carry- Iookahead adder requires 26 gates. then lhe design
with two <I-bit adde" requires 2*26=52 gatcs. whil e the carry-select adder requires
3*26= 78 gate,. the gates for the 5-bit 2x I mux.
We could " 1,0 buil d a 16-bit carry-select adder u,ing 4-bi l carry- Iookahead adders.
by u,ing multiple levels of multiplexing. Each nibbl e ( four bits) would have IWO 4-bit
one a"umi ng a carry- in of l. the other O. ibbleO':. carry-out would
6.4 Oatapath Component TradeoHs 343
select. USing a multipl ex h .
out would then I er. t e appropnate adder for Nibble I. Nibble I '5 selected carry-
would finall y t:e adder for Nibbl e2. Nibble2' s selected carry-out
be 6 gate-dela 5 N' ppropnate adder for Nibble3. The del ay of such an adder would
delays for I, bblel . plus 2 gate-delays for Nibble2' election. plu 2 oate-
adders would hav se eCll on-for a total of only 10 gate-del ays. Cascading four 4-bit
select . e required 4+4+4+4 = 16 gates-delays. The peedup of the carry-
verSIOn over the cascaded version would be
16 / 10 = 1.6. TOlal size would be 7*26 = 182 gates.
plus the gates for the three 5-bit 2x I muxes. That 's
prenyefficient size for prelly good speed.
. F'gure 6.65 illustrat es the tradeoffs among adder
deSigns. Carry-ripple is the smallest but ha the
longest delay. Carry-Iookahead is the fastest but has
the largest size. Carry-select is a compromi se
between the two. involvi ng some lookahead and
some rippling. The choice of the most appropriate
adder for a des ign depends on the speed and size
constratnts of the design.
Smaller Multiplier-Sequential (Shift-and-Add) Style
'"
ca rrylookahead
multilevel
carry-lookahead
carry-select
delay
carry-
ripple
Figure 6.65 Adder tradeoff .
An array-style multiplier can be fast but may require a 101 of aate Cor ' d b' .
I ' I' ..' e WI e- It\vldth
mu tiP lers. itke 32-blt multi pliers. In this section. we create a sequential I '
. t d f b" mu Ilplier
ttl S ea 0 a com ttl all onal one. in order to reduce the size of the multiplier Th ' d
. I '" . e I ea of a
sequentl a mult'pit er IS to keep a running sum of the panial products and compute ea h
pantal. product one at a time. rather than computing all the pani al product at once d
UIll.llllll g them. an
Fi gure 6.66 provides an exampl e of 4-bit multiplication. As ume we stan with a
runlltng of SUIll of 0000. Each step corresponds to a bit in the multiplier (the second
number). In step I. we com pUle the partial product a 0110. which we add to the runnin
a
sum of 0000 to obtattl 00 II O. In step 2. we compute the panial product as 0 I 10. which
we add to the propercolumns of the runmng sum of 00 11 0 to obtain 010010. In ste '
we compute the pantal product as 0000. which we add to the proper colu f
P
.
. L' k ' . runs 0 the
runlltng sum. I ' eWlse for step 4. The fi nal runlltng sum is 00010010. whi h i
correct product of 0110 and 0011. the
Step 1
0110
x 001 1
Step 2
0110
x 00'1
Step 3
0110
x 0 0 11
Step 4
0110
x 0011
o 0 0 0 I" 00 1 lO r 0 1 0 0 1 0 I" 00 1 0 0 1 0 (running Sum)
(P8noalptOduct)
o 0 t 1 0 0 1 0 0 1 0 00 1 00 1 0 0 0 0 1 0 0 1 0 (new runlllng sum)
Figure 6.60 Multiplication done by generuli ng n p:u-tial produ'l for bil in the multipher (the
number on the boIl OI11 ). nccul1lulatlllg the part ud products III a rulllllllg
344
Optimizations and Tradeoffs
Computing each partial product is easy-we just AND the current multiplicand bit
with every bit in the multiplier to obtai n the partial product. So if the current multiplicand
bit is 1. the AND creates a copy of the mult iplier as the part ial product. If the current
multiplicand bit is O. the AND creates 0 as the part ial product.
We need to determine how to add each partial product to the proper columns of the
running sum. Notice that the part ial product shoul d be moved to the left by one bi t rela-
tive to the running sum after each step. We can look at this another way-the running
sum should be moved to the right by one bit after each step. Look at the multiplication
illustration in Figure 6.66, unti l you "see" how the ru nning sum moves one bit to the right
relative to each partial product.
Therefore. we can compute the running sum by init ializi ng an 8-bit register to O. In
each step we add the part ial product for the current mult ipli cand bi t to the leftmost four
bits of the runni ng sum. and we shift the running sum one bi t to the ri ght , shifting in a 0
into the leftmost bi t. So the runni ng sum register shoul d have a clear functi on, a parallel
load function. and a shift right function. A circuit showing the running sum register and
an adder to add each partial product to that register is shown in Figure 6.67.
multiplier

e mrld
c:
8 mr3
mr2
mr1
mrO

f----': rs:'c:;:le'7a"-r_ ________
_ _ ________ _1shr
start
running sum
register (8)
product
f igure 6.67 Internal design of a 4-bit by 4-bit sequential mult iplier.
The last thing we need to figu re out is how to control the circu it so that the ci rcuit
does the right thing during each tep-that 's exact ly what conlroll ers are for. Figure
6.68 hows an FSM describing the desired controller behavior of our sequential
multiplier.
6.5 Rll Oesign Optimizations and Tradeoffs
345
mdld
mrld
mr3
mr2
mr1
mrO
rsload
rsclear
rsshr
start
figure 6.68 FSM describing the conlroiler for the 4-bil multiplier.
In terrnsof performance, the sequenti al multi pl ier requires two cycles per bit. plus I
cycle for IOIt lall zatlon. So a 4-bi t multipli er would require 9 cycles. while a 32-bit multi-
pher would require 65 cycles. The longest register-to-register delay is from a n!gister
through the adder to a register. II we built the adder as a carry-Iookahead adder havin
a
onl y 4 gate-delays, then the total delay for a 4-bit multiplication would be 9 cycles * ;;
gate-delays/cycle = 36 gate-delays. The tOlal delay for a 32-bi t multiplication would be
65 cycl es.* 4 gate-delays/cycle = 260 gate-delays. While slow, notice that this multiplier' s
size IS qUIte good, requiring only an adder, a few registers, and a state-register and some
control logic for the controll er. For a 32-bit multiplier, the size woul d be far smaller than
an array-style multipli er requiring 31 adders.
The mult ipli er's design can be further improved by using a shifter in the datapath, but
we omIt detail s of Ihat improved design.
6.5 RTL DESIGN OPTIMIZATIONS AND TRADEOFFS
Pipelining
In Chapter 5, we described the process of RTL design. While creating the datapath durina
RTL design, there are several optimizations and tradeoffs that we might make to creat:
smaller or fas ter des igns.
Mi croprocessors continue to become small er. faster.
and less expensive. and Ihus designers use mi cropro-
cessors whenever possible to impl ement desired
di gital system behavior. But designers conti nue to
choose 10 build thei r own digital circuit s to imple-
ment desired behavior of many digi tal systems. wi th
the mai n reason for that choice being speed. One
method of obtai ning speed from digital circuits is
through the lise of pipelini ng. Pipelilling means to
break a large tusk down into a sequence of stages
Without pipelining:
With pipelining:
I WI I W2 I W3 1 .. Stage 1"
] ] Ej ..Stage 2"
Figure 6.69 pplying pipelining [0
and
dIShes can be done n WTentl) .
Optimizations and Tradeoffs
such Ihat data moves through lhe stages like part s move Ihrough a factory assembl y line.
Each stage produces output used by the next Slage, and all stages operale concurrently,
resulting in bell er performance than if data had to be fu ll y processed by the lask before
new dala could begi n being processed. An exampl e of pi pelining is washing di shes wilh a
friend. wilh you washing and your friend drying (Figure 6.69). You (the fi rsl slage) pick
up a di sh (di sh I) and was h it. Ihen hand il to your friend (Ihe second stage). You pi ck up
Ihe nexl dish (dish 2) and wash il cOl/currell tly 10 your fri end drying dish I. You then
wash di sh 3 whil e your friend drys dish 2. Di shwashing Ihi s way is nearly lwice as fasl as
when washing and drying aren' t done concurrent ly.
Consider a syslem wi lh data inputs H. X, Y. and Z. lhal should repeal edly outpullhe
sum 5 = \, + X + Y + Z. We coul d impl emelll lhe syslem using an adder tree as
shown in Figure 6.70(a). The fastesl cl ock for thi s design must not be faster lhan the
longesl path bel ween any pair of regislers, known as lhe crili cal palh. There are four pos-
sible palhs from any regisler OUIPUl 10 any regisler inpul , and each path goes through two
adders. If each adder has a delay of 2 ns. then each path is 2+2 = 4 ns long. Thus, the
crilical path is 4 ns. and so the faslesl clock has a peri od of at leasl 4 ns, meaning a fre-
quency of no more lhan I 14 ns = 250 MHz.
elk el k
So mininum clock
*
elk-1L--iL
S(O)
(a)
So mininum clock
period is 2 ns
,-----,


(b)
Figure 6.70 Nonpipelined versus pipelined dmapal hs: (a) four regisler-I o-regisler palhs of 4 ns each,
so longe' l palh is 4 n . meani ng minimum clock period is 4 ns. or 114 ns = 250 MHz, (b) six
rcgisler-to-regi ster paths of 2 ns each, so longest palh is 2 ns, meaning mi ni mum clock period of
2 "', or 112 ns = 500 MHz.
Figure 6.70(b) shows a pi pelined version of lhis des ign. We merely add regislers
belween lhe fi"l and second row of adders. Since Ihe purpose of lhese registers is
solely relaled to pipelini ng, lhey are known as pipelil/e registers. though lheir internal
des ign is Ihe same as any ol her register. The compul ations bel ween pipeline regislers
are known a, stages . By inserting lhose regi sters and lhus creating a lwo-slage pipeline,
we've reduced Ihe critical palh from 4 ns down 10 on ly 2 ns. and so the fastesl cl ock has
a period of al leasl 2 n,. meani ng a frequency of no more Ihan 112 ns = 500 MHz. In
olher words. jusl by inserting lhose pipeline regi lers. we've doubled the perfo rmollce
of our de,ign!
6.5 RTl Oesign Optimi zations and Tradeoffs 347
Latency versus Throughput
The lerm "performance" ne d b fi
F 670 b e s 10 e re ned due 10 lhe pipelining concept. NOlice in
. . () .lhallhe firsl result 5(0) doe n' t appear umil after IWO cycles. whereas
I e eSlgn 111 FIgure 6.70(a) outputs Ihe fi rst resull after only one cycle. Thal's because
data must now pass lhrouoh .
c. <> an eXira row of regISters. The term latency refer to delay
lor new II1pUl dala 10 result . .
B '. 111 new OUIPUI data. Lalency IS one kll1d of performance
oth deSIgns 111 Ihe fi gure have a lalency of 4 ns. Fi gure 6.70(b) also hows that a
value for 5 appears every 2 ns, versus every 4 ns for Ihe design in Figure 6.70(a). The
term throughput refers 10 the rale at whi ch new dala can be input to lhe sy tern and
slm.ll arly, lhe rale al whi ch new oUlpul S appear from Ihe syslem. The throughpUl ;f the
deSIgn In Fl gur.e 6.70(a) IS I sampl e every 4 ns, while lhe lhroughpul of lhe desion in
6. 70(b) I I sampl e every 2 ns. Thus. we can more precisely describe the p:rfor-
ance Improvemenl of our plpehned de ign as having doubled the throughpllt of lhe
deSIgn.
EXAMPLE 6.20 Pipelined FIR filler
Recall the 100-lap FIR fi lter from Example 5.8. We
est imated that implcmcnl31ion on a microprocessor
would require 4000 ns, while a custom di aital circuit
implementati on would require only 34
c
115. That
custom digi tal circuit utili zed an adder trce, wi th
seven levels of adders-50 addili ons. Ihen 25. then 13
(roughl y), then 7. Ihen 4, Ihen 2. then I. The IOlal
delay was 20 ns (for Ihe mult ipli er) plu seven adder-
delays (7*2ns= 14ns), for a lotal delay of 34 ns. We
can funher improve Ihe Ihroughpul of Ihat fi lter using
pipel ining. NOli cing Ihal Ihe mult ipli ers' delay of 20
ns is roughl y equal 10 Ihe adder lree delay of 14 ns,
we mi ght deci de to insen pipeline registers (50 of
them since there are 50 mullipli ers feeding into 50
adders at Ihe lOp of the adder tree) belween Ihe multi-
pli ers and adder tree. resulling in dividing the
computation task into two stages. as shown in Figure
6.71. Those pi peline regislers shonen the critical path
from 34 ns down to only 20 ns. meaning we can clock
the ci rcuit faster and hence improve the throughput.
The Ihroughpul speedup of Ihe unpipelined design
N
Q)
0>
'"
;;;
Figure 6.71 Pipelined FIR filter.
the was. 4000/3.4. ;;: 1 17. while the throughput speedup
of Ihe plpehned deSIgn IS 4000/20 = 200. QUJle a nice nddJl lOnnl speedup for jusl insening orne
registers!
Although we could pi peline the adder tree also, that would not gain u higher throughput. since
the multiplier stage woul d still represent the critical path. \ Ve call' t cI k a pipelined an\
fasler than the longest stage. since otherwise that stage would fail to load COrre't \aJues into
stage's output pipeline registers.
The I.Hency of the nonpipclined design is one cycle of 34 ns. or 34 n:-. totai. The of the
pipclincd design is two cycles of 20 ns. or 40 ns total. Thus. we IhJt the
throughput at the expense of hHt:n y.
J.l8 6 Optimizations and Tradeoffs
Concurrency
EXAMPLE 6.21
A key reason for de igning a custom digi tal circuil , ralher than wri ling software that exe-
cut es on a microprocessor. is 10 achi eve improved performance. A common method of
achieving perfomlance in a custom digital ci rcuil is through concurrency. COllcurrellcy in
digital design means to divide a lask into several independent subpans, and then to
execule those subpan s simultaneously. As an analogy, if we have a stack of 200 di shes to
wash. we mighl di vide the slack into 10 subslac ks of 20 di shes each, and then give 10 of
our neighbors each a subsl3ck. Those neighbors simult aneously go home, wash and dry
thei r respecti ve substacks, and return 10 us their compl eled di shes. We would get a ten
times speedup in dishwashing (ignoring Ihe time to di vide the slack and move substacks
from home to home).
We have used concurrency in several exampl es already. For example, the FIR filter
datapalh of Figure 5.38 had three multipliers executing concurrentl y.
LeI's use concurrency to creale a fasler version of an earli er example.
Sum-of-absolute-difference component with concurrency
In Example 5.7. we designed a custom circuit for a sum-of-absolute-difference (SAD) component, and
we estimated that component to be three times faster than a software-an-microprocessor solution. We
can do even bener. Notice that comparing one pai r of corresponding pi xels of two frames is indepen-
dent of compari ng another pair. Thus, such compari sons are an ideal candidate for concurrency.
We firsl need 10 be able 10 read the pi xels concurrentl y. We can do thi s by redesigni ng the
block memories A and B. which earli er were designed as 256-byte memories. Instead, let's design
them as 16 word memories. where each word is 16 bytes (the total is still 256 bytes). Thus, each
memory read corresponds to readi ng an entire pi xel row of a l 6x 16 block. We can then concur-
rently determine the differences among all 16 pairs of pi xels from A and B. Figure 6.72 shows a
new data path and controll er FSM for a more concurrent SAD component.
iii
dill16
- -A8_rd=1
53 sum Id=1
i Ino; l
54 sad_regJd=l
ConI roller
AO 80 Al 81 A14 814 A15 815
Dalapalh
sad
Ftgure 6.72 SAD datapalh usi ng concurrency for speed. along with Ihe controll er FSM.
6.5 RTL Design Optimizations and Tradeoffs 349
The data path consists of 16 b .
lowed by 16 ab olute I su tractors operating concurrentl y on the 16 pixels of a row. fol-
result gets added 'thvalh
ue
components. The 16 result ing differences feed into an adder tree, whose
WI . e present sum for w 'f b k '
pares its COunter i with 16 since . n Ing ac. Into the sum register. The datapath com-
difference between rows 16' . there are 16 rows In a bl ock, and so we must compute the
ences of each row and th The controlling FSM loops 16 times to accumulate the differ-
SAD
. ' en oa s the final result into the register sad reg. whkh connects to the
component s output sad. -
. In 5.:, we esti mated that a software solution would require about six c des r '
Sillce there are 256 pi xels in a 16x l 6 block, the software would
J s to compare a pmr of bl ocks. Our SAD circui t with concurrency instead requires only 1
e row of 16 pi xels. which the circuit must do 16 times for a block resulting in
n y = . eye es. Thus. the SAD circuit's speedup over software is 1536 I 16 =' 96. In other
words, the .relatl vely Simple SAD Circuit usi ng concurrency runs nearly 100 times faster than a soft
ware solulion. Thm son of speedup eventually translates to beller quality digitized vi deo
whatever Video appliance we are designing.
Pipelining and concurrency can be combined h'
improvements. to ac teve even greater performance
Component Allocation
When the same operation is used in two different states of a hi gh-level state machine, we
can choose to ell her tnstanUale two functional units. one for each state. or one functional
uOJt: whtch Will be shared .among the two states. For exampl e, Fi gure 6.73 shows a
poruon of a state machtne with IWO states A and B that each ha I' li .
. . . ' , ve a mu ttp catton oper-
auon. We can choose 10 use IWO dlSttnct multipliers as shown ' F' 673( )
" tn tgure . a (we
aSSume the t vanables represent regt sters). The figure also shows the control si !ffials that
would be sel tn each Slale of the FSM contrOlling thaI datapath. with the t 1 reoi;ter bein2
loaded tn the first state (tll d-I), and the t4 register beino 10 d d ' th e -
(t4l d= I) . " a e to e second state
"0---8
t = t2 13 14 = t5 t6
F5M-A: (tlld=l) 8: (t4Id=1)


.,
e2mul
e1 mul
(8)
Figure 6.73 Two different component allocations: (a) two multipliers. (b) one multiplier (c) the one
multiplier allocation represents a tradeoff of smaller size for slightly more delay. .
However. because a slate machine can' t be. in. IWO states at the ame time_ then we
know that Ihe FSM wtll perform only one multtphcation at a time. 0 we an <hare ne
multiplier among the IWO states. Because fast multipliers are big h h ' -
_, su mug ould sa'.
350 Optimizations and Tradeofts
TI,e remlS
"opu(I!or" {/rid
"oper-arion"
refers 10 belial'ior;
like addition or
lIIuttiplicatioll,
TI,e/enll
"compolllfli/"
(aka 'jimcriollal
unit") refers 10
hard\\'ore, like (III
adder or (I
multiplier.
a 101 of gates. A datapath wi th only one multipli er appears in Figure 6.73(b). In each state
of the state machine. the comroll er FSM would confi gure the multipl exer select lines to
pass the appropriate operands through to the multipli er. as well as loading the appropriate
dest ination register as before. So in the first state A. the FSM would set the select line for
the left Illult iplexer to 0 to let t2 pass through (s 1 =0) and woul d set the select line for
the right multi plexer to 0 to let t3 pass through ( S r=O). in addition to selling tll d=l to
load the result of the mutlipli cati on into the t 1 register. Likewise. the FSM in state B sets
the muxes to pass t5 and t6. and loads t4.
Fi gure 6.73(c) illustrates that the one-multiplier design would have small er size. at
the expense perhaps of sli ghtl y more delay due to the multipl exers.
A component library mi ght consist of numerous different functional units that could
pOlenri all y impl ement a desired operati on- for a multipli cation. there may be several
multi pl ier components: MULl might be very fas t but large. whil e MUL2 mi ght be very
small but slow, and MUL3 may be somewhere in between. There may also be fast but
large adders. small but slow adders. and several choices in between. Furthermore. some
component s might support multipl e operations. like an adder/subtractor component . or an
ALU. Choosing a panicular set of fu nctional units to impl ement a set of operations is
known as compoll ellt allocatioll . Automated RTL design tools consider dozens or hun-
dreds of possible component allocati ons to find the best ones that represent a good
tradeoff among size and performance.
Operator Binding
Gi ven an all ocation of component s. we still have to choose whi ch operations to map to
which components. For exampl e. Fi gure 6.74 shows three multipli cation operations. one
in state A, one in state B, and one in state C. Fi gure 6.74(a) shows one possibl e compo-
nent binding to two multipli ers. resulting in two multipl exers. Figure 6. 74(b) shows an
alternat ive binding to two multipli ers. whi ch results in onl y one multipl exer, since the
same operand (t3) is fed to the same multipli er MULA in two different states and thus
that mUltiplier's input doesn' t require a mux. Thus. the second binding results in fewer
t4 = t5 \\t6 t3
t5 (f 6 f 3
sl-Tii fsr

(a)
Binding 1
Binding 2
Figure 6.74 Two different operator bindings: (a) binding I uses two muxe., (b) binding 2 uses only
one mux. (c) binding 2 represents an optimi zation compared 10 binding I .
6.5 RTl Design Optimizations and Tradeofts 351
gates. wi th no performance loss . . .
that bindin
o
not onl y - an opuml zauon. as shown in Figure 6.74(c). ote that
map to which co maps operators to components. but also chooses which operand to
mponent IIlput If we had d t3
Figure 6.74(b). then MULA w mappe to the left operand of MULA in
M " ould have reqUi red two muxes rather than just one
applll g a given set of operations to a . I .
operator billdillg. Automated to I . II partl cu ar component allocation is known as
given component allocati on. 0 s typlca y explore hundreds of different bindings for a
Of course the tasks of co .
demo If we ail ocate onl y one mponent all ocallon and operator binding are interdepen-
component. If we all ocate tw component, then all operators mu t be bound to that
all ocate many component s. then we have some choices in binding. If we
will perform all ocat ion and binding s I any mlore bllldlllg chOIces. Thus. some tools
Imu taneous y, or the tools will iterate a th
two tasks. TOg, ether. component all ocati on and operator binding are sometimes t e
as reSOurce S IQrll1g. 0
Operator Scheduling
Given a hi gh-level state machine, we may introduce add ' . I
II
lliona states to enable u to
create a sma er datapath. For exampl e consider the h'gh-I I '..
675( ) Th h' I eve state machine III Fl oure
. a . e state mac lIl e has three states. with State B having rwo multipl ications. Since
'"
0---0--0
(some tl = t2 t3 (some
operat ions) 14 = 15 t6 operations)
(a)
(some 11 = t2 13 t4 = 15 16
operations) _______ _
3-state schedule


4-slate schedule
delay
(e)
(some
operations)
Figure 6.75 Schedu ling: (a) initial 3-state schedut e requires two muttipliers. (h) new 4-smte
schedule requires onl y one multipli er. (c) new schedule trades off size for delay (extra late).
those two multi pli cati ons occur in the same state. and we know th t h .
. . . a eac state Will be a
slllgie clock cycl e. then we wIll need two mUltipli ers (at least) in the datapath to u
the two Slltlultaneous multipli cati ons III state B. But what if we 0 I h ppon
. . n y ave enouoh Oates
for one mulupli er? In that case .. we can reschedule the operations so that there i "at 7no t
onl y one multiplicati on needed III anyone state. as in Figure 6.75(b). Thus. when we allo-
cate components. we need ? nl y all ocate one multiplier as hown. and as was also done in
Fi gure 6.73(b). The result IS a smaller but slower destgn. a illustrated in Fi oure 6 -
That scheduling example a sumed that the computati on of t4 uld t be '" . ).
no moved t Sl3te
A or state C. perhaps because .those states already used a multiplier. r perha beenu
t 5 and t 6 were not ready yet III state A. and the new re ult in t4 \\ as ed d P se
no 0 III ' tate C.
352 Optimizations and Tradeoffs
Convening a computation from occurri ng concurrent ly in one stat e to occurring
across several states is known as serializing a computation.
Of course, the inverse rescheduling is also possible. Suppose we staned with the
high-level state machine of Figure 6.75(b). If we have plenty of gates avai lable and want
to improve our design's perfomlance, we might reschedule the operations such that we
merge the operations of state B2 and B into the one state B, as in Figure 6.75(a). The
result is a faster but larger design. requiri ng two multipli ers instead of one.
Generally. introducing or merging states, and assigning operati ons to those states, is
a task known as operator scheduling.
You may have noticed that operator scheduling is interdependent with component
allocation. which you may recall was interdependent with operator binding. Thus, the
tasks of scheduling, allocation, and binding are all interdependent. Modem tools may
combine the tasks somewhat. andlor may iterate among the tasks several times, in search
of good designs.
EXAMPLE 6.22 Smaller FIR filter using operator sc heduling
Consider the 3-lap FIR filter of Example 5.8. That design had no controller. meani ng the high-level
state machine actually had just one state containing aU the dat apath actions. as shown in Figure
6.76(a). We could reduce Ihe size of the datapath by scheduling the operations across several stales.
such that at most one multipli cation and one addition occurs per state. as shown in Figure 6.76(b).
The first stale loads the x registers with samples-note that the ordering of those actions nextla the
state doesn '( matter si nce all the actions occur simultaneously. That state also clears a new register
named sum. which had to introduce to keep track of the intermediate tap sums to be com pUled
in the laler Slales. The second state compules Ihe firsl lap of Ihe filter result . the neXI stale computes
the second tap. and the next Slate computes the third lap. The laSl state OUlput s the result , and then
the state machine returns to the first state again.
Inpuls: X (N bits)
Oulpuls: Y (N bits)
Local registers:
xtO. xtt . xt2 (N bits)
W
xtO =X
51 xt1 = xto
xt2 = xt1
Y = xtO' cO
+ xt1 ' c1
+ xt2 c2
(a)
Figure 6.76 High-level state machine for 3-tap
FIR filter: (a) original one-state machine, (b)
fi ve-stale machine with at moSl one add and
one mult ipl y per state. We ignore the writing
of the constant regi sters (c O. c 1. c 2) for
simpli city in the example.
Inputs: X (N bits)
Outputs: Y (N bits)
Local registers:
xtO. xt1, xt2. sum (N bits)
sum =0
xto = X
xt1 = xtO
xt2 = xt 1
sum = sum + xtO cO
sum = sum + xt' cl
sum = sum + xt2c2
Y = sum
(b)
6.5 RlL Design Optimizations and Tradeoffs 353
A dalapalh for thi s Slate h' . h . .
. I' d mac Ine IS s own III Figure 6.77. The data path requires only one mul-
up Jer an one adder beca th . . . .
in Figure 6.76. The' . use erc IS at r:nOS( one and one addition in any given state
panlcular configurall on of Ihe multlpher. adder. and regi ster in Figure 6.77 is
extremely .common In single circuits. and is generally known as a multiply.accumulaJe
(MAC) unll. The dalapalh multIplexes the inputs 10 the MAC unit.
Figure 6.77 Serial FIR filter datapath. The components in Ihe dashed box compri se whal is known
as a multiply-accumul ale (MAC) component.
One fu nher difference belween thi s datapalh and the concurrent datapath of Example 5.8 is
Ihat Ihi s datapath has load lines on the X regiSlers and on yreg. The conCUrrent design loaded those
registers every clock cycle. but Ihe serial design onl y loads those regi sters during particular tates-
other Slates compute intermediate results.
We estimated Ihe performance of the concurrent design of Example 5.8 assuming I os per gate.
2 ns per adder, and 20 ns per multiplier. The design had a critical path of 20 ns for the multiplier aod
then 4 ns for two adders In senes, for a total of 24 ns. That was al 0 the time between new results
being laken in al tlle inputs and generated al the output: 24 ns. Using the more precise performance
measures of lalency and throughput defined in Section 6.5. the concurrent design has a lmeney of 24
ns (delay from inpul 10 OUlput), and a throughput of I sample every 24 os. The serial d iro has a
criti cal palh equnl to Ihe delay through a mux. multiplier, and adder. Assuming two gate-<lelaY; for the
mux, we obtai n a delay of 2 ns + 20 ns + 2 ns, or 24 ns. The latency from input to oUlput is five states.
meaning 5 24 ns = 120 li S. The throughput is I sample every 120 ns. Thus. the concurrent 3-mp FIR
filter has 120/25 = 5 times faster lalency. as well as 5 times fasler throughput. companed to the serinl
FLR filt er. Recall from Example 6.20 that a pipeli ned concurrent FlR filter has even fasler throughput.
The performance difference between serial and concurrent become even more pronounced if
we look at an FIR fi lter with more laps. We estimated the latency of a concurrent lOO-tap FIR filter in
Sectioll 5.3. after Example 5.8 to be 34 ns (the delay I grealer than the concurrent 3-tap filter becau
Ihe lOO-tap fi lter needs an adder. ,:"e). The senal desIgn would till have a _4 os ritical path. but
would require 102 states (I to lIuuahze, 100 10 compute the taps. and I to oUlput). for a lateOC) f
102 24 ns = 2448 liS. Thus. Ihe latency speedup of the concurrent design would be 1 34 = _.
We should also consider the size difference between the serial and ncurrent design . Let's
assume for illustralive purposes Ihat an adder reqUIres nppro"matel) 500 gates and a multiplier
35-1 Optimizations and Tradeoffs
require ... 5000 gates. The serial design' s one and one would thus require only 55?O
For a 3-tap FIR filter. the concurrent design s 3 muillpi lers and 2 would
5000*3 + 500*2 = 16.000 gates. For a IOO-tap FIR filt er. the concurrent design s lOO multlphers
alone \\Quld require 100*5000;. 500.000 gates- I 00 times more gates than the senal deSign.
Intuitively. these numbers make sense. A concurrent
dcsi2.n ror 100 lapS uses about 100 limes more gates (due to
100 multi pliers instead of just I) compared to a serial
design. yet achieves about 100 limes bctl cr performance (due
10 computing 100 multipli cati ons concurrentl y rather than
computing one multiplication at a time). .
Depending on our pcrfonnance needs and Size con-
sLIaints. wc mi ght considcr designs in between the two
extremes of serial J nd concurrent. such as a design with two
multipliers. whi ch would be roughl y twice as big and twice
as fasl as Ihe serial design. or len multipliers. whi ch would be
roughly ten timcs big and ten times as fast as serial
design. Fi!2ure 6.78 illustr.1tcs tradeoffs among senal and
for an FIR filter.
concurrent FIR
1 compromises
senal
- FIR
delay
Figure 6.78 FIR design
tradeoffs.
The above sections should have made it quite clear that RTL design presents an enor-
mous ran2e of possibl e soluti ons to the designer. A singl e hi gh-level state machine can be
impleme;ted as any of a huge variely of possible implementati ons thai differ tremen-
dous ly in their sizes and performance.
Moore versus Meal y Hi gh-Level St at e Machines
In the same way that we can create either a Moore or a Mealy FSM (see Section 6.3), we
can create Moore or Mealy high-level state machines. In Ihe case of a hi gh-level slate
machine. a Moore Iype can only have acti ons associaled with the states, while a Mealy
type can have actions as ociated with the transiti ons. As was the case wilh FSMs, a
Mealy type may result in fewer stales. Mi xing Moore and Mealy types IS commonly done
in high-level state machines.
6.6 MORE ON OPTIMIZATIONS AND TRADEOFFS
Serial versus Concurrent Computation
Having seen in thi s chapl er numerous examples of Iradeoff techniques at various levels of
design, we can detect a common theme underl ying some of Ihose Iradeoffs. The common
Iheme is that of seri al ver us concurrent compul ali on. Serial means to perform lasks one
at a lime. COll currell t means 10 perform lasks aI Ihe . ame lime.
For example. in combinalional logic design, we can reduce logic size by faclOring
out By factoring OUI lerms. we are essenti all y seriali zi ng the compulalion. by com-
pUling the factored out terms firsl. and then combining Ihe resul ts with other terms. In
datapalh componenl design, we can improve an adder's speed by compuling carries can-
currenlly. rather than wai ling for the carry to ripple !.eri all y. In RTL design, we can
schedule operation, across ,everal . Iates. scri aliling Ihe opcralions 10 reduce size
6.6 More on Optimizations and Tradeoffs 355
compared to operations in a single state. Example 6.2 1 and Example 6.22 both
senal versus concurrent computati on tradeoffs. for an SAD circuit and an FIR
CtrCUIl. respectively.
Trading off between serial and concurrent computation is a fundamental concept
spanmng all levels of digital design. As a general rule, a concurrent design is faster but
larger, whde a sert al design is smaller but slower .
Typi call y, numerous design options exist that span the ranae in between fully serial
and fu ll y concurrenl designs. <>
Optimizations and Tradeoffs at Higher versus Lower Levels of Design
As a general rule, the.optimi zat ions and tradeoffs made at the higher Ie els of design may
have a much greater tmpaci on design cri leria than the optimizations and tradeoffs made
at lower levels of design. For example, imagine wanting to dri ve to a city on the other
side of the country in as lill ie time as possible. We could reduce time by reduci ng the
number of stops we make to eat. meaning we carry our own food in the car. We could
also reduce time by reduci ng stops for fuel, meaning we use a car wi th the lonee t dri viDe
capacity per gas lank. Some people (nO! you. of course) might even consider driving
faster than Ihe legal speed limit. But those are nO! Ihe fir t things you typicall y think of
when trying to reduce driving time for a cross-country trip. The most important deci sion
is which rout e to lake. One route mi ght be 4000 mil es long. whil e another route may be
onl y 2000 miles. The hi gh- level decision of which route to take has far more impact than
all the lower- level deci sions mentioned previously. Those lower-l evel deci sion are onl y
reall y useful 10 us if we made the ri ght hi gh-level decision, and then if we till want to
reduce the time furt her.
In digi tal design, optimi za-
tion/tradeoff deci sions at the
hi gher levels (e.g., RTL deci sions)
may have a much larger impact
than deci sions at Ihe lower level
(e.g .. datapath component deci sions
or multilevel logic decisions). For
example, the RTL decision to bui ld
a serial or concurrent FIR fi ller
(Example 6.22) wi ll have a far
greater impacl on circuit size and
perfonnance than Ihe datapath-
component- level decision 10 use a
carry-rippl e or carry-Iookahead
adder, or Ihe combi nali onal-Iogic-
delay
(a)
land
(b)
Figure 6.79 Higher- lower-level deci"ions:
(n) higher-level decisions (denoted by the larger two
circles) focus the design into a region. while 100\cr-lc\'cl
decisions tune withi n the region. (b) spotlighl
level decision to u e two-level or mullil evel logic. Those lower-level decision mereh rune
the size and performance of Ihe higher-level decision. Figure 6.79(a) illustrates thi co-n cpt.
An anal ogy might be a spotlight shining down on land. illu trated in Figure 6. 9(b>-
movina Ihe spotli ghtlefl or right at high altitude (higher-level decisions) has a larger impact
on which land region (possible solutions) is illuminated than d I wer-allitude mo\ emems
(lower-level decisions).
___ -..,..,.._....-- --c'- .__ __ __
356 6 Optimizations and Tradeoffs
Algorithm Selection
When attempting to implement a system as a digital circuit , perhaps the highest-level
design decision. havi ng therefore the most signficant impact on design cri teria like size.
performance. power. etc .. is the selection of an algorithm. An algorithm is a set of steps
thai solve a problem. The same problem can be solved by different algorithms. Algo
rithms for the same problem, when impl emented as a digit al circuit , may result in
tremendously different perfomlance andlor size. Some algori thms may simply be bener
than mhers (optimization without much tradeoff). while other algorithms may represent
tradeoffs between perfomlance, size, and other crit eri a. Select ing an algorithm for a
digi tal design problem is perhaps the hi ghest level of design, and can have the biggest
impact on design cri teria. For example, earli er examples showed vari ous impl ementations
of an FIR filter. But Lhere are many other algori thms for fi ltering very different from the
algorithm used in FIR. Some algori thms may provide hi gher-quality filteri ng at the
expense of more required comput ation. others may provide lower quality but need less
computation.
We illustrate algori thm selection using an exampl e.
EXAMPLE 6.23 Data compression using different table lookup algorithms
We wish ( 0 compress data being sent over a long-distance computer network in order to achieve
faster communicalion by sendi ng fewer bilS. One method for such compression is to use short codes
for frequently appeari ng data values. For example, suppose each data item is 32 bits long. We mighl
analyze the data we expect to send and fi nd the 256 most frequently appearing data values. We could
then assign a unique gbi t code to each of lhase 256 values. When sending data over the network, we
first send a bit indicating whether we are about (0 send an encoded 8-bit data item or a raw 32-bit
data item-if the first bil is 1. that might mean encoded. and a a mi ght mean raw. If al l the data
ilems being sent happen to be among the lOp 256 most frequent ones. then we' d be sending 9 bits per
data item ( I bit indicati ng wheLher encoded, plus 8 bits of encoded data) raLher than 32 bi ts per data
itcm-3 compression of nearly 4x, which could translate to about 4 limes fasl er communication.
We might design the encoder usi ng a 256-word
memory that stores the 256 most frequent values in sorted
order. from small est to largest in binary. The code would
then be the address of Lhal word in the memory. Figure 6.80
shows sample contents of such a memory, in hexadecimal.
The contents vary depending on the communi cating appli ca-
tions we are considering.
0:
1:
2:
3:
OxOOOOOOOO
OxOOOOOOO1
OxOOOOOOOF
OxOOOOOOFF
One algorithm for searching a li st of values in a memory 96: OxOOOOOFOA
is known as linear search. Starting at address O. we compare 128: OxOOOOFFAA
each memory word' s contents wi th the data item we are
look.ing for (known as the key), incrementing the addre", and
repeating unLil we find a match. at which point we treal the
add res at which there was a match as the encoded value. If
we get to address 255 and don't find a match. we will transmit
the raw data. The linear search algorithm is a slow way to
search a sorted list in memory. The algorithm requires 256
reads and compares for data items that aren' t in the memory.
which may translate to 256 cycles. For data items that arc in
the memory. we would require on average 128 reads.
255: OxFFFFOOOO
256x32 memory

..
"
'"

..
c
:0
Figure 6.80 Searching a sorted
memory for the key OxOOOOOFOA
- linear search requi res 97 reads!
compares. binary search onl y 3.
6.6 More on Optimizations and Tradeoffs 357
A faster algori thm for searching r f . .
first sort the list and th h ' ISla Items In a memory is known as binary search. We
en store l e list In the memo (\ . d I .
we start in the middle of the memo . ry \Ie nee on y son once). To look up an Item.
the key. If the content' s val ue is les ry. mealllng address 128. and compare that contents with
b h
s than 128. Ihen we know that the key. If 1\ eXists III the memory.
must c somcw ere between 0 and 127 S
and aga'ln com If h . 0 we go to the middl e of that range, meaning address 64.
pare. t e value there is les th h k
65- 127 So afl . h ' s an t e ey. we search 0 to 63; if greater. we search
h k 'I' b cr cae companson. we decrease the remaining possible range of addresses in which
t e ey les y one half. Halving 256 repeatedl y can onl y be done 8 times' ?56 128 64 32 16 8
4. 2. I. In other words. after at most 8 o . . . - . . . . . .
to 1, meanin the ke can' be . c mpansons. ve eIther seen the key, or shrunk the range
. g y t found III the memory. Billary search is 256/8 = 32 times faster than
Im,car when the key does nO{ exist in the memory. and roughly that much faster when the key
eXIsts 10 the memory too. : ct binary search only requires a sli ghtly smarter controller.
. We sec Ihat the chOice of the ri ght algorithm makes a big difference in performance for
thiS exampl e-much bigger a di ffere nce than determined by. say, the speed of the comparator
belllg used.
Power Optimization
Power is becoming an important design cri teria, both in high-end computing as well as in
embedded computtng. The unit of power i watts, which represents the energy per second
(I. e., Joules per second) .. ln high-end computi ng. like desktop PCs. servers, or video-game
consoles. the chtps tnstde a computer consume a 101 of power. causing the chips to
become very hoI. For exampl e, a typical chip insi de a PC may consume 60 wans-thiok
about touchtng a 60 wat! It ght bulb (but don' t actuall y touch one) to understand how bot
that is. Designing low-power chi ps reduces the need for expensive chip cooling methods
beyond si mpl e fans in hi gh-end computing, and also reduces the eleclriciry costs. which
can be quite significant for companies operating large number of computers.
In embedded computing, even simple cooling methods like fans are often not avail-
abl e-for example, your cell phone does not hav; a fan (if it did. people might find their
tie or scarf getting stuck in that fan). Portable embedded devices might have chip that
run at only I watt or less.
FurthemlOre, portable devices typically get
their energy from batteri es, and Lhus low power
chips are necessary to extend battery life-espe-
cially consideri ng the fact that batteries are not 0
improving fas t enough to keep pace with
.S;
"
"

"
,.,
8
increasing power consumpti on. By some mea-
sures. energy demand per chi p is doubling about
every three years (going along with Moore'
Law). Figure 6.8 1 plots such energy demands
compared to battery energy densities improving
at their present rate of only about 8% per year .
The increa ing gap shown translates to shorter
battery lifetimes for a device like a cell phone.
2
c
"
2001
energy
demand
banery energy

03 05 07
or translates to bigger batteries.
The most popul ar IC technology today use
CMOS transistors. and the biggest contributor to
Figure 6.81 Battery energy densit), is
impro\'ing slo" er than the in reasing
energ) demands of digital chips.
358 Optimizations and Tradeoffs
power consumpti on in CMOS is the switching of values from 0 to 1. The reason for this is
thm wi res aren't perfect. having capaci tance (we don' t put a capacit or there on purpose-it
is simpl y a result of the fact that wi res aren ' t perfect conductors of electricit y). Swi tching
the wire from 0 to 1 requires charging that capacilOr. Switching from 1 back to 0 causes
that charge to be di scharged to ground. That switching result s in power being consumed.
This power is known as dYll amic power. since thi s power comes from the changing of
signals (dynamic means changing). Dynamic power consumpti on of a CMOS wire is pro-
porti onal to the size of the capacitance (C) of the wire. multipli ed by the voltage (V)
squared. multiplied by the freque ncy at which the wire switches (f), namely:
(equati on for CMOS dynami c power consumpti on)
where k is some conswnt. To compute the dynami c power of a circuit. we would add up
the power computed by the above equation for every wire.
Looki ng at the above equation. one can clearl y see that lowering the voltage will
cause the grealC t reducti on in dynamic power. because of the voltage havi ng a quadratic
(squared) contributi on 10 dynamic power. Low-l evel circuit designers seek to reduce
power by creating transistors that operate at the lowest vollage possibl e. 10 reduce the V
term. and that have the small est wire capacitance possible. to reduce the C term. Digital
designers can therefore choose to uti lize gates that operate with a lower vol tage.
Unfort unately. lower voltage gates have a longer delay than hi gher voltage gates.
resulting in a tradeoff between power and performance.
Another way 10 reduce the dynamic power consumed by a circuit is to reduce the cir-
cuit' s clock frequency. which obviously reduces the f term for all the clock wires in the
circuit. as well as for the many other wires that change on each clock edge (like register
wires and the logic connected to those registers' output s). But again. reducing the clock
Frequency slows performance. resul ting in a tradeoff between power and performance.
The chi ef techni cal officer at a major chip design company IOld me in 2004 that, for
thei r company. "Power is enemy number one." The reason is that they had scaled their
voltage down nearly as low as possible. yet are pUlling more transistors on each IC every
year due to the shrinking of transistor sizes. meaning more wires switching. And capaci-
tance i n't decreasi ng at the same rate as transistor sizes. The resull is that an Ie
consumes more power as we put more transistors on the IC. which can result in problems
due 10 100 much heat and due to fast banery energy consumption.
Clock Gat ing (Advanced Technique)
Assumi ng the C and V term have been reduced to the extent possibl e using transistor-level
de ign techniques. power can be reduced furt her by reducingf . the Frequency at which wires
swi tch. One method for reducing such power is known as clock gating. Clock ga/i/lg is the
di sabling of the clock signal in regions of the chip that we know are not computing anything
at a given time. Clock gat ing aves power because a signifi cant percentage of the wires
switching in a chip are the wires that di stribute the clock to all the registers and flip-flops-
perhaps 200/c-30% of the power consumpti on is due to the clock signal switching
throughout the chip. Clock gati ng reduce f without slowing the clock frequency itself.
In clock gating. the clock is di sabled by A Ding the clock signal with an
enable signal that is in the machine. Recall that a register with parallel load inter-
nally reload, the ,ame va lue from the regi' ter', flip-fl ops back into the fl ip-flop on a
j
6.6 More on Optimizations and Tradeoffs 359
ri sing clock edge Preventi no the I k d
fl' fl . Id' . o C oc e ge from appearing keep the same values in the
IP- ops. Yle . II1g the same net result-the register's COlllent s don't change.
Clock gmll1g IS not someth' h' d' . . .
. II1g t at IgllaJ desloners ryplcally do themselves. Rather.
modern sYlllhesls tools may II . 0
. a ow us to speCify clock enable and disable u ing pecial
commands 111 each state These t I . .
. . 00 s must u e extreme cautIOn. becau e addine a gate on
a clock slonal delays the clock ' I '. .. -
. . .0 . signa. resulting 111 clock Signals 111 different parts of the
C" cull bell1g sli ghtl y different from one another, an effect known as clock skew. The tools
must perform carelul timing analysis to ensure that the clock skew doe not chanee
overall C"CUIl behavior. Furthermore, pUlling gates on a clock sional can reduce the
sharpness of the clock cdoes and . b 0 . . .
0 ." . so must e done careFull y. somellmes uSll1g speCial
gates. Nevertheless. the technique IS widel y used by low-power tools in practice.
We de monstrat e clock gati ng wi th an example.
EXAMPLE 6.24 Serial FIR filter with clock gating to reduce power
n4 ___ --' "--__ --' '--__ -'rL
We designed a serial FIR fi lter in Example
6.22. A five-Slate state machine controlled
the dalapath. The state machine loaded the
three XI registers only in me first slale. tale
SI . and loaded Ihe y reg regisler only in the
last Slate. state 55. Yet. lhe design routed
the cl ock signal 10 all four registers utilizing
four wires. labeled n I-n4 in Figure
6.82(a). Notice from lhe liming at
the lOp or the figure Ihat n1-n4 change
identi call y a the clock signal changes. and
remember that every such change consumes
dynamic power.
Figur.6.82 Clock gating: (n) the lock
signal switches e\ el) cycle n all the
heJvily bolded \\ ires. but the \ t reQi ters
are only loaded in state J.:J.IlO the Y reg
SI31C 5-so mmt of doc\... Itchin2. is
\\'Ilsled: (b) gnling the dock redu 's
hing on the lock. \\ In:,.
360 Optimizations and Tradeoffs
Figure 6.S2(b) shows 0 design using clock gati ng. The controll er gates the clock to the xt reg-
isters by selling si lO a in all states but 51. Likewise. the controller gates the clock to the yreg
register by scning 55 to 0 in all states but 55. Notice the significant decrease in signal switching on
Lhe clock's wires n} - n4. shown at the baLtom of Figure 6.82.
Low-power gates on noncritical paths
at all gates are equa ll y rast. Engineers that buil d gates rrom transistors can make a gate
faster by increasing the size of the gate's transistors, or by operating the gate at a higher
voltage, or by any or several other means. Thus, one
two-i nput AND gate might have a I ns delay, whil e
another two- input AND gate might have a 2 ns delay.
The laner AND may consume less power, due 10 its high-power gates
smaller size or lower voltage.
If we want 10 reduce the power consumed by a
circui t, we can build the enti re circuit using low-
power gates 10 achieve low power at the expense of
slower perfomlance. as ill ustrated in Figure 6.83.
Altematively. we can put low-power gates onl y
on the noncritical paths. such that we lengthen those
paths 10 have delays no longer than the cri tical pat h,
as shown in the foll owi ng example.
Q) low-power gates
~ on noncritical path
a.
low-power
gates
Figure 6.83 Using low- power
gates
EXAMPLE 6.25 Reducing noncritical path power with multilevel logic
In Example 6. 12. we reduced the size of a noncriti cal pat h by usi ng multil evel logic. In this
example. we instead reduce the power consumed by the noncritical path by using low-power gales.
Assume that nonnal gales have a delay of I ns and consume I nanowatt of power, and that low-
power gates have a delay of 2 ns and consume 0.5 nanowatts of power.
The left si de of Figure 6.84 shows the same circuit from Example 6. 12. havi ng a critical path
of 3 gate-delays. Assume that all the gates are nom1al gates, meani ng the cri tical path delay is 3 ns,
and the IOta I power consumption i s 5 nanowallS.
d
e
Figure 6.84 Using low-power gates on noncritical paths. Numbers inside a gate represent the gate's
del ay in nanoseconds, and the gate's power consumplion in nanowall S.
The bottom two AND gates lie on two noncritical paths having delays of only 2 ns. We can
thus replace those AND gates by low- power A D gates. The result is that the two paths' delays
lengthen to 3 ns. so become equal to the criti cal path delay, but not longer. The result is also that th.
total power becomes onl y 4 nanowatts instead of 5 nanowatt s (a 20% reduction).
6.7 Product Profi le: Digital Video Player/Recorder 361
6.7 PRODUCT PROFILE: DIGITAL VIDEO PLAYER/RECORDER
Digital Video Overview
In the 1990s, the di git izat ion of video became practical due to faster, smaller, and lower-
power digital circuit . Previously, video was largely captured, stored, and played using
analog methods. Di gi tized video works by sampli ng an analog video signal and trans-
formtng the sampl es to digital values. Such digiti zati on is simil ar to the audio digitization
example from Fi gure 1. 1, but with some additional work.
A video is actuall y a series of
qui ckl y di splayed still pi ctures, known as
frames, as shown in Figure 6.85(a). One
second of video mi ght consist of about
30 frames-the human eyes and brain
see such a rapid sequence or frames as a
smooth, conti nuous video.
A digital di spl ay may be di vided
into several hundred thousand tiny "pi c-
ture elements," or pixels. A typical size
might be about 720 across and 480
down. For each fra me, a digitized sampl e
captures several values for each pi xel,
li ke the intensity of the red, blue, and
green component s of the light at that
pi xel, convening analog measurements
of those intensities into di gital numbers.
The result is the representation of a digi-
ti zed frame as a (l arge) series of as and
Is, and the representat ion of a digitized
(a)
I ~ D G ]
1 P P
- . (b)
Figure 6.85 Video: (a) is a series of pictures. or
frames, with much interframe redundancy. (b)
can be constructed from I (intra) frames and P
(predicted) frames. shown with relative bit
encoding sizes.
video as a large seri es of digitized frames. Di gitized video can be transmined. stored.
repl ayed, and copied with much higher quality than analog video. Funhennore. digitized
video can be compressed, resulti ng possi bl y in higher quality video than analog video
transmitted or stored using the same medi um.
DVD-One Form of Digital Video Storage
Di gital video discs (also known as digital versati le discs). or DVD . store video in a
di gi tal format. First sold in 1997. DVDs replaced the analog video technology known as
VHS tape. DVD pl ayers appear in home entenainment centers, personal computers. auto-
mobi les (especiall y famil y-oriented vehicles). and even as stand-alone portable units. In
200 I , consumer electroni c companies introduced the first DVD recorder to market.
all owing individuals to record television shows to special recondable DVD . The popu-
larity of DVDs compared to the previously popular analog-based VH technology terns
from several advantages. includi ng bener quality video. no deterioration in "ideo quaJit)
over time. and the abili ty 10 jump directly to panicular pan in a ideo without having to
sequentially forward or rewind.
-- - - -- -== .".- -. ----- ~ -
362 Optimizations and Tradeoffs
DVDs store large amOu11l s of data on a thin reflective layer of metal. Although the
metal layer within a DVD looks fl at From our perspecti ve. there are actuall y bi ll ions of
tiny pi ts on the metal layer that store the data. These pits, or lack of pits (called lallds),
store the binary data on the DVD. Figure 6.86 shows how a DVD player reads the infor-
mation off a DVD. Using a very precise laser. the laser's light is focused onto the metal
layer withi n the DVD. The metal layer refl ects the light onto an opti cal sensor that can
detect iF the light is reflected off of a pit or a land. By detecting the difFerent regions, the
optical sensor creates a stream of binary values as it reads the DVD.
Optical
Pickup
... 010100101 100
---_.-/
figure 6.86 How a DVD player reads a DVD. The DVD player' s optical pickup element shi nes a
laser on the surface of the DVD. The DVD refleclS the laser back to an opti cal sensor. and the
optical sensor use the intensit y of the reflected laser to output the sequence of Os and Is stored on
lhe DVD. A video decoder circuit convens lhe bi nary data (0 a sequence of frames that humans
interpret as a moving picture.
The DVD' s binary data is organized into a eries of tracks that spi ral outward from the
center of the DVD. As the DVD player is reading the data, the laser and optical sensor must
slowly move outward from the center of the DVD to the outer edge. [f a DVD is dual-lay-
ered. the data on the di sk 's second layer is stored in a spiral that moves from the di sk's outer
to inner edge. The moti vati on for the second layer'S reverse spiral is to prevent the laser and
opti cal sensor from needing to reposition itself to the center of the di sk after focusing on the
second layer during a layer change. (You may have noticed a DVD pause momentari ly at a
certain point in a movie during a layer change.)
A single-layer single-sided DVD can store 4.7 gigabytes of data (meaning 37.6 giga-
bits), but that amount i not enough for a movie unl ess the dala is compressed. Consider
a video wi th a resolution of 720 pixels by 480 pixels, using 24 bi ts of information per
pixel. and di splayed at 30 frames per second. One frame would require 720*480*24 =
8,294.400 bits. or about 8 Mbits. One second of video. or 30 frames. would require
30*8.294.400 = 248,832.000 bits, or about 250 Mbits. A 100-mi nut e movie would thus
require about 250 Mbits/sec 100 min 60 seclmin = 1500 Gbit . But a DVD can only
hold 37.6 Gbits. To 'tore a movie. a DVD must the video in a compressed format.
6.7 Product Profile: Digital Video PlayerlRecorder
363
A DVD is onl y one of many different di gital video storage media. Digitized video may
be stored on any storage media capable of stori ng Os and 1 S in some form. such as on tape
rn many di gital video cameras). on a fl ash memory (used in digital cameras and cell
pones wtth Video recordlll g capability), on a CD. or on a computer hard drive. All such
media are typicall y still quite limited and thus require compression methods.
MPEG-2 Video Encoding-Sending Frame Differences Using 1-, po, and B-Frames
MPEG:2 video compression was defined and standardized by the Moti on Picture Expert
Group 111 1994 (as an Improvement over the 1992 MPEG- I standard). and is used in DVDs
digi tal television, and numerous other di gital video devices. MPEG-2 compression
range from 30: I to 100:1. or more. The compression ratio i determine by dividing the
number of btts of the dtgttt zed Video before compression, by the number of bits after com-
pression. So if a di giti zed video requires 400 gigabytes uncompressed but onl y 4 gioabyles
compressed. the compression rati o would be 400/4 = 100: I. ate that packing 1500 Gbits of
a movie into 37.6 Obits would require a compression ratio of 1500 Gbitsl37.6 Gbits = 40: I.
. The key observation leading to MPEG-2' s compression method is that typically very
htlle dtfference eXlsts between two successive frames in a video--in other words. video
typi cally has much interframeredundancy. For example. a frame may consist of a person
standlll g 111 front of a mOunt alll , as in Figure 6.85(a). The next frame (which represents
perhaps 1/30th of a second later) may be almost identi cal to the previous frame, except
that the person's mouth has opened slightl y. The next frame may till be almost identical.
with the person's mouth opened li ghtl y more. And so on.
Therefore, MPEG-2 does not merely encode each frame a a di stinct picture. Instead.
to take advantage of the interframe redundancy, MPEG-2 may choose to encode each
frame as one of the foll owing:
An I-j rome, or Intracoded frame. i a compl ete picture.
A P-jrame, or Predicted frame, is a frame that merely describes the difference
between the current frame and the previous frame. Thu . to derive the picture for
thi s frame, one must combine the P-frame with the previous frame.
For example, Figure 6.85(b) shows P-frames that contain only the differences from
the previous frame. A P-frame will obviou Iy require fewer bit than an I-frame. Example
frame sizes mi ght be about 8 Mbits for an I-frame. but only 2 Mbit for a P-frame. Thu .
instead of representing 30 frames as 30 compl ete pictures (30 [-fran,es). a compre ion
method mi ght represent those frames using the foll owing equence of frames: I P P P P P
P P P P P P P P PIP P P P P P P P P P P P P P. The compression ratio in this example
would thus be 8 Mbits * 30 I (2 8 Mbtts + 28 2 Mbit ) = 240 I 72 = 3. ': 1. Obviou Iv.
a pi cture created by combined predicted frames with a previous frame won't be a
represent ati on of the ongrnal ptcture, espectall y tf there is a lot of motion in the video.
MPEG-2 thus trades off some quality for compression.
To achieve even further reduction . MPEG-2 uses a third frame type:
a B-jrome. or Bidirectional predicted frame. is a frame that can store difference
from previous and jl/Illm frames.
B-frames can thus be even smaller than P-frames. n example B-frame size might be
just I Mbit.
31H 6 Optimizations and Tradeoffs
EXAMPLE 6.26 Computing compression ratios involving 1-, P- and B-frames
Assume a 30-frame MPEG-2 sequence has Ihe foll owing frame sequence: I B B P B B P B B P B
B P B B I B B P B B P B B P B B P B B. Assume average frame sizes of 8 Mbils for I-frames.
2 MbilS for P-framl?s. and I Mbit for B-frames. Compute the compression raLi o.
The compression ralio in Ihi s exampl e would be 8 Mbil s 30 I (2 8 Mbils + 8 2 MbilS +
20' I Mbils) = 240 I 52 = I.
The example sequence of frames is in faci fairly Iypical for MPEG-2 vi deo. wilh I-frames
occurring about every 12- 15 frames.
MPEG-2 video encoders may seek 10 create about 30 frames per second. With hun-
dreds of Ihousands of pixels per frame that must be compared with another frame,
MPEG-2 encoding requires a large amount of computation to determine whi ch frames
should be I. P. and B. and what should be the values for the P- and B-frames. Further-
more. much of that comput ati on will consist of the sallie comput ation performed between
corresponding regions of two frames. Thus, many MPEG-2 encoders utilize custom
digilal ci rcuits to parallelize those comput ati ons at the expense of more hardware size.
For example. Example 6.2 1 built a sum-of-absolute-differences circuit using more paral-
leli sm Ihan in Exampl e 5.9. at the expense of a larger circuit size. Such a circuit would be
useful in a video encoder needing to quickly determine whether a frame should be
encoded as a P- or B-frame. or instead should be encoded as an [-frame. Addi tional cir-
cuits might compute the actual values of P- and B-frames.
Likewise. an MPEG-2 video decoder might use circuit s to quickly recompose 1-, P-
and B-frames back into full picture frames-although decoding MPEG-2 video is easier
than encoding because the actual determination of P- and B-frame contents is only done
duri ng encoding; decoding merely needs 10 combine P- and B-frames with their sur-
roundi ng frames.
Transforming to the Frequencv Domain for Further Compression
DCT -Discrete Cosine Transform
We saw in the previous secti on that sending a frame (P or B) that is just the difference
from a previou or future frame can result in some compression. However, the compres-
sion ratios achi eved were onl y about 4; I. Recall earlier that a OYD needs perhaps a 40; I
compre ion ratio to slOre a full length movie. Thus. funher compression is needed.
MPEG-2 therefore funher compresses each 1-. P- and B-frame indi vi dually. The com-
pression method involves appl ying what is known as a discrete cosine transform to SxS
blocks of pixel values within each frame. The di screte cosi ne transfornl is also used in the
well-known ]PEG standard for compressing still images. like those in a digital camera. The
discrete cos;lle trall sform . or DCT, transforms infonnati on from the spatial domain to the
frequency domain. (The OCT is similar to another popular technique known as the Fast
Fourier Transform, or FFT, also used for translating to the frequency domai n.)
Trans lating to the frequency domain is a powerful concept. whi ch is widely used in
digital signal processing. To understand thi s concept. consider wanting to digital ly store the
analog signal shown in Figure 6.S7. usi ng the fewest bits possible. The signal is a I Hz
cosine wave with an amplitude of 10. To store the signal digitall y. we could sample the
signal at frequent intervals. perhaps every mi lli second. and record the measured signal value
6.7 Product Profile: Digital Video Player/Recorder
365
as a binary number, perhaps S-bits wide. One second
would thus result in 1000 S = SOOO bits. On the
other hand. we could just store the fact that the signal
IS a wave with a frequency of I Hz and an
of 10. If we store each of those numbers
as S-bl,' value, then we only need to store S + S =
16 bIts. Sixteen bits is far less than SOOO bits. time (s)
. Of course. nol all signals that we want to di gi-
tt ze are SImple cosine waves . But-and thi s is the Figure 6.87 Digitizing signals by
key idea underlying freque ncy domain representa- translaling 10 the frequency domain.
non-lVe. call applVx;lI/Q/e allY origillal sigllal as a SII/II of cosille lVaves of diffe I
freqllell cles alld all/plillides. If we break the original signal into small regions we ob:
even better a '. F ' n
a I . pproxltnat lon. or exampl e, we mi ght approximate one region as the sum of
Hz. cosme wave of amplitude 5 plus a 2 Hz cosine wave of amplitude 3. We mioht
approxImate another regi on as the sum of 50 different cosi ne waves of different frequ;n-
cles and amplitudes. The small er the region we consider. and the more different cosine
wave frequencIes we conSIder, the more accurate wi ll be our approximation to the real
sIgnal.
Rather than storing the actual frequencies along with the ampli tudes of the cosine
waves. we could mstead deCIde only to consider using panicular frequencies. such as:
I Hz. 2 4 Hz, S H z, 16 Hz, and so on. Then. we can simpl y send the amplitudes of
those pmlcular cosme waves: (5, 3, 0, O. 0, ".). Let' s refer 10 these ampli tudes as
coeffiCIents.
. The OCT in MPEG-2 convens an input 8xS block, whose val ue represent pixel
IntenSItI es. to an Sx8 block representing the coefficients of predetennined "frequencies."'
In the VIdeo domam, each frequency represents a di fferent block pattern. with low fre-
quency bemg an almost constant pattern and high frequency being a changing pattern
(li ke a checkerboard). The OCT determines a set of coefficients such that adding the pre-
detemuned patterns together wi th each pattern multiplied by it coefficient yields ODe
resultmg pattern very similar to the ori ginal input block.
The equation for a two-dimensional OCT applied to an 8x8 block of numbers i :
8 8
F(II , v) = I I D[x. I )Il)c0s ( lt ( - .';; I) \')
.r = 0)' = 0
C(hJ f = 0
11. olherw;se
The input is an 8x8 bl ock. Drx. yj. The outpul is another x block. \ ith F(II," ) com-
puting the coefficient at row u. column I' for the output block.
An MPEG-2 encoder may utili ze custom digital circui t' for fa t OCT computati n
Notice tlmt computing each coefficient requires evaluating the rightmost teml (let' ali
that term the inner ternl) 64 times. and that must be done for each of the IH c ffi ien
... - -- . --- - -- -.
366 Optimizations and Tradeoffs
mea ninn M*6-1 = -1096 eval uati ons of the tenll . And that inner term it se lf requires several
Funhermore, the OCT operates on 8x8 bl ocks. but in a 720x480 I-frame
there will be 5-100 such bl ocks. Thus. the OCT for one I-frame could require 5400*4096
= 22 milli on computati ons of the inner teml. And that encoding may have to occur at 30
frames per second. You can begin to see why an MPEG-2 encoder may need to use
CUStolll digita l circuit s to comput e the OCT quickl y. using extensive parall eli sm and pipe-
lining to obtain the necessary performance.
The OCT computation can be sped up funher by precomputing the cosine terms of
the inner term. Notice that the OCT computes two cos ines based on the input values of /I
and x and the input values of v and y. However. because the OCT operates on 8x8 blocks,
lhe vari ables Ii, v, x. and y only range in value from 0 to 7. Therefore, we can precompute
the M poss ible cos ine va lues needed for the OCT computati on and store those values in
an 8x8 table, whi ch may be programmed into a ROM. We can then rewrite the OCT
transfoml as follows:
8
F ( II. \.) = L L D[x, y ] eos [ x. lI] cos [ y, vl
x = 0)' = 0
Using a ROM to store the precomput ed cos ine va lues speeds up the computation of
the inner term of the OCT.
Quantization
Trans lating to the frequency domain using the OCT does not directl y perform compres
sion-we merely convened an input 8x8 block to an output 8x8 block. That output 8x8
block represents amplitudes of panicular cos ine wave frequenc ies. We can achieve com
press ion by rounding those amplitudes. such that we use fewer bits to represent the
amplitudes. For exampl e. suppose we use 8 bits to represent the amplitude, meaning we
can represent amplitudes ranging from 0 to 255. Suppose we only represent even ampli
tudes. meaning 2. 4, ... . 254. In that case, we can drop the lowe t order bit. in the
representation of the amplitude. resulting in onl y 7 bits. The decoder would merely
append a 0 to the 7-bit number to obtain an 8-bit number again. For example, the 8-bit
number 00001111 would be compressed to the 7-bi t number 0000 111 with an implicit 0
in the eighth bit. The decoder would expand that 7-bit number back to the 8-bit number
0000 111 O--not ice that the decoded number is sli ghtl y different than the original , being
1-1 rather than the original 15 (an exampl e of why MPEG-2 compression loses some
image quality) . We could take thi s rounding concept further, onl y representing amplitudes
that are multipl es of 4 (thu dropping the two lowest order bits. yielding a 6-bit represen
tati on). or are multiples of 8 (dropping the three lowest order bits. yielding a 5-bit
representation). 0000 IIII mi ght be represent ed as 0000 I wi th three implicit Os, tilu
decoded back to 00001000. The decoded number of 8 is different from the original
number 15 due to the rounding.
The rounding described above. achieved by droppi ng low order bit s to achi eve com
pression. i, known a, qIlQl/ti1.l1tiol/ . otice the Iradeoff- more rounding yields more
compre"ion. at the expense o f accuracy. Fort unately. 11//l1/{/lIs dOli 'tllot ice sl/eh rolllldillg
ill the hixhlreqllell cy COIIIIJOIIeIll.1 of the pict"rc-Qur vi,ion ju,t i'I1 ' t Lhat precise. We
6.7 Product Profile: Digital Video Plaver/Recorder 367
also don' t notice mino . .
h
. . r I erences In the hi gh-frequency components of sound-Qur
eanng Isn't that precise Th' k f "
b uk I . In 0 a very hi gh-pitched sound, so hi gh it could perhaps
re ' g ass .. You probabl y couldn' t tell the difference between two s uch high-pitched
ounds of sli ghtl y different f' . . . .
. requencles-they are both Just hi gh. LikeWise. Our eyes can't
detect sli ght rounding of color values in a hi ghl y complex scene. So MPEG-2 applies
quanti zat IOn more aggressively on the OCT output block's hi gh-frequency coefficients
than on the low-frequency coefficient .
After quanti zati on, the 64 va lues in the 8x8 bl ock are treated as a li st of 64 numbers
Those 64 numbers are then run-length encoded. RIIII-length ellcodillg is a compres
method that reduces consecutive occurrences of zeros by a number indicating the number
of consecuti ve zeros rather than representing those zeros themsel ves . For example. con-
Sider wanung to represent the foll owing 5 numbers: 0, 0, 0, 0, 24. If each value is 6 bits
the 5 number require5*6 = 30 bit . On the other hand, we could just send a pair of
numbers, the first IIldl catlng the number of leading zeros, the second indicati ng the
nonzero number. So 0, 0, 0, 0, 24 would be encoded as (4, 24)-4 leadi ng zeros. followed
the number 24. If each value is 6 bi ts. the run-length e ncoded version requires only
- 6= 12 blls. Any of numbers could simil arl y be replaced by a sequence of
number p3lrs, each pUlr replacing a sequence of zeros and a number. The sequence O. O.
0, 0,24, 0, O. 8, O. 0, 0, O. 0, 0, 16 could thus be replaced by three pairs: (4, 24). (2. 8),
(6, 16), reducing the number of bits from 15*6=90 down to 6*6=36 bits. Note that the
number of zeros at the beginning of the sequence or in between nonzero numbers may be
zero, and the last number may be zero. For exampl e, the sequence 2, 0, 0, 63. 2, 0, O. O.
0, 0 could be encoded as (0,2), (2, 63), (0,2), (4,0).
Run-length encoding achi eves good compress ion only if there are many 0 in the
of numbers. Fonunately, the nature of the OCT leads to many 0 numbers (not
all cosme Jrequenc, es are to approximate a signal region. 0 tho e frequencies
wlil have 0 coeffiCients). espeCiall y after quanti zation (many coefficients are ' ust mall
numbers. which become 0 during quaniti zati on). Thus, appl ying run-Ienoth J enCoding
after quanti zation leads to funher compression. e
EXAMPLE 6.27 Computing compression ratios involving Quantization and run-length encoding
Continuing Example 6.26. assume that the 30-frame MPEG-2 sequence has the same frame
sequence and average sizes as that exampl e. bUI that each frame is further compressed by OCT con-
version to the domain fol! owed by and enCoding. A Sume the
DCT OUlput block conSI sts of 64 8bll numbers. thai quantization reduces the average number size
to 5-bil numbers. and that run-length encoding reduces the resulting number sequence ize to 30%
of its size.
The compression ratio would be 8 Mbits * 30 I 5/8 * 0.30 *(2 1bilS + * _ Mbi +
I Mbits) = 240 19.7 = 25: I.
Huffman Coding
After run-Iengtll encoding. each block consists of a sequence of numbers. me numbers
wi ll occur in that equence more frequently than others. HUffman codillg i a method of
reducing the number of bll. reqUIred to represent a et of values, by creating shoner encod-
ings for the frequentl y occurring and longer encodings for the Ie ' \-alue.
368 6 Optimizations and Tradeoffs
Huffman codi ng. a form of encoding known as entropy encoding, is another powerful
concept in digital data compression. Suppose you wi sh to represent an original sequence
of 16 numbers O. 3. 3. 31. O. 3. 5, 8, 9. 7. 15, 14.3. O. 3. O. Assuming 5 bits per number,
a straightforward binary encoding would be: 00000 000 11 000 11 11111 00000
000 11 00 1 a 1. and so on. for a total of 16*5 = 80 bit . We can reduce this total by first
observing that there are only 9 uni que symbols: 0, 3. 5, 7, 8. 9, 14. 15, and 31. We really
only need 4 bits to uniquely identify each symbol. We could thus assign the nine unique
symbols to 4-bi t encodings using the foll owing definiti ons: 0=0000, 3=0001, 5=0010,
7=00 11, ... , 31=1001 (note that the encodings are no longer the binary number represen-
tati ons of the ori ginal numbers). Thus. the ori gi nal sequence of numbers (0, 3, 3, 31 , 0, 3,
5, ... ) would be encoded as 0000 0001 0001 1001 0000 000 1 0010 etc. , for a
tot al of 16*4 = 6-1 bits. The key observati on here is that we can encode numbers using
any arbitrary unique bit patterns we desire, as long as the encoder and decoder are both
aware of the encoding definiti ons.
We can take this definiti ons concept a step fu nher. by using encodi ngs of different
lengths. Observing that 3 and 0 occur more frequently than the other numbers, we might
give 3 and a shoner encodings. So we might create the following encodi ng definitions:
0=00. 3=10. 5=010. 7=0110.8=0111, 9=11 00.14=1101. 15=111 0.3 1=1111. How
these definitions were created is just beyond the scope of this di scussion, though it 's really
not hard to learn. Notice that the encodings are such that the shoner encodings do not
appear at the left of any of the longer encodings. For example. 00 does nOl appear at the left
of any of the longer encodings, like 010, 011 0,0111, etc. This feature all ows the decoder
to know when it has reached the end of the code word-when the decoder has seen 00, it
knows it has found an encoded a (because no other encoding stans wi th 00); when it sees
10. it knows it has found a 3 (because no other encoding stans with 10). But when the
decoder sees 01, it must look at the next bit, and if it sees 010, it knows it has found a 5
(because no other encoding stans with 010). Using this variable-length encoding scheme,
the original sequence (0. 3, 3, 31. O. 3, 5, .. . ) would be encoded as 00 10 10 1111 00
10 010 etc. We have insened the spaces just for readabili ty; the actual encodi ng would just
be 001010 1111 00 1 00 1 0 etc. The total number of bits would be 4 * 2 (for the four Os,
encoded with the two bits 00) + 5 * 2 (for the five 3s, encoded with the two bits 10) + 1*3
(for the one 5, encoded with the three bits 010) plus 6*4 (for the six remai ning numbers 31,
8. 9, 7. 15, and 14, each encoded as 4 bits), totaling 45 bits-much reduced from the orig-
inal 80 bits required by the straightforward binary encoding.
Huffman coding achieves good compression when some numbers occur much more
frequently than other numbers in the sequence of numbers to be encoded. Fonunately,
thi s is indeed the case after OCT, quantization, and run-length tasks are performed on a
bl ock of a frame. For example, there may be plenty of as, Is, 2s, etc. , and fewer occur-
rences of hi gher numbers.
EXAMPLE 6.28 Computmg compression ratios involving Huffman codll1g
Continuing Example 6.27, assume that pairs of numbers after quanlizalion and run- length encoding
are Huffman coded, and that such encoding reduces the number of bil'> by 50%.
The compression ralio would Ihus be 240 I 0.50' 9.7 = 50: I.
6.7 Product Profile: Digital Video Player/Recorder 369
Summary
Summarizing MPEG-2 video enCoding:
The use of 1- P- d B f ' .
. f . ' , an - rames achI eves compres Ion by nOl resending redundant
10 ormatIOn of Successive frames, but rather JUSI sending the differences.
OCT transforms 8x8 blocks of frame to the freq uency domain. which doesn' t
ac leve compression itself, but rather enables compression in lhe next steps.
Quanti zation achi eves funher compression by reducing the number of bits needed
to represent the OCT coefficients, through rounding.
Run-l ength encoding achi eves further compression by replacing sequences of zero
coeffiCients by a number indicating the number of such zeros.
Huffman cod' h' f
. 109 ac leves unher compression by encoding frequently occurring
coeffiCient numbers with shorter encodings than less frequently OCCurring coeffi-
cient numbers.
The sequence of steps is shown graphicall y in Figure 6.88.
... 010t0010t100101010 --.J
,----...._10101111010101001oot - !
.. ! t001001oo0t010t11101 L--;====,--.!.,
101010001000t0111011...
* Uncompressed
OCT
g digital video

MPEG-2 video
(compreSsed)
L.....:.:..=.."'--_r-... 0101OO1011OO..
Figure 6.88 MPEG-2 video compression encoding overview.
Our example compression rali o calculati ons yielded a ratio of about 50: I. In fact, the
compressIOn ratio can be varied by varying each of the above steps. We can use fewer
I-frames to achieve even compression at the cost of degraded video quality. or
more I-frames for Impro.ved Video quality at the cost of more bilS. Likewi e. we can vary
the amount of quantization to trade off quality and compression ratio. Becau e a typical
movie Will have some slow-changi ng scenes and other rapidly changing cenes. and some
complex colored frames and other si mpler frames. the compres ion ratio for different
parts of a video may actually vary. Notice lhe permeating presence of cradeoffs (primaril
between quality and compression ratio) throughout MPEG-2 encoding. y
--------.l Huffman h
=-:-:-::::l decoding
.010tool0ltOO ....
L
r-o-
MPEG2 Video
(compressed)
I Uncompressed 8' fl
J t digital video
I
InvelSe 1-1 ...Ot0100t0110010t010
quantization - "I
! 'is
I
InvelSe L- tOOtOOtOOOtOt011110t Q>
. OCT I - t010tOOO1000tOlttOt1 . Cl 8
L.....:.-
Figure 6.89 MPEG-2 video decoding overview.
[IJ
a 00
=
370 Optimizatio ns and TradeoHs
An MPEG-2 decoder merely needs to appl y the above steps in reverse, as ill us-
tra ted in Fi gure 6.89. to convert an MPEG-2 stream of bit s back into a seri es of
pi ctures. or video.
Clearly. MPEG-2 encodi ng and decoding require a lot of computations performed at
speeds fast enough to create smoot h-looking. good-quali ty vi deo. Custom digital circuits
can help achi eve those required speeds.
6.8 CHAPTER SUMMARY
In this chapt er. we introduced (Section 6.1 ) the idea that sometimes we can improve a
parti cular design cri teri a without hurting other cri teri a (optimi zation). but usuall y we can
improve one criteri a at the expense of another cri teri a (tradeoff). We descri bed (Section
6. 2) the problem of two-level size minimi zation. int roducing K-mups as a visual method,
and then describing automated heuri sti cs for two- level as well as multi level logic size
minimi zation. We discussed (Section 6.3) methods for optimi zation and tradeoffs in
designing sequential logic. including state mini mizati on. state encoding, and Moore
versus Mealy type FS Is. We hi ghl ighted (Secti on 6.4) several alternati ve methods for
implementi ng some datapath components. incl uding a faster adder using carry-lookahead,
and a small er multipli er using sequenti al multipli cation. We described (Section 6.5)
methods for RTL optimizations and tradeoffs. including the powerful concepts of pipe-
lini ng and concurrency as means of achieving para ll el executi on-a key purpose of
custom digit al design. We also described the RTL methods of component all ocation,
operator binding, and operator scheduling. We briefl y menti oned (Section 6.6) some
higher-level methods. includi ng the general idea of serial versus concurrent computation,
and the selection of effi cient algorithms. We also int roduced some basic concepts of
power reduction. incl uding clock gating, and using low-power gales.
A you can see from thi s chapter. there are many methods for improving our design .
Yet. thi s chapter just scratched the surface of such methods. An entire mul tibillion-doll ar-
per-year industry exists that specializes in mak.ing aut omated tools for converting behav-
ioral descriptions of desired system functionalit y into highl y optimized circuit
impl ementations- that industry is known as Electroni c Design Aut omati on (EDA) or as
Comput er-Aided Design (CAD). Thi chapter hopefully gave you enough exposure at
least to understand the basic idea behind circuit optimi zati on at various levels of design
abstracti on. ranging from the gate level up to the RTL level and beyond.
6.9 EXERCISES
SECTIO 6. 1: INTRODUCTION
6. 1 Defi ne the "optimi zation" and "tradeoff." and provide everyday examples of each.
SECTIO 6.2: COMBI ATIO AL LOGIC OPTIMIZATIO S A D TRADEOFF'S
6.2 Perform two- level logic , ize optimizati on for the equati on F ( a . b . c) - a b ' e + abc +
a ' be + a be' u, ing (a) algebrai c method . (b) a K-map. Ex pre" the an" ver, as sumof
product,.

6.9 Exercises 371
6.3 Perform two-level logic s" ".
K E Ize optmll Zatl on for the equati on F(a.b.e) = a + a'b'c + a'e using a
-map. xpress the answer as sum-of-producLS.
6...4 Perform [wo-Ievel louie s' ' "
b ' d ' 0 lze optllnlzall on for the equati on F (a bed) - a' be ' +
_ a e + a bd using a K-map. Express the answer as
two-level logic size opt imizati on for the equation F (a . b . e . d)
usmg a K-map. Express [he as sum-of-products.
ab +
6.6 Perform two-level logic size opti mization for the equati on F ( a . b . c) - a' b ' c + a be.
assummg t,hm IIlput combinntions a ' be and a b ' c can never occur (those two mintenns rep-
resent don t cares). Express the answer as sum-of-products.
6.7 Perfo nn Iwo- Ievel logic size opti mizati on fo r the equation F (a bed) : a ' be ' d +
a b ' cd ' . assuming that a and b C'in never bOlh be 1 at the and thilt e and d can
never both be 1 at the same time (i.e. , there arc don' t cares).
6. 8 Consider the equation F (a be) : a ' e + a e + a ' bU' K d . .
f h f II . _ , " . a -map. etennme whI ch
o , t : ? are implicants (but not necessari ly prime irnpli canls) of the equation:
abe . a b . a ' be . a ' e . e . be . a ' be ' . a ' b.
6.9 Repeat the previous problem. but this time determine whi ch of the terms are prime impiicams
of the function.
6. 10 Forthe equation F (a . b . c) = a ' e + a e + a ' b. delermine all prime implicanlS and all
essential pnme Impl lc3nts of the function.
6.11 Forthecquati on F(a . b . c . d) 3 ab ' e ' + abe ' d + abed + a ' bed + a ' bed'
determine all prime impli cants. and all essenti al prime implicanlS. .
6. 12 the problem, the heuristic method of Tabl e 6. 1 to obtain a two-level size opti-
mi zed equation expressed in sumof-products form.
6.1 3 Use repeated appli cati on of the expand operation to heuri sti cnll y mi nimi ze the equation
F (a ,.b.' c) : a ' b ' e + a ' be + a be. Try expanding each term for each variable. Gi ve
the mlnlll1l zed equati on in sum-of-products form.
6. 1-' Use repe:lIed applicmion of [he expand operation to he uri Ii all y mi nimize the equation
F(a .b . e . d . e) = abede + abede ' + abed ' e'. Try expandingeachtermforeach
variable.
6. 15 Using algebraic methods. reduce the number of gate inputs for the foll owing equation b} cre-
ating a mult ilevel circuit : F(a . b . e . d . e . f . g) abede + abed ' e ' fg +
a bed ' e ' f ' 9 , . Assume on I)' AND. OR. and OT gates will be used. Draw the ci rcuit for
the original equati on and for the multile\'el ci rcuit. and clearl y li st me delay and number of
gate inputs for each ci rcuit.
SECTION 6.3: SEQUE TIAL LOGIC OPTI:MIZATIONS AND TRADEOFFS
Do. 6. 16 Reduce the number of stales
P L U S for the FSM in Figure 6.90 b)'
eliminating redundant
by using an implic3l ion table.
xy=OO xy=10
xy=10
Figure 6.90 FS I e\ rull ple.
- - ---.-.. -
xy=Ol
372 Optimizations and Tradeoffs
6.17 Reduce the number of states Inpuls: x: OulpulS: y
for the FSM in Figure 6.91 by
using an implication tnble.
().IS Reduce the number of Slates
for the FSM in Figure 6.92 by
using an table.
fi.19 Compare the logic size (as
number of gale inputs) and lhe
delay (as number of gate-
delays) of a straightforward
lbit bi nary encoding of the Figure 6.91 Sequence detector for bit patterns "01"' and "10"
FSM in Figure 6.93 with a
3-bi t output encoding and with
a one-hoI encoding of the
same FSM.
6.10 Compare the logic size (as
number of gate input s) and
the delay (as number of gate-
delays) of a minimal bit width
state encoding and an output
encoding for laser-based dis-
tance measurer FSM shown in
Figure 5.20.
6.2 1 Compare the logic size (as
number of gate inputs) and tlle
delay (as number of gale-
delays) of a minimum binary
encoding (if not possible. indio
cate why). output encoding.
and one hot encoding of the
FSM in Figure 3.39.
Figure 6.92 FSM exampl e.
Inputs;::e: out: : w,X,Y r---.

wxy=100 wxy=010 wxy=001 wxy=OOO
Figure 6.93 FSM example.
6.22 Conven the Moore FSM for the code detector circuit shown in Figure 3.46 10 the nearest
Mealy FSM equivalent.
6.23 Conven the following Moore FSM 10 the nearest Mea ly FSM equivalent.
a=O
en=O
Inputs:S,r
Outputs: a,en
6.24 Conven the fOll owi ng Mealy FSM to the
nearest Moore equivalent
6.25 Conven the following Mealy FSM to the
nearest Moore equi valent.
6.9 Exercises 373
Inputs:s,r.
Outputs: u,y
Inputs: g,r
Outputs: x,y, z
g'r'/xyz=010
glxyz=111
SECTION 6.4: DATAPATH COMPONENT TRADEOFFS
6.26 Trace the execut ion of the 4-bit carry lookahead adder shown in Figure 6.59 when a = II and
b = 7.
6.27 Trace the executi on of the 4-bit carry-lookahead adder shown in Figure 6.59 when a = 5 and
b = 4.
6.28 Trace the executi on of the 16bit carry-lookahead adder shown in Figure 6.59 when a = 43690
and b = 21845. Do not trace internal behavior of the indi vidual 4-bit carry-lookahead adders.
6.29 Design a 64-bit hi erarchical carry lookahead adder using 4-bi t carry-lookahead adders. Wbat
is the total delay through the 64-bi t adder? How much faster is the carry-lookahead adder
compared to a 64-bit carry rippl e adder (compute as slower ti me/faster time).
fi .30 Design a 24-bit hi erarchical carry-lookahead adder using 4-bi t carry-lookahead adders.
6.31 Design a 16 bit carry-select adder using 4-bit ri pple carry adders.
SECTION 6.5: RTL DESIGN OPTlMlZA TlONS AND TRADEOFFS
6.32 The adder tree shown in Fi gure 6.94 is used 10 compute the sum of eight inputs on every clock
cycle. where the sum is S - R + T + U + V + W + X + y + Z.
3 7 ~ Optimizations and Tradeoffs
(a) Design J pipcli ncd version
of the adder tree (0 maxi-
mi ze the speed at which we
can operate our clock input
elk.
(b) Create a timing diagram.
6.33 Assume the delay of an adder
is 3 IlS. How fast can we
execut e the ndder tree shown in
Figure 6.94 and how fast can
we execute the pipelined adder
tree designed in Exercise 6.32?
6 3 ~ What are the latency and
throughput of the pipelined
adder tree YOli designed in
Exerci se 6.32?
elk
L-----_l> s
Figure 6.94 Adder tree used to compute the sum of
eight inpUls every clock cycle.
6.35 (a) Convert the foll owing C-l ike code lO a high-level slate machine.
(b) Use the RTL design process shown in Table 5. 1 to convert the high-level tate machine for
the C code to a controller and a datapath. Design the datapath to structure, but design the
controller to the point of an FSM only.
(e) Redesign your datapath to all ow for concurrency in which four multiplications and two
addit ions can be performed concurrentl y.
I npu t s : byte a[256] . b[ 256]
Out put: by t e sum . byt e e[256 ]
MULT:
i nt i =0 :
int s um = 0 ;
whil e ( i < 256 ) {
e[i ] = ali] * b[ i] ;
sum = sum + e[i] :
i ++:
6.36 Redesign the data path and controll er designed in Exerci se 6.35 by allowing up to four concur
rent additi ons and inserting pipeline regi sters to your datapath and updating the controller if
necessary. A suming an adder ha a delay of 3 ns and a multiplier has a delay of 20 ns. how
long will the circuit take to finish its computation?
6.37 (a) Convert the following C-li ke code to hi gh-leve l state machine.
(b) Use the RTL design process hown in Table 5. 1 to convert the high-level state machine for
the C code to a controller and a datapath. Design the dawpath to structure, but design the
controll er to the point of an FSM only.
(C) Redesign your datapath to allow for concurrency in which three compari sons, three addi-
tions. and three multiplications can be performed concurrent ly.
6.9 Exerc ises
I nput s : byte a [256 ] . byte b[256] . byte ey
Ou t put: by t e sumx . by t e sumy . byte e[256]
MULT_OR_ADD:
i nt i -O :
i nt s umx 0:
i nt s umy - 0 ;
whi le( i < 256 ) {
if ( a li ] > 128 ) I
e [i] = al i] * b[i ] :
s umx = sumx + e[i ] :
el se
e[i J
sumy
i++;
a [ i] * ( b [i] + ey) :
s umy + e [i ] ;
375
6.38 Redesign the datapath and controller designed in Exercise 6.37 by allowing up to nine concur-
rent additIOns and inserting pi peline registers 10 data path and updating the controller if
necessary. Assuming a comparator has a delay of 4 ns. an adder has a delay of 3 ns. and a
multiplier has a del ay of 20 ns, how long wi ll the circuit take to fini sh its computation?
6.39 Given the hi gh-l evel state
machine in Figure 6.95.
create two di fferent
designs: onc design opti -
mized for minimum
circuit speed and one
sO = sO cO 51 = 51 +sO"cl 53 = 52+s0 c1 F = 53 54-c2
s2 = sO x2 54 = 50 c1
design opt imized for
minimum circuit size. Be Figure 6.95 High-level Slate machine for Exerci se 6.39.
sure to clearl y indicate the component allocation. operator binding. and operator scheduling
used [0 design the two circuit s.
SECTION 6.6: MORE 0 OPTIMIZATIO SAND 'ffiADEOFFS
6 ~ 0 Trace through the execution of the binary search algorithm when searching for the number 6
in the foll owing sorted li st of 15 numbers: I, 10,25. 62. 7 ~ 75. 80. 4. 5. 6. 7. 100. 106.
III, 121. How many compari sons were requi red to find the number u ing the binllr) search
and how many comparisons would have been required using a linear search?
6A I Trace through the executi on of the binary search algorithm when searching for the number 99
in the following li st of 15 numbers; I, 10.25.62.-74, 75. 80. 4. -. 8 7 ~ 99. 100. 106. III.
121. How many comparisons were required to look for the number u ing the binllr)' search
and how many comparisons are required using II linear search?
6A2 Trace through the execution of the binary search algorithm when searching for the number L I
in the li st of numbers from the previous example. How many comparisons were required to find
the number using the binary earch and how many comparisons are required using a linear
search?
6AJ Using the list of 15 numbers from Exercise 6.41. how many numbers ould \\e find faster
usi ng a linear search algori thm compared with the binary search algorithm?
376 Optimizations and lradeoHs
SECTION 6.7: POWER OPTIMIZATION
6A-l Given (he logic gates shown in Figure 6.96, optimize the foll owing circuit by reducing power
consumption without increasing the circuir' s dclny.
Figure 6.96 Logic gal e li brary. 2/0.5
format means 2 ns delay/O.S nw power.
a
b
d
(l A5 Given the logic gates shovm in Figure 6.96. optimi ze the foll owing circuit by reducing power
consumpti on wi thout increasing the circuit's delay.
b
6..t6 Given the logic gates shown in Fi gure 6.96. optimi ze the foll owing circuit by reducing power
consumpti on without increasing the circuit 's delay.
a
b
h
6A7 Gi ven the logic gates shown in Figure 6.96. optimi ze the following circuit by reducing power
consumption without increasing the circuit's delay.
a
b
~ DESIGNER PROFILE
Smila has degrees in
Electronics Engineeri ng
.md in Computer Science.
and has worked in the
digital design fi eld for
nearl y a decade. She spellt
a lot of time thinking about
the choice of a coll ege
maj or. Whal major should
I invest my focus. energy.
hean. and soul for what
will be some of the 1110St
productive years of m)' li fe?" She chose engineering. for
several reasons. 'Fi rst. engineering is a career in itsclf-
unli ke some other majors. jobs speci fi call y for
engineering majors arc out there. With engineeri ng. I
would le:ml the 1110 S1 va luable and uni versal of ski lls:
problem solving. Second. engineers have many options.
because engineers are highl y valued for their problem
solvi ng ski ll s by other professions, such :'IS management
consulling. marketing. and investment banking. And
electrical and computer engineers cun choose from a
mnge of industri es in which to work: telecommunicati ons.
image proccssing. mcdi cal devices, Ie fabri cati on. and
even banking. This was a phenomenal di scovery for me!"
Smit <l continued her educati on by doing graduate
studies in Computer Science, researching methods for
aut omati call y designing integrated circuil s (I e) or chips-
"a fascinati ng fi eld because it involves a mix of hardware
and soft ware skill s and knowledge. I conti nued in this
profession aft er school and worked for a company that
develops Computer-Aided Design (CAD) soft ware used
by hardware designers who work wilh a type of chip
call ed an FPGA (Fi eld Programmable Gate Array).
FPGAs can be used for an amazing vari ety of appli cati ons
all the way from high-speed tel ecommunicati on chi ps 10
low-speed and low-cost chips thaI go into electroni c toys
and games. Our software saves designers many months or
even years of time. In fact. without our sofl ware, it would
be absolul ely impossible for peopl e to design most chips
even if they had a decade or more to do it:'
Smita (shown mountain climbing above) loves her
work. ' My work is inl ellectuall y stimulating and I have
an opportunity to innovate, create. and actuall y build
something reall y useful.' She al so enj oys the peopl e-
aspect of her work. '1 work in team, of dynamic people
because 111 0s1 proj ects, hardware or software. are done
in leams of 3- 8 peopl e these days. The peopl e on my
6.9 Exerc ises 377
team are also my friends and it' s a lot of fun to work
with them."
I n her decade of work so far. Smita has taken on some
management responsibi lities. "As manager of one of the
four products that my company develops. I pl aya variety
of different roles. I work with my team of 7 soft ware
developers to determine what features to build in the
product and how best to build those features. I work with
the marketing and sales team to understand what the
customers need and how best to message and position our
product. Finall y, I work with other groups that are
involved in releasing a product - technical publ ications.
appli cation engi neering. and product engineering. The
diversity of my job makes it very interesti ng.
Smita enjoys the respect thm engi neers receive. "As an
engineer. I am highly respected by customers, partner
companies, and by our market ing and sales organizations
because I have a deep understanding of our products. I
reall y know my stulT since I built it and I get recognized
for it : And regarding the pay: '1 get compensated very
well for my skill s: She also likes the lifestyle: "I get in to
work around 10 a. m. and leave around 7 p.m. I don't have
earl y morning meetings unlike the folks in marketing and
sales, and I can work from home once a week or more
often if I wish. Thi s is also a great career for women - I can
take time off and return to my job without much penallY
when I have children. I can tailor my work hours as I Deed
as my children are growing up. Lastl y. I realize that I can
move from engineering to other functions such as
marketing and sales. but not the other way around! That's a
great benefi t of being an engineer - more option :.
Smita recommends engineering and computer science
students focus on certain t h i n ~ while in college.
Fi rst. get a good understanding of both hardware and
software. Systems are highl y integrated today and there are
very few compani es that develop one without payi ng very
close attention to the other. For instance. though I write
software. I need to completel y understand the hardware for
which il will be used. My husband. on the other hand.
designs telecommunication chips but works very closely
with hi s oft ware team. especially during the ini tial design
stages when they decide what to implement in hardware
versus software and how to design the hardware interface
so that the software algorithms work efficiently:
So, what do I mean by a good understanding of
hardware and software? In software. 1 think it is mosl
important to develop good software habits. Treat your
program li ke a well -landscaped garden-you want it
378 6 Optimizations and Tradeoffs
DESIGNER PROFILE (continued)
beautiful and weed- free. Understand claw well
and know when ant: is morc appropri ate than the ot her.
Organize your code, be di sciplined. cross the Ts and dOl
the h. document diligently. have your code reviewed by
friends. and finall y. don'! be afraid to throwaway code
and rewrite it if you disCQVC! f a better way,"
"In hardware. understand the b'1Sics of logic design and
then make sure you also understand the capac iti ve.
induct ive. and resisti ve properties of circuit s since these play
a big role in designing the hi gh-speed circuits of today."
"Other than these hardware and soft ware skills, become
adept at math and analysis. Learn to frame problems and
break them down until you can sol ve them. Be
experiment al and try diffcrcllI tools and methods. Have a
hypot hesis and thcn go about proving or disproving it. If
YOll haven' t already, you wi ll soon di scover thai
cngineering is nOI onl y fun. bUl also provides you with
many fulfilling career opportun.ities-so stick with it and
make the most of it !"
7
Physical Implementation
7.1 INTRODUCTION
A di gital circuit design lhat we've created bUl just drawn out . perhap wilh pencil on
paper or as.a 6gure in this book. is just a drawi ng. Somehow. we must event ually imple-
ment that dt gttal circuit dc ign on a real phys ical device. so that the device can then be
placed In some electronic product to
carry out the desi red functi on. owadays,
such a device is usuall y some form of
integrated ci rcuit , Or IC. also known as a
computer chip, or just chip. In ot her
words, looking at Figure 7. 1, how do we
get from (a), the seat belt warning li ght
ci rcuit we designed in Chapter 2. to (b). a
physical impl ementat ion using an IC?
In this chapter, we will describe
several popul ar physical implementati on
technologies for digi tal circuits.
7.2 MANUFACTURED IC TECHNOLOGIES
BeltWarn
Digital circuit
design
(al
Physical
implementation
(b)
Figure 7.1 How do we get from (aJ to (hJ?
If we are willing to wait weeks or months for a physical implementati on of our digital circuit
design, and iF we are willi ng to spend tens of thou ands of dollars to milli on of doll ars for
that physical impl ementation, lhen we might consider implementing our circuit using one of
several technologies that involve the manufacture of a custom or semicustom Ie.
Full-Custom Integrated Circuits
One physical implementation technology is known as a custom Ie. A!ull-CIIstOIll Ie is a
chip created specificall y to implement the gates (actually. the transistors) of the desired
digi tal circui t design (Fi gure 7.2). We digital designers wouldn't usually build full-custom
ICs ourselves, but rather we would send our desired di gi tal circuit design out to a group
or company that specializes in transforming digital de igns int o custom IC . Engineers.
assisted by computer-aided de ign (CAD) tool s. conven our desired digital circuit de ign
379
380 Physical Impl ementation
Accordmg (oot/e
sun'(')', on!., about
/00/.- 0/2002
digital circuits
were Implemellted
aJ CUSlOm tCf.
into a circuit of transistors. and then decide
where to place each transistor on the surface
of the chip. how to ori ent each transistor
(e.g .. left to right. ri ght to left. top to bottom,
ClC.). how big to make each transi stor, etc.
All that infomuHion about how the transi s-
tors should be layed out on a chip's surface
is known as a layol/t . Then. the fu ll -custom
IC engineers send that layout information to
a special factory lhat speciali zes in fabri -
cating ICs. known as a fabri cati on plant. or
Jab for short. Fabri cating an IC is often
referred to as a sili con spill .
BeltWarn
Ie
_ Custom
layout
------ Fab
months
Fabricating an IC is an extremely
costl y. delicate. error-prone process, uti- Figure 7.2 Full-cuslom Ie design.
li zi ng state-of-Ihe-art photographic, laser. and chemi cal equipment that costs hundreds of
milli ons of doll ars. The fabrication process may take many weeks or even months,
because transistors and wires are formed as layers on the surface of a chip, and each layer
may take hours or even days to form through chemi cal processes.
Implementing a digital ci rcuit on a full-custom IC is a compl ex and expensive task.
Costs for setting up the fabri cation of an IC, known as 1I0llreclIrrillg ellgilleerillg (NRE)
costs. can easi ly exceed many millions of dollars for a full-custom Ie. Furthermore, that
setup takes time. perhaps months, and that time may be costl y to us too-the product for
whi ch we are fabri cating the chip may be losi ng market share to a competing product
already compl eted and being sold while we wai t for our chip to be fabricated. Once we've
set up the detail s needed for fabri cati on, the fabrication process itself is less expensive.
But because we custom designed everything, the probability is hi gh that we made a
mi stake somewhere in the transi tors or wiring. Therefore. after fabri cating a full-custom
Ie. we may find errors that necessitate refabri cating the Ie. known as a respill . Respin-
ning may happen two or three times. each time requiring weeks or months, thus costing
us even more. We ought to ei ther be making milli ons of chips, or charging large amounts
of money per chip. to earn back the large NRE costs.
Needless to say. full-custom IC fabrication is not extremely common.
choose to implement a digital circuit on a full-custom IC when they know they will
produce the chip in extremely hi gh volumes , such as a mass- produced chip found inside
calcul ators or wri stwatches, or a mass-produced microprocessor chip like a Pentium.
Hi gh volumes in the tens of milli ons or more are needed to offset the cost and time
needed to produce a custom Ie. Alternatively, designers may choose to implement a
digital circuit on a custom IC if cost is not ti ghtl y constrained but maximum perfonnance
is a must. as mi ght be the case in military or space applications.
Semicustom (Application-Specific) Integrated Circuits-ASICs
Because physical implementation on full-custom ICs is so costl y and time-consuming,
semi custom technologies evolved during the 1980s and 1990s that reduce the costs and
the time of fabricat ing a chip, known as Applicatioll-Specific Illtegrated Circuits, or
ASICs. Two popular ASIC technologies are gate array and standard cell.
7.2 Manufactured Ie Technologies 381
Gate Arrays
The pan of custom IC design is designing and fabri cating Ihe transistors that will
go onto t e surface of the chip. Designing and fabri cating the wires that connect those
transIstors IS somewhat simpler. Gate array ASIC technology utili zes a chip who e tran-
sis tors are predesigned to form rows (arrays) of logic gates on thc chip. as shown in
=I gure 7.3. Gate arrays are sometimes referred to as sea-oj-gates. To implement a desired
Igltal Circuit on a gate array chip, we merely need to create the I"ires that conneci those
gates. Creatlllg the wires represent just the last steps of fabricati on. and thus gate array
technology eliminates much of the time and cost of fabricating a cllip for a particul ar
deSign. A gate array company predesi gns and mass-produces the gate array chi p, and then
customizes some of those chips for each cli ent' s circuit- the chip i somewhat custom-
the term sellliCI/SIOIII . and the customizati on is for a parti cular circuit
appitcatlon. hence the ternl afJfJlicatioll-specific. Figure 7.3 illustrates how we might
Implement our seat belt warning li ght circuit (Fi gure 7. 3(a using a gate array chip
(FI gure 7.3(b. Figure 7.3(c) shows how we might map the desired 3-input AND gate to
two 2-lIlput gate array AND gates. and the inverter to one of the gate array inverter . The
figure also shows how we mi ght implement the desired wi ring among the gate array's
pillS, the gate array AND gate, and the gate array invener. The remaining gate and pins
on the gate array chip would be unuti li zed. Fabri cating these wires would re ult in the IC
being customi zed to our seat belt appli cat ion (Fi gure 7.3(d .
Figur.7.3 Gale array lechnology: (a)
desi red circuit. (b) gate array before
wires are added. (c) gale array after
wires are added. thu implementing
Ihe desired circui l, (d) fabri caling Ihe
wires compl etes the Ie. NOle: real
gate arrays hnve many thousands or
millions of gates. not just a fcw.
w
We point out that the actual mapping of our desired di gital cir uit to a gate array
would typicall y be carried out by an automated tool. Designers rarely. if ever. carry out
that mapping manuall y, and in fact usuall y don' t even see that mapping in any fonn-the
mapping is all done by tools, resulting in huge data files that can be processed by other
tools at a fab to control the fabrication process. We also point out that a typical gate array
chip may hold lIIallY thol/ sal/ds or milliolls oj gates: the gate array shown in Figure 7.3.
having less than ten gates, is trivi ally sma.! I and is for illustration purposes only-gate
arrays wilh ol//y 10 gales do 1101 exis/. Furthennore. we would typically not u e gate
arrays unless our design contained thousand of gates or more. For de' igns with only a
few gate. we would instead use logic ICs: see ceti on 7A.
384 Physical Impl ementation
NOI ice thm our standnrd cell impl ementa-
lion places the cell such thai wiring is
minimi zed. whereas the gate array impl ementa-
tion of Figure 7.4 requi red uS to run the wi res to a
the pre-exi sting gate result ing in b
longer wires. Thus. the tandard cell impl ementa-
tion may be faster than the gate array
implementalion. si nce shaner wires lypically
have shaner deb )'.
Implementing Circuits Using Only
NAND Gates
You may recall from Chapter 2 that CMOS
transistors lend themselves more readil y to
creating NA D and NOR gates rather than
AND and OR. The stated underl ying reason
co = ab
5 = a'b + ab'
cell row
Figure 7.6 Half-adder usi ng
standard cell s.
co
was that pMOS transistors conduct Is well but not as. whil e nMOS transistors conduct
as well but not Is. In any case, gate arrays typi call y cont ain pl enty of NAND aneVor
NOR gates. rather than AND and OR gates. And standard cell designs will also be more
if implement ed using NAND or NOR gates rather than AND and OR. Further-
more. creating a gate array is much easier using just one type of gate, like just NA Ds,
or j ust NORs. rather than having to decide how many AND gates, OR gates, and NOT
gates to pre-instantiate in the arrays. Gi ven the ready avail ability of NAND or NOR gates
in CMOS ASIC technol ogies, we therefore want a method for converting AND/OR Cir-
cuit s to NAND circuit s or to NOR circuit s.
Fortunately, converting any AND/OR circuit to a NAND-onl y circuit is possible
because NAND is a uni versal gate, as was menti oned in Secti on 2.8. A tllliversal gale is
a logic gate type that can implement any Boolean functi on using gates of that one type
onl y. One way to understand NAND's uni versali ty is to recogni ze that we can implement
a NOT gate, an AND gate, and an OR gate by substituting each by an equival ent circuit
of AND gates. Therefore any circuit of NOT, A D, and OR gates can be implemented
using NAND gates onl y.
To implement a NOT gate using AND
gates, we can sub titule the NOT gate by a
two-input NAND gate with its twO inputs
ti ed together, as shown in Figure 7.7. The
truth tabl e in the fi gure shows that the
NAND gate with its inputs tied together acts
the same as an inverter. When the input X is
0, both inputs of the NAND gate are 0,
causi ng the NAND gate to output 1. When
the input X is I, both inputs of the NAND
gate are 1, causi ng the NAND gate to
output O.
Inputs Oulput
x a b F
0 0 0 1
1 t t 0
Figure 7.7 Impl emenling a NOT gale
using a NAND gate
Alternatively, we could simpl y connect X to one NAND input. and a 1 to the other
NAND input. Then if x is 0, the NAND outputS 1. and if x is I, the NAND output 0,
achieving the desired OT gate behavior.
EXAMPLE 7.3
7.2 Manufa ctured Ie Technologies
385
N implement an AND gate using NAND gates, we can subslilute the AND gate by a
Ah . gat e fOll owed by a NOT gale (which we know 10 be a two-i npul NAND gate
wit Its IIlputs tied together), as
shown in Fi gure 7.8. Thi s works
because given in puts a. b, Ihe first
NAND compules (a b ) , , and Ihen
the NOT gate computes (a b) " _
Figure 7.8 Impl emenl ing an AND gale usi ng
a b, which is AND. NAND gales.
To implement an OR gat e using
NAND gates, we can substiwle the
OR gate by a NAND gate wi th each
input invened, as shown in Fi gure
7.9. This works because given
tnputs a, b, the circuit of NAND

K}-
F=(a' b')'=a"+b"
=a+b
gates in Fi gure 7.9 computes Fi gure 7.9 Implemenling an OR gate using
( a ' b ' ) ' . which by DeMorgan' s AND gates.
Law is a " + b" , which simpli -
fi es to a + b - whi ch is OR.
When we repl ace a circuit originall y consisting of A D/OR/NOT gates by a ci rcuit
with NAND gates only using the above substitutions. we may fi nd that cert ain Signals get
doubl e-tnverted- the signal feeds into an inverter and then immedialely feed into
another invener. Double-inverti ng a
signal yields the ori ginal signal, so
double inversions can be replaced by
just a wire, as shown in Fi gure 7.10.
..
Such eliminali on reduces the transis-
tors needed without changing the
Figure 7.10 Double inversions can be eliminated.
circui t's funct ion.
Implementing a half-adde(s sum circuit using NAND gates
Figure 7. 11 (a) shows the sum circuit for a half-adder (sec Seclion 4.3). usi ng AND. OR. and Nor
gates. We can impl ement that circuit using AND gales onl y by substituti ng each gale with an
equivalent NAND ci rcuit. as shown in Figure 7. II{b). Afl cr the substitutions. we note that there are
two signals that are doublc invcncd. Eliminating the double inversions results in the circuit shown
in Fi gure 7.II (c).
double inversion

a

a
double inversion
(a) (b) (e)
Figure 7.11 Implemenling a half-adder's sum circuil usi ng NA D gales only: (a) original ANDIOR!
NOT circuit. (b) circuit oblaincd aft er SUbSlilUling equivalent A D for e3ch gate.
(c) ci rcuit aft er eliminat ing double inversions.
386 Physical Implementation
EXAMPLE 7.4
When convening A D/OR/NOT circuits by
hand 10 NAND ci rcuits. some people find it easier
10 simply draw inversion bubbles rather than the
NAND-based inveners. as shown in Fi gure 7.12.
Then. double inversion bubbles on a signal cancel.
Any remaining isolated inversion bubbles become
a NA D-based NOT gate. Thus, lhe ci rcuit in
Figure 7. 12 would end up identical 10 the ci rcuit in
Figure 7. 1 I (c),
If NAND gates with a fi xed number of inputs
are available. such as 2- input NAND gates onl y,
we can first modify the AND/OR circuit 10 use
only 2-input AND/OR gates (by composi ng larger
gates from smaller ones-see Seclion 5.8), before
convening 10 NAND gates.
Implementing Circuits Using NOR Gates
a-{)x>-a.
b
b
double inversion
double inversion
Figure 7.12 Drawing inverters as
inversion bubbles during
conversion to NAND.
---
a-c[>-a.
Converting AND/OR/NOT cir-
cuits 10 NOR gate circuits is
similar to convening to NA D
circuits, as a NOR gate is also
a universal gate. The process of
lransforming circuit into
NOR gates replaced each
AND, OR. and NOT gate wilh
equivalent NOR-based circuits,
as shown in Figure 7. 13. We
can replace a NOT gate Wilh a
two-input OR gate with the
inputs tied IOgether (or alterna-
Figure 7.13 NOR gate equivalencies.
ti vely, by a two-input NOR
gate Wilh one input tied 10 0). We can replace an OR gale wilh a NOR gate followed by
an inverter. yieldi ng (a+b) " = a+b. We can substitute an AND gate with a NOR gate
having inverted inputs, yielding ( a' +b' ) , a ' '*b' , a b (notice the use of
DeMorgan's Law).
Implementing a half -adder's sum circuit using NOR gates
Earli er. we demonslrated how to represent the half-adder's sum output with NAND gates; we can
just as easil y implement the sum output using NOR gates. The half-adder' s sum circuit is shown
agai n in Fi gure 7. 14(a). We replace each NOT. AND, and OR gate by its equivalent NOR circuit in
Figure 7.14(b), using inversion bubbles instead of NOR-based NOT gates for convenience. We
eliminate double inversions. and replace stand-alone inversion bubbl es by OR-based NOT gates,
as shown in Fi gure 7. 14(c).
EXAMPLE 7.5
(a)
double inversion
a
b
double inversion
(b)
7.2 Manufactured Ie Technologies 387
(e)
. Figure 7.14 Implementing an A D/OR/NOT circuit using NOR, onl y: (a) original ci rcuit , (b) circui t
obtained by substituting AND/ORINOT gates by equivalent NOR circuits. using inversion bubbles
:or of drawing. (c) final circuit after elimi nating double inversions and replacing standalone
inverSion bubbles by NOR-based NOT gates.
The half-adder' s sum circuit was implemented with fewer NA D gates than NOR
gates. Depending on the ori ginal circuit, the reverse cou ld be true. We saw that NAND
gates were well-suited for circuits in the sum-of-products form. NOR gates are best
used when a circuit is in product-of-sums form (a level of OR gates feeding into a
single AND gate).
Gate array and standard cell librari es typically include additional components.
beyond just NAND or NOR gates, that have efficient CMOS implementations. For
example. a popular such component is known as AND-OR-INVERT. or AOl for shon.
Such a component has two 2-i nput AND gates (thus four inputs total). feeding into a
2-input NOR gate. That circuit can be efficiently designed using CMOS transistors. Thus,
we would want to utili ze AOI components, and other si milarl y compact available compo-
nents in a library, as much as possible.
The task of convening a general logic circuit to a circuit using onl y components from
a panicular technology' S library (e.g., a particular gat e array library or standard cell
library) is known as tecllllology mapping. The task of determining where to place tho e
components on a chip is known as placement. and the task of connecting tho e compo-
nents by wires is known as routing. All three tasks, collecti vely known as physical
design , are typically done by automated tool s today.
Implementi ng the seat belt warning light on a NOR-based gate array
Implement the Bel/Warn circuit of Fi gure 7.15(u) using the NOR-based gate array of Figure
7.15(a). Noticing that the gate array has only 2-Input NOR gates. we first conVert the Bel/Warn
circuit to usc AND/OR gates wi th 2 inputs only. as shown in Figure 7. 15(b). We then convert the
ANDI OR circuit to the NOR-only circuit in Figure 7.15(c). using the equivalencies in Figure
7.13, and using inversion bubbl es rather than NOR-based inverters. We then see a double inver-
sion on the wire from input S. so we eli minate those two inversions. Note that we do not
eliminate the double inversion between points 3 and 4 in Figure 7. 15(c). be ause the first in"er-
sian is part of a NOR gate-eliminating that first inversion would convert the OR 2.3.te to an
OR. defeating our goal of havi ng NOR gates onl y. After converting remaining inver-
sions to OR-based inverters. we map the circuit to the gate array's _-input lOR 2ates as in
Figure 7.15(d)-we numbered the OR gates of Figure 7 .15(c) and (d) to show the pon-
dence between the two circuits.
388 1 Physical Impl ement ation
DD-D-
--- - - - -- -- - --- -- --------
DD-D-
(b) P
-- ------ -- - - --- -- - - -----
D-D-D-
(a)
(c)
(d)
Figure 1.15 Implementing the BelllVa,." circuit on a NORbased gat e array Ie: (a) ori ginal gate
array. (b) - (c) convening (he desired circuit LO two- input OR gal e!' onl y. (d) final gate array with
wires.
w
7.3 PROGRAMMABLE IC TECHNOLOGY- FPGA
ManufaClUred IC technologies require at least a few weeks. and usually more like several
months. to canven a desired di gital circui t design int o a physical Ie. What if we are
developing a circuit that we want to implement roda,,? In that case. we can utili ze one of
several programmable IC technologies. In a programmable Ie techllology. we tmpl ement
a desired circuit simply by writing a panicular sequence of btls tnto a memory (or
number of memori es) contained in the Ie. Using a programmable IC technology has the
drawback of worse performance. size. and power compared to custom or semi custom Ie
technologies. But we get our implementation today. and the benefits of that fact may out
wei gh the drawbacks.
-The most popular form of programmable IC
technology is known as a Field-Programmable
GoteArray. or FPGA . An FPGA company prefabri -
cates an FPGA chip, meaning that the chip contains
all the transistors and all wires that the chip will
ever have. We buy lhose FPGA chips. and then
program the chip to implement our desired ci rcuit.
To program in lhis context mean imply to down-
load a seri es of bits into lhe chip's memories-not
to be confused with writing hi gh-l evel oflware pro- Figure 1.16 FPGA chip>.
grams like C or C++ code. Such programming .. .
OCcurs in lhe field. meaning in our lab. or offi ce, or home. 3.', opposed to tn a fabn atton
plant. Hence the words "field-programmable" in Ihe FPGi\ nallle. Funhermore. program
ming typicall y takes onl y seconds. or perhaps minules at most. Fi gure 7. 16 show. SOllie
FPGA chips. The chip al Ihe top. wilh iL, front and back shown, mea., ures ahoul 3/4 tnch on
each side. The chip on the bott om measures just over I inch on each side.
Fil'ld
programmable
gale arrays
(FPGAs) hl1v(' 110
"gale arrays"
iI/side 'hem-
the I/(lme is there
due 10 historical
reasons,
Lookup Tables
Tlte key idea
underlying
FPGAs is '''m
tI memOf)! wilh
N'addre,'is lines
COfl implemenr
1Il1y combi,wliollal
!1If1Cliofi wilh
N i"pIIIS,
1.3 Programmable Ie Technology-FPGA 389
The words "gale array" are lhere in the name because, when FPGAs firsl became
popular in the mid- 1980s. they were marketed as an alternative to gate array technology,
which was very popular allhal lime. Thus, an FPGA was a semicustom IC (nearl y syno-
mous with gale arrayal lhal lime) thaI could be programmed in the field instead of at a
fabrication pl ant. However, be forewamed Ihat the inlernal design of an FPGA chip looks
nothing like a gale arraY-lhe naming is somewhal unfortunate.
The two basic Iypes of components inside an FPGA are lookup lables and switch
matrices. Those components are repli caled hundreds of limes in regular patterns inside an
FPGA. We now describe each type of component.
A basic idea underl ying FPGAs is lhal a memory can implement combinatioltal logic.
More specifically, a I-bit wide memory with N address lines. and hence 2N words, config-
ured 10 read the word corresponding to the present address. can implement any Boolean
combinational functi on of N variables.
Recall that a memory confi gured 10 be read will out pul the contents of the word cor-
responding to Ihe present address al the memory's address lines. So if a 4x I memory's
address lines a 1 a 0 are 00, the memory wi ll outpul the contents of word O. If the address
lines are Ol. lhe memory outpulS the contents of word I. Likewise, 10 reads word 2. and
11 reads word 3.
Implementing a Boolean function wilh a memory can therefore be done simply by con-
necting the funclion 's inputs to lhe memory address lines, and storing a 0 or 1 in each
memory word to match the desired funclion OUlput for each combination of inpul values.
For example, consider lhe function F ( x . y ) = x ' y ' + xy . The truth table for the func-
lion is shown in Figure 7. 17(a). To impl ement the example function, we can connect x and
y to a 4x I memory's address lines a 1 and a 0, respectively. and based on the truth table. we
store a 1 in word 0, a 0 in word I. a 0 in word 2. and a 1 in word 3-in other words. we
slore lhe trulh lable OUIPUIS in the memory. The memory then implements the d ired func-
ti on, as shown in Figure 7. I 7(b). For example. when xy=OO, we wanl the output to be 1.
Figure 7.1 7(c) shows thai when xy-OO, the memory' s address lines will be 00. and thus the
memory will outpul lhe contents of word O. which i the value 1 , as desired.
F =x'y' +xy F =x'y'+xy

4x1 Mem. 4x 1 Mem.
G = xv'
,
x Y
F
0 0 1
';
x
Y F G
/1 ------
--I
0 0 0 1 0 \
0 0
(1 O'{
0 0 - - 0 2 0 , 0
1
0
i
---J 1 x=o 3 1, 0
: 0 1 :
x- a1 ,
\ 1 0 :
y- aO 0 aO 0 :
"---"
+F
y=O
F=1
(c)
F G
(8) (b)
(d)
(e)
Figure 1.11 Implementing logi c functi ons using a memory: (3) _-i nput fun 'li n truth table. to)
corresponding memory contc,llls and 'onnectlo,ns. (c ) the propt!r outpUt appe3f'S for the gi\cn input
values. (d) two functi ons the same two mputs. (e) mcm I) l.. "Ontents for the ' \\ Q functions.
390 Physical Implementation
A with JII bits per word. rather than just I bi t per word. can implement M
runctions. as long as all those M functi ons have the same inputs. For example. consider
thetworunctions F(x . y) = x ' y ' + xy and G(x , Y) - xy '.The tmthtableror
the e two functions is shown in Figure 7. 17(d). A 4x2 memory. which has 2 bits per
word. can implcmcm those two functions. as shown in Figure 7. 17(e).
A memory used to implement a combinational circuit is known (in FPGA termi-
nol ogy) as a lookup table. When used as a lookup table. we typicall y rerer to the memory
by the numbcr of iI/pillS (address li nes) and the number or out puts (bi ts per word), rather
than by the number or \I'onls and the number or out put s. For exampl e, we would refer to
an 8x2 mcmory being used as a lookup table as a "3- input 2-output lookup table," rather
than as an 8x2 lookup table
From this point forward. we' ll assume Ihe memory is configu red for read, and thus
we \\,on't show Ihe read line sellO 1.
EXAMPLE 7.6 Implementtng the seat belt warning light with a lookup table
Use a lookup lable 10 implemenl Ihe seal belt
warning li ght circuit from Figure 7.1. whose
circuit appears in Figure 7. 1 8(a) and whose
equation is:
,./ = kps '
\Ve generate the truth table for the fune
tion. as shown in Fi gure 7.18(b). Because the
circuit has three inputs. we know we' ll need
a 3- inpul I-oulpul lookup lable (memory).
\Ve connect the inputs 10 the memory's
address lines. and store the truth table in the
memory. as shown in Figure 7. 18(c). Ihus
implementing the desired runction. Ir the 3-
input I-output memory is an Ie. then we are
done implementing our design. and can
insen the Ie into the electronic system Wilh
"hich Ihe Ie should inleracl.
o 0
, 0
2 0
o
o
o
,
7 0
(c) Ie 0
w
p s w
0 0 0 ! 0 '
0 0
,
i
O
0
,
0
:0
0
,
:0
0
:0
0
,
:0
0 : ,
I
: 0 I
(b) ' ..
----- .
Programming
(seconds)
X
You've ju t seen an example of a very
imple programmable IC technology-a
memory. We can use a memory chip
with N address lines and hence 2N
Figure 7.18 Lookup lable implemenlation.
word,. and with M per word. to
implement M dirrerent Boolean functions of the sallle N inputs. We can purchase a
memory chip before we need it for our design. and then we can "program" the memory
chip in our lab to implement 3 desired Boolean function.
Partitioning a Circuit among Lookup Tables
Unr rtunatcly. u,i ng a memory to implement a Boolean function doc not work well for
functi on, with numerou, input,. For example, while a <I -input function would need only a
7.3 Programmable Ie Technology-FPGA 391
16-word memory a 16 . .
functi o Id ' -tnput functIon would require a 64 K word memory; a 32-input
same a
nthwou. reqUIre a 4-billion-word memory. The needed memory size grows the
s e size or the f ' .. v
numb f f . . unction s truth table, whIch we know grows as 2' . where N IS the
er 0 unclton tnputs I h . . .
resenl
' t' < f . . n SOrt, a tmth table IS 1101 an effiCIent Boolean function rep-
a Ion .or uncti ons . h ' .
imple' WI t numerou tnputs, and thus a lookup table IS not an efficient
mentatIOn ror runctions wi th numerous inputs.
Partltl omng a funct' ' . .
. . Ion s CirCUit among multiple lookup table can yield more effi-
cIent Impl ementations I .
. . f or arger functIOns. Consider the extended eat belt warning
CirCUI t rom Example 28 L '
. " . " et s eXlend the ci rcui l even more by addina a third "diag-
nostlc IIlput called d th r . 0>
. . al orces the warnlll g li ght 10 tum on when d=l-perhaps a
mechamc IIlvestigating a raulty warning li ght mighl want to force the warnina liaht on to
Isolate whether the li ghl has blown OUI or to help determine ir a seat has
fatl ed. The extended circul' t 's h . F' . . .
. . I S Own III Igure 7. 19(3). That CirCUit can t be mapped to a
3-lIlput I-output lookup table because the circui t has 5 inputs, bUI the circuit could be
mapped onto a 5-lIlput I-output lookup table. Alternatively, we could implement the
CIrCUIt by UStng a 3-input I-oulput lookup table connected to another 3-input l-output
lookup table, as shown in Figure 7. 19(c). We do so by partitionina the oriainal circuit iOlo
two groups. such thaI the fi rst group has 3 inputs and I output. the group has
3 IIlputs and I output. as ci rcl ed in Fi gure 7.19(b). The fir t group' s output. whicb we've
labeled as x, has the equation x = kps '. The second group' S output has the equation
vi = x + t + d. We would program the lookup tables to implement these functions_
as shown III FI gure 7.19(c), thus implementing the desired circuit using two lookup tables.
BeltWarn
(a)
BellWarn
3 inputs
, oulpUI
x=kps'
(b)
k
P
3 inputs
, OUIPUI
w=x+t+d
t
d
8x' Mem.
- .... 0
0
,
0
0
0
,
0
0
(c)
ax, Mem.
o 0
,
2
__ ..... 7
o
w
Figure 7.19 Partilioning a circuil onlO IWO lookup lables: (a) desired circuit. (b) circuil partitioned inlo !!fOUpS with 31
l11os13 inpuls and I OUlpUl, (c) groups mapped 10 IwO 3-inpul I-oulpul lookup lables. -
Notice that the implement ation with two lookup tables has a total of + = 16
1V0rds, compared to 32 words that 1V0uid have been present with a 5-input lookup table.
Thus. partlllol1lng a CIrcUIt among small lookup tables an re-ult in better effi ienc\ than
using one larger lookup table. -
This efficiency can be seen even more dramatically f r e,amples \\ ith rn re
inputs. For example. the runctl on F - abc + de + ghi . , h \\n in Figure
7.20(a). has 9 inputs. Implementing the funcli n on a single lool..up tabk
392 Physical Implementation
require a table wi th 2
9
= 5 12 words . However. we can partition the circuit into groups
such that each group has 3 input s and I output- the first group wou ld compute abc,
the second def. the third ghi, and the fourth would OR the output s of the first three
groups to ge nerate the out put F. Each group could be implemented using a 3-input
I -output lookup table. meaning 8x I memories. The resuiting implementation would
have four such lookup tables. as shown in Figure 7.20(b). The total words for that
four-table implementati on would be a mere 8 + 8 + 8 + 8 = 32 words-far less than
the 512 words required for a si ngle 9-i nput lookup tabl e. Figure 7.20(c) compares the
relative si zes of a 5 12-word and four 8-word memories. Notice the tremendous reduc-
tion in size.
a---.r::::;::::::::---,
b
C-r-'-.-/
d-'-r---....
e

h
(a)
F
afu
512x l Mem.
3-1
F
8xl Mem.
(b) (e)
Figure 7.20 Dividing a many- input circuit among smaller lookup tabl es reduces totallooirup table
size: (a) 9-input ci rcuit . (b) ci rcui t mapped to 3-input I-out put lookup tables. (c) size savings
compared to 9-input I-output lookup table.
Parti tioning a function among small lookup tables is more efficient than imple-
menting a function on one large lookup table. But what is a "small " lookup table-a
table with 2 inputs. 3 input s, 4 inputs. 7 inputs. or maybe even 10 inputs?
Researchers have conducted numerous studi es on large numbe rs of typical circuits,
and found that 3- input or 4-input lookup tables seem to work best for most circuits.
Furthermore. researchers fou nd that 2-output lookup tables also seem to work weli
for mo t examples. Thus. we' ll use 3- input 2-output lookup tables from this point
forward.
EXAMPLE 7.7 Partitioning a circuit among 3-input 2-output lookup tables
Impl ement the circuit shown in Figure 7.21 (aJ u ing 3- input 2-output lookup tables. We begin by
trying to partiti on Ihe ci rcuit into groups such that each group has at most 3 inputs and 2 outputs.
However. the 4-input AND gale prevents us from successfully perfonning such panitioning,
because whatever gate Ihat group is in will have at least four inputs. To remedy this problem, we
decompo,e Ihat gate into two smaller gates. while maintaining the same functionality. as shown in
Fi gure 7.21(b). We can then partition the circui t into two groups. each wi th 3 inputs and l output,
a, <hown in the figure-We've numbered the inputs to each group to make clear that each group hns
three tnputs. We then map those groups onto two 3-inpul 2-output lookup tables as shown in Figurt
7.2 ](C). ot lce that the lookup table's 0 I output is unu<cd. and the second table's DO output is
unu\Cd The fir<ttahle\ DO column implements t-abc. The .,.,cond wblc's 01 column implements
r td + e.
EXAMPLE 7.8
7.3 Programmable Ie Technology-FPGA
393

(a) __- ------------
8x2Mem.
8x2 Mem.
0 00 0 00
00 1 10
00 2 00
10
e--__
(b)
(e)
Figure 7.21 Partitioning a circuit onto two lookup tables: (a) original cireuit. (b) transfonned circuit
that breaks the 4-IOput AND gate IOto two smaller gates and then that haws the 3 . I
. ) . ' -Input -output
groupIOgs. (c mappIOg of each group to a lookup table, with the group's function converted to
programmed btls 10 the lookup table. Italicized bits are unused.
In the previous example, notice that we did not use one of the columns in the first
lookup tabl e, and dtd not use one of the columns in the second lookup tabl ' th U
I . e el er. smg
ookup tables sometttnes results in unused memory cells. Using lookup tables also some-
ttmes results In unused lookup table words, as illustrated in the foll Owing example.
Mapping a 2x4 decoder to 3-input 2-output lookup tables
Let's implement a 2x4 decoder. without enable, using 3-input 2-ourput look'llP tables. A 2x4
decoder has two inputs .. 11 and 10. and four outputs, dO. d 1. d2. and d3. A mapping i shown in
Ftgure 7.22. The equauons for each output are dO = i 1 . i 0 '. d 1- i 1 . i 0, d2- iIi 0'. and
d3=111 O. The lookup tables tmpl ement those equations usi ng the top halve of the tables' words'
the bouom halves are unused. .
8x2 Mem.
8x2Mem.
0 10
0 00
1 01
1 00
2 00
2 10
0 a2
3 00
a2
3 01
0
il
al 4 00
al 4 00
iO aO 5 00
aO 5 00
6 00
6 00
00
tl iO
dO dl
d2d3
(b) (a)
Figure 7.22 Mapping a 2x4 decoderto 3-input _-output lookup table : la) desi!\'d ireuit.lb)
mapping to two lookup tables. ltaltCi zed btl are unused.
Physical Implementation
An FPGA may come wit h tens. hundreds. or even thousands of lookup tables. and thus
can implement large amoullls of combinati onal logic.
Programmable Interconnects (Switch Matrices)
In the previous examples. we have been creating customi zed connections between lookup
tables. However. the point of FPGAs is that the ent ire chip is prefabricated-includi ng
the wires. FPGAs therefore come with programmable illlereollll eels. sometimes call ed
swilch molriees. which all ow us to program the conneclions among lookup tables. Fi gure
7.23 shows a simple FPGA chip having six inputs (PO-P5), IWO 3-i nput 2-output lookup
lables. one -l-input 2-output swilch mat rix. and four OUlPUIS (P6-P9). All three of the left
lookup table's inpuls come from the eXlernal inputs Pl . P2. and P3-that lOOkup table' s
inputs can ' t be changed. However. two of the right lookup wble's inputs may come from
eilher the left lookup wble's outputs. or from Ihe external input P4 and P5. The switch
matrix determines whi ch of those connecti ons will be made.
FPGA (partial) Switch matrix
8x2 Mem. 8x2 Mem.
2-bit
memory
t t
mO
iOS
1 sO
m1
0 00 0 00
1 00 00
2 00
m2
i1 4x1
I-
m3
i2 mux
i3
PO
P1-t-
3 00 P6
4 00
P7
aO 5 00
I-
2-bit
-""
memory
t t
6 00
00

1051 sO

d
f-2.l I-
i3
pg
P4 . ____ _ P5l:
(a) (b)
Figure 7.23 A imple FPGA architecture: (a) an FPGA Ihal includes a swi lch matrix, and (b) the
Ii " itch matri x's internal s , howing two 4x I muxes controlled by twO 2-bit registers. Note: real
FPGA, have hundreds of lookup tables and swilch m3lrices, nOI jusl " felV.
The swilch matrix's internal design appears on the ri ght of Figure 7.23. It consi ts of
two 4x I multiplexers. The lOp mux connect Ihe swit ch malri x output 00 to one of Ihe
matri x' s four input, . The bottom mux connects thc output 0 / to onc of the matrix's four
inpui s. A two- bit memory (whi ch is actuall y a 2-bit register. but call ed a memory for con
sistency with the memory in\ ide a lookup lable) holds Ihe two bits that set each mux'
two select line . Thu,. we can program the de;ired connecti on; simpl y by wri ting the
appropriate into Iho\O two 2-bil memorie;, . otice that each ,witch matrix outpul can
be confi gured independently of the other. In facl. we could even make Ihe same inpul
appear at both output , . though that' s probabl y not in tim FPGA design.
We' lI illLl , trale the use of the switch matnx With an CX;1I11plc.
7.3 Programmable Ie Technology-FPGA 395
EXAMPLE 7.9 A 2x4 decoder on an FPGA with a switch matrix
0
0
i1
iO
We repeal Example 7 8 here us h FPGA h .
inpUlS 10 Ihe fi rs t lookup lable as in E 'In 7.23(a). We Co an easily gel the proper
external in ut ;0 . . x mp e .. y connectmg . external input iI . and
inputs 10 FPGA InpUIS. as shown in Figure 7.24(a). To gel the proper
FPGA . h up tabl e, we first connect external input if and external input ;0 to the
. 1,"Puts t at reed 1010 the switch mmri x. We then configure the swi tch matrix such thaI swi
10 swilch malri x .output 00. which means Ihat eXlemal
. . WI, C Outpul 00. We achieve thaI configuration by programmin 10 into
to: register In switch as shown in Figure 7.24(b). Likewise, we me
:xt
li C
that switch matnx 1113 passes through to swi tch matrix output 01. meaning
,ema. Input passes through to switch matrix output 01. \ Ve achieve that confi ouraLion b ro-
gramlllJllg 11 the boltom 2bit register in the switch matrix. Because the switch matrix
connect 10 the nghl lookup tables ' I '
h
InpU s. we ve successfull y connected external inpuLS if and fO to
I e second lookup lab Ie's in I' d' d
E I . pu s. as e Ire . We program the Iwo lookup tables as we did in
F
xamp
e 7.8. Thus. external OUlpUIS dO-dJ can be found at the FPGA eXlemal pins as shown in
Igure 7.24(a). .
FPGA (partial) Switch matrix
c;,J
8x2Mem.
8x2 Mem.
0 10
0 00
01
00
mO
iOS
1 sO
m1
i1 4x1
1m3
mux
i3
2 00
10
3 00
a2 d3
a1 4 00
aO 5 00
d2
6 00
00 --I- 11
"

iOS1 sO
:E-! il 4Xl
L-.
I:;rtx
'-------'-

n
iO
(a)
(b)
Figure H4 Implementing a 2,4 decoder on the FPGA fabric having a witch malrL': (a1 e.'temal
and bits In the lookup table- and switch matrix. and (b) a look inside the
matnx. showmg the programmed connections between the Outputs and input . Italicized bilS
In the lookup tables are unused.
EXAMPLE 7.10 Extended seat belt warnmg light on an FPGA
We arc, to i.mplcment the, extended seat belt warning light system f Example t't on the FPGA
In Figure (Figure 7. 19 showed how to partition a similar circuit in 1\\0 groups. \\; th
equations X c kps. and w - + t + d. For this example. W - + .) \\'e connect . p.
and 5 to the FPG gOll1g to len lookup lable. and \\ e progrnm thut lookup I,ble to imple-
ment the .fullcllon kps : tl!' shown In FIgure 7.2-. \ Ve connect an utpul of the left lookup mble.
rcprcsentlllg x . 10 the nght lookup table!. b) programming the Itch man;" to ("(lnnect 0 to 0 .
\Ve connect t to the right lookup tnble also. b) connecting to an pin lnn (tXt h .. '
Illtllrix input m2. and then b) onfiguring the itch Il1nIrh to ,,"onnl.: "' \ m2 t 0 1. \\"e then pn gram
the right lookup tnblc to il11plcl1lcl1Ilhc function \ + t . as S.ho\\l1 in Figure -

396 7 Physical Implementation
FPGA (partial )
Bx2 Mem. Bx2 Mem.
0 00 0 00
t 00 01
0
2 00 01
3 00 3 01
w
:r'
00 4
00
s aD 5 00 aD 5 00
6 01 6 00
7 00
---7 --Bfr-
01 DO 01 DO
matrix
6
'
(a) (b)
Figure 7.25 Implementing the extended seat belt warning light circuit on the FPGA fabri c having a
\\ itch matrix: (a) external connections and programmed bils. (b) n look inside the switch matrix,
showing the programmed connecti ons. Italicized bils in the lookup tables are unused.
Notice that. in the previous two exampl es, we implemented two differelll circuits using
the salli e FPGA chip. To implement the two different ci rcuits, we merel y had to program
different bi ts inlO the lookup tables and swi tch matrices. That' s the appeal of FPGAs-
they implement our circuit just by programming.
Configurable Logic Block
In the previous section. the illustrated FPGAs were mi ssing a critical element needed to
implement general circuits. namely.jlip-jlops. Without flip-flops, FPGAs could not imple-
ment sequenti al ci rcuits.
FPGAs may include a flip-flop with each output of a lookup table-two flip-flops in
the case of a 2-output lookup table. The lookup tabl e and its flip-Oops IOgether are known
as a configurable logic bLock. or CLB. A simpl e CLB i shown in Figure 7.26. Each con-
fi gurable logic block has a 3-input 2-output lookup table, and has two outputs and two
flip-flops. Each flip-Oop is loaded every clock cycle with the corresponding lookup table
output. Each output of the CLB can be confi gured to either come from the output's flip-
fl op. or direct ly from the carre ponding l ookup table output. That configuration i s done
by programming a I -bit memory (which itself is a flip- Oop, but we' ll call it a memory to
avoid confusion), shown in Figure 7.26, thai controls a 2x I mux for each CLB output.
The output flip-flops enable us 10 implement cquential ci rcuits, that i s, circuits
having registers, on Ihe FPGA .
PO
CLB oulput
flipflop '
lbit
CLB
outpul-
configuration
7.3
CLB
Bx2 Mem.
a 00
00
00
00
00
00
00
Programmable Ie Technology- FPGA
FPGA
CLB
Bx2 Mem.
0 00
1 00
2 00
00
@laO
00

00
m2
00
m3
Switch
matrix
N


N
N
397
Figure 7.26 An FPGA ' th fi .
table W ' o WI can gurable logiC blocks. which contain flip-flops along with a looku
. e ve put S In all the configuration memory bi t cells in the figure. p
EXAMPLE 7.11 Implementing a sequential circuit on an FPGA
W . h .
e WIS 10 Implemenl Ihe circuit shown in Figure 7.27(a) on the FPGA of F 7'6 W firs
connect a and b to the left lookup tabl e. and C and d to the right I k bl Igure .-. e . t
matrix as shown' F 7 ?7( 00 up ta e throueh the SWlt h
b ' . h . In Igure .- c). We program the left lookup table to output the fun;oons a ' and
. as s own In Figure 7.27(b). Likewi se, we program the right lookup table to output C and d We
program all the confi gurable logic block outpulS to connect to their flip-Hops by pro . . 1
mto the CLB Output configur::uion memories. as shown in Fi gure 7.27(c). grnmnung s
398 7 Physical Implementation
a2
0
0
0
0
0
w y
(a)
Leh lookup lable
at aO Dl DO
b w=a' x=b'
0 0 l ',
o
o
a
b
"
CLB
,
FPGA
CLB
8x2 Mem.
o 00
Ot
to
,/f , ,
1
i 1
o "
" r
0 \ 0
1 ,
\ Q----
9-/'
below unused
(b)
(c)
. . I ., ,n FPGA (0) desired sequenli al circuit.
Figure 721 ImplcmCnlllH! a sequenlW. Cifelli on, . .. ..
Ibl left CLB"S lookup table program bi ls. (e) programmed FPGA. Unused b' IS arc nalt clzed
w
Care should be take n 10 avoid confus ing the outPUI. nip-flops themselves the
CLB OUIPUI configurati on "memori es-the configurall on memones that
program the FPGA to implement the desi red ci rcUit. CirCUit while the
OUlput flip-flops store the bits that the circui t loads dunng CirCUli operat ion ..
The storaoe clements for the lookup table. the CLB OUIPUI configuration, and the
'" itch arc coll ectively known a an FPGA's cOllfigllrotiollmemory, although that
"memory" is comprised of numerous smaller memories and even registers or flip-flops.
Overall FPGA Architecture
Grid of CLBs and switch matrices
A commercial FPGA contains hundreds or even
thou'and, of CLB, and switch malri ces, arranged
in a regular pallem on the chip. A smllple
arra ngement i, shown in Fi gure 7.28. CLBs
connect with horitonWI and vert ical rouling
channel,. whi ch connect to ,wit ch matrices. A
,ample connecti on of a CLB to the routing chan-
neh i, , hown for the top cenl cr LB. The rouling
channel, con,i" of ten, of wires. represented in
the Ilgure ju" as \Ingle bolded wire,. Figure 7 28 FPG archilecture.
7.3 Programmable Ie Technology-FPGA
399
. CLBs and switch matri ces in commercial FPGAs are more complex than described
tn thiS Chapter. For example. CLBs may contain two lookup tables. or direct connections
to adJacenl CLBs to support carry chain . Switch matrices may contain more inputs and
output and more flexible swi tching Options. Furthefl11ore. commercial FPGAs may also
tnclude large embedded RAM memori es for data storage. and embedded mUltipliers or
multipl y-accumulate units for fast multipli cations.
Programming an FPGA
We haven ' t said anything yet. about how we actually program the lookup table. witch
matnx configuration memones, and CLB-output configurati on memories; in panic-
ular, how do we get the program bits into the configurati on memories? The
configuration memories are all the lookup table memori es. the swi tch matrix memo-
ri es, and the CLB-output configurat ion memories. Conceptuall y. programming is
enabled by the FPGA having all the configuration memory bit storage cell s connected
asone big shift register. That shift regi ster' bit storage cell s are spread out acro s the
chip. so don' t represent a traditional regi ster whose bits are u uall y in one place. but
thtnktng of them as a shift regi ster helps understand thei r connectiviry. Actually.
storage cell s connected as a shi ft regi ster are typi call y referred to as a scan chain.
The FPGA will have an extra input pin for programming that erves as the hift input
for the shift regi ster. Another extra input pin indicates that programming is taking
place. During programming. we shift in the bits necessary to impl ement our de ired
ci rcuit. Remember that the configuration memory cell s onl y get wri tten during pro-
gramming of the FPGA-during normal FPGA operation. those confi guration memory
cells become read-only. Thu , one can concei ve of FPGAs whose configuration mem-
ories are made from programmable read-onl y memory technology (PROM. EPROM.
or EEPROM) , although today most FPGAs use RAM and flip- fl op components for
confi guration memories. RAM and flip-fl ops are used probably becau e those compo-
nents need to be programmed quickly using the scan chain method. easily achieved
using RAMlflip-flop components. but not 0 ea ily using EPROM or EEPROM
components.
Automated tools that program FPGAs usually start wi th a file containing the bits
to be shifted into the FPGA chai n-that file is known as a bit file . The tool that
creates the bit file obvious ly must know the number and purpose of every bit ell in
the FPGA scan chain. so uch tools will generate a different bit file for different
FPGA devices.
EXAMPLE 7.12 Programming an FPGA
Thi s exampl e demonstrates programming nn FPGA for the FPGA and de Ired circuit shown
in Example 7. 11 . Figure 7.27 from Exampl e 7. 1 t showed Ihe required con lent of the c "fig-
unuion memory on the FPGA to implement the desi red ircuil. \\'c replicate the C' Oleots in
Figure 7.29(a). Ihis lillie illustraling the llIanner in \I hi h Ihe FPGA h3, the configllr:lti n
memory bits connected 3;:.. a scan chain. Figure 7.2:9{b) :-.ho\\ s h \\ that SClD ' haLO 'on epru-
all y forllls a shift regi ster. Figu,"" 7.29( ) .ho\\ $ Ihe ,'onlenl, of .1 bil rile thlt could be
used to program the FPGA to implement (hI! circuit. \\'r ,:re:lled thJt bit til
follOwing the d3Shcd line that represent!'. the scan hain. placing Is Jnd into the- bl( lile a..,
we sec them in the figun! .
7 Physical Implementation
r-______________ ,FPGA
CLB CLB
Pin
-
Pclk
0
0
a
b
(a)
Pin
-Pclk L _____
Figure 7.29 Programming an FPGA: (a) all configurati on bi t cell s exist in a scan chain, (b) a scan
chain conceptuall y is a bi g shift regi ster. (e) a bit fil e's contents would be shift ed in during
programming-some relati onships between the file' s bits and confi guration bit cells are shown.
How Many Gates Docs an FPGA Implement?
We usually think of a di gital ci rcuit 's size using the noti on of "gates" to represent design
ize. A design with 3000 gates is likely bi gger than a design with 2000 gates. Of course,
whether that statement is true depends on the type of gate5 used in each design (e.g.,
because XOR gates are bigger than NAND gates, 2000 XOR gates may actuall y be
bigger than 3000 NAND gates), as well as the number of inputs to each gate (a 20-i nput
gate is bigger than a 2-i nput gate). Thus, a common method of indi cating de ign size for
a circuit approximates the Ilwllber oj 2-ill/1l1t NAND gates that would be required to
implement the ci rcuit. So when we say lhat a circui t consist, of 3000 gates or 2000 gates,
we typically mean that if were implemented using 2-i nput AND gates,
they would require 3000 2-input AND gate, and 2000 2- inpul NAND gales,
re5pectively.
have lookup lable, and swi lCh malri ce'> in, ide, not gate" FPGA sizes are
therefore typically reported by con,idering how large of a ci rcuit made up of 2-i npul
AND gale, could be implemented using the FPGA ;orchilcclurc. FPGA vendo may
7.4 Other Technologies 401
report FPGA size by saying a particular FPGA h "d .
" 100 000 tical " as a enslly of 100.000 system gates" or
report ' d yp be gates. These numbers are approximations, and many people view such
e num rs very skepllcally (be ' .
FPGA v d ' . cause somettme compani es like to exaggerate).
en ors mt ght also descnbe FPGA size as the number of "10' bl ks"
"lookup t bl " h' h . . gIc DC or
a es, w tC tS useful when comparing sizes of FPGA h . th
of logic blocks or lookup tables. s avtng e same type
FPGA versus ASICs and Microprocessors
FPGAs arc less efficient than ASICs in terms of delay size and powe F I th
. . f F " r. or examp e e
CIrCUtt 0 tgure 7.22(a) could be implemented with a delay of ju t one gate-delay 'i n a
custom or semtcustom IC technology. However, when mapped to the FPGA of Fi e
7.26, thal CtrCUIL w tll have a longer delay- the inputs must pass through the left Ct':-s
lookup lable (whIch may have a delay of two gate-delays), through tbe left CLB's output
muxes (another two gate-delays), through the switch matrix (another rwo gate-del a 5)
through nght CLB's lookup table (another two gate-delays). and finally through
nght s output muxes, resulling in a total of ten gate-delays. In terms of size an
ASIC Implementation of the circuit of Figure 7.22(a) would require about ?O Iran . '
whereas the FPGA implementation using two CLBs and a switch matrix ;;'0 Id StSlOrs.
several hundred transistors. u reqwre
An FPGA implementation of a circuit will therefore be slower and bigger than an
ASIC of the same Ctrcutl. Some studi es have shown that FPGA are
approXlmately .1O slower, and 10-30 times bigger, than ASIC implementations of
the same ctrcutl. SlIntiarly, a circuit implemented on an FPGA may Con ume about 10
ttmes more power than when implemented on an ASIC. But the advantage of being able
to program FPGAs immediately and for almost no cost. rather than having to wai t weeks
or months while spending tens of thousands of dollars. often those
dt sadvanlages. -
. Despite the perfomlance, size, and power overhead compared to ASIC. FPGAs are
sltll much faster than software on a microprocessor for many tasks. in pan because
FPGAs can effectively implement concurrency. pipelining. and bit-level operations. Thus.
FPGAs possess the programming fl exibility of software on a microprocessor. Yet
approach the performance of an ASIC. representing an excellent implementation option
for many designs.
7.4 OTHER 1ECHNOLOGIES
In this section, we describe ot her technol ogies for physically implementing digital ir-
cuits. Some of Iho e technologies are older technogies that are still useful for
situations. Others are newer technologies that are beginning to gain popularity.
Off-the-Shelf Logic (SSI) IC
Sometimes we need onl y impl ement a circui t having just a few gates. In these cas _
using an FPGA may be overkill. as FPGA typi call. upport th usand r milli os f
gates. Likewi se, u' ing an SIC would al 0 be overki ll . For es "here" e only need a
.102 Physical Implementation
few 2.ates. we might instead use one or more. off-
the-,Ilelf lo!!ie I s. A logic IC typi call y cont ains a
few. perhap: ten or less. gat es connected directl y to
the les pi ns. as shown in Fi gure 7.30. The IC
shown has four AND gat es and 14 pins. One pin IS
for power 10 the IC (known as VCC). the other for
2round (eND). The remaining pill S connect 10 the
fou r NO gates in the Ie. as shown in the figure.
Different logic ICs have gate types other than
AND. such as OR. NAND, NOR, or NOT. To budd
a small ci rcuit from these off-the-shelf logiC ICs.
we woul d simpl y pl ace the ICs on a board and
connect the appropriate pins. ICs wit h a few
gates are known as Small-Scale IlIlegratroll chips.
or 551 chips.
7.jOOlCs
vee
114 113 112 111 110 19 18
11 12 13 14 15 16 17
GND
Figure 7.30 Example logic Ie.
The most popular off-the-shelf SSI
IC are known generall y as 7400-
series ICs. A 7-100 IC typi call y con-
tains four to six logi c gales. and aboul
1-1 pins. A particul ar 7400 IC is hown
in Fi 2ure 7.3 1. The IC measures about
112 inch across. The IC package
shown has two rows. or lines, of pins,
and is thus known as a dllal-illlille
package. or DIP.
Figure 7.31 7400-seri es Ie.
7-100 ICs fi rst became avail able in
the earl y I 960s. The ori ginal 7400
chip had four NA 0 gates, and cost
about Slooo each. in 1962. That's
riaht-S looo And that's in 1960s
. when' a U.S. engineer earned
only about S I O.ooo/year. The price
dropped igni fi cantl y during that
decade. Lhanks in large pan to the u e
of huge numbers of the devices by the
U.S. Minuteman Mi ssil e and Lhe
Apoll o rocket programs. and has con-
tinued to drop since Lhen due to
cheaper tran iMors and huge volumes.
Today. you can buy 74oo-seri es ICs
for of cents each.
Parts with different gates have dif-
ferent pan numbers. Table 7. 1 , hows
\ome commonly used 7400 panl from
TABLE 7.1:
Part
74LSOO
74LS02
74LS().I
74LS08
74LS 10
74LSII
74LSI 4
74LS20
74LS27
74LS30
74LS32
74LS74
74LS83
--
74L 85
\""'"
Commonly used 7400-series ICs.
Description Pins
Four 2-inpul NA D 14
Four 2-input NOR 14
Six invert ers 14
Four 2- input A 14
Three 3-input NAND 14
Three 3-input A D 14
Six inveners (Schmil1 LTi gger) 14
Two 4-inpul NAND 14
Three 3-inpul NOR 14
One 8- inpul AND 14
Four 2-i npul OR 14
Two D fli p-nop. positi ve edge 14
tri ggered. with preset and reset
4-hll binary full -adder 16
4-blt 11Iagnllude comparator 16
.. l '11
7.4 Other Technologies 403
Fairchild's 74LSoo subfamily of the 7400 series. In additi on to basic gates. the table hows an
IC wiLh 0 flip-fl ops. full -adders, or a magnitude comparator. Pans al so exi I for XOR.
XNOR, buffers, decoders. mul tiplexers, up-counters, up-down-counters, and more.
There are several different subfamilies of 74oo-series pans-pans from a subfamil y
can be used with other pans from the subfamil y, bUI generall y not with pans from other
subfamilies. The reason is Ihat Ihe voltage and CUrreOi sening of a subfamil y are designed
such that the ICs can be connecled wi thout worrying about adjusling the voltage and current
between ICs. The 74 series (e.g. , 7400, 7402. etc.), is the basic subfamily. based on a type
of Iransistor known as TTL-designers using logic ICs today onl y use 74-series Ie if they
must integrale WiLh old designs, and typically don' t use the series for new designs. The
74LS subfamil y (e.g. , 74LSOO, 74LS02) uses a special Iype of TTL technology known as
ScholLky Ihat results in lower power and Sli ghtly higher speed than the 74 series-the "L" in
the name means low-power. the "S" means Schottky. The 74HC subfamily use high-speed
(denoted by the "H") CMOS (denoted by the "C' ) Lransi lors. The 74F subfamil y was
introduced by FairChild, consi ting of fast (hence Ihe "P') advanced Schon.ky TTL logic.
Numerous oLher 7400 subfamilies exist, with new subfamilies SLill being inlroduced.
Funhemnore, additi onal series of off-the-shelf SSI ICs al so exist in additi on 10 the 7400
series. Another popular series is the 4000 series of ICs. a CMOS series thai evol ved in the
19705 as a low-power alternaLive to Ihe TTL-based 7000 series. More series exi 1100.
EXAMPLE 7.13 Seat belt warning implementati on usi ng oft-the-shelf 7400 ICs
Usi ng 74LS-series ICs shown in Table 7. 1. physicalJy implement the sea! belt warning ligbl circuil
of Figure 7. 1. shown again in Figure 7.32(a). We could implememihe invener using a The
74LS08 has 2-inpul AND gates. and we need a 3-input AND gate. A simple soluti on is 10 decom-
pose Ihe 3-i nput AND into two 2-inpul ANDs. as shown in Fi gure 7.32(b). The final impl ementation
is shown in Figure 7.32(c).
Figure 7.32 Implementi ng the seat
belt warning circuit wilh 74LS-
series ICs: (a) desired circuit. (b)
circuit transformed 10 li se 2-inpul
AND gates. (c) circuitmnpped 10
two 74LS I s. Additional
connections not shown would be
power 10 the 114 pin, and ground
to the 17 pins on ench Ie.
(a)
(b)
(e) ____________________________________ .J
40-1 Physical Impleme nt ation
Preferably. we \\Quld implement the circuit usi ng onL' IC. (0 reduce board size. cost. and
power. onvcning the circuit to use only one type of gatl!. like AND gates onl y, or NOR gUles
onl y. could result in just one IC. For exampl e. if we could t:onvcrt to 3-i npul OR gales. we could
use the 74LS27 chip. \Ve slart by conveni ng the circui t to onl y. as in Fi gure 7.33(1.1). We
remml! the double in\'ersion. and replace the si ngle inversions by 3-i npul NOR gates. The imple-
mentation a 7-tLS27 Ie is shown in Figure 7.33(c),
Figure 7.33 lmplementing Ihe seal
belt warning circui t with one 74LS
IC. namely. the 74LS17 consisting
of three 3inpul NOR gales: (a)
desi red circuil transformed 10 OR
gales with inversion bubbles. (b)
circuit wiLh double inversions
eliminated and si ngle inversion.s
replaced by Iinput OR gales. (c)
circuil mapped to a 74LS27 chip.
Additional conneclions not shown
"ould be power to the 114 pin. and
ground to the 17 pin.
(a)
j - ---- -- ----- - -- --- i
: 0 :
, '
, i
' w
,
,
,
,
... - --- - - ------ - - --- -..!
(b)
Simple Programmable Logic Device (SPLDI
p :
i
l __ .. ..... __ ...
(c)
A programmable logic device. or PLD. is an IC that can be confi gured to implement a
varielY of logic functi ons. ranging from tens to thousands of gat es. PLDs became popular
in the I 970s (thus predati ng FPGAs). as they could implement far more functi onality in a
si ngle IC than pos ible usi ng SSI ICs.
A PLD device comains a prefabricated circuit wi th a set of external inputs feeding
into a large ANDOR circuit structure. with the special feature that the user can confi gure
(via "programming") which external input connect to the AND gat es. For example,
Figure 7.34 shows a basic PLD with three inputs feeding int o three AND gates followed
by an OR gate. The inputs feed into the AND gate in both true and complemented fomls.
Each wi re feedi ng into each AND gate pas es through a programmable node. which can
ei ther pass the node's input to the node's output. or di sconnect the node' input from the
node's output. Thus. by programming the programmabl e nodes. we can program the PLD
to implement allY 3term functi on of three inputs.
The programmabl e node design vari es among of PLD;. Fi gure 7. 35 shows two
types. The type shown in Fi gure 7.35(a) is based on a fu,e. A fu,e conducts like a wire,
unless we "blow" the fuse. meaning we pa;s a higher than normal current through the
fuse. causing the fuse to literall y burn up and break. A blown fuse obviously does not
conduct electricity. The type ..hown in Fi gure 7.35(b) i, baloed on memory and n tran
sistor- we program a 1 into the memory to cau,e the tran,i \l or to conduct. and a 0 to
cause the transiMor to not conduct. We omit the detai l .. of how to program the fuses or
program the memOrI e, Memory based PLD, cal) u,uall y be reprogrammed,
in contra" to fuse ba ed PLD, that can only be programllled once and are known as
11 12 13

, },--
)



.. -.------.+-... __._._ .. __
programmable nodes
7.4 Other Technologies 405
figure 7.34 A basic example of a programmable
logic device. (AND gales are wiredAND.)
01
programmable node

" unblown" fuse " blown" fuse
figure 7.35 Two Iypes of programmable
nodes: (al fuse based. (b) memory based.
olletime programmable (OTP) devices. Fuse based PLDs are po I . I .
. 1' " ". pu ar In e ecmcally
nOI sy app ltke space apphcauons, smce memori es can have their contents
changed from radl3uon In space. are also popular in applications demanding b.igh
securIt y, smce mahclous enemI es can t reprogram the device. Memorybased devices are
more common. however. SInce they can be reprogrammed and thus reduce costs when we
make deSIgn The memones used are almost always nonvolatile. meaning the
memones don t need power to retam theIr stored bits. (See Secti on 5.6 for more info-nna.
tlon on nonvolatIl e memones. )
. You might be wondering how those AND gate work when the programmable node
IS programmed to dI sconnect an Input-how does the AND gate treat an input wi th no
conneclt on? As a O. as a 1, or a something else? Actuall y. PLD don' t use nonna! AND
gates. Instead. they tYPIcall y use what IS known as 'wiredAND." Explaining how wired
AND works .IS beyond the scope of thIS book, and instead the Subject of a COllISe on tran.
slstor: level Clrcutts. For our purposes, we can thmk of a wiredAND gate as an AND gate
that SImply Ignores unconnected Inputs. -
Real PLDs have more than just three t1 12 t3
inputs. three AND gat es, and one output. PLD
structure drawi ngs thus need a more concise
way of drawing the ci rcuit s. A concise method
or drawi ng PLDs is shown in Figure 7.36.
Such a drawing docsn' t show the progrdm
mabie nodes. and simply utili zes an "x" to
indicat e a connect.ion. In the drawing. wi res
that cros. each other are 1101 connected unl ess
an "x" exi sts at the cros ing. FunhernlOre.
such a drawing uses a singk wire to repre em
all the A D gate inputs. representing the
-H....;IO+.-oI"'- .c., --..,; - ------- -----]
,
,
,
PLDIC :
-----------------------..!
figure 7.36 implified PLD dr:l\\;ng.
01
Physlcallmplementallon
\\ The shows how we would use such n drawing to indicate the connec
ti on; needed to the term 13* I 2 ' . The "x" on the left represe11l s 12 ' feeding
int o the top AND gale. The " X' on the ri ghl indical es 13 feeding i11l 0 Ihe lOp AND gale.
EXAMPLE 1.14 Seat belt warnlllg light using a simple PLD
to impkllll.! nt the belt warni ng light
!:iv... tem of Fi2ure 7. 1 the PLD of Fi gure 7.36.
can do b) the PLD as shown
in Fi gure 7.37. \Ve generate the desired tenll kps I
by p;ogrammi ng connecti ons ror the top AND
2at c J!) shown. \Ve want the bottom tWO AND gates
;0 output 0, so th:l t the OR gate' s output equal s the
top AND gatc' !:i out put. We can achieve Os by
ANDing an input with complement- the result
of a*a ' I.!o. al\\3)!! O. The fig ure shows two ways
of O!!. " iIh the mi dd le gate using j ust one
of lhe Jnd the bottom gale using all three
inputs-the re... uh is lhe same.
p
-- --- --- ----------------------.
Figure 7.37 Seat beiJ warning system
on a simple PLD.
w
PLDs Iypicall y have more than j ust one OUIPUI. Figure 7. 38(a) shows a PLD wilh IWO
outputs instead of just one. Each ompul is an OR of up W Ihree terms. , .
Many PLDs have a D flip-flop thai stores each OUI PUI s bl!. and Ihe PLD s oulpUI pill
can be programmed 10 connecl from the OR gale oUlput or Ihe flip-flop OUiPUI , known .as
combi nali onal or re2istered outpul, respeclively. A PLD supponlll g comblllalionaUregls-
tered OUIPUI is in Figure 7.38(b).
11 12 13 11 12 13
-------- ---------- -- -- ---------,
'
,
, ,
, :
: 01
! PLO Ie: 1 __ - _______________________ _ _ ...
(a)
-------------------(bi----
Figure 7.38 (0) PLD wilh IWO oUl pu15. (b) PLD ",ilh programmabl e regislcrcd OUlput S.
AnOl her eXlension is to allow Ihe PLD oUlput 10 be either the tnJe or complememed
.alue of the OR gal e or fli p-fl op output. using a 2x I mux controll ed by a progr:llnmable bit
Yet anOlher eXlension is for the oUlput 10 feed back 10 Ihe input array. One use of feedback
is to impl emem fu nclions wi lh more lenns. achi eved by feeding bac k Ihe combi nati onal
OUiPUI val ue. Another very common use of feedback. achl e"ed by feedlllg back Ihe
7.4 Other Technologies 407
regislered OUlput. is 10 implemem a slate regisler and control logic (i.e., a controller}-Ihe
AND array gelS ils inputs from Ihe regi slered oUlputs and eXlernal inputs. and Ihe OR gates
then generale Ihe eXlernal outputs and Ihe next values for the state register.
Some PLDs nOi only have a programmable AND array, bUI also have a program-
mable OR array, meaning Ihe OR gale can gel its inputs from any of Ihe A D gates.
SPLD versus PAL versus GAL versus PLA
Li ke so many names in Ihe rapidl y evolving field of hi gh lechnology, names for PLDs are
a bil blurred and confusing. Ori ginall y ( 1970s). PLDs consisted of programmable AND
arrays and programmabl e OR arrays, and were known as programmable logic arrays. or
PUs. In Ihe mid- 1970s, a company named AMD (Applied Micro Devices. lnc.) devel-
oped PLDs Ihal instead had OR gales with fi xed rather Ihan programmable inputs. as in
Figure 7.38 and Ihe OIher PLD fi gures we' ve shown, and referred 10 such device as Pro-
grammable Array Logic, or PALs ("PAL" is a regi stered trademark of AMD). PALs were
origi nall y fuse-based and hence one-time-programmable. A company named Latti ce
SemiconduclOr Corporalion developed a PLD using a memory-ba ed programming
approach ralher Ihan fuses, resulting in reprogrammability. and referred to such device
as Gell eric Array Logic. or GAL (which are regi stered trademark of Lattice Semicon-
duclor Corporal ion). As PLDs became more complex (a well di cu in the next
seclion). PLDs based on PAL or GAL architectures (PLA archi tectures seem to be prett),
rare) became known as Simple PWs. or SPWs. to conlraSlthem with the more complex
PLD vari elies. Today. numerous companies manufacture SPLDs. and often state Ihat Iheir
SPLD archilecture is based on "PAL" or "PAUGAL" architectures. wilh Ihe di stinction
bel ween PAL and GAL nol seemingly relevant in that comexl.
SPLDs Iypicall y support lens 10 hundreds of logic gates.
Complex Programmable Logic Device (CPLD)
As IC rransislor densilies grew in the 1980s. companies began to build PLDs 10 suppon
thousands of gales. However. the PLD archilecture described in the previous ectioD does
nOI scale well 10 thou ands of gate -who needs one big huge circuit of two-Ie"el logi 0
Inslead. archit ectures evolved Ihat consisted of numerous SPLDs on a in21e de,; e. con-
nected using switch malrices (also known as programmable imer e tion
7.3 for delails on swi lch matrices. These devices today are lenO\\ n as Complex PWs, or
CPWs. CPLD can Iypicall y implemem designs with thousands of gUles.
SPLDs versus CPLDs versus FPGAs
What' s Ihe dilference among SPLDs. CPLDs. and FPGAs' In general. Ihe tenn PLD is used
for devices thm support lens 10 hundreds of gates. CPLD for devices Ihat suppon thousands
of gales. and FPGAs for devices Ihal suppon tens of thousands f gates 10 million of g31
Funhemlore. loday. SPLDs and CPLDs are almo t always nonvolati le. me:min; Ihe\
can store their program even after power is removed. whereas FPGAs are aIm ' t
volati Ie. meaning Ihey lose their program when power is remO\ ed--and must in lude
external circuitry thOi lores the program in nonvolatile mem and that progrnn ' Ih
FPGA from IhOi memory on po\\ cr up of Ihe FPGA. FPGAs are likely, Iatile
because of the wlIy Ihey are programmed using a s an chain. \\ hi -h is easy using fl ip-fl ops
and RAM cells. bUI would be diflicull u ing nom lalile mcm bit: . Ho\\ 'cr.
hysical Implementation
,'unceptuall), any of SPLD" CPLDs and FPGAs could be made to be volatile or nonvolatile,
and une might 'lI1ticipate thai fulure FPGAs wi ll include FPGAs that are nonvolatile,
Flows
An interc"ing new technology that has evolved in the earl y 2000s is Ihat of creating an
AS IC from an FPGA- ba,ed design, Many designers usc FPGAs for ASIC protolyping,
The) 'hC automalcd tools 10 implement Iheir circui t on FPGAs, and Ihey then extensively
lest Ihe ci rcuit in Ihe circuit ' s environmenL for example, in a prolot ype DVD player, The
FPGA-based prolotype implementali on may be larger, costl ier, and more power-hungry
Ihan an ASI C-ba,ed implement ati on, but can be very userul for detecting and correcting
in the circuit. (IS well for demonstrati ng the event ual producl. Once sati sfied wi th
the circui t. mi ght then use tools to rci l11 plement the circuit on an
ASIC. The ASIC implementation traditi onall y did not ul ilil e any informali on from the
FPGA implementat ion.
Implementing large ci rcuits on ASICs is a diffi cul t lask, even with automated tools.
Nonrecurring engineering costs may exceed hundreds of thousands or even millions of
dollar>, and fabricat ing the IC may take many week, or monlhs. Furthermore, any
problem \\ ith the fa bricated ASIC may require a second fabri calion cycle, requiring addi-
tional \\cck, or month" Problems may arise in Ihe ASIC Ihat didn' t appear in the FPGA
due to the completely new implementati on of the circui t as an ASIC- perhaps timi ng
problem, might arise, for exampl e, due to the circuit being placed and routed in a com-
pletely different fash ion Ihan was the case in the FPGA.
To ea,c the migrati on of a circui t From FPGA 10 AS IC, some FPGA vendors offer a
'truclUred ASIC approach. In a siruelllred ASIC approach, an automated tool converts the
FPCA illlplelllell/(llioll to an ASIC implemenl at ion, in conLrast to converting the origillal
circuil to an ASIC impl ement at ion. In oLher words, a truclUred ASIC will refl ect the lookup
table and ,wi tch matrix truclUre of the ori ginal FPGA. However, the structured AS IC will
nOI be programmabl e. and thu will have faster lookup tablCe and faster switch matrices,
Ihei r conte Ill;, will have been "hardwired" into the ASIC. The structured ASIC's
can be preplaced. with only wires left to be completed to implement a part.icular cir-
cuit. The re,u lt i, les, I RE cost (tens of Lhousand!. of doll ars rather than hundreds of
thou,and, or mi ll ions) and le;s time-to-s il icon (weeks rat her than mOlll hs), as well as less
chance of problems, The drawback is that the ASIC will be larger, slower. and
more power-hungry than a tradi tional ASIC. but ti ll bell cr than un FPGA.
The advent of IC, cont aining a billion tran,btor has led 10 I ;, Lhat contai n what used to
exi,t on multiple ICs, Thu;. a single I may contain dozen, or hundred, of microproces-
'Of>, cU'lOm digital ci rcuits. memori es, bw,cs, elC, An I wit h numerous processors,
CU\i()m circuits. and memori es is known as a System-oll-a- Chip, or SOc.
While many SOC, are creatcd by dc,ignw, for a pani cul ar application (e.g" for a
particular DVD player), ot her SOC, are crealed to be uscd in a vari ety of diffe rent appli
cauons. Such platform SOCs might conl ai n prOCe"Of\ and ell ' tom circui ts specifi cally
(or an apphc:lllon domain. For a platform SO for Ideo processing mi ghl
7.5 IC Technology Comparisons 409
cOlllain custom di gi tal c ' t h .
' d . IrCUI s aving hardware optimized for hi gh-speed low-power
VI eo compression and deco . (k
d '. . mpresslon nown as eodees)-such pl atforms often contain
co ecs or a Wide vanety of protocol s (e.g, MPEG 2 MPEG 4 H 264 ). th
pl at form co Id b d ' . . ' , ,. , etc, ,Since e
. h N u . e u e In different products supporting di fferent standards, An example
IS t e .. expena pl atform from Philips. Furthermore, some pl atform SOCs contain FPGA
In addll ion to one or more microprocessors and custom di gital circui ts on the Ie Exam-
pl es mc lude the V,nex II Pro platfoml from Xilinx and the Excalibur platform from
Ahera. DeSigners might uti lize a platform SOC to prototype an ASIC, or to physicall y
Illlplement a syslem In a final product.
7.5 IC TECHNOLOGY COMPARISONS
Relative Popularity of IC Technologies
/" 2002 aloll e.
m'ari), 80 billiun
l e s (o[ on /)'{Jes)
were prodll ced.
(Source: Ie
' "sig/us McClean
He{Jar/. 200J.)
TABLE 72.: Sample 'to of new
implementations in various
technologies. Total is more than 100%
due to overlap among categories,
Technology %
We' ve described numerous technologies in thi s
chapt er. In thi s secti on, we' ll give you some idea
of the relati ve popularity of some of those technol-
ogies. Tabl e 7.2 provides the rel ati ve percentage of
deSigns that were physicall y implemented in
various technologies in 2001, based on a particular ________ -_5_%_
study. The table considers each new unique design Gate array 5%
onl y once, meaning that it doesn' t mailer how :S=-y-s- te-m---o-'- n--a---C-h-i-p------3-Q% -
many copi es of the same design were manufac-
lUred. That tabl e' s data does not include off-the- :F:: U:: II-: -C:-U-:s:-to_m c:-_ _ ___ ___ '_w_"'_
shelf SSI ICs or SPLDs (both represent only a tiny CPLDIFPGA
fracti on of the IC market from a total dollars per-
specli ve, and are thus often excluded from such 5'l-
surveys). A different study describes 2002 IC reve- Sou,t:<: Synopy , DAC 2002 panel.
nues (as opposed to unique designs) totaling $11 billion as follows: standard cell fuji
custom 20%, gate array 10%, PLDIFPGA 17%, and other 5% (source: WSTS, lC Insi2hts),
Yet anoLher study li sts 2002 ASIC revenues at S I 0. 9 billion, PLDIFPGA revenue at- _-
billion, and SOC revenues at $7.6 billion (source: Busi ness Communications Company,
2003). Numbers from different studies vary: we provi de these numbers just to !rive you a
general feel for the popul arity of the vari ous technologies. - -
Some general trends seem to include the increasi ng popularity of FPGA . the
increasing use of structured ASIC approaches, and the increa ing appearance of y rem-
on-a-chip.
The 10015 used to map di gital designs to phy ical impl ememations, ollecti\elv
known as Eleelronie Design All tomation tools, or EDA, themselves fom) a market with
of $3 billion in 2002. $3, 6 billion in _003. and predi ted re\'enues of billion
in 2006 (source: Ganner DataquesL 2(04).
Tradeoffs among IC Technologies
Figure 7 39 the general tradeoffs anlOng the key I techn I gies dc_ -ribed in this
chupter. Technologies toward the ri ght an be more customi zed t a parti ulac d ired dr-
cui t, and thus may have fustcr perfomlance, higher density (smaller chip for a giw n dn:uitt
b
-110 Physical Implementation
lower PO\\cr. and larger chip capacil y (more circuils on " single chip). BUI such customized
lechnologies \\ ill be more cosil y 10 design and wi ll lake time 10 design. Technologies
loward Ihe Icfl are less cuslOmi zed 10 a particular desired circuit. and Ihus may be more
quickly a\ ail able and have lower design COSI. bUI al Ihe expense of slower performance, less
densi lY. higher power. and less chip capacilY (fewer of our circuits on a single chip). More
generally. lechnologies loward Ihe righl allow for more oplimizali on. Technologies to the
left yi eld less oplimizali on. bUI yield easier design.
Figure 7.39 Tradeoffs among
sc\eral lC technologies.
,


E:
!!' ,
FPGA 0>'
e:
PLD 5}i

Full -custom
Siandard cell (semicuslom)
Gale array (semicustom)
Quicker availability .......--
Lower design cost ......-
----.. Faster performance
-- Higher density
----... Lower power
-- Larger chip capacity
Easier design More optimized
Furthermore. FPGAs and PLDs nOl onl y enable easier design, bUI may be reprogram-
mabie. a feature Ihat enables changes 10 the circuit lale in Ihe design cycle, or even after
the circui!"s IC has been deployed in a fina l product.
Choosing an IC lechnology for a parti cul ar design wililherefore depend on the con-
straints imposed on Ihat design. If a design needs 10 gel 10 market quickly, Ihat constraint
fa"ors PLD and FPGA lechnologies. If a des ign must be extremel y fast, that constraint
favors emicuslOm or full-cuSlom technology. If a design must consume very little power
or lake up "ery little space. Ihose constrainls favor emi cu 10m or full-cus tom technology.
If changes 10 the ci rcuil are likely_ thai constraint favors PLD and FPGA technologies.
Choosing the besl lechnology is a hard problem, requiri ng careful consideration of
numerous compeling con traints.
Ie Technologies versus Processor Varieties
IC technologies and proces or varieti es are onhogonal implementati on features. Two imple-
mentation feat ure are orthogol/al if we can seleci each independentl y (in mathematics,
orthogonal means forming a right angle). We know Ihatlhere are several proce sor vari eties
thai can each implement a desired system function. including a custom proce sor, or a pro-
grammable processor. Fi gure 7.40 ill ustrates that Ihe ch ice of processor variety is
independent of the choice of IC technology. Point 1 illustrates the choice of implementing
desired system fu ncli onalit y using a cu tom processor circllit wi lh a fu ll -cu 10m IC tech-
nology. That choice re, ult, in a hi ghl y optimized de ign. Point 2 illu>lrates Ihe choice of
implcmenting a cu'tom processor ci rcuit on an FPGA. While the ci rcuit may be optimized,
the FPGA I lechnology results in a less-optimi7.ed implemcntalion (compared to full-
cu"om) but an c;"ier Point J illustrates the choice of implementing de ired system
funcllonalit y as ,oftwarc execuling on a programmable procc\Sor, where the programmable
7.5 IC Technology Comparisons
processor is implemented in stan-
dard cell s. Point 4 illu trates the
choi ce of implementing software
on a programmable processor,
where the programmable pro-
cessor is aCl uall y implemented
on an FPGA. While that concepi
may seem strange, a program-
Custom
processor
.(2) .,/
More optimized
Easier design
/'
mabl e processor is jusl another Programmable
circuit , so Ihat circuil can be processor (4)
(3)
mapped to an FPGA just like any
ot her circuil. Programmable pro-
( 1)
cessors mapped to FPGAs are in Gale Siandard Full-cuSlom
fact becoming increasingly pop- array cell
ul ar. because a designer can Figure 7.4ll Ie lechnologies and processor varieties are
choose how many processors to onhogonal implementation features. Four of the ten
put on a single IC (perhaps the possible choices are shown.
411
desioner wants 9 proorammabl e
. 0
0
o processors on one IC), and because a designer cao put
sln)e-purpose processors alongside programmable proces ors-all withoul havin2 to
fabrlcale a new Ie. -
Of course, programmable processors can often be purchased as off-the- helf ICs. so
a deSigner uSing a programmable processor may nOl have to worry aboul the processor' s
IC lechnology.
But des igners must place a programmable processor within their own
IC. coexlstlll g with other processors. When a programmable processor coexists on an IC
along with other processors (programmable or custom)_ that pro!!Tarnmable proces or is
often referred 10 as a core. -
Our discus ion of IC teChnologies and proce sor varietie has thus far assumed just
one type of each Item (e.g., one type of FPGA). In reality. each type it elf has maoy vari -
eties. For example_ dozens of differentlYpes of FPGAs are available. varying in their size.
speed. power. co t. elc .. Likewl e, dozens of different rype of programmable proces 011;
are avrulable. also varying In those features . And we know thai we can create different
Iypes of cuslom processors, varying also in their size_ speed. power. etc. Thus_ each point
Fi gure 7.39 and Figure 7.40 IS actually a large collection of points that spread out in
different dtrecllons on Ihe plots. and may even overlap with other types_ Funhennore_
other IC lechnologles as well as proces or varieties exist and continue to e\"ol"e_
We also point out Ihat a single IC may actually incorporale se"eral different IC tech-
nologies. So a single IC may have some circuits created u' ino full-custom technol02Y.
and other circuils created using ASIC or even FPGA Like\\ i' e. a si ngle p;;;'
cessor may dl.fferelll parts Implemented in different IC technologies. F r example. a
common situati on IS for a programmable proce sor 10 ha\'e it ' datapath implemented in
ful l-cuslom technology. but ils controller implemented in ASIC rens n
being that the datapal h is very regular. while the ontTOlier is mo' th rured combi-
national logic. -
In summary_ designers have a hllge number of choices in ch ' ing proc :;sor \':lli-
eli es and IC Ie hnol ogies to implement their s:nem' .
n
H 2 1 Physical Implementation
Ie Technology Trend-Moore's law
I" tl 2()().J \/,1'1:(-11.
lUI I,,'t-/I/(',-
prt'\Illt.ml
\//1l'tf'flt'tl Ihllt 11('
II(lU

IrlllUUIOrf U,S
('{It'IIf/I,II, jrt!t'.
Under,tandi ng thc trends or IC
technologies requires
of loorc', Law. il'l oore's LAw
rough I) state; that IC capacit y
double, evcry 18 month;. Figure
7.-1 I plots , uch doubling. begi n-
ning with about 10 milli on
tran, istor, pCI' IC in 1997. The
plot u,os a logarithmic ,calc for
the l'axis-each tick mark repre-
10 more than the
prc\ ious ti ck mark. The growth
rUle is astounding-ICs increase
from 10 milli on tran;istors in
'"
100.000
I to.ooo
Q; 1,000
Cl.
!!?
tOO
*
';;'
c 10

Figure 1.41 The (rend of incrc<lsillg transistors per Ie.
1997 to owr 10 billion transistors in 2015. That means that the 20 15 IC can hold 1000
time, more tran, i; tof> than the 1997 Ie. In other words. the 20 15 IC is as powerrul as
aboUl 1000 1997 IC,. Thi s increasing capacit y trend has also resulted in the cost per tran
!-l istor dropping at nearl) the same astounding rate.
The IC capacity trend has many implications. One implicat ion is that digital
designers can creale Illilssively parallel designs that usc huge numbers of functional units
and register>. to create high-perrormanee systems not previously practical. The number or
required transistors ror such des igns might have been considered absurd just a decade ear
lier. Another impli cation is that the size overhead of FPGAs compared to ASICs (about
lOx) become, Ie,s rel evant . making FPGAs an increasingly popular choice in more sys
tems. Yet another impli cation is that designers increasingl y need automated tools to help
build multimilli on Lransistor circu its. and Illay increasingly wish to use RTL and
even hi gher or design (e.g .. C-based design) as the method ror describing circui ts,
Iea\ ing the remai ning design to tools.
At point. Moore's Law must come to an end. because transistors cannot shrink
to an infi nit ely ize. When that end wi ll occur has been a subject of debate ror many
yea". Some people ha\e predi cted Moore 's Law wi ll continue a couple decades into Lhe
2()()(h.
7.6 PRODUCT PROFILE: GIANT VIDEO DISPLAY
In the late 1990, and 200(h. giant color video di splay, became popu lar at sport stadi ums.
car dea le"hip'. ca, ino<;. rreeway bi ll boards. and various ot her locat i ns. Most such video
d"play, utillLe a huge grid or light-emitting di odes (LED,) driven by digital ci rcuits.
A light-emil/iug diode (LED) is a semi conductor device that eilli ts li ght when current
pa,<'c, through the device. In conLrast. a traditi onal " incande,ccnt" li ght bulb emits light
when current p""e, through the bulb" int ernal filament. which i, a hi gh-resistance wire
that heat' up and glow' when current fl ow' Lhrough the wire- the wire, however. doesn't
hurn II " cnclo,cd in a vaccum or inert ga, "ithlll the bulh. BeclIu, c LED light
come, rrom a ,emlconductor material and not Imm a hot glowlIlg hl ' lnlent in tl bulb. LEDs
U'>C Ie" rower, ia't longer. ;lnd ca n handl e vihration' that would u regular light bul b.
LEOs have long been used to
di play simple device status (e.g. , on or
off) .. text messages. or even simple
graphi CS. However, until recently. LEDs
were onl y avail able in white. yell ow.
and green colors, Hnd were not very
bri ght. Thus. earl ier LED video dis-
plays were typicall y small , used onl y a
Single color. and were designed for
IOdoor li se. However, wi th the deve lop-
ment of the blue LED in 1993. and the
1.6 Product Profile: Giant Video Display 413
Traffic
illclIlIdescefllligill
(Illd red plastic em'er
Traffic light made from
se\'eral hllfldred red LEDs
development or brighter LEOs. ru ll - figure 1.42 LEOs arc replacing incandescent
color LED displays evolved that can bulbs in Iraffic li ghts. as well as other areas.
display video in much the same way as a computer monitor or televi ion. even in unny
outdoor .. In ract . LEOs. being a semiconductor technology. have been
IIn provlO" at a rate sllndar to transi stor (which also use semiconductor technol ogy). The
Improvement has followed what is known as HailZ's LAw (the LED equivalent of Moore ' s
Law), statll1g that the LED " nux per package" doubles every 18-2-1 months. which has
been the case ror several decades. Due to thi s improvement. many people predict that
LEOs WIll repl ace incandescent li ght bulbs ror home and office lighting. LEOs ha"e
already begun to replace lOeandescent bulb in traffic light s. as illustrated in Figure 7.-12.
. Figure 7.43(a) shows a large LED video di splay capable or di splaying full-color
Video on a 15 x 8 screen. Because each LED is relati vely large ( 1/8th of an inch
Wide. for exa mple) In comparison to the pixe ls or a computer monitor. one has to tand
several feet away from the LED display to view the image wi thout notiCing the indi\' idua!
LEOs. Ir we look closer at the LED display. as seen in Figure 7.-l3(b). we can see the
IOdl vldual lines or the di splays. Ir we look even closer at the di splay. we can finallv
the individual LEOs wi thi n the display. as shown in Figure 7.-13(c). That figure
the LEOs are cl ustered Into groups or red. green .. and blue LEDs-each cluster represents
one pIxel. For the LED Video display shown 10 FIgure 7.-13. each cluster or LEOs consi ts
of five LEOs: two red. two green. and one blue LED. Giant video are indeed
Intended to be viewed from a distance. 0 mo t viewers don't see the indi\ idual LEOs.
one pixel
(a)
figure 1.43 LED video di>pla) : (al a large LED di phI) (aboul 10) ard< \\ ide and - ards tam. [h) a
do,cr \ ie\\ ing :lOoUI I yanl. (c) :l \ C'r) cl "e \ le\\ 'ho\\ mg ahout I 'Quare incn--Ib
"pi ch" can be ,cell . each pi\cI ing :2 red lo\\t:r-nghll.'Ir pl \ "1 . gre 'n
right and Itl\\ "r-Ieft of pi,el). and I blue LED (\'ellll'r 01 PileI).
Physical Implementation
A>sume we wanl to creme an LED video di splay capable of di splaying a nOx480
pixel video. where each pixel simply consists of one red. one grecn. and one blue LED. If
each LED cluster has a width of just over 3/8 inch ( 10 millimelcrs) and a heighl of 3/8
inch. our di spl ay will be roughly 2-1 feet wide and 16 feet hi gh. Furthermore. our di splay
wi ll contain over one million indi vidual LEOs. because 720 * 480 = 345.600 pixels. and
the LED, per pixel results in 1.036. 800 LEOs.
Controlling every LED using a single digital circuit wou ld require millions of output
pins and miles of wire to connect all of the LEOs. Insteael. as depicted in Figure 7.44. an
LED vieleo display is construcled of smaller and smaller component s. The LED di splay
consists of an arrayal' small er components call ed /lollels. shown in Figure 7.44(a). The
panels are large di splay components typicall y designed in a modular fashi on such that
di spl ay manufaclUrers can easily create custom-size video displays and repair broken
components within a di splay simply by replacing individual panels. The LED di spl ay
panels are further divided into LED lIIodllles that control the physical LEOs. shown in
Figure An LED modul e is the basic di splay component and. depending on the
design of the module. can cont rol anywhere from a few hundred to a couple thousand
LEOs. For exampl e. in designing a pixel di splay. we may want 10 use an array
of 6x6 panels. where each panel consists of an array of 5x5 LED modules. Each LED
module would then need 10 control an array of 24x 16 pixels. where each pixel is com-
posed of three LEOs.
The LED video di splay functions by dividing the incoming video stream into sepa-
rate streams for each panel. The panels furt her process the video stream by di viding the
incoming video stream into even smaller streams for the LED modul es. Finally. the LEOs
modules di splay the video frames by controlling the LEOs to output Ihe correct colors for
each pi xel. or LED cluster.
LED Module
The LED module controls the individual LEOs wi thin the video display by turning the
LEOs on and off at the proper times 10 create the fi nal color images. Because each LED
module can consist of thousands of LEOs. directl y controlling each LED would require
too many wires. Instead. as shown on Fi gure 7.45, the LED within the LED module are
connected in a matrix with a single control wire for each row and three control wires for
each column (one wire for each colored LED within the LED clusters). In the fi gure. the
LED module controll er control s an array of 2x3 pixels, where each pixel consists of three
indi vi dual LEOs. for a total of 18 LEOs. But as shown. the controller u es onl y 9 wires 10
control those 18 LEOs. The wire aving using thi s row and column approach becomes
even more signifi cant for more pixels. An LED module with 24x 16 pixel s and three LEOs
per pixel would have 24* 16*3 = 11 52 LEOs. but the controller would require only 16
(one per row) plus 24*3 wires (three per column). for a total of onl y 88 wires.
The largest LED
displtl}' in 2004
was J 35 jeel wide
by 26 feet wl/.
buill fI.fill g 10
large FI'GAs, 323
lIIo(/erale-si:.e
FPGA.r. 333 flash
II/l' lIm rie,f, 1I11l1
3800 PLDs.
(Source: Xedl
j (mmal. Wiml"
21JO.1).
Panel Panel Panel
Panel Panel Panel Panel
Panel Panel Panel Panel
(a)
7.6 Product Profile: Giant Video Display 415
Module
,
,
,
"\\
000
000
...
... Module
000 .. .
(b)
0
Q
...
0
Blue Red

Green
(e)
Figure 7.44 LED video displays are designed hierarchicall y: (a) Ihe LED d' I .
larger panels. whi ch can be composed to create different sized di spl"ys d
lsP
conSIsts of several
. d' . .. . an w lch can be
In IVldually replaced to repair broken panel s. (b) each panel consislS of several smaller LED
modul es. responsible for controlhng the IndiVidual pixels. and (c) each pi xel .
red. green. and blue LEOs. Con ISIS of a cluster of
The LED module controll er
di splays a video image by
sequenliall y scanning. or
enabling. each row and di s-
playing the pixel va lues for each
column within the video image.
Us ing thi s teChnique. onl y one
row of LEOs is illuminated at
any given time. However. the
LED module scans the rows fast
enough such that the human eye
perceives all rows a being
illuminated.
The LED module must
control the LEOs to create the
desired color for each pixel.
LED Module
Controller
Rl
R2
R3
C2 C2 C2
(R) (G) (B)
Figure 7.45 LED module circuit consi ting of a matrix of
red (R). green (G). and blue (8) LEOs controlled b\ the
LED module controller. R IIR2IR3 are row I 3.
and el fe2 are columns I and 2-thus the matri . is
2x3 pixels. or 6 pixels total. with 18 LEOs total (3 LEOs
per pixel ).
Each pixel wi thin a video frame is typically represented usino an RGB I
e co or pace. An
RGB (red/greenlblue) color space is a method to create any color of li2ht b dd' 0
'fi' . . . h f d - ya tn. spe-
Ct c 1I1tenSllles. or bng rnesses. are. green. and blue colors. Each pixel within a -video
frame may be represented as three 8-bi t binary numbers where each b't be .
. - I num r peCt-
fies the intensity of the red. green. or blue colors. Thus. for each alar. the LED od I
111uSt be able to provide 256 distinct brighlness levels. However. an LED by itse7f
support s IWO values: 011 and off. or full Inl enstly and no intenSity. .
. To support 25.6 brightness levels. the LED module controller u e pul e width modu-
lal1 on. In pulse IIIld,h lIIodulatlOlI (also known as PWM). a controller dri\es a wire \\ ith
a 1 value for a specific percentage of a time period-the signal being 1 is kno\\ n as a
pulse. the duration of the 1 is known as the pulse s width. and the pen:-enta2e of the
period spent at 1 is kno\ n as the dilly cycle. When thm pulse drive ' an LED: a \\ ider
pulse causes the LED to appear bri ghter 10 the human eye. Figure 7 A6 illu>t:rates ho\\ the
LED module cont roller uses pulse width modulation to suppon \ ariou, brighrn -, le\eIs
for the LEOs. To illuminate an LED at full brightne s. the controller Jri\C, the
416 Physical Impl ementation
LED with I for the ent ire period. as shown in Figure 7.46(a). To illuminate the LED at
half brightness. the controller uses a pul se with a 50% dUl y cycle. as shown in Figure
7.46(b). For 25% bri ghtness. the controll er sets the pul se to I for 25% of the period.
mean ing a 25% dut y cycle. as shown in Figure 7.46(c). For an LED video di splay. the
LED modul e cont roll er divides the length of time each row is scanned int o 255 time seg-
ment s. and cont rols the bri ghtness of the LEOs by turni ng each LED 0 11 for 0 to 255 time
segment s. thereby support ing 256 levels of intensity.
Period 1
,
Period 2
,
Period 3
,
Period 4
,
, , , ,
, .. 1'-
"i
" : III
.,


(b)
(C)-{l
r r r
Figur. 7.46 Pul se widlh modulat ion can be used 10 create various LED brightness levels: (a) for full
bri ghtness. the LED is always on. (b) for half bri ght ness. the LED i" tumed on 50% of the time. and
(c) for quart er bri ghtness. the LED is turned on 25% of the time.
Because an LED modul e cont roll er must provide precisely timed signal s at a fast
rate. custom processors are commonl y used rather than just mi croprocessors. FPGAs are
a common choice for impl ementing those custom processor circuits in LED video di -
pl ays. due to several reasons. First. FPGAs are fast enough to support the required scan
rates. Second, the circuit on the FPGAs can be easil y changed, making it possibl e for the
di splay manufacturer to fix bugs in the circui!. and even upgrade the circuit , without
requiring the high cost of creati ng a new ASIC. Third, the di splays themselves are fairly
large. expensive. and consume much power. and therefore the larger size. hi gher cost. and
more power consumpti on of FPGAs compared to ASICs do not impact the overall dis-
play' s size. cos!. and power 100 signifi cant ly.
7.7 CHAPTER SUMMARY
In this chapt er, we discussed (Section 7. 1) the idea thai we must map our circuits to a
physical impl ement ati on so that those circuits can be inserted int o a real system. We
introduced (Secti on 7.2) some technologies that require that a new chip be fabricated to
implement our circuit. Full-custom technology gives the most optimi zed implementati on,
but is expensive and time-consuming 10 design. Semi custom technologies give very good
impl ement ati ons while costing less and taking less time 10 des ign. through the prede-
signing of the gates or cell s that will be used on the IC. We described (Secti on 7.3) the
increasingly popul ar technology of FPGAs. and showed how a circuit could be mapped
onto a set of programmable lookup tabl es and switch matri ces. We hi ghli ghted (Section
7.4) several other technologies, including off-the-shelf SSI/MSI ICs. and programmable
logic devices. We gave some data (Secti on 7.5) showing the relati ve popularity of the
technologies described in the chapter.
An interesting trend in physical impl ementati on is the trend toward programmable
ICs (FPGAs in parti cul ar). Impl ementing functi onalit y on an FPGA involves the task of
7.8 Exercises .j 17
downl oading a bitstrcal1l into the FPGA I devi ce. One might notice the , imi larity of tlwt
task wllh the k .01' implement ing functionali ty on a mi croprocc>sor. which also involve,
downioadlllg bus Int o an Ie devi ce. Thu,. Ihe diff Tence between softw:lfe on a micropro-
cessor and cu, tom di gitnl ci rcuits continue, to be blurred "peciall y when one can. iders
that modern FPGA., can al,o incl ude one or several l1l icroprocc;.sor. within the salli e I .
For more IIlfOrmall on n the blurri ng. ;.ec "The Soft ening of Il ardware." F. Vahi d. I EEE
COlllplll er. April 200
7.8 EXERCISES
SECTJOI 7.2: I C TECII NOt.OGI ES
7. 1 Explain why gal e nrray I technology ha ... II :-. honcr prmJucliulI l i me thull fulJ.CU... lol1l I
lechnology.
7.2 why Ihe u,c of NAND or lOR gale' illu CMOS gmc ttrray clfell i l
IYPlc:lil y preferred o\er all AI O/OR! OT 1ll1pl cIl1Cni al lUil 01 :t 1: 11"CUll .
7.3 Omw OJ gale Ie havi."g three rows. the firM row IHl vi ng f( ur 2-illplil AND sal e ... . lhc
: OW 2-lllpli l OR gale... , and Ihe third row huving lour OT gale!'. . Show
how to IIlSla n li atc 10 the gate array to implcmclIl the lum:li on F (d b e) - ab +
a' b ' c ' . . . C
7.4 A" ulnc it cell library h,n ... ;1 2: inpul A D gUle. n 2-inpul OR gale, und a NOr gUIC.
Usc ;:1 dr;:I\\.I.11l g 10 , how how 10. and place M:mdard On ;:IIl I :.t nd wire Ihem
,to I mplement the f unction 111 Exercise 7.3. Draw your thl.! ... Ihe
111 Exercl!)c 7.3. and be your row ... ;:Ire of equal ... i/ c.
Gs 7.5 Draw II gal e ;:1"::1)' Ie Iwvi.ng three rows, Ihe fi rst row having four 2-illpUI AND gales. lhe
sccond 2-lIl pul OR galC!), unci the third row havi ng fOllr NOT gUl es, Show
how to Illsl anll al c wi res t Ihe galc arra), 10 implcmclll the equtlliull F (a , b , c .d )
a ' b + cd + c ' .
7.6 As ume " "Iandard cell li brary ha, a A D gate. a 2-il1l>ut OR gate. and a NOT gate.
Usc a dr:)\vlIlg 10 how 10 II1:, wnlmLc and pl ace ... tandard cell ... on an I and wire Ihem
IOgcther to implcmcnl Ihc runction in Excrci l"c 7 .5. Be ... urc to draw yuur cell , Ihe '3me 'lizc a!o.
the in Excrci ... e 7.5. and be "' lIre your rows ure of equal ') i /c. . -
7.7 Consider the implcmenlali ons or a Imlf-addcr wil h a gale array in Figure 7.4 ;:lI1d with Mundard
cells in Fi gure 7.6. A" ul1w each gate or cell (including inverte",) I"" a delay of I n". Abo
assume th?t every of wire each II1ch .111 your draw.ing. nOl an an (lcili al I ) a delay
of 3 11 ... (wire:, arc re lall vcly 111 the Cr.l of li ll y fal,l lran"' I'o tor\ ). the delay of the gate
array and the standard cell ci rcui ts.
Gs 7.8 For your soluti ?ns to Exerci"es 7.3 and 7.4. ' " ume that cuch gate and cell ha, a delay of Ins.
and Ih;'1( evcry Inch .of wire ( for ench II1ch 111 your dmwlIlg. not on an actuul Ie ) corresponds to
n del ay of 3 ns. lhe of the gme array and cell ci rcuits.
7.9 Draw a circui t using AND. OR. and OT gates for the foll owing equati on: F ( a . b . c) _
P L US a ' be + a be ' . Pl ace inversion bubbles on that circuit to conVe rt the circuit to:
(a) NAND gates onl y.
(b) OR gate, onl y.
7.10 Draw a ci,,;ui t AND; OR. and OT gates for the foll owing equati on: F ( a . b . c) _
a be + a + b + e , Pl ace II1verslon bubbl es on that ci rcuit to con\'ert the circuj t to:
(a) NA D gates onl y.
(b) NOR gates onl y.
r
.US Physical Implementation
7. t I Ora\\' circuit using AND. OR. and NOT gales for the following equation: F ( a . b. C) ""
(a b + c) (a ' + d) + c '. Convert the circuit (0 a circuit using:
(a) lAND gates onl y.
(b) NOR gnles only.
7.12 Draw a circuit usi ng A 'D. OR. and NOT gales for the following equati on: F ( w . x . y . z) ""
(\'1 + x) (y + z) + \'/Y + X Z. Convert the circuit to a circuit usi ng:
(a) NAND gates onl y.
(b) NOR gates onl y.
7.13 Draw:l ci rcuit Ll silH! AND. OR. and NOT gates for the following equat ion: F ( a , b , C , d) =-
(a b) (b ' + c) (a I d + c ' ). Convert the circuit to a circuit using:
(a) NAND gates onl y.
(b) NOR gates only.
7.14 Create J template for convening a 3-inpul AND gate to a ci rcuit using only 3-input NAND
gates.
7.15 Create a template for converting a 3-inpul OR gale to a circui lllsing onl y 3-input NAND gates.
7.16 Create:1 Icmplale for converting a NOT gate to a ci rcuit using onl y 3-input NAND gates.
7.17 Assume a standard cell library consisting of 2-input and 3- input NAI D gates with a delay of
I ns each. 2-input and 3-i nput A D and OR gates with a delay of 1.8 ns each. and a NOT gate
wi th a delay of I ns. Compare the number of transistors and the delay of an implementation
using onl y ANDIOR! OT gates with an implementation using onl y NAND gates for the func-
tion: F ( a . b , c) =a b ' c + a ' b. For calculating the size of an implementation. assume each
gate input requires IWO transistors.
7.18 Assume a standard cell library consisting of2-input AND and OR gates wit h a delay of Ins
each. 3-input AND and OR gates wi th a delay of 1.5 ns each, and a NOT gate wi th a delay of
I ns. Compare the number of transistors and the delay of an implementati on using only
2- input AND/OR gates and NOT gates with an impl ementation using onl y 3-input AND/OR
gates and NOT gates for the functi on: F (a , b , c): a be + a ' b ' e + a' b ' e'. For cal-
culating the size of an implementation. assume each gate input requires two transistors.
7.19 Assume a standard cell library consisting of 2-i nput NAND and NOR gates with a delay of
I ns each. and 3-i nput NAND and NOR gates with a delay of 1.5 ns each. Compare the
number of transistors and the delay of an implementati on using only 2-input ANDINOR
gates with an implementation using only 3-input NANDINOR gates for the function:
F ( a , b , C): a ' be + a b' e + a be ' . For calculating the size of an implementation.
each gate input requires two transistors.
SECTIO ' 7.3: PROGRAMMABLE I C TECHNOLOGY-FPGA
7.21) Show how to implement on a 3-input 2-output lookup tabl e the function F (a , b , c) : a +
be .
7.2 1 Show how 10 implement on tWO 3-input 2-output lookup tables the function F (a , b , e ,d ) : ab
+ cd. you can connect the lookup tables in a custom manner (i.e.. do not use a switch
matri x. directly connect your wire ).
7.22 Show how to impl emcllI on Iwo
3-input 2-output lookup tables the fo l-
lowing function: F (a . b . e , d)
Bx2 Mem.
a ' bd + b' cd ' . Assumc the two
lookup tables arc connccted in the
manner shown in Fi gure 7.47. You
may not need to use every lookup -... a2
table Output. -... al
7.23 Show how to implcment on two 3-inpul -"" ao
2-output lookup tables the following
functions: F (x , y . z) : x ' y + d1
xyz ' and G(w , x . y , l) : w' x ' y '-----"r---';
7.B Exercises 419
Bx2 Mem.
a2
a1
d1 dO
+ vi' xy Z ' . Assume the two lookup
tables are connected in the manner
shown in Figure 7.47.
Figure 7.47 1\"0 3-input 2-output lookup tables
Impl emented using 8x2 memory.
7.24 Show how to impl ement on two
3-input 2-ouput lookup tables the following functions: F ( a , b . e . d ) - a be + d and
G "" a'. implement both F and G with only two lOOkup tables connected in the
manner shown In Figure 7.47.
7.25 a 2-bi t comparator that compares two 2-bi l numbers and has three outputS indi-
cating greater-than, less-Ihan, and equal-to, using any number or 3-input 2-output loo"-.'1Ip
tables and Custom connections among the lookup tables.
7.26 Show how to implement" 4-bi t carry-ripple adder using any number of 3-input 2-input
tables and cll stom connectIOns among the lookup tables. Hint: map one full-adder to each
lookup table.
7.27 Show how to implement a 4-bi t carry-rippl e adder using any number of 4-input t -oUtpUl
lookup tables and custom connecti ons among the lookup tables.
7.28 Show how to implement [1 comparator that compares two 8-bit numbers and has a sin21e
equal-to output . using any number of 4-inpul I-output lookup tables and custom onnecti;ns
among the lookup tables.
7.29 the bi t file necessary to program the FPGA fabri c in Fi gure 7.29 to implement the func-
ti on F (a , b, e ,d ) = a b + cd. where a. b. e. and d are external inputs.
7.30 Show the bit file necessary to program the FPGA fabric in Fi gure 7.29 to implement the func-
Iton F (a , b. e. d) : abed. where a. b. e. and d are external inputs.
7.31 Show the bit file necessary to program the FPGA fabric in Fi gure 7.29 to implement the func-
tion F (a , b, e, d) = a' b' + e ' d. where a. b. e. and d are external inputs.
SECTION 7.4: OTHER TECHNOLOGIES
7.32 Use any combination of 7400 ICs li sted in Tabl e 7. 1 to implement the function F (a . b , e . d)
: a b + cd.
7.33 Use any combinati on of74oo ICs li sted in Table 7. 1 to implement the functi on F( a , b . e. d )
= abc + ab'e' + a' bd + a'b ' d ' .
7.34 By drawing XS on the circuit. program the PLD of Figure 7.38(0) to implement a full-adder.
7.35 By drawing Xs on the circuit. program the PLD of Figure 7.38(a) to implement a
equality comparator. Assume the PLD has an addi tional 14 input.
7,36 *(a)Design a PLD device capable of supporting a 1-bit carry-ripple adder. B, drn\\ing s on
your PLD circuit. program the PLD to implement the 1-bit arT) -ripple adder -
J
.UO Physical Implementation
(b) Using a CPLD device consist ing of several PLDs frolll Fi gure 7.38 and <.Issu ming you can
connect the PLDs in a cli stom manner. impJcmcllI the 2-bit c:1rry-ripple adder by drawing
Xs on Ihe PLDs.
(c) Compare Ihe size of your PLD and Ihe CPLD by delermining Ihe gales required for bolh
designs (make sure you compare the number of gales within the PLD and CPLD and not
the number of gates used for your impl ementati on).
SECTION 7.5: IC n:CHNOLOGY COMPARISONS
7.37 For each of the system constraims below, choose the 1110S1 appropriate technology from among
FPGA. standard cell. and full-custom Ie technologies for implementing a given circui t. Justify
your answers.
(n) The system must exist as a physical prolOlype by next week.
(b) The system should be as small and low-power as poss ible. Short design time and low cost
are /lat priorities.
(c) The system should be reprogram mabi e even after the final product has been produced.
(d) The sy lem should be as fasl as possible and should consume as lillie power as possible.
subj ect to being completely implemented in just a few months.
(e) Only five copies of the syslem will be produced and we have no more Ihan S 1000 10 spend
on all Ihe ICs.
7.38 \Vhi ch of the following implementations are "at possible? (I) A custom processor on an
FPGA. (2) A cuslom processor on an ASIC. (3) A cuSlom processor on a full,cuslom IC. (4)
A programmable processor on an FPGA. (5) A programmable processor on an ASIC. (6) A
programmable processor on a full -custom Ie. Explain your answer.
Programmable
Processors
8.1 INTRODUCTION
Seat belt
warning lighl
singlepurpose
processor
Digital circuits des d ..c' .
r h Igne to pell ornl a slJ1gle processlJ1g ta k. such as a seat belt warning
Ig t. a pacemaker, or an FIR fi lter, are indeed a very common cia s of digital circuits. We
;Ighl refer to a circuit perfornling a single processing task as a sil/gle-purpose processor.
IJ1glepurpose processors represent a class of di gital circuits enabling tremendously fast
or powerefficlent compulation. However, another class of digital circuits. known as pro-
grammabl e processors, is also extremely popular, as well as being more widely known.
The programmabl e processor is largely responsible for the computing revolution that has
taken place in the past several decades. leadi ng to what many call the infonnation age. A
programmable processor. also known as a gel/eralpurpose processor, is a digital circuit
whose panl cular process ing task is stored in a memory. rather than being built into the
CirCUit It self. The representat ion of that processi ng task in the memory is I... 11own as a pro-
gram. Figure 8. 1 illustrates singlepurpose versus general-purpose processors. We could
creale a custom digi tal circuit for a seat belt warning light system (Chapter 2) or an FIR
filter system (Chapter 5). or instead we could program a general-purpose processor circuit
10 Implement those systems.
3'lap FIR Ii Iter
singlepurpose processor
3'lap FIR filter
program
Other
programs
Figure 8.1 Single.purpose versus geneml'purpose processors.
Generalpurpose processor
-'22 Programmable Processors
Some programmablc processors. like thc well-known Intel PCllIium processor or
Sun', Spare proce"or. are illl ended for use in dcsktop computers. Other programmable
proces,ors. ARM. MIPS. 805 1. and PIC processors (whi ch arc widely known in the
design community but kss known by the general public). arc illlendcd for embedded sys
tems. like cellul ar telephones. automobil es. video games. or even tenni s shoes with
blinki ng lights. Some programmable processors. like the PowerPC. arc intended for both
de -ktop and cmbedded domai ns.
A benefi t of a programmabl e processor is that its circuit can be mass-produced and
then programmed to do almo. t anything. Thus. the same programmable desktop pro-
cessor can fun \Vindows 98. \ Vindows XP. Linux. or whatever new operating system
program comes aboll l. Likewise. that same processor can run appl icati on programs like
word processors. spreadsheets. video games. web browsers. ctc. Furthermore, the same
programmable cmbedded processor can be used in a cel l phone. aut omobil e. video game.
or tenni, shoe by programming the processor for the desired processing task. Mass-pro-
duction result s in low costs due to amorti zat ion of design costs (sec '"Why such cheap
calculators"" in Chapter -' for a discussion of amorti zati on).
Of course. because programmable processors arc ma s- produced and then used for a
wide \'ariety of appli cations. there aren't as many unique programmable processor
designs as there arc single-purpo e processor designs. It foll ows then that there are far
fewer programmable processor designers than there arc single-purpose processor
designer. evertheless. even though you may never design a programmable processor as
part of your job. it i interesting and enlightening to understand how such a program-
mable processor works. Some people argue that people who understand how a processor
works are even bett er software programmers. And technology trends have led to the si tu-
ation of designers being able to create semi custom processors ("appl ication-specific"
processors) that have ju t the right archi tecture for one or a mall number of applications,
making knowledge of programmable processor designs important. Finally, there are
indeed people who do de ign programmable processor architectures. and you never knolV
if you might end up being one of them.
In thi s chapter. we show how to design a simple programmable proces or using our
prc\iously-described digital design method Our purpose is mainl y to demystify these
and to provide an intuiti on of how programmable proce sors work. We point out
that real mass-produced proces ors are designcd using different methods. and their designs
can be much morc complex than the de ign described in thi s chapter- learning about those
proce. so,,' designs is the subject of many textbook. on computcr architecture.
8.2 BASIC ARCHITECTURE
Basic Datapath
A programmable prOce%or consist of two main parts: a datapath and a cont rol unit.
We' ll provide a general imrodu ti on to those two parts in thi , ,ccti on. then we'll provide
a more detailed look at tho,c parts in a subsequent sccti on.
We can view procc"lI1g generally as:
Lnlldllll( data. meanll1g reading the data on whIch we ",i,h to work from some
Input locution,.
8.2 Basic Architecture 423
Trall sforming that data. meaning perfomling some computati ons with that data
that result 111 new data. and
Storillg the new data. meaning writing the new data to some output locations.
h example. a SCat belt warning sy tern reads bit data from sensors representing
w et er a seat belt is fastened and whether a person is sitting in a eal. transforms that
data by comput1l1g a new bit indi cati ng whether to tum on a warning li oht. and writes that
new data to a warning I" h A FIR fi . e
'. Ig t. n Iter read data represenllng the most recent set of
Input SIgnal sampl es. tran forms that data by performing multipli es and add. and writes
new data to an output representing the filt ered signal.
A data memory holds all the data that a program-
mable processor can access. as input data or output
data-for now. assume the word in that data memory
are somehow connected to the outside world (e.g .. to
the seat belt sensors or to the FIR input and output
SIgnals). To process that data. a programmable pro-
cessor needs to be able to load data from data
memory into one of several registers (typi call y a reo-
ister file) within the processor. need to be able
feed data from some subset of regi sters throuoh func-
tional units that can perfoml all ;os ible
trallsformatioll operations (typicall y an ALU) we
might consider wit h results stored back into a register.
and needs to be able to srore data from a regi ster back
int o data memory. Therefore. we ee the need for a
programmable processor to include the basic circuit
shown in Figure 8.2. showing a data memory. regi ster
fil e. and ALU. That circuit is known as the program-
mable processor s datapat" . The basic datapath
shown in Fi gure 8.2 can perform the following po _
sible datapat" operatiolls in a given clock cycle:
somehow
connected
to the
outside
world
Figure 8.2 Basic datapalh of a
programmable processor.
Load operatioll: Thi s operation loads (reads) data from anv location in the data
memory into any register in the regi ster file. A load ope';tion is illustrated in
Figure 8.3(a).
ALU operatioll : This operation tran forms register data by p sing am two fegi --
ter through the ALU configured for any of the ALU' supponed ",;"d
back 111t O any regt ster of the regIster file. An LU opcrmi n is illustrated in Figure
8.3(b). Typical ALU include addition. subtraction. logical A..;U.
logical OR. etc. -
Store operatiol1: This operJtion stores (write) dara from regi -ter in the regi.ter
fi Ie to an)' data 111e1110ry location. A store opemtion is illustrated in Figure '.3( ).
These possible datapath.operations are in Figure .:. E:tch ,uch opcnti n
requires the appropnaw setllng ot the c?ntrOl I11putS f the uara mem 1\ . I11U'.
file. and L - those control 111pUtS wlil be sho\\n For n(\\. Just familiarize
Programmable Processors
EXAMPLE 8.1
, ourself \\'ilh Ihe basic dalapalh's abililies. NOlice Ihm Ihe dmapalh in Figure 8.2 cannol
direcll Y oper:lI e on dala memory locmions wil h Ihe ALU in one clock cycle. because lhe
dOIa n1l,,1 firsl he read inl o Ihe regisler file. which il self requires a clock cycle, before lhe
dala can be opcralcd on by Ihe ALU. A dalapalh Ihm requires all dala 10 firsl pass through
Ihe regiSicr lik before Ihal d:lIa can be Iransformed by Ihe ALU is known as a load-store
architecture .
(a) (b) (e)
Figure 8.3 B"ie dalapalh operali ons: (a) load (read). (b) ALU opcr:lIion (transform). and
(c) "ore (wri le).
Understanding data path operations
Which of Ihe following are valid single-clock-cycle dalapalh operalions for the datapalh of
Figure 8.l?
I. Copy daw from a data memory location inlo a regiMer file locali on.
2. Read dala from two d:U3 memory locati ons into IWO rcgi,(cr file locali ons.
3. Add data rrom IWO data memory locat ion... and l) IOre the result in a register fi le location.
Copy dma from one regi<ler fil e locolion 10 anol her regi"er fil e localion.
5. Subtract data in a rcgi'lcr file localion rrom a d::!la memory loc:lIion. storing the result in a
register fi le location.
( I) " a valid operali on. "nown a, a load opermi on. (2) is 1101 1I valid operali n. We cannol read more
Ihan one daw memo!) local ion during a dmapmh operalion (for Ihi s dUlaplllh). and we cannOI wrile
10 more th:.m one regi\ lcr fi le locati on during:1O operati on. (3) b ,w/ :1 valid operati on. Not only can
we nm rCild from two data memory during onc opt:rution. but wc cannot reed the read
,alue, dlreclly inlo Ihe AL 10 perform Ihe add- we mu" fi rsl perronn opemlion Ih.1 read Ihe
duta lIem; II1tO reg"ler hie lac. lion;. (4) is. v" lid oper,"ion. We can configure Ihe ALU operalion
10 'Imply pa" one of ," "'PUb Ihrough to .he QUIPUI (perhap' by adding 0) and slore Ihe re,uil in
Ihe reg"l hie. (5) "/lor a valid operal. on. We cannOI feed a read duw memory lacOlion directly 10
Ihe AL -Ihere " no such co"neeli on .n Ihe d.II.'pmh Vulue; read fr m dala memory mu I be
loaded InlO Ihe hie hr.1
8.2 Basic Architecture
425
Basic Control Unit
Suppose we walll 10 use Ihe basic datapath of Figure 8.2 LO perfonn the simple processing
lask of addmg dala memory local ions 0 and I logelher, and wriling the resull in data memory
9-m olher word, we WamlO compule 0[9} = O[OJ + O[ I J. We can achieve this
processmg lask by "inslrucling" the dalapath lO perfonn the following operations:
load datapalh memory localion 0 LO regisler file regi sler RO (i.e. , RF[OJ = O{O/),
load daLapmh memory lOCation I lO regisler file regi sler R I (i.e. , RF{ I) = O{ I /),
perform an ALU operati on lhat adds RO and RI and wriles the resuil back into R2
(i.e., RF{2/ = RF[O/ + RF{ I J), and
Slore R2 into data memory localion 9 (i.e., 0[9) = RF[2J).
NOle lhal we could have used any regislers in the regisler file, rather than RO, RI. and R2.
If 0 [0/ contained Ihe value 99 (in binary, of course), and O[ I J contained the value
102, lhen afler carryi ng oUllhe above operalions, 0[9J would cOnlain 201.
You mighl lhink lhal having 10 instrucl the datapath lO perfonn four distinct opera-
lions is a rather cumbersome way of adding IwO dala items. If you could build your own
CUSlom digilal ci rcuillO implemenl 0[9/ = 0[01 + O[ I }. you would likely just feed OlOI
and 0[1 J lhrough an adder whose ompul you would conneCl LO 0[9 J, thus avoiding Ibe
four operali ons involving the regisler fil e and ALU. We see the basic tradeoff of single-
purpose versus programmable processors-programmable processors have the drawback
of compulali on overhead because Ihey have to be general, but they provide the benefits of
a mass- produced processor lhal can be programmed lO do almo l anything.
Somehow we need 10 descri be the sequence of operations-RF[OJ = OlOJ, Iben
RF{ I }=O{ I}, lhen RF{2/ = RF{O/ + RF{ I}, then 0 [9J = RF{2/-that we desire LO execute
on the dalapalh. Such a description of desired processor operations are known as instruc-
t i OIl S, and a colleclion of instruclions is known as a program. We will tore Ibe desired
program as words in anOlher
memory. cail ed the ins/rue/ioll
memory. We'll describe how 10
represenl lhose instructions Ialer.
For now. assume lhat the four
instruclions are somehow slored
in locations O. I. 2. and 3 of Ihe
inSlmcti on memory I . as shown in
Figure S.4.
ow is where the comrol
unil plays a role. The cOllfrol
lill i/ reads "'Ich insll1J clion from
insll1Jcli on memory. and lhen
execules lhm inslmclion on the
dalapalh. To execule our simple
program. the conlrol unil would
begin by perfonning Ihe fol-
lowing lasks. known us stages. to
arry OUI Ihe firsl insinl lion:
Instruction memory I
0: RFIO]=DIO]
1: RFlll=Dll]
2: RFI21=RFI01+RF(1]
3: DI9]=RFj2l
Control unit
Figure 8.4 Tho control unil in 3 programmable p
'16 Programmable Processors
l. Fetch: The control unit would stan by reading l i D} into a local register, a task
known a, fetching. Thi s stage requ ires one clock cycle.
1. Decode: The control unit would then det ermine what opcrati on thi s instruction is
requc,ting. a task known as decoding. Thi s stage also requires one clock cycle.
3. Exewte: Seeing that thi s inslructi on re luests the datapath operation RFIO} =
010}. the cOlll rol unit would set the cOlllrol lines of the dalapath to read DIO},
pas; the read data through the 2x I mux in front of the register fi le. and write that
da ta int o RIO}. The task of carrying out the operation is known as exeClilillg. Most
operations arc datapath opermions (such as a load operat ion. ALU operati on, or
slOre operati on). but not all operati ons require thc datapal h (an example is the
jump instructi on to be discussed later). Thi s stage requires one clock cycl e.
Thus. the basic stages the control unit carries out for thm first instructi on are: fetch,
decode. and exeellle. requiring three clock cycles to compl ete just thm first instruction.
The local register in whi ch the control unit IOres the fetched instruction is known as
the illstructioll register. or fR. a shown in Figure 8"+. NOIi ce thm the cOl1l rol unit needs
10 keep track of the locati on in instruction memory from which to fetch the next instruc-
tion. Since the instruction locations are usuall y in sequence. we can use a simpl e up-
counter 10 keep track of the currelll program instructi on-such a counter is known as a
program COli liter. or PC for shan. The processor stans with PC=O, so the instructi on in
flOJ represents the fir t instructi on of the program.
Figure 8.5 illustrates the three stages of executing the instructi on RFfOJ = DIO}
stored in flO}. Assuming PC was previously initi ali zed 10 O. Figure 8.5(a) shows the first
-- -- ------ ------- -------------------------------1
: ________ __ -------1--------------------------:
I InstructIon memory I :
0: R F[O]=D[O] , _________ - - --- --- - -- __L --- --- ----- - -- -- --- ------,
1: RF[1 ]=D[t] i Inst ruction memOlY I
Conlrolier
______ -l _____
(a)
2. RF[2]=RF[0]+RFP] i 0: RF[O]=D[O]
3: D[9]=RF[2] : 1: RF[l]=D[I]
(b)
: 2: RF[2]=RF[0]+RF[1 ]
j 3: D[9]=RF[2]
.
: L __ __ --=_,, _-=' _ __
(c)
f,gure 85 Three tage, "f p"J<:c",ng nil' on,'ruWnn (a) fetch. (h) decode. (e) e,cnll"




__-_-_J.J
8.2 Basic Architecture 427
stage fetching flOj' s co t h '
h d n ents, t e Instruct ion RFIO}=OIO}. into fR. Figure 8.S(b) shows
tl state decoding the instruction and thus determining that the instruction is a
0" II1struct lon F,oure 8 5{c) h h
. h ." . sows t e controll er executing the in lruclion by confio-
urlng t to read the value of 010} and storino that value into RF/O]. If D/OJ
contall1e 9. then RIO} wi ll cont ain 99 after completi on"of the execute stage.
After proces II1g the instruct ion in IIO}, the control unit would fetch the in lructi on
that IS 111 III J. decode that instruction, and exeCUle that in trucli on {lhus executing
RF/ I} Of I J), requiring another three cycles. Next, the control unit would fetch the
II1strucllon that i in 1{21, decode that instructi on, and execute that instruclion {thus
executing RFI2} = RFIO} + RFlf Jl, requiring anOl her three cycles. Finall y. the control
unit would fetch the instruction that is in 113}, decode tltat in lruction. and execute that
IIl structl on (thus executing 019} = RF/2J), requi ring anolher three cycles. The four
IIl structl ons wou ld require 4*3 = 12 cycles 10 run 10 completion on the programmable
processor.
Tlt e control unit wi ll require a controller.
like those de cribed in Chapter 3. that in thi s
case repeatedly performs the fetch. decode.
and execute steps (after having initialized PC
10 O)-nOle that a controll er appears inside the
control unit in Figure 8.4. An FSM for that
controll er appears in Figure 8.6. Tlt e con-
troller increments the program counter after
fetching each instructi on in state Fetch. so tltat
the next fetch state wi ll fetch the next instruc-
ti on (nOlice tlt at PC gets incremented at the
end of tlt e fetch stage in Figure 8.5(a)). We ll
describe the actions of the Decode and
Execllte states later.
Controller
IR=I[PC]
PC=PC+t
Figure 8.6 Basic controller states.
Thu , the basic pans of the control unit include the program counter Pc. the in auc-
tion regi ster fR, and a controll er. as illustrated in Figure .-I . In previous hapters. our
non programmabl e processors consisted onl y of a controller and a datapath. Notice that
the programmable proce sor instead contains a control unit. which itself consi IS of some
regi sters and a controll er.
To summari ze. the comrol unit processe each instruction in three tages:
l. first fetchillg tlte instructi on by loading the current inslTUction into fR and in ce-
menting the PC for the next fetclt .
2. next decodillg the inslnlcti on to determine its operation. and
3. finall y execlltillg the operati on by setting the appropriate ontrol lines r the data-
path. if applicable. If tlte operation is a datapath operation. the b<!
one of three possi ble type :
(al/oorlillg a data memo I) locmion into a register fi le location.
(bl tmn,rorming <lma using an AL opemtion on register file locations and
writing results back to n register fik I 'ation. or
(el ,fwrillg a regi,ter file loc:nion into a data memOI) ation.
- ---...-- -
-'28 8 Programmable Processors
EXAMPLE 8.2 Creating a simple sequence of instructions
Crt'a{(' a 'l't of In,tmction, ror the.: in Figure MA to compute 0/3/ = 0/0/ + Df 1/ + D12/.
Each in!-tnlclion mu,t f\!prC'Cnl a valid operati on.
\Vc might ... lan \\ ith opl..'J"alions that read the data memory locati ons into register file
location,:
O. R131; 0101
I. RI.JI; 0111
2. Rlcl; 0121
NOlI..' lhat \\(' intcllIionall y arbitrary regi ster IOC;lIi oll s. to make clear thaI we can use any
rcgl'ler:-..
Ne\t. we need to ;Idd the three va lues a nd store the result ill a register fi le locati on. say R/ J j.
In other \lord,. \Ie wanll O perronn ille roll owing opermion: Rill; RI21 + RI 31 + R141. However,
the datapath of Figure.: SA cannot ;Idd three register file locations in a single operati on. but rather
can ani) add 1\\0 location .... Instead. we can describe the desired addit ion comput ati on by dividing
the computation into 1\\ 0 dmapat h operations:
J . Rill; RI21 + RIJI
Rill = Rill + RI.JI
Finall). \Ie ,\[il< Ih< re,uil inlO 0131:
5. Df3I ; Rill
Thus. our program c:onsisls of the six instructions appearing above. whic h we might store in instruc
tion memo) location ... 0 through 5. <411
EXAMPLE 8.3 Evaluating the time to carry out a program
Deh!mline the number of clock required for the processor of Figure 8.4 10 execute the si.x
instruction program of Example 8.2.
The procc"i\or require'\ 3 cycle\ (0 process cach inst ructi on: I cycle to fetch the instruction. I
(0 decode! the fetched and I to execute the At 3 cycles per instruction. the
(olal cycle\ for 6 i\: 6 in\lr * 3 cyc leslinstr = 18
8.3 A THREE-INSTRUCTION PROGRAMMABLE PROCESSOR
A First Instruction Set with Three Instructions
Thc v. ay v.e repre,ent in,tructions in the in!>lruction memory. and the li st of allowable
in,truction,. arc known as a programmable illstrtlctioll set . Let's assume that a
processor uses 16-bi t instructi ons. and that the instructi on mcmory I i 16-bits wide.
In,truclion set, typically a certain number of bit s in the instruction to denote whal
operation to perfonn. The remaining biL, pecify any additi onal infonl1ation needed to
perform the operalion. ,uch a, the source or destination registers. We define a simple, three
In,truction ,el. with the most signifi cant (meani ng leftmost) 4 bi t identifying the appro-
priate operation and the lea,t ,ignificant 12 bi t> containing register fi le and data memory
addre"c,. 'l' fo\low,:
i1Jod In'truclion 000 r
J
r
2
r
l
r
O
d, d6dSd,dJd2dldo: Thi, in; truction specifi c a
move of daw from the memory local ion whose addre<,<, is speci fied by the
nih into the regl,tcr hie rcgi,tcr who,e locati on is specified by
8.3 A Three-Instruction Programmable Processor 429
the bits r)r2rlrO' For exampl e, the instrucLi on "0000 0000 00000000" speci fies
a move of data memory locati on 0, or DIOI. into register file locaLion O. or
other words. that inslruction represenlS the operati on RFfOI ; DfOf.
LikeWi se, "0000 0001 00101010" specifi es RFfll=Df42f. We've inserted
spaces bet ween some bits for ease of reading by you the reader-those spaces
have no other significance and would not exi sl in the insLructi on memory.
Store instruction-OOOI r)r2rlrO d, d6dSd,d)d2dldo: Thi s instrucLi on specifies a
move of data In the opposite directi on as the instructi on above. meaning a move
from the register file to the data memory. So "0001 000000001001 " specifies
DI91;RFIOf.
Add instruction-OOIO ra)ra2ralraO rb)rb1rblrb
o
Thi in truction
speCifies an addition of two register file registers specified by rb
3
rb
1
rb
l
rb
o
and
rc)rc2rclrcO' with lhe result stored in til e register file register specified by
ra)ra1ralraa For example, "0010 0010 0000 0001" specifies the in truction
RFf21;RFIOI+RF{ II. Ole that add is an ALU operation.
None of these instructions modifies the contents of the instructions' ource operands.
In other words. the load instructi on copies the COnlents of the data memory location to the
specified register, but leaves the data memory location itself unchanged. Likewise. the Slore
instruction copies the pecified register to data memory. but leaves the register' contents
unchanged. The add instructi on reads its band c registers without changing them.
Using thi s instruction
set. we would describe our
earli er program that com-
putes Df91;DfOI+Df I f as
shown in Fi gure 8.7.
Not ice that the first
four bi ts of each instrucLion
are a binary code til at indi-
cates the instructi on's
operation. Those bit are
known as the instructi on's
operation code. or opcode
for shan. "0000" means a
move from data memory to
register file. "0001" means
a move from regi ster fi Ie to
dat a mcmory, and "0010"
means an add of two regis-
ters. bascd on the
instruction set defined in the
bullcted li st above. The
remaining bits of the
in tnlcti on represent oper-
allds. whi ch indie,lle what
dma to operate on.
Desired program
0: RF[O]=D[OI
1: RF[l]=D[l]
2: RF[2]=RF[0]+RF[1]
3: D[9]=RF[2]
Instruction memory I
0: 0000 0000 00000000
1: 0000 0001 00000001
2: 0010 0010 0000 0001
3: 0001 0010 OOOOtOOl
Computes
0191= 0101+0111
Figure 8.7 pn.lgram illal ,-ompUl<' D['I);D[O]+D[II.
u:;ing a 2h"en instrul'li n set. IOsened
bet\,"'t'eo"'lhC: instRu:tion memof) 's bits for
donOt 11\ tht'
----
no Programmable Processors
0: 0000000000000101 /I RF[OJ 0[5J
1: 0000 000100000110 /I RF[1J 0[6J
2: 0000 0010 00000111 /I RF[2J 0[7J
3: 0010 0000 0000 0001 /1 RF[OJ RF[OJ + RF[1J
/I which is 0[5J+0[6J
4: 0010000000000010 /1 RF[OJ RF[OJ + RF[2J
/I now 0[5J+0[6J+0[7J
5: 0001 0000 00000 1 01 /I 0[5J RF[OJ
Figure B.8 f\ program to compul e
D/5/=D/5/+D/6/+D/ 7/ li sing the three-
instruction instructi on sel.
We could \\ ritc a different program
the lhrec- in:' lfucti on instruc-
lion set. For example. we could write a
program that compute, DI51 = D[51 +
D16} + DI7f. We mu,t perfonll that
computati on lI sing instructi ons chosen
fr0111 the three-instruction instructi on
se t. \ <\Ie might \\ rit e the program as
sho\\ n in Figure 8.8. The number before
the colon represents the instruct ion' s
addrr" in the instruction memory I .
The text foll owing the two forward
slnshes (1/) represe nt comments. and are
not part of the instructi ons.
Ole how that program ultimately comput es the de ired sum. Thi s mi ght be the first
time that lOU have had to think of computati ons in terms of low- level programmable pro-
ces,or instructions. Think.ing in terms of such regi ster- level operati ons can be diffi cult at
firs!. but become easier as you see and develop morc programs at that level.
:\Iachine Code versus Assembl y Code
As you have seen. the instructions of a program exist in instructi on memory as as and Is. A
program represented as a s and Is is known as machille code. Writing and readi ng programs
represented as Os and I s are tasks that humans are not pani cularly good at . We humans can' t
understand those Os and Is easil y. and thus will li kely make pl ent y of mi stakes when writing
such programs. Thus. earl y computer programmers developed a tool , known as an assem
bier (which itself is just another program). to help humans write other programs. An
as embler all ows u to write inst ructions using IIl1l emoll ics. or symbols. that the assembler
automaticall y translates to machine code. Thus. an as embler may tell us that we can wri te
instrUction, from our three-instruction instructi on et using the foll owing mnemonics:
Load instructi on-I\(OY Ra. d: pecifie the operati on RFlaj=Dldf. a must be 0.
I ..... or 15-so RO means RF/Oj. R I means RFII f. etc. d must be O. I ... .. 255.
Store in,tntcti on-MOY d. Ra: specifi cs the operati on Dld}=RF/af.
Add instruction-ADD Ra, Rb, Rc: specifics the operati on RF[a}=RFlbl
+RFl cf.
COMPUTERS WITH BUNKING LIGHTS.
Big computer., shown in the mo\ies often have many
ro'" of ,mall bltnking light,. In the carly day, of
compuung. computer programmer; progrummed u"ng
machine code. and they cntered that code tnto the
tn,trucUon memory by nipptng "''' tehe, up and down
to repre",nt 0, and h To enahle dcbuggtng of the
program. a., "'ell "' to ,how the computed data. tho",
earl y compute" u\Cd row, 01 Itghh-on Itght' meant
1,. off li ght' meant 0,. Today. nobody in their ri ght
mind would try writing or debugging a program by
u,ing machine code. So computer, today look like big
boxes-with no row, of li ght, . But big plain boxes
don't make for in movies, so
movie make" continue to U\C rnO\, II.: prop, wi th lots or
bhnl.ing It ght' tu rcpre,ent computc,,- lights that IlrC
u clc"", bUI cnh;r1i.H11 IIlg.
Turning on a personal com pUler causes the operating
system to load, a process known as "booting" the
computer. The computer executes instructions
beginning at address O. which usuall y has an
instructi on that jumps to a built-in small program that
loads the operating system (the small program is often
call ed the basic input/output system, or BIOS). Most
computing dictionaries Slate that the term "boot"
ori ginates from the popular expression "to pull oneself
up by one's bootstraps." which means to pick yourself
up wi thout any help. though obviously you can' t do
thi s by grabbing onto your own boot traps and
pulling-hence the cleverness of the expression. Since
the computer loads its own operating system. the
computer is in a sense pi cking itself up without any
help. The term bootstrap eventuall y got shortened to
boot. A colleague of mine who has been around
8.3 A Threelnstruction Programmable Processor 431
computing a long time claims a different origin. One
way of loading a program inro the instruction memory
of earl y computers was to create a ribbon with rows of
holes. Each row might have enough room for say 16
holes, thus each row would represent a 16-bi t machine
instruction-a hole meant a O. no hole a 1 (or vice
versa). A programmer would punch holes in the ribbon
to store the program on the ribbon (using a special
hole-punching machine). and then feed the ribbon into
a compUlcr's ribbon reader. which would read the rows
of Os and 1 s and load those Os and 1 s ima the
computer" instruclion memory. Those ribbons might
have been several feet long. and looked a lot like the
straps of a boot. hence the term bootstrap. hortened to
boot. Whichever is the actual origin. we can be fairly
sure the term "boot" comes from the bootstraps on the
boots we wear on our feet.
Using those mnemoni cs. we could rewrite the program D{9}=D[O}+DII} as follows:
0: MOY RO, 0
I: MOY RI. I
2: ADD R2, RO. R I
3: MOY 9. R2
That program is much easier to understand than the Os and Is in Figure .7. A
program wri uen using mnemoni cs that wi ll be transl ated to machine code by an as ern-
bier is known as assembly code. Hardly anybody writes machine code direclly these day.
An assembler would automaticall y translate the above assembl y program to the mac rune
code shown in Fi gure 8.7.
You mi ght be wondering how the assembler can di stinguish between the load and
store in truction above, when the mnemonic for both instrU tions i the ."
The assembler di stingui shes those two types of instruction by looking at the first char-
acter after the mnemonic "MOY"-if the first character i an "R." then that operand i a
register, and thus thut instructi on must be a load instrUction.
Control Unit and Datapath for the Three-Instruction Processor
From the definition of the three-instruction instrUction set and an und rs!anding of the
basic Oturol unit and dat apath archi tecture f a programmable proces' or as -ho\\ n in
Fi gure .-1 . we can desi gn a complete digit al circuit for a three-instru lion progrnmmable
processor. The de ' ign process is actuall y vet') si mil ar to the RTL de 'ign proces i
haptcr - .
432 Programmable Processors
We begin wilh a hi gh
level Slale machine descriplion
of the syslcm. shown in Figure
8.9. Assume Ihal 01' i, , hon
hand ror IRI I S .. / 2/. meaning
Ihc leflmosl four bil ;' of Ihe
instruction Likewi se.
a,sume Ihal ra i, . hOrlhand for
IRIII .. 81. rl> i, shonhnnd for
I RI 7 .. .JI. rc is , horlhand for
IRI 3..01. and d i, , hOrl hand for
IR/ 7 .. 0/.
High- level Slalt! maclllne dc... riplion of::t
thrcc! -in tnlction progmll1mablc procc,,",or.
Recall Ihal Ihe nexi SICP in
Ihe RTL design process was 10
crellle Ihc dalapalh. We already
erealcd Ihe daaapaah in Figure
.4. whi ch we refine lO , how every cOnlrol ; ignal from the comrollcr. a. hown in Figure
8. 10. The relined dalapath ha, comrol signal s for cach read and wri lc pon of Ihe regi ler file
(sec Chapler 4 for informaaion on regi sler file;). Thc regisler fil e has 16 regislers because
Ihe inslrll eli ons have only 4 bils wi lh whi ch 10 address rcgisle",. The dmapmh has a conlrol
signal 10 Ihe ALU call ed a 1 u_s O-we' lI assume Ihc ,i mpl c AL adds ils inpul when
a 1 u_s O- I. and jusl passes inpul A whcn a 1 u 50=0. The cialapalh has a ,c1eellinc for lhe
Figure 8.10 Refi ned dalapalh and control unil for the threeinslruction processor.
83 A Threelnslrucllon Progrommoblo Pr ocossor 433
2x I mux in front of Ihe re' ,. . .
conlrol " I ' gl Icr hie , 'HHe dm.1 pon. hllall). \I . II l1\e uhu includ '.1 Ihe
. Igna , lor Ille dOlJ m' I h
I
\\ II \\C J"UIIlC ha, .1 'llIg.!.: addr'" pon. tlml 'nn
IIU suppon onl) a read or h
256 d
. a \I rile. III IIlJ1 rolll '1IlIllil.llleou,1 . Tile <lain 111'111<")' hlh
wor s. Mn e the ,"'Inl I h '
Th
II on) hit, \\llh \\ )l1eh to .Iddu:" Ih ' lInlU I11Cll1Ur .
e dalapmh " no\\ ahle 10 .. II ' ..
. h . eMl) mit 01 Iht: lo;td/ ... tul"\: 0pCI.IIHHl , lI ml ullt hnH.: II C
operall on'l at \I e need lor Ihe lugh lc\d '1U1C maellonc 1'1<"" FI gure H.'! . Tllu, . we ell ll
proceed 10 Ihe IIHrd , Iep of Ih . R'rl d
. c . c'lgn prllCC" ul connertlll!! Ihe Jawpal!! Wllh II con
Iroller. FIgure . 10,11 '" Ihl"e COlllleCIII"". (h \l ell,,, Ille e<llU1CeIl0l" " I III ' cnulfllll'r
10 the PC and IR reui'IC'" III II . I
eo ' I e l:Olltro lllll! . ;I IlU 1011)(.' 11I,IIII C1HHl IlI CIII{HY I .
n,e 10.'1 ,Iep of Ihe RTf.
-AF{11tr-otdl
o
o rd. 1
AF s. 1
RF Waddr. r.
AF W wra l
D{dl RFtr'l-
o oddr d
o wr I
AF 6 X
AF Ap oddr ,ro
RF Rp rd I
PG-o-I
Inc- 1

- AFjreJ-
RF Rp oddrerb
RF Rp
RF 9- 0
RF Rq addrerc
RF Rq
RF W addr. ra
RF W wr. l
alu sOlZ l
dc;ign procc" b 10 dem e
Ihe coni roller"> FS I. \ e
can do Ihi, 'imighlforwurdl)
b replacing the high Ie, cI
aClion of the :...HlIC machi ne
in Figure 8.9 by BIKlle:m
opermion, on Ihe Con
lroller', inpul and oUIPUI
lines. 3; ,hown in Figure
8. 11 . (Remember Ihal Of'. d,
ra. rb. and rt: arc , honhancl
nOlali on, for IRII S .. 121.
IRI 7 .. 01. IRI 11.. 81. IRI 7 .. -l I.
and IRIJ..OI . re'peclively.)
We could Ihen fini,h Ihe
controll er's design by con
veni ng the FSM 10 a ,laiC
regiMer and combi nali onnl
logic, using Ihe mClhod,
rrom Chaplcr 3.
Figure 8.11 .. M for fhe prucc" ur\ confroll cr.
We would have Ihu;, de,igned a programnwblc proce"or.
leI's trace lhrough Ihe comroll er\ FSM behavior 10 ,ec how a program would
execule on Ihe Ihree in; lrucli n A, II rcminder. remcmber Ihlll we follow Ihe
FSM convenlions Ihal all are implici ll y A Ded wi lh a ri , ill g clock edge. and
ihal any comrol ;ignal nOI explicill y a"igned a va lue in a SlalC i;, implicill y as;,igned a O.
The FSM inili all y Sian, in Male Illit . whi ch SCi, PC c 1 r - I, cau, ing Ihe PC reg
iSler 10 be cleared 10 O.
The FSM on the neXI clock cycle enlcr;, the Fetch SlaIC. in whi ch Ihc FSM reads
lhe inslruelion memory al address 0 (because PC i, 0) and loads Ihe read value
inlo IR-lhal read value will be Ihe inWuclion Ihal wa ... ' IOrcd ill 1/01. Allhe same
lime. the FSM incremenls the PC' s value.
The FSM on the nexl clock cycle enlers Ihe Decode SlaIC, which has no aClions
bUI which branches on the nexi clock cycle to one of Ihree ; lales. Load. Store, or
Add. depending on Ihe value of Ihe highe; 1 four bilS of Ihe IR regisler (lhe currem
instruclion' opcode).
Programmable Processors
In the L{I(/d ,tatc. the 10 M sets the data memory address line, to the low eight bit
of the IR and ,ets the data memory rcad enabl e to 1. setS the 2x I mux's select line
to pa" thc data memory output to the register fil e. and sets the register fil e wri te
addrc" to IRIII .. BI and the write enable to 1. causing whatever gets read from
the data mcmory to be loaded int o the appropriate register in the register fi le.
Likewise. the Store and Add states set the control lines as needed for the store and
add operati ons.
Finall y. the FSM rctllrl1S to the Felch state. and begins fe tching the next
instructi on.
NOti ce that becau,e the Sture state does nOt writ e to the register fi le. then the value of
rhe register mux select lines don't mUHer. so we've ass igned signal RF_ s=X in thai
meaning the signal's value does not maLler. Using slIch don't care values (see
Section 6.2) can help u 10 min imi ze logic in the controller.
You may wonder why the Decode state is necessary when that state contains no
we not have j ust had Decode' s transiti ons originate instead from state
Ferch" Recall from Section 5.3 that register updates listed in a state do not actuall y occur
until the next clock edge. meaning that transiti ons ori ginati ng from a state use the pre-
\" iou register \alues. Thus. we could not have originated Decode's transiti ons from the
Ferch , tate. because those transitions would have been using the old opcode in the
instruction register IR. not the new value read during the Ferch state.
8.4 A SIX-INSTRUCTION PROGRAMMABLE PROCESSOR
Extending the Instruction Set
Clearly. having onl y a three-instructi on instructi on set limits the behavior of the programs
that we can wri te. All we can do with those instructi ons is add numbers. A real program-
mable processor wi ll support many more instructi ons. perhaps 100 or more. so that a
"ider variet y of programs can be writt en.
Let's extend our programmable processor s instructi on set with a few more instruc-
tions. in order to give you a sli ghtl y better idea of how a programmable processor wi th a
full instruction set woul d look.
We'll begin by introducing an instructi on able to load a constant value int o a register
file register. For example. suppose we wanted to compute RFIOI = RFI I I + 5. The 5 is a
constant. A cOllstall1 i, a va lue that is part of our program. not somethi ng to be found in
data memory. We need an instruction that all ow uS to load a constant into a register, after
which we could add that regiMer to RFf II using the ADD instruction. Thus. we introduce
a new instructi on with the fol lowing machine and ",sembl y code representati ons:
Load-coll sl all l in\tructi on--{)O I J ' j ' Z'1 ' 0 c, c"csc. CjCZClco: specifies that the
binary number represented by the bit ' C, c6clc.CjCZCICO . hou ld be loaded into the
register specified by rl rZrlrO' The binary number being loaded is known as a co,,
\10111. The mnemonic for thi s in,truction i, :
\IOV Ra, #c- 'pccifies Ihe operati on RVl al=l'
8.4 A Six-Instruction Programmable Processor 435
a can be 0, I , ... , Or 15 A . , .
c can be - 128 - 127 . ssumlng two s complement representation (see Section 4.8).
instructi on f' . .. . O ... 126. 127. The "W' enables the assembler to di stinguish this
rom a regul ar load Instructi on.
We continue by introducin . .
ters simil ar t dd" g an lI1StructlOn for perfonning subtraction of two regi s-
. 0 a ili On of two registers. having the foll owing machine and assembly code
representatI ons:
SlIbtracl instructi on-1l 100
bt
. f , a),aZ,al,a
o
rbj,b, rblrb
o
specifies
su ractl on 0 two reoister fi le' ' fi -
. h h e registers speci cd by rb)rb, rb,rb
o
and rC3rc,rc trco
WII t e stored in the register fi le register by For
example, 0100 0010 0000 0001" specifies the instruction RFlil=RFfOI-
RFI II. The mnemoniC for thi s instruction is:
SUB Ra, Rb, Rc-specifies the operati on RFfal=RF{bl - RFlcl
Let' s also introduce an instruct ' I II .
Ion l,at a ows us to Jump to other parts of a program:
llllllp-if-zero instructi on-1l101 raj raZratra
O
O, 06050.0j010100: specifies thai if
contents of the register specified by ra
3
ra
Z
ra
i
rao is O. we should load the PC
Ihe current value of PC pl us o,06050.030Z0tOO' which is an 8-bit number in
two s complement fonn representing a positive or negative offset amount. The
mnemoni c is:
J MPZ Ra, ofTset-specifi es the operation PC = PC + offser if RF{al is O.
By using two's complement for the jump off et. whi ch all ows representation of positive
or negati ve numbers. the program can jump backwards in the program, thus imple-
menting a loop. With an 8-bll offset, the instruction can specify a jump forward by 127
addresses, or backward by 128 addresses (-1 28 to + 127).
Table 8. 1 summari zes the six-
instructi on instruction set. A program-
mabl e processor typicall y comes with a
databook that lists the processor's
instructi ons. and the meaning of each
instructi on. using a fonnat similar to the
format of Table . I. Typical program-
mabl e processors have dozens. even
hundreds. of instructi ons.
TABLE 8.1 Six-i nstruction instruction set..
Extending the Control Unit and Datapath
The three new instructions require some
extensions to our control unit and data-
Instruction
MOY Ro. d
MOY d. Ra
ADD Ra. Rb. Rc
MOY Ro. #C
SUB Ro. Rb. Rc
l MPZ Ro. ofrset
Meaning
RF[a] = O[d]
O[dl = RF[a]
RF[ol = RFIb]+RF[ 1
RF[al = C
RF[al = RFIb]-RF[ 1
PC=PC+<>ffset if
pat h of Figure 8.1 0. wi th those extensions shown in Figure . 1_. Fin;!. load con ranI
instruction requi res that the register file be able to load data from IR{ .. OJ. in addition to
data from data memory or the LU output. Thu . we \ iden the register file" ' multiple,.r
from 2x I 10 3x I. add another muX control signal. and al 0 create 3 ne\\ signal oming
from the ontrol ler labeled RF_III_tflllo. whi h will onnect \\ith lR{ .. OJ-these banges
are highlighted by the d", hed circle labeled .. r' in Figure e ond. the ' ubtract
.
-'36
8 Programmable Pr ocessors
s1 sO ALU operation
o 0 pass A through
-
o 1 A+B
t 0 A-B
addr rd data I I D addr S
D
% 16
I
addr
D rd rd
frD wr wr 256x16

.......... '.
.. ', I ".

dataS t f1 6
... I 2 1 0 .:
51 51 16.br\ ,
'/ sO 3x1


Id
u
16
f
RF W addr 4 W data
RF W wr
W_addr -
Controller RF Rp addr 4
W_wr
RF Rp rd
Rp_addr
16x16
RF Rq addr 4
Rp_rd
RF
RF Rq rd
Rq_addr
Rq_rd
.... j;; Rp dala Rq data
RF _Rp zero
__ ....
: al u sO : ALU
>
Control unit
Datapath 16
Figure 8.12 Control unit and datapath for the sixi nstruction processor.
instruction require that we use an ALU capable of subtracti on. so we add another ALU
control signal-highlighted by the dashed circle labeled '"]" in the figure. Third, the
jump-if-zero in truction requires that we be able to detect if a register is zero, and that we
be able to add IR(7 .. 0( to the Pc. Thus, we insen a dat apalh component to detect if the
register file's Rp read pon is all zeros (that component would just be a NOR gate), labeled
as dashed-ci rcle "3a" in the figure. We also upgrade the PC register SO it can be loaded
wi th PC plus IR(7 .. 0j. labeled as "3b" in lhe fi gure. The adder used for thi s also subtracts
I from the sum. to compen;ate for the fnct that the Felch state already added I to the Pc.
We also need 10 extend the FSM for the conlroll er within lhe control unit to handle the
three additional in'>tructions. Figure 8. 13 shows the extended FSM. The Illil and Felch stales
,tay the same. We added three new transitions from the Decode state for the three new
,",truction opcode ... We made a minor revision to the UllId, 1Ore. and Add tates' action
(the new action, are italicized) s ince the file mux has a mux with two select lines
,",tead of ju,t one. Likewise. we revised the Add ,tatc action .. to confi gure the ALU with
two conlrol hne, '"'tead of one. We added four new ,tate.,. /"I}(/(/-('oll '</OIII. SlIblracI.
8.4 A Six- Instruction Programmable Processor 437
D_addr=d
D_wr=1
RF_sI=X
RF_sO=X
RF _Rp_addr=ra
RF_Rp_rd=1
RF_Rp_addr=rb
RF_Rp_rd=1
RF_sI=O
RF_sO=O
RF _Rq_add=rc
RF_RQ..rd=1
RF_W_addua
RF W wr=1

alu_sO=1
RF_s1=1
RF_sO=O
RF W addr=ra
RF=W=wr=1
Figure 8.13 COnlrol unit and dat.path for the six-instructi on processor.
RF _Rp_addr=rb
RF_Rp_rd=1
RF _s1=0
RF_sO=O
RF _Rq_addr=rc
RF _RQ..rd=1
RF W addr=ra
RF=W=wr=1
alu 51=1
alu=sO=O
RF_Rp_addr=ra
RF_Rp_rd=l
e
'"
d
a:
... 1
a:
JUlllp-iJ-zero, and JUlllp-iJ-zero-jll1p, for the three new instructions. The new in tructi on
states perfoml the following funclions on the data path:
In Ihe Load-cOI/Slalll state, we configure the register file mux to pas the
RF_W_da ta signal. and we configure the regi ster file to write 10 the addres pec-
ified by I'a (which is IR( 11.. 8]).
In Ihe Sublracr Slate. we perfonn the same action a in the Add tate. except that
we configure lhe ALU for subtraction instead of addition.
In Ihe state, we configure lhe register file 10 read the register pee-
ified by ra onto read pon Rp. If the value of the read register Rp i all 0 .
RF_Rp_zero will become 1 (and a otherwise). Thus. we in lude two transi-
tions from the JlIIl/p-iJ<ero slate. One tran ition will be laken if RF_Rp_zero i
O. meani ng the read regisler was nOI all OS-lhat transition takes the F M back
to the FeTch state. meaning no actual jump occurs. The other tran ition will be
taken if RF_ Rp_zero is 1, meaning the read regi ter was all Os. That tran iti n
goes to another Slate, which hould actually carry out the
jump. That slat e carries OUl the jump simply by etling the load line f the Pc.
Notice Ihal with Ihe addi tion of a instru tion. the proce or may take up
to four cycles 10 complete an instruction. nmely. when the ra regisler of a
instnlction is all as, Ihen an extra slate is needed to I ad the PC with the address f the
instrucli n 10 \Vhi h 10 jump.
-'311 8 Programmable Processors
8. 5 EXAMPLE ASSEMBLY AND MACHINE PROGRAMS
Usi ng the ,i x-in, lrucli on inslruclion sel of lhe previou, TABlEB.2 Instruction opcodes.
$eclion. we no\\ provide an example of
guuge programming the six-inslructi on
to perform a parti cul ar task. and we show how the
"" cmbly code woul d be converted 10 machine code by
an a"embler. n acmbl er would make use of lhe table
shown in Tabk 8,1. which maps inslrucli ons 10 opcodes.
EXAMPLE 8.4 Assembly and machine programs for a simple program
\Vritc a program that COll n t.;; the number of words that arc n OI
equal to 0 in daw, mt!l11ol) ...j. and 5. and that stores the
result in data memor) locat ion 9. the possibl e result s that
\\Qul d be i n locati on 9 arc Lero. one. or two.
Instruction
MOV Ra. d
10V d. Ra
ADD Ra. Rb. Rc
MOV Ra. #C
SUB Ra. Rb. Rc
JMPZ Ro. offsel
Opeode
0000
0001
00 10
0011
0100
0101
U\lI1g the in-, tfuction ... et of Table 8.2. we can wri te an program as shown in Figure
.I..l(a). The progrJm mai nt ain.., the count in register RO. whi ch the program initializes to O. The
program mil} need to add I to lh b register latcr. so the program loads the value I into regi ster RI.
The program next load, data mem ry locmi on 4 inlO regi:-tcr R2. The program then jumps 10 the
in,tTUClion labeled "lab I" if the \'alue of R2 is zero. If R2 i:-. not l Cro. the program will ex.ecute
an add imlruetiOn that add ... one to register RD. and will then proceed to the instruct ion labeled
"Iabl" ... inee that in"' lruclion the next instructi on. The instructi on labeled "Iab l" loads data
MOV AO, #0; /I initialize resuil lo 0
MOV Rl , #1 ; II constant 1 for incrementing result
MOV R2, 4; /I get data memory location 4
JMPZ A2, labl ; II if zero. skip next inslruction
ADD AO. AO, Al ; /I not zero, so increment result
tabl :MOV A2, 5; /I get data memory location 5
JMPZ A2, lab2; /I if zero, skIp next inslruction
0011 0000 00000000
0011 0001 00000001
0000 0010 00000100
0101 001000000010
0010 0000 0000 0001
0000 001000000101
0101 001000000010
ADD RO, AO, A 1; /lnol zero, so incremenl ,esull 0010 0000 0000 0001
lab2:MOV 9, AO; /I store result in dala memory location 9 0001 0000 00001001
(a) (b)
Figure 8.14 A program 10 counl lhe nu mber of nonlero numbe" in Df.Jj and Dj5 j. sloring lhe
r",uil 10 Dj9j: (a, ""embly code. and (b) corre_ponding machine code generaled by an assembler.
The In Ihc machine code\ 16-bi t nre lherc for your cOlwcni encc as you read thi s
hook; actual machine code no , lIch \ p3Ce\,
memory locallon 5 1010 reg"lcr 112. The program j ump' 10 lhc in' lrucli on labeled "lab2" if Rl is
fero. II R2 1\ not fCro. thc program execute' an add in"trucli oll that add, one LO rcgi ter RO. and
lhen proceed' to lhe neXl '"'lrucllOn. "hich i, the in' lruelion labeled "lab2." ThaI ilmruclion SIOres
the conlcnl\ 01 rcgl\ICr RO to data memory 9.
In ,""rltlng the .' ......emhl)' program. we cho ... e tIll! regl\ tcf' thaL we used to the
rc'\ull. the e,:un... t,lOt I. the u'ila memory locatinn COI)Y. We coul d h;,ve u ... cd any registers for
thoCi.C purpn\C,. r'or example. we could huvc u\cd ... tcr 1< 7 10 hold the rc\uit . meani ng all occur
renl:C" of RO In the codc would In\lcnd h.lve neen }(7 h n1 hcnTIorc. In writing the assembly
p"'gr .. m "c ,lfhll,"nly eh .. ," lhe label' "Iabl" "nd "1,lh2," We could have pI cked olher nu,"c\ for
8.6 Further Extensions 10 the Programmable Processor 439
lhose labels, such as "ski I" .. .. .. ,... ..,
live label s lhal hel p and done. or Fred and George. It s best, lhough. 10 u e descrip-
A P people readll1g lhe a sembly code 10 undersland the program.
n assembler would .
Figure 8. 14(b). For aUlomal lcally convert lhe assembly code 10 the machine code shown in
type by I k' each IIlstrucuon. the detenmnes the speCifi c instruction
prime 0 the as well all the operands if necessary. and thcn outputs the appro-
assen blpe e Ils (four blls) for lhat inslruclion lype. as defined in Table 8.2. For exampl e the
1 er would look al lh fi ' . .
leuers "MOV" lhat " e rSl II1s1rucllon "MOV RD. #0" and lhu know from the fi rst lhree
oper d . thi S IS one of the data movement the assembler would look at the
an s, and seeing "RO" Id k .. .
fina l! h . . wou now thi S IS either a regular load or a load-constant instruction;
putt" y. t assembler would see the "#" and conclude this is a load-constant instructi on, thus out-
. St mg. t e Opcode " 0011" for a load-cans ram instruction. as shown in the first machi ne
In ruCll on of the fi gure.
"00;; lhe operands 10 bilS also, converting "RO" of the firsl instructi on 10
. and #0 10 00000000 ," as shown in lhe firsl machine inslruction of the figure
The JMPZ ins' . .
if ' lruellon requires some extra handling. The assembler rccogni zes thi s as a Jump-
lhus OUlPUlS the opeode "0101 . " The assembler converts lhe firsl operand.
. 10 00 10. The as emblcr then reaches lhe second operand. "lab I." and does not know
to output. since the assembler doesn' t yet know the address of the instruction label ed
labl. ' as lhe assembler hasn' l reached lhal instruclion yel in lhe program. To solve lhi problem
many assemblers actuall y make I WO passes over the assembly code: during the pass. the
creales a lable of all labels and lheir addresses. and lhen on lhe second pass the as embler
? (PUlS. machine code. Such an assembler would therefore know during the second pas thal th
II1S1rucll on label:.d is al address lWO addresses beyond lhe first JMPZ in
the lab l instructi on IS at address 5. whli e the JMPZ III truclion is at address 3
(assuming lhal lhe firsl inslrucli on is al address O. nOl I). Thus. lhe assembler would amp t
off,sel of 2 10 jump forward 2 addresses. I alice lhal lhe labels "Iab l" and "lab?" do nOl ap u an
lhe h' od - pear 111
mac me c e-thcy are merely a convenience construct thai the assembler provides for the
programmer.
8.6 FURTHER EXTENSIONS TO THE PROGRAMMABLE PROCESSOR
Instruction Set Extensions
EXlending the instrucli on sel wilh further instructions would require similar types f
eXlenSlons and modifi cations 10 the control unil. datapath. and FSM. A prog:ramm
processor mighl cont ain dozens more dolo movemelll instructions. which
between data memory and lhe regisler file. or belween regi slers. For example. a processor
1llIghi have in lruclions for copying lhe contenlS of one regi sler 10 another (e.g .. !\IOV
RO. R I. whi ch would copy RJ' contents into RO). and would carry out that instru tion
uSlIlg a tale thal reads lhe source reg ISler. pas es .the read \'alue through the AL
unchanged. and wriles the ALU oulPUl 10 the desllnatlon reglsler. As another exam Ie
mighl have inslrucli ons lhm would use lhe COnlenlS of a regi ' ler as the
from whIch to read data memory. known as IIld,reC( addre' slIlg.
programmable processor would also conlain dozens I' arilhmeticflogic in
lions. whi h perf nn arilhmelic and logi operalions on registers in the register
example. a processor mighl include nol ju ' t add, and sublracl instru ti ns. \.luI also :
menl, complemenl. decremenl. AND. OR. XOR. shIft left. shift right . and other
insllllctions that could be carried OUI by an AL .
_ _____ ..... -.J_ ,
Programmable Processors
A programmable processor would furtherlll Ore sevcraIJlow-of-colltrol illstruc-
tiolls . \\ hi ch detenninc the next value of the Pc. I-or example. a processor IHl ght lOciude
not j u,t a j ump-if-lcro insLnlcti on: but also a lin un.condlll onal jump, an
indirect j ump. and perhaps e\en jump-Ji -negall vc and sImIl ar such InstruCll ons. Further-
more. a proceor may include instructions that can jump farther than j ust a small offset
fTom the current Pc. and perhaps even to an absolut e address rather than an offset address.
Input/Output Extensions 256x16 D
Section 1.3 introduced a basic mi croprocessor ==::
htl\ in2 input s 10. II . .... 17. and eight
outpu7s PO. PI . .... P7. We can extend the basic
programmabl e processor of Fi gure 8.1 2 to

23gB
implement such external input s and outputs .. One
method for such an extension would ullhze a
'pecially designed data memory. In that data
memol). we mi ght replace the last 16 words of
the memory by direct connections to the input
and output pins. as illustrated in Fi gure 8.15.
The data memory stores locati ons 0 through 239
in a normal RAM. Location 240. however. is
aCLUall y a special word whose hi gh 15 bits are
all Os. and whose lowest bit comes from a OIP-
flop loaded every cycle with the value on
external input pin /0. Thus. reading locati on 240
will ""uil in either 00 . .. 0 1 (i llleger I). or
wr
240: . . . I 10
241: '
248:: . PO
255:' P7
Figure B.15 Connecting to
external pins.
00 ... 00 (i nteger 0). depending on the value appearing at 10. Likewise. location 241 is
connected to pin II . locati on 242 to 12, and so on. Wi th locallon 247 connected to 17.
Location, through 255 arc connected to pins PO through P7. except the pms are can
nected to tho.e locati ons' flip-flop outputs rather than input . For example, writi ng to
location \\ rite the Oip-fl op with either 0 or 1 (onl y the low-order bit matlers during
the \Hite). and lhat flipOop drive external output pin P7.
Thu.,. an a"cmbly- Ianguage programmer can read or write a microprocessor's
external data pin' ,impl y by readi ng or writing parti cular data memory locations.
EXAMPLE 8.5 Motiontn-the-dark detector in assembly language
Secll()n 1.3 IIleluded an exampl e. ill u'lwted in Figure 1. 13. Ihat ulili/.cd to mi croprocessor to imple
menl a deteclor. That ,coli on utili7ed C code 10 c mpulc the expression PO
[0 && ! [1 In th" example. we ,how thc underlying ""cmbl y code Ihm would implemenlthat
C expre,,,on. A"umlllg lhat the lIli croproce"or' , eXlemal pin' 10 .. 17 'II1d PO .. 1'7 are mapped to
Uo.s ltJ memory hx:atlon ... a., 10 Figure 8 15. we can program Ihe c"' pre,\lOn 111 a"' 5,clllbly as follows:
o MOV RO. 240 /I move 0/240/. whI ch" Ihe value at pin 10. IIll 0 RO
MOV R I. 241 /I mOve 0/241/. will h " that va lue at pill II . into R I
'(JT R I R II/compute 'II . ""unllng e,,'lencc "f a complemenl instruction
AND RO. RO. R II/compute 10 && ' II . ""ullll ng '"I AND I", tructi on
\110\ 24K. KO 1/ move re,ult 100/248/. wh, ch" pill PO
8.7 Chapter Summary 441
Performance Extensions
One difference between real processors and the basic processor architecture in this
chapter IS lhat many real processor are pipelined (see Section 6.5 for an introduction to
plpeltnmg). The basic. three-instruction architecture uti li zed a controller with three
stages: ferch, decode. and exeClIIe. By inserting appropriate pipeline regi sters througbout
the deSign and modifying the controll er appropri ately. we could pipeline the fetch,
decode, and execute stages. In other words, as the control unit decodes instruction I. the
control unit could be simultaneously fetching instruction 2. Next. as the control unit exe-
cut es lnstruction I, the control unit could be decoding instruction 2. and felching
InstructIOn 3. Thus, rather than processing one instruction every 3 cycles, the control unit
could be processing one instruction every cycle. Each instruction still takes 3 cycles to
process (3 cycle latency), but the pipelini ng results in single cycle throughput. The net
result woul d be that programs would execute three times faster.
Another extension involves creating deeper pipelines. Thu . rather than just three
stages (fetch, decode, execute), we mi ght break the stages down to stages of even finer
granul ari ty (e.g., fetch. decode, read operands. execute. store reSUlts). Creating finer
grained stages may shorten the longest register-to-register delay. which enables a fasler
clock frequency. The net result would again be faster program execution.
Another extension involves having multiple ALUs in the dalapath. The control unil
may then perform mUltiple ALU operations simultaneously in the datapath. One fonn of
thi s extension involves a processor whose instruction set use in tructions with multiple
opcodes and associated operands in a single instructi on. known as a Very Large Instruc-
tioll Word (VLlW) processor. Another form uses a processor with a control unit that reads
in multiple instructi ons simultaneously and then ass igns those instructions to execute
simultaneously on avail able ALUs, known as a superscawr processor. A high-end
deskt op processor may support perhaps 5 si multaneou instruction . with 10
stages of pipelining. Thus. at any moment. such a processor may be in the middle of pro-
cessing 5*10 = 50 different instructions. Needless to say. modern proces or architectures
can become quite complex.
This chapter described the basic idea of how a programmable proces ors d ign
works and how the design could be extended to support a fuller instruction set. We lea:'e
the role of describing a complete processor. as well as modern processor de i!!n lecb-
niques for improved performance (such as pipelining. caching. elc.). to on
computer architecture.
8.7 CHAPTER SUMMARY
III thi s chapter. we stated (Section 8. 1) that programmable processors are \\idel)' u"ed for
implementing a system's desired functionality. due in part to their easy :\\'ailabilil\ and
short design (namel y. writing software). We provided (ection - l the basic
tecture of a programmable processor. consisting of a general-purpose datapath ha"ing a
register file and ALU: a ol1trol unit having a controller. Pc. and IR: and memories r
stori ng the program and the data. The control unit would fetch the in tru tion from
program memory. de ode the instruction. and th.en the U n nfiguring
the datapath to carry alit the instru tiOIl 'S peCltied perau n. \\e then de ' igned t tion
442 8 Programmable Processors
8.3) a simpl e Ihreei nstruclion programmable processor. and showed how a program
woul d be represent ed as Os and I s (machine code) in the processor's program memory.
We wenl fu rther 10 des ign (Secti on 8.4) a six instruction processor. and di scussed how
further eXlensions could be made to add more instructi ons and hence achieve a more rea
sonable processor archil ecture. We provided (Secti on 8.5) an exampl e of assembly and
machine code for Ihe six instructi on processor. We di scussed a rew extensions to the pro
grammable processor archit ecture (Section 8.6).
Programmabl e processors are typi call y produced in huge quant iti es (numbering in
the tens or milli ons. or even bi ll ions). and so tremendous a!tent ion is given to thei r
design. Readers should rea lize that the programmable processor des igns in thi s chapter
are extremel y simplistic and used for illustration purposes on ly. Yet, seeing even the si m
pl isti c designs, you hopefull y now have an understandi ng of the principle of how a
programmabl e processor works. Modern commercial processors are based on the same
principles-instructi ons arc stored as machine code in program memory, control units
felch. decode. and execut e Ihe instructions. and datapaths support the operations of the
instrucli ons using register lil es and ALUs. Modern processors just do a much better job,
usi ng concurrency. pipelining. and many other techniques to obtain hi gh ciock frequen.
cies and fast program execlItion.
8.8 EXERCI SES
SECTION 8.2: BASIC ARCHITECTURE
t;;:";J.S 8.1
8.2
r---. 8.3
PLUS
If a processor' s program counter is 20-bits wide, up to how many \,,-/ords can the processor's
instructi on memory hold (ignoring any special tricks to expand the inslruction memory size)?
Which of lhe foll owing are legal singlecycle dalapalh operalions for Ihe dalapalh in Figure
8.2? Explain your answer.
(a) Copy data from a memory location into another memory location.
(b) Copy two registcr locati ons into two memory locations.
(c) Add dala from a regisler fil e locali on and a memory locali on. sloring Ihe result in a
memory location.
Whi ch of the following are legal single-cycle datapath operations for the datapalh in Figure
8.2? Explain your answer.
(a) Copy data from a register fi le locat ion into a mcmory locati on.
(b) Subtract data from two memory locations and store the result in anot her memory location.
(c) Add data from a register fi le localion and a memory location. storing the result in the same
memory location.
8.4 Assume we are using a dual-port memory from which we can read two locations si multa-
neously. Modify Ihe d3lapath of Ihe programmable processor of Fi gure 8.2 10 support an
instructi on that performs an ALU operation on any two memory locations and stores the
re ult in a register fil e locati on. Trace through the execution of thi s operati on. as illustrated
in Figure 8.3.
8.5 Delermine Ihe operali ons required 10 instrucl the datapalh of Fi gure 8.2 to perform Ihe opera
lion: DI 81 = (D[4] + D15J) - D[71. where D represent s the data memory.
8.8 Exercises ... 443
SECTI O 83' A THREE I ST
. . " . R CTi ON RLE PRO ESSOR
8.6 If :1 processor's instruction has 4 bilS for thl! 0 'f '.
processor Suppan? pcode. ho\\ many 1Il 'l lnl ti on., can the
8.7 does the foll owing a\SCl11bly pro mill which II . ' '.
Ihl s chapter. com pUl e') MOV R5 19' AgD' toe, the thrcc- IIl!'oo lruCIi On Ili stmeti on SCI of
. " D R5. R5. R5: MOY 20. R5.
8.8 Whal doe, Ihe followi ng , . bl .
thi s eh a y program. which Ihe in!<.lruc( . r
8.9
8.10
8.11
8. 12
. apler. COmpUle? MOY R4. 20: MOY R9 18' ADD R4 R4 Rl). . 'On SCI a
R4, R5: lOY 20. R9. '" . . MOY R5. 30: ADD R9.
Using the three-instructi on .sct of Ihi!- chapt er. wri te Ull I
updales Ihe (lain memory D as foll ow.: DIOI = DIOI + Dill . ' . b y program Ihal
Using Ihe Ihree I '" .
'" inS nl CIl 11 II\Mructl on of 1111 :-' chupler. write nn )r .
Update' Ihe dala memory D a' foll ows: DI41 = DIII. 2+ D1 21. y I ogr,II" Ihal
the following :I\scmbl y program 10 machine code ba,,"-cd on the Ih . ' .
"" Irucll on ,el of Ihi s chapler: MOY R5. 19: ADD R5. R5. R5: MOY 20. R5. rcem, lruCli on
Lisl Ihe b' Isic re . I I f
" gl!-. cr memory IrJns crs and opcmli on ... Ihut OCcur duri ng each clock c
the foll owlflg progmrn. on the Ihrce-ill \ trucli on of Ihi J yclc for
I: MOY R I. 9: ADD RO. RO. R I. . ' lOpler: MOY RO.
SECTION 8.4: A SIX INSTRU Tl ON I'ROGRAMMAIJ LE I'ROCE SOR
8.13
8.14
8.15
Li !o. I the basic regbtcr/mclllOry and operali on) Ihal occur during c'lch J k
Ihe foll Owing progmm. on the \ ix-instruction il1\ tnl cli on \ct f thi s' h-C oc - cycle for
Ihal Ihe COl1lenl or DI 91 i, O' MOY R6 # I' MOY R5 9' JM""Z 1'5 I I II' c ,'pIer. "", uming
. . . ,-,- ,." >0 : ADD R5 R
label I : ADD R5. R5. R6. Whal b Ihe vallie in R5 after Ihe program Compieles? . 5. R6:
Add u new to the \Cl of thi ... dl' lpl er Ih '
b' '. ' al perfo
II wlse AND of two registers and stores the rC5ult in .a third rcgiMcr E:<tc d h nns a
cont r I unit. and the cont roller's FSM as needed. . , n l c dat<Jpalh.
Add a to thc six-i nstructi on :-oct of this ch3ptcr that rfom '
uncondlt, onal Jump Uumps always) 10 a location specified by a 12.bil on,cl E pe IS an
palh, control un il, and Ihe controll er's FSM as nceded. . Xl end Ihe dala
8.16 a new. instruction to the six-instruction in truction sct of this Ch3plcr that perform.s .
,f Iwo reglSlers are equal. 10 a localion speci fied by a 12-bil offset. EXlend lhe
control un'l . and Ihe cont rollers FSM as needed. lapalh.
8.17 the six-instruction instructi on set of this chapter, wri te .111 assembl y program for th
10wll1g C code, whi ch com pUles Ihe sum of Ihe firsl N numbers. where is an h e fol
D191 . Hillt: Usc a regisler 10 fi rsl SlOre N. Ol er name for
i -l :
sum-O :
wh i Ie (i !
sum sum + i;
i 3 i + l :
8. 18 Using Ihe eXlended inslrucli on sel you designed in Exerci se 8. 16. wrile an 35sembl
for Ihe C code in Exercise 8.17. Y program
44-' Programmable Processors
SECTION 8.5: EXAMPLE ASSEMBLY AND MACHINE PROGRAMS
8.19 Define twO new daw movement instructions for the ... ix-i nstruction i nSlruction sel of thi
chapter. Extend the datapath. control unit. and the controll er's FSM as needed.
8.20 Define two new arilhl1lcLic/logic instructions for the six-i nstruction instruction set of this
chapter. Extend the datapath. control unit. and the controller's FSM as needed.
8.21 Define two neW now-or-control instructions for lhe six-instruction instruction set of lhis
chapler. Ex.tend the datapath. control unit. and the controll er's FSM as needed.
8.22 Assuming that the microprocessor's external pins 10 .. / 7 and PO.. P7 are mapped to data
memory locations as in Figure 8. 15 and an AND instruction has been added to the six- instruc-
tion instructi on of thi s chapter. create an assembly program th'l l will output 0 on P4 if all
eight inputs 10 .. 17 are is.
Carole grew li p in 3. country
where the best swdcnts went to
engincering school. as
engineeri ng was highly respected.
"1 W:lS good in school. so
engineering secmed like a natural
option. I \Vas also very interested
in building things. and very
curious about how one builds new
things-so I was attracted to
engineering aI an early age. around 10 years of age."
Carole has worked at I ntel for 15 years. She was Olle of
the original architects of the popular MMX (Multimedia
Extension of the Intel Architecture) pan of Pentium
processors ... It was fascinaling to learn the algorithms
used to compre s video and audi o, and to invent new
instructions for the Intel ArchitecLUre to run these
applications efficiemly. It is not always easy for processor
architects to quantify the benefi ts of new fCJlUres, and to
motivate the expense in si li con area (or chi p die size) for
new instructions. I n the case of mult imedia appli cutions.
the benefits are well understood: running a video clip at a
few frames per second. or running it in real time (about 30
frames per second) makes a huge. visible difference to
everyone:' As is the case with so many engineers. she is
very proud of what she accomplished: "When the first
Pentium processor with MMX came up. it was really
rewarding 10 think th3t a small piece of my mind was in
all of these machines running video real time popping up
everywhere."
Carole was also one of the archit ects on the Intel I
Hewlett Packard tearn that defined the Itanium computer
architecture. "This was a unique opponunity to define a
processor 'from scratch.' Technicall y thi s was a very
challenging project. and worki ng with so many top notch
architects was very enriching. But I also learned what it
takes to bui ld something big. involving a very large team,
and two large companies. The two compani es had different
cultures. diffe rent methodologi es, and reconci ling the
differences was sometimes more challengi ng than solving
the technical problems. But thi s is all pan of ' bui lding
things: and this was a gre.u lesson in leadership."
\Vha( Carole likes most about her career is "the
constant change. After 22 years as a computer architect, I
a111 still doing new things every day. Computer science is
a work in progress. and it offers new opponunities that
one has 10 grab. and run with. Thi s is where the fun is."
Asked to give some advice to students. Carole suggests
two things:
"Stay at school as long as possible. Get a PhD if you
can. To be able to adapt to constant change, you will
need a very robust. and theoret ical foundation. Onl y
learning how to do things is not enough; it will get you
a job for 2 years. but then your skill s wi ll be obsolete."
"Be open for change. It is imponantto bui ld an in-depth
expenise in one area. in my case, it is computer archi-
tecture. But one has to be ready to use thi s expertise in
many different proj ects. with different people, and more
and more in different pans of the world. Fifteen ye""
ago multimedia appli cat ions were the focus of many
computer architects. Today it is bioinformatics and data
mining. Change requires a lot of work to learn new
domains. but not adapting to change is not an option."
9
Hardware Description
Languages
9.1 INTRODUCTION !
In thi s book, we have been drawi ng the circuit. that
we destgn. For exam I . Ch .
. , p e. III apter 2. we deSIgned
an door opener circuit and drew the circuit
shown til F,gure 9. 1. A drawi ng has more informa-
tlon.than is reall y necessHry to descri be the ci rcllit. In
particu lar, the drawing gives information about the
DoorOpener
locatt on of the inputs and outpUts: in the drawing or Figure 91 D .
Ft gure 9.1. the inputs are on the Icft, the output on . ruwn clfcuit.
the nght , and the c input is on the top, the h input in the midd le. and the in
bottom. The drawlIlg also gives infomlation about the size alld I . P put on the
. I . . . . ocmlon of th
nents III tl e Clrcull: the IIl vertcr IS Ht the top the OR gate below tl ' e compo-
D h ' , 'lC IIlvcner th
"ate on t e nght . and each component is abollt a half inch by a I alf ' I ,e A D
. . f " ,1 II1e 1. Th d .
gtves til ormati on about the wIres tOO' the wire from the inverter e rawlllg
d h . .' goes to the ri h h
own, t .en to the nght agalll. for example. However, all that informati on . g t. t en
drawlIlg tS really Irrelevant , and has not hing to do with how the des ign will be the
tmplement ed. We had to draw the circuit somehow. so we chose to dr'I \V th . P ystcall y
h
. ' e CIrCU Il ' h
manner s own III the figure. But we coul d have drawn the circui t many oth III t e
drawlIl g of a circuit is commonl y referred to a a ci rcui t schematic. er ways too. A
A problem with drawing all our ci rcuits arises when we deal with large . .
tl h
.. F" r CtrcullS Does
le sc III tgure 9.2 mean anythlllg to you? That schematic has J' u t .
a coupl e d
components-what if there were a couple t1lOusand eomponelllS. as is' oZen
D
. . . ' qUll e commo ?
rawlIl g a large CtrCUIl would reqUIre tremendous effort on Our pan to fig n .
I
.' .
P ace each componelll 111 the drawIIlg. and how to route Wires among the co v 0
' f mponenLS A d
t . a tool generated the circuit , the tool would have to spend compute time to fio . n
vtsually-appealing way to draw the ci rcuit (rather than a paghctti-like me ) "ure OUt a
. . . . d '11 I . 5S . and ueh
com put all on tS tlmeconsumtng an Sll may not resu ttll a good drawing. Funhenno
ti les used to store such schematic would be very large. as those tiles would re, the
th
. <. . I . d' f eontmn all
at extra lIl/ ormallon about the prectse ocall on an SIze 0 every component All th
. at extra
I Substamial content or thi s chapter was contributed by Roman Lysccky.
44S
-------- --- -
446
(aJ
Hardware Descriplion Languages
Figure 9.2 Schematics become h .. rd to read beyond a dozen or so component s-the
graphical inronllation bccolllc!ol a nui sance raLhcr than an ai d.
efron. file size. and lime. woul d be needed for somel hi ng Ihal is reall y nOI very u erul-
humans can'l comprehend circuil drawings or more Ihan perhaps a hundred or so gale. so
what 's Ihe poim or drawi ng such circuil s? What we reall y wanl is a way 10 just describe the
ci rcuil ilselr- whal arc Ihe inputs and outpulS. whal components exisl. and what are Ihe con-
neclions? Ideall y. we would do thi s descripti on in a texwal language, a that we humans
coul d Iype such descri pli ons wilh a compuler keyboard. j ust like we type email messages and
C programs.
We coul d Iherefore describe Ihe circuit in Fi gure 9.3(3) using Ihe lextual language or
English as shown in Fi gure 9.3(b). We've given names to each gate in the circuil and 10
the illlernal wi res in Figure 9.3(3).
(b) We'll now describe a circui t whose name is DoorOpener.
The external inputs are c, h and p, which are bits.
The external oUlput is I, which is a bit.
We assume you know the behavior of these components:
An inverter, which has a bi l inpul x, and bit output F.
A 2-input OR gate, which has Input s x and y, and bit output F.
A 2-input AND gate, which has bil input s x and y, and bit output F.
The circuit has internal wires n1 and n2, both bits.
The DoorOpener circuit internally consists 01:
An inverter named Inv_1. whose input x connects to
external input c, and whose oUlput connects to n1.
Figure 9.3 Describing a circuil using a
tcxtual language rather than a graphi cal
drawing: (a) schemalic. (b) lextual
description in the English languagc.
A 2-input OR gate named OR2_1. whose inputs connect to external
inpuls hand p. and whose outpul connects to n2.
A 2-i nput AND gate named AND2_1, whose inputs connect to n1
and n2, and whose oUlput connects to external outpul I.
That's all.
or course, Engli sh is not a good language ir you want to use a computer tool to read in
the descripti on-a computer tool requires a language with a precise syntax and precise
meaning ror every language construcl. Com pUler-readabl e languages thus evolved in the
I 970s and I 980s ror describing hardware circuits. Such languages became known as hard-
ware descriptiol/ lal/guages, or HDLs. Hardware descripti on languages not onl y enable us
9.2 Combinalional Lo . D " .
glc escrlptlon USlllg Hardware Description Languages ... 447
10 describe the slruclural illlerconnections .
descri be Ihe beh ' r or componelll,. hUI abo II1clude melhods for t" 10
aVlor a componelll Ihen I ' Mod . . .
the use or HOLs al ' II . 1>e Vo, . em dlgll al de"gn relics heavil y on
. . ,t tage; of de.<lgn.
We II prOVide n bri ef imroducti n I
ouages-VHDL ' 1 10 I1C mOSI popular hardwnre de,cri pli on lan-
o . en og. and YSlcmC 111 Ih' h b I
one may wanl 10 con; uit I : -:- "c aplcr. ul 10 rca Iy Icam each language.
thi s chapler ca be SpeCificall y dedi :Hed 10 each Iangll age. Each secli on or
aft er Chapler 2
n
S' 3 IIllmedlately aft er correspondi ng carli er chaple" (Seclion 9.2
arler Ch'lpler 5>-.:,cI,
l
on . after Chapl er 3. eCli on 9.4 aft er hapler 4. and ecti on 9.5
c11'1pters' Fir I 1e,e sections muy be covered all at onCe "rler compleling Ihose earli er
' r S un 1ermmore. each seclion hus three pa riS. one ror VHOL one ror Veri log and
one lor ystemC Each of Ih se n.-, " d "
d
. '. . pa I 111 ependcllI or Ihe mher pari s or Ihe secli on ;0 "
rea er II1l eresled onl y 111 one or Ihe HOI " ' 1 . ,
' h . " -'. say ven og. can rend only Ihe Veri log part' or
eac seCll on, Sklppll1g Ihe VHOL Or SyslemC pan,.
HDLA II1l ere ted in comparing Ihe three HOI., may ",,,d Ihe ,ecli ons of " II Ihree
' 1 . s. . oll1g so. YOllmay nOll ce Ihat Ihe HDL, have ,i milar capabilili es di frering prim.t-
n h
Y
IIlHI
O
e
L
II' symax. Tilli , aft er leaming One HOL th roughl y. a de' igne; can likely
0 1 er s qUi ckl y. '
9.2 COMBINATIONAL LOGIC DESCRIPTION USING
HARDWARE DESCRIPTION LANGUAGES
Structure
This chapt er's introducli on sough I 10 describe a circuil lI sing a lexwal language. We now
show how HOLs descnbe a circll il. The lerm stTllcture is somelimes used
10 rerer to a CII'CUII. wllh slructure meaning an int erconnecti on or componenl.
VHDL
Fi gure 9.4(c) shows a VHOL of Ihe DoorOpeller circuil or Figure 9.4(a). For
convellience, we' ve 31. 0 shown the Eng" h descri pli on in Figllre 9.4(b), and Lhe correspon-
dence bel ween Ihe Engli sh descnpll on and Ihe VHDL de cripLion.
begins with an elltity declaralion, whi ch defines the de ign's name and
Ihe deSign s IIlpUts and outpulS. known as ports. An entil Y declaration says nothi ng aboul
Ihe IIllemals or Ihe deslgn-:-Just the deSign's .name and interrace. The description li sts the
port names and defi nes thCll' Iype. whi ch III thi S case is Iype s td_ l og i C. That type es en-
li aJiy means a bil, bUI isn' l bui ll imo VHDL (Ihe predefined bit type in VHOL is too limil ed,
for rea ons beyond our scope here). To use s td_l 09 i C, we aCluall y musl incl ude Lhe stale-
ments: " library ieee ; use ieee . std_logic_1l64.all ; " a1thelOp oflhe fi le.
The description continues wilh an architectllre definiti on, whi ch descri bes the intemals
of the design. We named Ihe archilecture Circllit. bUI we could have named il anything we
wanted; DoorOpellerCircllit, DoorOpellerStructllre, Structllre. or even Fred, although we
want a name Lhat is helpful in underslanding Ihe architeclure. The architecture lans by
declaring what components the design will be uSing-Lhose components must be defi ned else-
where, perhaps earlier in Ihe descripti on's fil e. or perhaps in another fil e. We' II di scuss those
componenls' definiti ons later- for now, as ume they are omehow already defined. Each
componenl declaration mUSI define the inputs and OUlputs or each componenL and those
inputs and OUlpUIS mUSI match the component 's entity declarati on (found el ewhere) exactl y.
448 9 Hardware Description Languages
DoorOpener
tnv_'
library ieee i
use ieee.std_logic_1164.alli
.... entity DoorOpener is
(a)
........... // port h, p : in std_logic;
... "/,,,- _ ... _ ... f : out std_logic
We'll now describe a circuit whose name is DoorOpener, ............... ... ... -- .J, ;. ..
The external inputs are c, hand p. which are bits .......... - .' . nC:! DoorOpener;
The extemal output is f. which is a bit. ---------------- .'
architecture Circuit of DoorOpener is
We assume you know the behavior of these components: _------ component Inv .
An inverter, which has a bit input x, and bit output F. ----- port (x: in std_logl.c:
A 2-,nput OR gate, which has Inputs x and y. ____ F, out st,,-logie 1 :
and bit output F ---- end component :
A 2-input AND gate, which has bit inputs x and y, ........ ------- component OR2
and bit output F. ......... port (x. y: in stcLlogic;
.... F : out std_logic):
The circuit has internal wires nl and n2, both bits..... .......... end component ;
.. ........ component AND2
The OoorOpener circuit internally consists of: port (x, y: in std_logic;
An inverter named Inv l ,whose input x connecls 10 .. ""'.. ...... F: out std_logic);
external input c, and-whose output connects to n1 . "' .. '" .......... end component ;
A 2-input OR gate named OR2_" whose inputs ........ ''''''', ""'... signal nl, n2 : s td-log ic; - - in ternal wires
connect to external inputs hand p, and whose oufPul... ','
connects to n2. .. ........... ' ..
A 2-input AND gate named AND2_' , whose inputs -___ ......... ' Inv_l : Inv port map (x=>c. F=>nl);
connect to nl and n2, and whose output connects 10 ---... _ ...... ". OR2_1: OR2 port map (x=>h. y=>p. F=>n2) ;
externaloutput f. --'AND2_1: AND2 port map (x=>nl,y=>n2.F=>fl;
That's all. ____________________________________________ end Cireui t;
(b)
(c)
Figure 9.4 Describi ng a circuit using a textual language ralher than a graphical drawing: (a) schematic, (b) textual
description in the English language. (c) textual description in the YHDL language. Bolded words are reserved words
in YHDL.
The description then includes a declarati on of the design's internal sigllals, which are
essentiall y internal wires. Next to that decl aration, the description includes an example of
a YHDL comment : "-- i nterna 1 wi res". Comments start with "--" followed by
any text we want on the rest of the line. That text is ignored by YHDL tools, but is useful
to us humans who must read the descriptions.
Fi nall y, the descri ption instanti ates the circuit 's components and defines those com-
ponents' connections. For exampl e, the description instanti ates a component named
111 11_1, whi ch is a component of type 11111 (whi ch we declared earlier in the YHDL descrip-
tion), and indicates that 1/l 1I_l' s input x connects to c, which is an external input. An
alternate, more concise port map not ation omits the port names. Using this notation, we
could instantiate our inverter by writi ng "Inv_l: Inv port map (c . nlJ :". The
order of the signal s in the port map of I lIv corresponds to the order of the ports in the
component definition of Ill v. We wi ll use thi s alternate notat ion in subsequent examples.
9.2 Combinational Logic Description Using Hardware Description Languages <II 449
The bold words in the desc" t'
. Ip IOn represent reserved words. abo known", keywords
111 YHDL. We cannot use reserved \ dr ... ..
. . YOT S lor names of entit ies. architectures. signal:::,
1I1stantlated components. etc .. as those words have special 1l1eaning that guide YHDL
tool to understand Our descriptions.
Summarizing. the YHDL structuml descri ption has an entit y that de crihcs the design's
na1l1e, "'puts. and OUlput S: a declaration of what components wi ll be u,ed: a dec1amtion of
1I1t em31 and finally, nn in !antialion of all component, . along with their
Interconnecti ons .
The entity thm we've just defined could then be used as 11 component in another enti ty .
Ycrilog
Figure 9.5(c) a Yerilog description of the O()orOpeller circuit of Figure 9.5(3). For
convenience. we ve al so hown the English description in Figure 9.5(b). and the corre-
spondence between the Engli h deSCription and the Yerilog descri ption.
DoorOpener
Inv_'
(a)
, m04u1. tnv(x. F):
We'll now describe a circuil whose name is DoorOpener. \ ,/ input x;
The external inpuls are c, hand p, which are bits,,' \ I output F:
The external output is I, which is a bit. , I I deta i 1s not shown
" \ " \ endmodu1.
We assume you know the behavior of \_ .. - .a4ule OR2 (x. y. F);
An inverter, which has a bit input x, and bit ou.tput F.Al-......... \ input x. y;
A 2-inpul OR gate, which has inputs x and y, \ \ output F;
andbiloutpulF \ \ \ /1 details not shown
A 2-inpul AND gate, which has bit inputs x and y, ___ \ .ndmodu1.
and bit output F. \ AND2lx, y. F);
\ \ \ input x. y;
The circuit has intemal wires n1 and n2, both bits. " \ \ \ output P;
' ....,. '" \ .... ..details not shown
The DoorOpener circuit intemally consists 01: ., \
An inverter named Inv_1, whose input x connects to ........ " \ \ \
external inpul c, and whose output connects to nl ...... , .. ' .... DoorOpener(c, h, p, f);
A 2-input OR gate named OR2_' . whose inputs '_, -__ '" \ ' input e. h. p:
connecllo external inputs hand p, and whose output... _... ...., ..' ....: output f;
connects to n2. --...... " .. wire n1. n2;
A 2.input AND gate named AND2_', whose inputs-____ -"< Inv Inv_l (e. n11:
connectton' and n2. andwhoseoutputconnectsto ----____ OR2 OR2_1(h. p. n21:
extemal output f. --AND2 AND2_1(nl. n2. fl:
That's alf. ____________________________________________ '-ule
(b) (c)
Figure 9.5 Describing a circui t using a textual language rather than a graphical drawing:
(a) schemati c. (b) textual description in the English language, (c) textual description in the
Yeri log language. Bold words are reserved words in Yerilog .
. ' __ w_ J -
450 Hardware Description Languages
Tbc description begins by defi ni ng modul es for an inverter 1111'. a 2- input OR gate
OR2. and a 2-input AND gate AN02. We' l l skip discussion of tbose modules, and begi n
our discu"ion wi th tbe defini ti on of the founb module OoorO"eller.
Tbe dcscription declares a modllie named OoorO"eller. The modul e declaration
defi nes a design's name and the names of tbat desi gn's inputs and outputs. known as
pons. Tbe module declarati on says nothing about tbe intcrnals of the design or the
pons-just the design" name and interface.
Tbe descript ion tben defi nes tbe type of each pan, assi gning the types illplI l and
0"11'"1 in thi s example.
Tbe description tben i ncl udes a declaralion of tbe design' s internal wires. named II I
and 112.
Finall y. tbe dcscripti on instantiates the circuit 's componcnt s and defines those com-
ponenls' connecti ons. In tbe OoorO"ell er modul e. tbe descripti on instanti ates a
component named 1111'_/ . wbicb is a componenl of type III I'. Tbe connecti ons to the inputs
and outputs of tbe i nstantiated components arc specified i n tbe order in wbi ch the compo-
nent' s modules declare the input s and output s. In tbe instant iati on of 1111' _1, the input c is
connected to the input x of the IIII' component. In Veri log, the modul e does not need to
specify the interface of a component witbin tbe modul e instanli aling the component. For
example. the OnorO"eller module does not include a decl arati on of whi ch components it
wi ll instantiate or any informati on regardi ng tbose components. The components, of
course. must be defined elsewhere. perhaps earlier in the same fi le as shown i n Figure
9.5(c). or perbaps in anotber file. For reference purposes. tbe exampl e shown here pro-
vides i ncomplete speci fi cations for tbe III I'. AN02. and OR2 component s in order to
clearly show the pons and inter face For lhese component s. In pl ace of speci fying lhe
internal bebavior of these components. we simpl y included an example of a Veri log com-
ment. Comments stan witb . I I" and then any text we want on the rest of the line.
The bold words i n the descri plion represent reserved words. al so known as keywords.
i n Veri log. We cannot use reserved words for names of modules. pons, wires, instantiated
components. etc .. as those words have special meani ng that guide Veri log tools to under-
stand our descriptions.
Summarizing. the Veri log slructural description has a modul e that describes the
design name. li sts the module's inputs and OUlputs, and specifi es the type for each input
and output : a declarati on of internal wi res: and fi nall y. an instanti ali on of all components,
along with lheir il1lerconnections.
Syst emC
Figure 9.6(c) shows a SystemC descripti on of the OoorOpener ci rcuit of Fi gure 9.6(a).
For convenience, we've also shown the Engli sh descripti on i n Fi gure 9.6(b), and the cor-
respondence between the Engli sh descripli on and the SystemC descripti on. The SystemC
language is built on top of the C++ programming language, but it i s not necessary to be
an expen C++ programmer LO use Systemc. However, it i s imponantto keep in mind that
cenain restrictions ex i st as a result , such a not using C++ keywords to name modules,
ports. signal . etc.
Before defining the circuit behavior. we musl include the statement "IIi nc 1 ude
" sy s t emc . h"" at the top of each SystemC fi le. The descripti on begins with an
SC_MODULE declarati on, whi ch defines the design' s name, in thi s case OoorOpeller. The
9.2 Combinational Logic D .. .
escrlptlon USing Hardware Description Languages 4S I
DoorOpener
Inv_'
.include
, .include
/, 'include
:/ , .include
systemc . h
"inv.h"
or2.h
'and2 .h'
(a) :: /
We'll now a circuit whose name is (OOorOpenQr I
The exlernalmpuls are c, hand p, which are 10<
The external output is I, which is a bit ______________ (tl ______ _ C f i h, p:
We assume you know the behavior of these components' /
An inverter. which has a bit input x and bit output F ' ,/ I d og 1 c> nt n2;
A OR gate. which has mpuls x and V, . ,I'I' lnv cc arat ons
and bIt output F. / OR2 OR2 l'
A 2"nput AND gate, which has bit Inputs x and y, / lIND lIN02 i
and bIt output F. // II instantiations
T .. . /" SCS'l'ORtDoorOpenerl . Inv 1 (' Inv 1'1
he CirCUI t has Internal wi res nt and 02, both bits/ /, OR2 1 ('OR2 1') AND ' 2-1 (' -2 .'
,- -C'/ - -' - AND _1 )
The DoorOpener circuit Internally consists of: ,/" /, .. ,:, :. lnv 1 x (c I .
An invert er, named Inv_l, whose input x connects to lnv:1: F (n1 i j
eX,ternallnput c, and whose output connects to nJ......... < _______ OR2_1. x (h) j
A 2'lnpul OR gate named OR2_'. whose Inputs .<, _____ OR2_1. y (pi;
connect to externallllputs hand P. and whose output / OR2_1 . F (n2) ;
to n2. / ________ N-ID2_1. x I nIl;
A 2'lIlput AND gate named AND2_' , whose Inputs lIN02_1 . Y (n2 I ;
connect 10 01 and n2, and whose output connects to AND2_1 . F (f) ;
output f.
Thats all. -------------------------------- ------------.... . '-- ),:.. . ____ _
(b) Ie)
a circui t using a textunl language rather than a graphical drawi ng: (:'1) \ChClmllic. (b) textual
In the Engli sh language. (c) Icx lUal dc\criplion in the language. Bold word" arc n.!\crved words
In Systcmc.
modul e declarati on says nothing aboutlhe internals of the de,ign- justthe de; ign's name.
Within the modul e descri ption, the input and output pons of the design are specified. using
the sc_ill<> and sc_olIl<> statements respecti vely. The descripti on the pon names and
defines thei r types. which in this CHse is type 5c_1 og i c. whi ch speci fies a si ngle bit.
The description then includes a declarat ion of the design' internal si gnal , specified
as sc_siglla/, whi ch are essenli all y internal wires. Next to that declarati on, the descripli on
includes an example of a SystemC comment: ,. I lin te rna 1 wi res". Comments stan
Wi lh "I In and then consist of any text we want on the rest of the l ine.
The modul e then decl ares what component s the design will be using. The SystemC
module does not need to specify the i nterface of the component , but ralher ju t the type
of component as well as a unique name for each component wi thin the design.
The modul e defines a constructor functi on SC_CTOR that is responsible for i nstanti-
at ing and connecting the components within our SystemC design. The conslructor funclion
__ ...: ..... -...I"l._ .... - .. --
-'52 Hardware Description Languages
takes as an argument the name of the current SystemC modul e. which is in thi s ca e Door-
Opener. Following the SC_CTOR statement after the colon is a li st of component
instantiations. The SystcmC module' s inst antiati ons arc used to call the constructor func-
li ons of each componcnl bei ng instantiated. However, we poi nt out that the connections
between the indi vidual component are nO! specilied at this point. Instead. the statements
within the construct or fi nall y define the connections between the components. For example,
the invert er 11II' _I's input x is conne ted 10 c. which is an extemal input. In SystemC, the
module does not need to specify the interface of a component within the modul e. The com-
ponents. of course. must be complet ely defi ned elsewhere. perhaps earlier in the same file,
or perhaps in another fi le. In our SystemC DOO/'Opener desc ripti on, the descripti ons for the
hII'. AND2. and OR2 component s are pecifi ed in other SystemC li les. In order to use those
component s. we must include a statement at the beginning of the current file indicating
where we can find thi s descript ion. For example. our DoorOpener descri pti on includes the
statement "1/ i nc 1 ud e "i n v . h "". and the descri pti on of the component Ill v can be found
wi thin this fi le.
The bolded words in the descript ion represent reserved words. also known as key-
words. in SystemC and C++. We cannO! use reserved words for names of modul es, ports,
signals. instantiated components, etc .. as those words have special meaning that guide
SystemC and C++ 100is 10 understand our descripti ons.
Summarizing. the Sy temC structural description has: a modul e that defines the
design name: a li st of input s and outputs of the modul e specifying their types, a declara-
tion of internal signals: a decl aration of component s providing the name for each
component. a const ructor function instanti ating the modul e's components, and finall y, the
components' interconnecti ons.
Combinational Behavior
HDLs typically suppon the ability to describe the internals of a design as behavior rather
than as a circuit. This abi lity enables us to de cribe the bOHom-level building-block com-
ponents that we use in a des ign. such as the behavior of an AND gate or OR gate.
VHDL
Figure 9.7 contains a behavioral descrip-
tion of a 2-input OR gate. whi ch
you'lI recall we used as a component in
Figure 9.4(c). The de cripti on begins wi th
the declarati ons necessary to use
5 td_l og i C. It then decl are the entit y
with the name OR2 as having t wo input
ports x and y. and having output pon F, all
of type 5 d log i c . whi ch means bit.
The de, cription then defines an architec-
ture named behavior for OR2. That
architecture con'I S!!' of a process. whi ch i,
the VHDL con,truct that describes
behavior. The proce" declaration here i
library ieee;
use ieee.std_logic_1164 . all;
entity OR2 is
port (x, y: in st<t-logic;
F: out std_logic
);
end OR2;
arcbi tecture behavior of OR2 i.
begin
proce (x. y)
begin
F <::0 X or y:
en4 proce ;
end behavior;
Figure 9.7 Behavioral VHDL de,criplion
of an OR gate.
9.2 Combinational Logic Description Using Hardware Description Languages 453
"process(x . y) ". which mean the process should execute from beginning to end
whenever there's a change on x or y-in other words, the process is seflsitive to x and y.
A process body (the part between the process's begin and end) can contain sequential
statements, just like sequential statements in C, but with a different syntaX. The process
shown has onl y one such statement. assigni ng the value of "x or y" to F. "or" happens
to be a built-in operator in VHDL, making the internal description of the OR gate imple.
As another example of a behavioral descripti on, let's revi sit our DoorOpener
example from Figure 9.4(c). for which we created an architecture havi ng a structural
descripti on. We can alternatively create an archit ecture having a behavioral description-
a VHDL entity may have multipl e architecture descripti ons for that same entity.
Assuming the same entity declaration as in Fi gure 9.4(c). we show an alternative archi-
tecture definition in Figure 9.8. The behavior consists of a process that is sensitive to
Input s c, h, and p. When the process executes (which is whenever c. h. or p changes), then
the process executes its one statement, whi ch updates the value off
In designing the DoorOpell er circuit ,
we might start with the behavioral descrip-
li on, and run a simulati on to verify correct
behavior. We mi ght then create a structural
description, and run simulati on again to
veri fy that the circuit has the func-
ti onaliry as the behavi or. In fact, tools exi st
that automatically convert such behavior to
archi tecture beh of OoorOpener i.
begin
process (c. h. p)
begin
f <= Dot (c) and (h or pi;
end procells ;
end beh;
a circuit. Figure 9.8 Behavioral VHDL descriplion of
When writing a VHDL process the DoorOpell er design.
describing a combinational CirCUli S
behavior. care must be taken 10 include all the ci rcuit' s inputs in the proces 's
li st. Omitting an input is not a VHDL error. but such omission results in different
behavior than combinational behavior-wi th an input omitted, the output does not change
when that input changes. meaning there must be some storage in the circuit.
Verilog
Fi gure 9.9 contains a behavioral description of
a 2-input OR gate, which you'lI recall we
used as a component in Fi gure 9.5. The
descri pti on begins by declaring the module
named OR2 and specifyi ng that the module
has three ports named x, y. and F The descrip-
tion then defines that the port s x and yare
both inputs and the port F is an output. The
description then defines the output F to be a
reg output. In Veri log, all ports are by default
assumed to be a wires . which do not store
values. Instead. wires can onl y creme connec-
module OR2lx.y.F);
input x. y;
output F;
reg F;
a1wa.ya @ (x or y)
begin
F <= x I y;
end
enc!module
Figure 9.9 Beh., ioml Veril02
description of an OR gate. -
tions between components. If we want to assign a alue to an output pon. we mu t
defi ne the port to be a reg. which indicate the output pon stores the value - we i20
to the port. The Vcril og code for our design continues with an always procedure that
454 Hardware Description Languages
delines a bl ock of code Ihal wi ll be rcpealedl y cxcc ul ed whcnever a change occurs on
an input in Ihe block', inpul li st. The always procedure declaralion is "a 1 ways @(x
or y ) ". which Illeans Ihe procedure should execllIe from begi nni ng 10 end whenever
Ihere is a change on x or y-in olher words. Ihe procedure is sellsiti ve to x and y. The
always procedurc's Sialements (Ihe pan bel ween Ihe procedure's begill and elld stale-
menl) can contai n sequential Slalements. j usl like sequemial state menl s in C, bUI with
a different sy11l ax. The block shown has only one such Sial eme nt. ass igning Ihe value
of " X I y" lO F. where I is a buil t- in Veri log operali on 10 compute an OR.
As anolher example of a behavioral
descriplion. lel' s revi sil our DoorOpeller
example from Figure 9.5(c). for whi ch we
crealed a slruclural verilog de cripli on. We ea n
alienlalively creale a behavioral description.
Figure 9.10 presems a behavioral Veri log
de cription of Ihe DoorOpeller circuit. The
module declaralion is s imil ar lO Ihe struclural
descriplion of Figure 9.5(c). but in Ihe behav-
ioral description we need 10 declare lhe outpul
f as a reg. The beha ior consis ls of an always
procedure sen. itive 10 inputs c. iI. and p. When
the procedure execules (which is wheneve r c.
module DoorOpener (c, h, p, f) ;
input c, h, Pi
output fi
reg f:
always @(c or h or p)
begin
f <= (-c) & (h I p);
end
endmodule
Figure 9.10 Behavioral Veri log
descriplion of Ihe DoorOpener design.
iI. or p changes). Ihen lhe procedure execules a si ng le slatemenl thai updales lhe value
off, by assigning Ihe value .. (-c) & ( h I p) ". whe re - . &. and I perform the inven ,
AND. and OR operalions. respectively.
In designing lhe DoorOpeller circuit. we mighl sian wi lh Ihe behavioral descriplion,
and run a imulation 10 verify correCI behavior. We mi ghl lhen creale a slructural descrip-
tion. and run a simulati on again lO veri fy Ihal Ihe ci rcuil has lhe ame fu ncti onality as the
behavior. In fact. tools exist lhm aUlomalicall y conven such behavior to a circuit.
SystemC
Figure 9.1 1 a SyslemC behavioral
de criplion of a 2-inpul OR gate. which
you'll recall we used as a component in
Figure 9.6(c). The SystemC description
declares lhe module wi lh the name OR2
and has IWO inpul pons x and y and one
OUIPUI pon F. all oflype sCl ogi c . indi -
cati ng each inpul and output is an
individual bi!. Thc modulc defines thc con-
lructor function SC_CTOR that consisls
of a proce" named comh/ogic
'include systemc h-
SC_ MODOLE (OR2)
(
Bc_ in<Bc_ logic> x, y i
8c_ out<sc_ logic> F:
SC_METHOD (comblogic) ;
sensitive x Y j
void comblogic ()
(
)
};
P . writ. (x. read () I y. read!)); defined a' a SC...METflOD. SC...METflOD
is one Sy'tcmC con'lnlct lhat describes
behavior. The declaration here
SC.METHOD (comblogic); s en -
Figur.9.11 Behavioral Sy'lcmC de.'criplion
S i i v e < /, < < y; ". which mean' Ihe of an OR gale.
Testbenches
9.2 Combinal ional Logic Descri ption Usi ng Hardware Descri ption Languages
455
process wi ll execule the c . beh' .. .
. h orcUI[ aVIOr descnbed on the funcuon comb/ogic whenever there
IS a c ange on x or y. In other words, the process is sellsilive to x and y. The process body i
defined on the funcllon comb/agic and is declared as "v 0 i d comb 1 09" c ( ).. The
f . ( h . process
unCUon t e pan belween the open brace " {" and close brace "}" ) can contain sequential
sta!emenlS, JUSI li ke sequential sta!emenlS in C or C++. bUI somelimes requires different
synt ax. The process shown has onl y one such Slalement. writing the value of " X . re a d () I
y . rea d ( ) 10 F, where I execules an OR operali on. In SystemC, one can read the current
value an onput pon using Ihe readO function and can wri te a value to an outpul pon using
lhe wnleO funcllon. Whol e we can use other melhods of accessing lhe inpul and Outpul pons
lhe readO and wrile() functi ons are recommended. '
'include systemc. h
As anDlher example of a behavioral
description, let 's revisil our Door-
Opel1er exampl e from Fi gure 9.6(c), for
whi ch we crealed a slruclural SystemC (OoorOpener)
descriplion. We can alternatively creale
a behavioral descripti on. Figure 9. 12
present s a behavioral SystemC descrip-
tion of the DoorOpel1 er ci rcuit. The
module declaration is the same as lhe
slructural descripti on of Fi gure 9.6(c).
The behavior cons iSIs of a single pro-
cess. named comb/ogic, Iha! is sensilive
to inputs c, II , and p. When the process
execul es (which is whenever c, ii , or p
changes), Ihen the process executes ils
one statement , which updales Ihe value
Bc_ in<8c_ logic> c, h, p;
sc_out<8c_logic> f:
SC_C'I'OR (DoorOpener)
(
SC_METHOD{cornblogic) ;
sensitive c h Pi
vo i d comblogic ()
(
)
);
. write ( I-c. read () & (h. read () I
p. read i)) ;
of f by assigning lhe value Fi 912 B h . I S
Igure . e ,vlora YSlemC deSCription of
"(-c . read( & (h.read() I the DoorOpellerdeSlgn.
p . re ad() )", where - performs an
invert operalion. & performs an AND operation. and I perfomls an OR operation.
In designing Ihe Door?petler circuit, we might stan Wilh the behavioral description.
and run a SImul atIon 10 venfy correct behaVIor. We mIght then creale a tructura] des .
ti on, and run simulation again 10 verify lhat lhe circuil has the s:u."e functionality
behaVIor. In facl, lools exist lhat automati cally conven such behavlOr to a circuit.
One of Ihe main uses of an HDL is Ihal of si mulating a new design to ensure that th
design is correcl. To simulate a design, we need 10 sel Ihe de ign' S inputs to certru::
values, and then check lhal the design's output values are whal we expecl them to be. A
syslem Ihm sets inpul values and checks output value IS known as a leslbellch. \ e now
show how 10 create an HDL test bench to test our DoorOpeller circuit.
Hardware Description Languages
VHDL
Figurc 9. 13 shows a VHDL
leslbench for Ihc DOOl'Opeller
design of Figure 9."(c). I mice
that the entilY. named Tesr-
bellch. has no pons- the entil Y
is self-contained. requiring no
inputs and generating no out-
pUIS. The archilccillre declares
Ihe componelll Ihat we plan 10
Icst-namely. the DoorOpeller
component. The archit ecture
instantiates one instance of the
DOOIOpellercomponent. which
we named DoorOpell er/. A
single process in the architec-
ture sets the inputs of the
component and checks for
correct output. This test bench
tries all possibl e cases of the
three inputs. of which there are
eight cases. Many components
have too many inputs to Lry all
possible cases-in that situa-
tion. we might try border cases
(e.g .. all Os. all 1 s) and then
some random cases.
Each case sets the three
inputs of the component to a
panicular input combinaLion.
and waits for those values to
library ieee;
use ieee. st.d_logic_1164. all;
entity Testbench is
end Tes tbench;
architecture behavior of Testben ch i8
component DoorOpener
port ( c, h, p: in std_logic ;
f: out std_logic
);
end component :
signal c, h, P, f : std_logic;
begin
DoorOpenerl: DoorOpener port map (c, h, p, f 1 ;
process
begin
-- case
c <= '0'; h <= ' 0 '; p <= '0';
wait for 1 n8 ;
assert (f=' 0') report -Ca se a failed- i
-- case 1
c <= '0'; h <= '0' i P <= '1' ;
wait for 1 ns ;
assert (='1') report Case 1 failed
M
;
-- (cases 2-6 omitted from figure)
-- case 7
c <= '1' ; h <= . 1 '; p <= '1';
wait for 1 ns ;
assert (f=' 0') report "Case 7 failed
M
;
wait ; -- process does not wake up again
end process ;
end behavior;
Figure 9.13 Behavioral VHDL descriplion of DoorOpell er
lestbench.
propagate through the component-we arbitraril y wai t for I ns of simulated time, but
could have picked any time, since we didn' t actuall y create a time delay within the com-
ponent. But we do have to wait for some Lime. as VHDL simulation is defined such that
no signal is updated instantaneously. but rather after an infi nitely small period of simu-
lated time. After waiting, each case checks for the correct value on the output!. using an
assert statement. If the condition of the assen statement evaluates to lme, simulation pro-
ceeds to the next statement. But if the conditi on evaluate to fal se, the corresponding
error me<;sage Wi ll be reponed and the simulati on will terminat e.
9.2 Combinational Logic Description Using Hardware Description Languages 457
Verilog
Figure 9.14 shows a Veri log test-
bench for the DoorOpeller design of
Figure 9.5(c). Notice that the
modul e, named Testbell ch, has no
pons-the modul e is self-contained
requiring no inputs and
no outputs. The module fi rst
declares three registered signals c,
h, and p and a single wire f. The
Signals c, h, and p are declared as
reg because we must assign values
to the signals that wi ll be connected
to the inputs of the design we are
testing. However, because we do
not need to assign a value to the
output we are monitoring, the
signal ! is decl ared as a wire. The
test bench then instanti ates one
instance of the DoorOpeller compo-
module Testbench:
reg c, h. p;
wire f;
OoorOpener DoorOpenerl (c. h. p. f);
initial
begin
II case 0
c <= 0; h <= 0; P <= 0:
. 1 $dioplay (" f = %b". f);
II case 1
c <= 0; h <= 0 ; p <= 1;
tl $dioplay (" f = tb". f);
II (cases 2-6 omitted from
/I case 7
c <= 1; h <= 1: p <= 1;
11 $dioplay (" f = tb". f);
end
endmodule
figure)
Figure 9.14 Behavioral Verilog deSCription of
DoorOpener test bench.
nent : named DoorOpeller/ , and connects the inputs and outputs of the component t
our Internal ignals. The testbench then contains an initial procedure that defines 0
block of code that will be executed onl y once when executi on of the te tbench
The IIlIHal procedure sets the inputs of the DOOiDpell er component and di splays the
resulting value of the component ' S output. This testbench tries all po sible cases of
the three Inputs. of whi ch there are eight cases. Many component have too man
Inputs to try all possi ble cases-in that situation, we might try border case (e.!!.. at;
Os, all Is) and then some random cases. -
. Each ca e sets the three inputs of the component to a particular input combina_
lion, and waits for those values to propagate through the component-we arbitrarij
wait for I unit of simulated time using the delay contTol statement "1/1". but we
have pi cked any length of time, since we didn' t actuall y create a time delay within
the component. The Veri log language does not define standard. time unit. uch as
nanoseconds. but Instead simply defines lime 111 term of lime uruts. which a de .
. h' . . . W d h t . f Igner
can use wit 111 a sm1Ulallon environment. e 0 ave 0 waH or some time. as the
within the test bench are nonblockmg statements that are not Updated
untll the current simulation time completes. After wUlung. 7.
ach
case outpUts the value
of the output ! USlllg a $disp/ay statement. The statement $dlsplaY( "f lb"
flU outputs the value of! in binary. For If the value of! is l. then
di splay . tatement will output "f = 1 .... The display stalement consist of a format
slnng followed by a comma-separated hst of wIres. regl ters. or pons. \ nil th
format strint! of our display statement. the '7cb indicates thut the value of.
L
111 e
. _ ..' . . u,e I!!nal
speCified after the format string Will be displayed m bmaf)'. After SImulation has -
pleted. we can compare the values output during simulation to the expe ted \ lue om-
determine if our ircuit is working correctl y. . t
-
458 Hardwa re Description Languages
SystcmC
Figure 9.15 , how' "Sy'lem Ic't-
bench for the Doo,.Opel/cr de,ign
of Fi gure 9.6(c). I oli ce that Ihe
module. Tellbel/ ch. h,,' three
outpul port'>. C_I. h_l. and 1'_1. and
one inpul purt JJ In Sy"cm . we
de<i gn Ihe le'ihcnch ci rcuil '" II
separate module thai connects 10 the
design we arc le'ting. Therefore. for
every inpul port on Ihe circuit we
arc leMi ng. our ICMbcnch will have
II corr"polltling outpul port . Like-
wise. for every output port on
the circuit we arc leMing. OUf tcM-
bench wi li have a corresponding
input port. The t"tbench module
define"> a I\inglc proce" nallled
le.HIJel/ch-IlroC. The tes tbench
proces;. is defined a, an
SC_TNREAD. which is simil ar to
an SC_METHOD process excepl
that Ihe SC_THREAD ali ows LI S to
li se Ihe ",aitO function within Ihe
process body to control Ihe timi ng
behavior of Ihe process. In contraSI.
SystcmC doc;. not allow us 10 usc
the waitO function within an
SC_METHOD process. The lest-
bench process controls the inputs of
the circuit we arc testing and checks
for correct output. Thi s testbench
tries all possibl e cases of the
Doo,.Opel/e,.,. three inputs. of
whi ch there arc eight cases. Many
'include systemc. h-
se_MODULE (Test.bench)
{
8C_ out<8C_ logic> C_t. h_t. p_t;
8c_ in<8c_ logic> f_t;
SC_ CTOR (Tes tbench)
(
se_THREAD (testbench...,proc) ;
void testbench-proc ()
(
)
I;
II case 0
c_t .write ISC_LOGIC_O);
h_t . write ISC_LOGIC_O);
p_t . write ISC_LOGIC_O) ;
wait l!. SC_NS);
a ert ( f_t. read () =.: SC_LOGIC_O );
II case 1
c_t. write I SC_LOGIC_O) ;
h_t. write (SCLOGIC_O) ;
p_t. write ISC_LOGIC_l) ;
wait ll. SC_NS);
sssert ( f_t. read () ==
/ I (cases 2-6 omitted from fi9\lre)
/ I case 7
c_t. write ISC_LOGIC_11 ;
h_t . write I SC_LOGIC_11 ;
p_t . write I SC_LOGIC_l I ;
wait ll. SC_NS);
assert ( f_t. read () == SC_LOGIC_O );
Figure 9.15 Behavioral SYSlemC description of
DoorOpe1l er tcslbench.
componellls have too many inputs to try all possible cases-in that situalion. we might try
border cases (e.g .. ali Os. all Is) and then some random cases.
Each case sets the Ihree inputs of the Doo,.Opell er circuit to a parti cul ar input com-
bination. and wai ts for lhose values to propagate through the component-we
arbitraril y wai l for I ns of simulated time, but could have picked any time, si nce we
didn't aCluall y create a time delay within the component. But we do have to wait for
some time. as SystemC si mulation is defined such thai no signal or pon is updated
instantaneously. but rather after an infinitely small period of simul ated time. After
waiting, each case checks for Ihe correct output by reading the portLt using an assen
statement. If the condition of the assert statement evaluates to true, simul ati on proceeds
93 Sequenual Logic Description USing Hardwaro DeScription LangUBgos

to the Ilt!\( MalClllcnt. But if the condition '" tahl' , "IHulnlmn \\111 ' lOP and
the corre'ponding emIr \\ III be reported
In Y, lem . ",ch." 0 "nd I ,Ire ',lluc' "nd IIut logl' ' UlliC'. 1", lcild.
Sy'tcmC define' thc ,,,Iuc' CJOGIC .O "nd . _IOGle I Ih.1t corre'I","d h'lhe logic
,alue, of 0 nnd 1. re'pcltI\<!). \\hlch \\c u,cd dC'Cnplltlll
9.3 SEQUENTIAL LOGIC DESCRIPTION USING
HARDWARE DESCRIPTION LANGUAGES
Register
The mo;.t bo" componenl 111 equentlal Inglc I' a rc!,!"la. We no\\ , 11,,\\ h'" 10 llIodel II
basic regi,ter 111 IIDL, .
I-IDL
Figure 9.16 ,ho\\, " ba\lc
4-bit register m I IDL.
The register is identical In
that dC\cribed In Fi gure
3.30. The entll} <Ichne, the
data mpul I and the <Ial,1
output Q. "' \Veli u, the
input ciA. The inpul I
and output Q of thi' de"gn
corrcspond to 4-bll vil lucs.
of u,ing eight lIl<li-
vi dual .Ild_logi inl"l" and
output s. the entity", I and
Q port, arc defined as
.<ld_108ic_,ec/Or. A .flle
Hbrary i. I
u.. _lovlc 11f.4.all,
entity R V4 1.
P rt I I in d loQle velar I J downto 01:
);
. out td lOQlc v clot () CSownto 0):
lk in ld 1000Ie
en4 Pf!lg4:
architeoture of Rf"U4 1.
begin
proc tcUc:)
begin
it (elk '1' and clk' ev. ne ) t hen
o < 1:
_"" U ;
.04 proc I :
.04 behevior:
toxic_,'ector j, a vector. or
array. of multiple sl(elallic Figure 9.16 BchavlO,"1 VIIDL UC"'"P""" 0[" 4-bn rcgl"cr.
clemen IS. For example. Ihe
type declarati on "s td 1 ogi e vee or (3 down 0)" dehne, a 4bil vector of
SId_l ogic clement. where the bit wi thin the vector arc numbered from 3 to O.
The dowllto statement defines the ordering of the elemenl s within the vector. indicati ng
that element 3 is located in Ihe leftmo,t po\it ion. The ,wtcment "I <- . 1000" would
thus assign the value '1' to position 3 of the veclor I (l nd the value '0' to the remaini ng
three posi li ons. When assigning a value to a .f ld_logic"ector. the vector'b value
specified within double quotalions. For example, the deCimal value 5 would be specifi ed
as a 4-bit sl(U ogicveclOr as "0101". . .' .
The architecture describe the regi ster behavlonllly, u, ms a proce,s Matement. rhe
process is sensitive to its elk input onl y-because the should only update output
during a ri sing clock edge. the process need not execute If mput I changes. If elk change.
th ocess begins executing its statements. The first statement checks If the process began
due to a rising clock edge (0 to 1), as opposed to a falling edge (1 to 0).
The statement checks for a ri ing edge by cheekmg If the elk mput Just changed

-'60
Hardware Description Languages
(c 1 k ' e Ven t) and that change was 10 a 1 (c 1 k= ' 1 ' ). If the process began executing due
10 a rising clock edge. then the process updates the register's contems using the statement
" 0 (= I ". For a fa lling clock edge. the process will begin executing, check the i{statement
condition. and then reach the end of the process and hence stop executing, without updating
Q. Ideally. YHDL would have a way to begin executing a process onl y on a ri sing clock
edge. but YHDL has no such feature.
In VHDL. output ports are a type of signal. and signalS have memory in simulation.
Thus. ass igning I to Q causes Q to retain the new value. even when the process stops exe-
cuting, thus implement ing the storage part of the register.
Yerilog
Figure 9. 17 hows a basic 4-bi t regi ster in Veri log. The
register is identical to that described in Figure 3.30. The
module defines the data input I and the data output Q,
as weli as the clock input elk. The input I and output Q
of thi s design correspond to a 4-bit value. Instead of
using eight individual inputs and outputs, the module's
I and Q ports are defined as veClOrs. For example. the
type declaration " i npu t [3 : 01 f" defines a 4-bit
input vector where the bit positions wi thin the vector
are numbered from 3 to O. The [3:0] defines the
module Reg4 (I. Q, elk);
input [3, OJ I;
input elk;
output [3 , OJ Q;
rell [3,OJ Q;
always @ (poaedge elk)
begin
Q <= I;
end
endmod.ule
ordering of the elements within the vector, indicat ing Figure 9,17 Behavioral Veri log
that element 3 is located in the leftmost positi on, The descri ption of a 4-bit register.
statement "I <=4' blOOD" would thus ass ign the value
1 10 po ition 3 of the vector I and the value 0 10 the remai ning three positions, When
assigning a value to vector. we must specify the number of bi ts wi thin the value we are
as igning, the base in which we are specifying the value, and the value itself. For example,
the decimal value 5 would be specified as 4-bit binary value 4 'bOIOI,
The module describes the register behaviorall y, using an always procedure, The proce-
dure block i sensi tive to the positive edge of the elk input , specified using the posedge
keyword-because the module should only update its output during a ri sing clock edge,
the always procedure need not execute if I changes, On the positi ve edge of the clock, the
procedure update the register's contents using the statement "0 < = I ", Because we
defined the output Q as a reg, ass igning I to Q causes Q to retain the new value, even when
the procedure is done executing. thus implementing the storage part of the register,
SyslemC
Figure 9, 18 shows a basic 4-bit register in Systemc. The register is identical to that described
in Figure 3.30. The module defi nes the data input I and the data output Q, as well as the clock
input elk. The input I and output Q of thi s design correspond to a 4-bi t value. Instead of using
eight individual sc_logic inputs and outputs, the module's I and Q ports are defined as sc)v
logic vector. An .fc_fl is a vector of multiple se_logic elements. For example, the type decla-
ration "5c_l v<4)" defines a 4-bi t vector ofsc_logic elements where the bit positions within
the vector are numbered from 3 to O. In Systemc. the orderi ng of the elements within the
vector is defined such that the leftmost position i the most significant bit. For example, the
statement "I <= " 1000 "" would thus aSl>ign the value 1 to posi ti on 3 of the vector I and
the value D to the remaining three positions. When assigning a value to an sc_lv. the vector's
Oscillator
9.3 Sequential Logic Description Using Hardware Description Languages 461
value must be specified wi thin double quotations. For
example, the decimal value 5 would be specified as a
4-bit sc_lv as "0101". Notice that in defi ning the
input port for I , we included a space between the two
closing angle brackets, >, the space being required in
SyslemC.
The modul e consists of a si ngle process, named
seq_logic, that is sensitive to the positi ve edge of the elk
input. specified using the sell silive..JJos statement for
defining the sensiti vity li st- because the modul e
should onl y update its out put duri ng a ri sing clock
edge, the seq_logic process need not wake up if I
changes. On the positi ve edge of the clock. the register
updates the regi ter's coments using the statement
'O . write(l.read(".
In Systemc. output ports are a type of signal,
and signals have memory. Thus, assigning I to Q
causes Q to retain the new value, even when the
process is done executing, thus implementing the
storage part of the register.
YHDL
.include systemc.h-
SCJ<ODtILE I Reg4)
(
8c_ in<8c_ lv<4:> > I;
8c_out<8c_ lv<4> :> 0;
8c_ in<8c_ logic> elk;
SC_METHOD (seQ....-logic) ;
aenaitive-P08 elk;
void seCLlogic ()
(
I
I;
Q. write (I. r ead () ;
Figure 9.18 Behavioral SystemC
descriplion of a 4-bit register.
The register presented in Figure 9.16 has a clock inpu!. We thus need to define an oscill ator
component that generates a clock signal . Fi gure 9. 19 illustrates an oscill ator de cribed in
YHDL. The entit y defines one output , elk. The archi tecture consists of a process. but notice
that process does not have a sensitivity li s!. By default . such a proce executes irs tate-
ments as if they were encl osed in an infinite loop. So the process sets the clock 10 O. leeps
until iO ns of si mul ated time passes. sets the clock to 1. sleep another 10 ns of simulated
time, goes back 10 the first statement in the process that ets the clock to O. and so on.
The output waveform for such an 0 cili alOr wi ll be identical to the waveform shown in
Figure 3.17. library ieee;
The wait/or statement in YHDL tell the us. ieee. std....logic_1l64 .all;
simulator the amount of simulated time that
the proce s should not execute. A proce s
wi lhol/I a sensiti vi ty list IIIlISI have at least one
wait statement . otherwise the simul ator will
never fini sh simul ati ng that process (because
the process is in an implici t infi nite loop). and
thus the simulator will never get a chance to
update outputs or to simulate other proces es.
On the other hand. a process lI'ilh a sensitivi ty
li st call/IOI include wait statements. because
by defi niti on. the sensitivity li st defi nes when
the process should execute.
entity Osc i.
port ( elk: out stCLlogic );
end Osc;
architecture behavior of Ose i.
begin
proceS8
begin
clk<='O';
wait for 10
elk <= '1';
wait for 10
end proce8. ;
end behavior;
n.a ;
n.a ;
Figure 9.19
HDL oscillator description.
462
Hardware Description Languages
Verilog
The register prescnt cd in Figure 9. 17 has a cl ock input. Wc thus
need to define an oscillator component that gcnerates a clock
signal. Figure 9.20 illustrates an oscillator described in Veri log.
The module defines one output. elk. The module consists of an
olll"Q."s procedure. but noti ce that the always procedure does not
have a scnsitivity li st. By default. such a procedure execut es its
statements as if they were enclosed in an infinite loop. Assuming
we arc using a time scale of nanoseconds, the always procedure
sets the clock to O. delays for 10 ns of simulated timc. scts the
clock to 1. delays for anot her 10 ns of simulat ed time, goes back
modu1e Osc (e lk) ;
output elk;
elk;
a lways
begin
elk <= 0;
810;
elk <= 1 ;
#10;
end
endmodu l e
to the first statement in the procedure that sets the clock to 0. and Figure 9.20 Veri log
so on. The output wavefonn for such an oscill ator will bc iden- osc illator description.
ti cal to the waveform shown in Fi gure 3. 17.
The delay control statement. specified with the # character. tell s the simul ator the
amount of simulated time that the procedure should not execut e. A procedure lVilhol/l a sen-
sitiviry li st 1111151 have at least one delay cOnLrol stat ement, othcrwise the simulator wi ll never
fini sh simulating that procedure (because the procedure is in an implicit infinite loop), and
thus the simulator will never get the chance to update outputs or to simulate other procedures.
On the ot her hand. a procedure lVilh a sensiti vit y li st COIIIIOI include delay control statements,
because by definition the sensitivity list defines when the procedurc should awake.
SystemC
The register present ed in Figure 9. 18 has a clock
input. We thus need to define an oscillator com-
ponent that generates a clock signal. Figure 9.2 1
illustrates an oscillator described in Systemc.
The module defines one output. elk. The module
consist of a single process, named seq_l ogi c.
implemented as an SC_THREAD. By default. an
SC_THREAD prace s is onl y executed once. In
order to ensure the process executes continu-
ously, we encl ose the statements within the
proce s in an infinite loop. implemented using
the tatement "\01 h i 1 e ( t rue )". Thus, the loop
will execute the statement included withi n the
braces forever. During execution, the process sets
the clack to 0, suspends executi on for 10 ns of
simul ated time, sets the clock to 1. sleeps another
10 ns of simulated time, l>e ts the clock to 0, and
#include "systemc h"
SC_ MODULE (Osc)
(
;
void seCLlogic ()
(
)
) ;
while (truel {
elk . write (SC_LOGIC_OI;
wait (10, SC_NSI ;
elk. write (SC_LOGIC_ll ;
wait (10, SC_NS I;
Figure 9.21 SystemC oscill ator
so on. The output waveform for such an osci ll ator descripti on.
will be identical to the waveform in Fi gure 3. 17.
The wail() functi on in SystemC tells the simulator the amount of simulated time that
the process :, hould not execute. For example. the statement "wa i t ( 10 . SC_NS); " wi ll
su pend the execution of the process for 10 nl>. An SC_TfIREAD process expli cit ly
implementing an infinite loop IIIU.51 havc at least onc wait sWtCl11cnt . otherwi se the
Controllers
9.3 Sequential Logic Description Using Hardware Description Languages
463
simulator will ne fi ' h h ' .
I ver 1I11S Simulating that process (because t e process tS III an infinite
oop), and thus the simulator cannot update outputS or simul ate other processes.
Recall that a common type of
sequent ial circuit is a controll er
which implements a
machine. The controll er consists of
a state register and combi national
logic.
VHDL
Figure 9.22 shows one way to
model a controller in VHDL. The
controller modeled is described by
the FSM shown in Fi gures 3.38
and 3.39. The VHDL entity, named
LaserTIlll er , defines the controll er'S
inputs and outputs.
The VHDL architecture
describes the behavior of the entity.
The archi tecture consists of two
processes, one modeling the state
register, the other modeling the
combinational logic, that form the
standard controll er archi tecture
from Figure 3.47.
The first process descri bes the
controll er' s state register. That pro-
cess, named stafereg. is sen ilive to
inputs elk and rsl . If the rsl input is
enabled, then the process asyn-
chronously sets the Cllrrell/s/{/Ie
signal to the FSM's init ial state,
S_Ojf. Otherwi se, if the clock is
ri sing, the process updat es the state
register wit h the next tate.
Figure 9.22 Behavioral VHDL
descripli on of the LLrser7imer controller.
library ieee;
use ieee. .all
entity LaserTimer is
port (b : in stCLlogic;
x: out std_logic;
elk. rs t: in std logic
I;
end LaserTimer;
architecture behavior of LaserTimer i.
type statetype is
(S_O ff, S_Onl, S_On2, S_On3);
signal currentstate. nextstate:
statetype;
begin
statereg: proceaa (clk, rst)
begin
if (rst= '1') then -- int.ial state
currents tate <= 5 Off
e1aif (clk= ' !' and clk' e';'8Jlt ) then
<= nextstate;
end process ;
cornblogic : procesa (currents tate ,
begin bl
case currents tate ia
when 5_0f f =>
x <= '0'; -- laser off
if (b='O' I th"n
nextstate
elae
nextst.ate
end if ;
x <= '1'; -- laser on
nextstate <= 5_On2;
when 5_002 =>
:: still on
when 5_0n3 =>
x <= '1';
nextstate
end case ;
end process ;
end behavior;
laser st.ill On
S_Off;
The c"rrell/stale and lIexlstale signals are defined as a type. named
slatelype. The statet ype is defined by the type. statement and speCIfies the po ible values
a signal of that type can represent. In SpeCtfYlllg slatel ype. whIch repre ents the tates of
an F M, the Iype declaration consists of the names of all the states in Our controller s
cifi cally S_Ojf. S_OIlI. S_01l2. and S_O/l3. . pe-
464 Hardware Description Languages
The second process describes the cont roller' s combinational logic. That process,
named cOlI/btogi c. is sensitive to the input s to the comblll atJOnal logtc of FI gure 3.47,
namely. the external inputs (in Lhi s case. b). and the stat e outputs (curreIlISIGle).
When either of Lhose items change. the process sets the FSM s outputs, In tht s case x,
with the appropri ate value for the current stat e. The process al so detenlllnes what the next
state should be, based on the current state and the values of lllputS (Le .. the condllJOns on
Lhe FSM transitions) . The next state will be loaded int o the state regtster by the state reg-
ister process on the next ri sing clock edge.
Notice that the archit ecture
declares two signals, CUrre/llSlOle
and lIeXISlate. Signals are visibl e
across all processes in an architec-
ture. The CUrreJ1fstate signal
represent s the actual storage of the
Slale register. The ll exfs(Qte signal
represent s the value coming from
the combinational logic and going
to the state register. Notice also that
the architecture declares those
signals as rype slOl erype. defined in
the architecture as a rype whose
value can be eiLher S_OiJ. S_OIlI.
S_On2. or S_01l 3.
Verilog
Figure 9.23 shows one way to
model a controll er in Veri log. The
controller modeled is described by
the FSM shown in Figures 3.38 and
3.39. The Veri log module, named
LaserJimer. defines the controll er's
inputs and outputs.
The module consist of two
procedures, one modeling the state
regi ster. the other modeling the
combinational logic, that togeLher
form the tandard controll er archi -
tecture from Figure 3.47.
The state register procedure is
sensiti ve to the po iti ve edge of Lhe
rSI input and the positive edge of the
elk input. The sLaLe regi ter has an
asynchronous reset signal and in
order to model the asynchronous
reset. the staLe regi ster procedure
must be to the positive
module LaserTimer {b, x, elk. rst);
input b. elk, rst;
output x;
reg x;
parameter 5_0ff
5_0nl
5_0n2
5_0n3
2 'bOO.
2 'bOl,
2 ' blO,
2 ' bll ;
reg [1 : 0] currentstate;
reg [1: 0] nextstate;
/ / state register procedure
always @ (posedge rst or posedge
begin
if (rst==!) / I initial state
currents tate 5_0ff ;
else
currents tate nextstate i
end
II combinational logic procedure
always @ (currents tate or b)
begin
case (currents tate)
S Off : begin
-x <= 0; II laser o ff
if (b==OI
nextstate
else
nextstate
end
S On1 : begin
-x <= 1; / I laser on
nextstate <= 5_0n2;
end
S On2: begin
-x <= 1; II laser st ill on
nextstate <= 5_0n3;
end
S On3: begin
-x <= 1 ; /1 laser still on
nextstate S_Off;
end
endcaee
end
enc1module
elkl
Figure 9.23 Behavioral Vcril og description of the
Loser7imer controller.
9.3 Sequential Logic Description Using Ha rdware Description Languages
465
edge of the rSI input. On the positive edge of the rSI input , the procedure wi II wake asyn-
chronously and sets the currelllSIGle signal to the FSM's ini ti al state, S_OiJ. On the ri sing
edge of the cl ock input, elk, if the reset input is not enabled, the procedure updates the state
register wi th the lIeXISlme value determi ned by the combinational logic procedure.
In Verilog, we must expli citly specify the size of the state registers as well as define
the values associated with each state within the FSM. Within the LaserTimer module we
declare four parameter values, namel y, 5_0ff, 5_0111 . S_01l2. and S_01l3. whi ch specify
the values assigned to each state within the FSM. For exampl e. "5_0 f f 2' bOO"
defines the state name S_Off and assigns the 2-bit value "00" to thi s state. We can then
refer to thi s state th roughout the modul e using S_Off instead of using spec ifi c bit val ues.
Whi le not required to define a state machine, using parameters increases the readability of
our design and makes revi sions to the FSM much easier. As the LlIserTill/ er s FSM has
four stat es, we need a 2-bit state register. and we therefore declare the currelllSlal e and
lIexlstal e signals as 2-bit registers.
The second procedure is the combinational procedure implementing the control logic
of the FSM. That procedure is sensitive to the inputs to the combinational logic of Fi gure
3.47, namely, the external inputs (i n this case. b). and the state regi ster OutpulS (cllrrelll-
Slate). When ei ther of those items change, the procedure sets the FSM's out pulS. in thi s case
x, with the appropriate value for the current state. The procedure also determines what the
next state should be, based on the current state and the values of inpulS (i.e .. the conditions
on the FSM transiti ons). The next state wi ll be loaded into the state register by the tate
regi ster procedure on the next posi ti ve clock edge.
Notice that the module declares two signals, CllrrelllStale and lIeXISlal e. Signals are
visible across all procedures in a modul e. The c"rrelllstal e signal represenlS the actual
storage of the state register. The /l exlstale signal represents the value coming from the com-
binaLi onal logic and going to the state register.
SystemC
Figure 9.24 shows one way to model a controller in Systemc. The Controller modeled is
described by the FSM shown in Figure 3.38 and Ftgure 3.39. The module. named Laser-
Jimer. defines the cont roll er's inputs and outputs.
The module consists of two processes. one modeling the SLate regi ster named
sral er eg, the other process modeling the combinational named combl ogic. that
too ether form the standard controll er architecture from Ftgure
o The state register process is sensiLive to the positive edge of the rSI input and the po _
iLive edge of the elk input. The state register has an reset signal. In order to
model the asynchronous reset. the state regt ster process IS senSlllve to the po itive edge of
the rSI input. On the positi ve edge of the Input. the process wtll wake
and sets the Cllrrell/stal e signal to the FSM s mlll al state. S_OiJ. On the nSIng edge of the
clock input, elk, if the reset input is not enabled. the process updates the state regi ster
wi th the /l eXISlllle value determined by the combmallonal logtc proce 's.
The cllrrelllstale and lIeXlslate signals are defined as a u:er-defined type, name<!
. d fi ned by the elllllll statement and spect fYIng the possible values a
Stal elype. slllIelype tS e . . , . -
signal of that type can represent. In pectfymg stalelype, \\ htc.h. the state' of an
FSM. the elll/III declaration consists of the names 01 all the states tn Our controller. spe-
cificall y S_Off. S_O/l/ , S_01l2, and S_O/l3.

-'66 Hardware Description Languages
The second process.
named cOll1blogic. is sensi-
tive to the inputs to the
combinational logi c of
Figure 3.-'7. namely. the
external inputs. and the
state regi ster output s. When
either of those items
change. the process sets the
FSM' s out puts. in this case
x. with the appropriate
value for the cun'ent state.
The process also deter-
mines what the next state
should be. based on the
current state and the val ues
of inputs (i.e" the condi-
tion s on the FS M
transitions). The next state
will be loaded into the state
register by the state regi ster
process on the next rising
clock edge. Within the fi rst
state. we determi ne the next
state depending on the
value of input b by per-
forming the compari son
" b . read() SC
LOG I C_O". Note that the
compar ison for equality
uses the syntax "=="
Instead. if we accidentall y
used the syntax " =", which
is a valid statement. our
design would function
incorrectly.
Notice that the modu le
two sc-""igl/a/s.
currentSlale and nex/sltl te.
Signals are visible
.include "systemc.h"
SC_ MODULE (LaserTimer)
{
sc_ in<sc_ logic> b, elk. rst;
Bc_ out <sc_ logic> x;
sc_ signal <statecype> currents tate . nextstate;
SC_ CTOR (LaserTimer) {
SC_METHOD (statereg) ;
sensitive-pos rst elk;
SC_METHOD (comblogie) ;
sensitive currents tate b;
void statereg () {
if ( rst. r ead () SC_LOGIC_l)
currentstate S_Off; I I initial state
else
eurrentstate nextstate;
void comblogic ( )
)
);
switch (eurrentstate)
case S_Off:
x. write (SC_LOGIC_O); II laser off
if ( b .read () == SC_LOGIC_O
nextstate S_Off;
else
nextstate = S_Onl;
break ;
case S_Onl:
x . write (SC_LOGIC_l); I I laser on
nextstate S_On2;
break ;
case S_On2:
x . write (SC_LOGIC_l); I I laser st ill on
nextstate S_On3;
break;
case S_On3:
x. write (SC_LOGIC_Il; II laser stil l on
nextstate = S_Off;
break;
Figure 9.24 Behavioral SystemC description of the
Ltlser7imer conlroll er.
all proce, 5es in a module. The currell i stale signal represents the actual storage of the state
regi ster. The lIeXl.\IlIIe signal represent s the value coming from the combinational logic
and goi ng to the state register. Notice also that the arch it ecture declares those signals as a
type ,f/alelype. defined in the architecture as a type who,e value can be either S_OjJ.
.5_0"1. S_01/2. or S_01l3.
9.4 Datapath Component Descri ption Usi ng Hardware Description Languages 467
9.4 DATAPATH COMPONENT DESCRIPTION
USI NG HARDWARE DESCRIPTION LANGUAGES
Full-Adders
Recall that a full -adder is a combinati onal ci rcuit that adds three bits (a, b. and ci) and
generates a Sum (5) and a carry-out (co) bit. This secti on shows how to describe a full-
adder behaviorall y in an HDL.
VHDL
Fi gure 9.25 shows a I'u ll -
adder described behav-
iorall y in VHDL. The
full- adder design corre-
sponds to the full -adder
described in Fi gure 4.3 1.
The VHDL entit y, named
Fill/Adder, defines the
full -adder's three inpllls
a. b, and ci and two
outputs s and co.
library ieee;
use ieee.stcLlogic_1164 . all;
entity FullAdder is
port ( a, b, ci: in std_logic;
s, co: out std_logic
);
end FullAdder:
arcbitecture behavior of FullAdder i.
begin
process (a, h, eil
begin
5 <= a xor b xor ci;
The architecture co <= (b and cil or (a and cil or (a and bl;
describes the behavior of
the full -adder. The arch i-
end process ;
end behavior;
tecture consists of a Figure 9.25 Behavioral VHDL descri pti on of a fu ll -adder.
single process describing
the combinati onal behavior of the full -adder. TIl e process is sensitive to all three inputs (a.
b, and ci ) of ule full-adder. When any of the inputs change. the process executes its two
statement s updating the values for the sum (s) and carry-out (co).
Verilog
Figure 9.26 haws a fu ll -adder
described behaviorall y in Veri log.
The fUll- adder design corresponds
to the full -adder descri bed in
Fi gure 4.3 1. The Veri log module,
named FIIIlAdder, defines the full -
adder' s three inputs a, /l. and ci and
two outputs s and co.
The module describes ule
behavior of the full -adder and
module F-ullAdder (a. b, ci, 5, co);
input a, b, cit
output s, co;
reg 5, co;
always @ (a or b or eil
begin
5 <= a A b A ci;
co <= (b & cit I (a & ei) I (a & b);
elld
endmodule
Figure 9.26 Behavioml Vcri log description of a
consists of a single always proce- fu ll -odder.
dure descri bing the combinat ional
behavi or of the full -adder. The pro-
cedure is sensi ti ve to all three input s (0. b. or ci) of the fu ll -adder. When any of the input
Change, the procedure execute
carry-out (co).
its two statements updating the value for the sum (s) and

468 Hardware Description Languages
SystemC
Figure 9.27 shows a full-adder
behaviorall y in Sys-
teme. The fu ll -adder design
corresponds to the full-adder
described in Figure 4.3 1. The
SystemC module. named FIll/ -
Adder. defines the full-adder s
three inputs C/, b. anel ci and
two outputs s and co.
The modul e describe the
behavior of the full -adder and
consists of a single process.
named combiogic. describing
the combinational behavior of
the fu ll -adder. The process is
sensitive to all three inputs (a,
#inc1ude
SC_MODULE (FullAdder)
{
Be in<sc_ logic> a,
sc=out <sc_ logic> s.
SC_CTOR (FullAdder)
b. ci;
co:
( SC METBOD {comblogic);
a b
void comblogic ()
ci;
"" b. read () " ci. read(;
& ci. read {)I I
5 ,write(a. r ead ()
co .write ( (b. read ()
(a . read { 1
(a .read ()
)
);
& ci. read (1I I
& b. read ( III;
b. or ci ) of the full -adder. When Figure 9.27 Behavioral SystcmC descripti on of a full-adder.
any of the inputs change. the s m (s) and ca -out (co)
process executes its two stat ements updating the values for the u rry .
Carry-Ripple Adders
We now show how to struc-
turall y describe a 4-bit
carry- rippl e adder using the
full-adder we designed in
the previous section.
VHDL
Figure 9.28 is a VHDL
descri ption of a 4-bi t carry-
ripple adder wit h a carry- in,
as appeared in Figure 4.33.
The VHDL emity, named
CarryRippieAdder4, has
two 4-bit input s, C/ and b,
and a carry-in input. ci. The
carry-rippl e adder outputs a
4-bi t sum, s, and a final
carry-out co.
The architecture struc-
turall y describes the carry-
ripple adder composed of
four full-adders. The archi -
tecture begins by declaring
the component Ful/Adder,
library ieee ;
use ieee. std_logic_l164. all ;
entity CarryRippleAdder4. ia
port (a: in
b: in std_logic_vector(3
ci : in std_logic;
downto 0);
downto 01;
s: out std_logic_vector (3
co: out std_logic
downto 01;

architecture structure of CarryRippleAdder4 i.
component FullAdder
port ( a, b, ci : in std_logic;
s, co: out std_logic
I ;
end component ;
signal col, co2, co3 : std_logic ;
begin
FullAdderl: FullAdder
port map (a 101. b(OI. ci, 5 (01.
Fu llAdder2 : FullAdder
port map (a (11. b(ll, col, 5 {ll,
FullAdder3 : FullAdder
port map {a{21. b(21. co2, 5 (21.
FullAdder4: FullAdder
port map {a {31. b{31. eo3, 501.
end structure;
col) ;
c021;
c031;
col;
Figure 9.28 Structural VHDL descript ion of a 4-bit carry-
ripple adder.
9.4 Data path Component Description Using Hardware Description Languages 469
whi ch was described in the previous section. TIle design has three internal signal, col. co2.
and c03, that are used for internal connection between the full-adders. The architecture then
instantiates four Fill/Adder components. In VHDL. each instantiat ed component must have
a unique name. The four Fill/Adder components in this design are uniquely-identified by
the names FIII/Adderi. FIII/Adder2, FIII/Adder3. and FIII/Adder4.
In VHDL. the std_logic_vector type provides a convenient method of specifying
pons or signal s consisting of multiple bits. However. a design may need to access the
indi vidual bi ts of these vectors. The individual bits of a s{(Clogi c_vector can be accessed
by specifying the desired bit positi on within parentheses after the vector' s name. For
example, to access bit 0 of the 4-bi t input {( of thi s design. one would use the syntax
"a (0 )". In defining the connecti ons to the instanti ated components in the carry-rippl e
adder. indi vidual bits of the inputs a and b and output s are 'lccessed using thi s yntax.
The first full -adder, FIII/Adderi. connects bit 0 of the inputs a and b as well as the carry-
ripple adder' carry- in, ci, to the full-adder' s three inputs. The s output of FlIllAdderl is
connected to bit 0 of the 4-bit adder's sum output. s. represemed as s(O) . The design then
connects the carry-out bit of FIII/Adderl to the internal signal co l , whi ch is ubsequently
connected to the carry-in input of the next full -adder. FIII/Adder 2. The component COn-
necti ons of the remaining three full-adders are connected in a imilar faShion. with the
exception of the last full-adder in the carry-ripple chain. The carry-out from that last full -
adder. FIII/Adder4, is connected to the carry-out output (co) of the carry-ripple adder.
Verilog
Figure 9.29 is a Veri log descrip-
tion of a 4-bit carry-ripple adder
with a carry-in, as appeared in
Figure 4.33. The Veri log module,
named CarryRippieAdder4, has
two 4-bit inputs. a and b, and a
carry-in input. ci. The carry-
ripple adder outputs a 4-bi t sum,
s, and a final carry-out co.
The module structurally
describes the carry-ripple adder
composed of four full-adders.
module CarryRippleAdder4 (a.
input (3:0) a;
input (3:0) b;
input ci;
output (3 :01 .:
output co;
wire col, co2, co):
FullAdder pullAdderl{aIO).
s(O)
FullAdder FullAdder2 (a (1)'
5(1)
FullAdder FullAdder3 {a (21
5(2)
FullAdder FullAdder4 (a(3),
(3!.
The design has three internal ._ule
b, ci, s, CO);
bioI ,
col) ;
bill
co2) ;
b(2)
c031;
b(3) .
co) :
cit
COl.
Co2,
co3,
wires, col , co2, and c03, that are Figure 9.29 Structural Veril og descripli on of a 4-bit carry_
used for inlernal connection ripple adder.
between the fu ll -adders. The " .
module instantiates four Fill/Adder components. In Venlog. each IIl stanttated component
. The four Fill/Adder components III thI S deSIgn are uniquely_ must have a ul1lque name.
. 'fi b th F IIAdderl FIIIIAdder2. FIII/Adder3. and FIII/Adder4.
Identl ed y e names II , . f . .
In Veri log. vectors provide a conveni ent method of SpeCI ports or SIgna] coo-
sisting of multiple bits. HOlVever. a design may need to the blls these
vectors The individual bits of a vector can be acce se y SpeCI ymg .the de Ired bit
.. ' ' th' brackets after the vector's name. For example. to access bit 0 of the 4-bit
position Wt . III . Id se the syntax "a [0 J". In defi ning the connection to th
input a of tillS destgn, one wou u e
470 Hardware Description Languages
instantiated componcl1l s in the carry-ripple adder, indi vidual bits of the inputs a and b and
output s are accessed using thi s symax. The first full -adder. Fill/Adder I , connects bit 0 of
the inputs (1 and b as well as the carry-rippl e adder' s carry- in. ci. to the full -adder' s three
inputs. The s outpul of FIII/Adderl is connected to bi l 0 of the 4-bit adder's sum output , s,
represented as s{O). The design then connects the carry-oul bi t of FIII/Adderl to the internal
signal co l. whi ch is subsequeml y connectcd to the carry- in input of the next full-adder,
FIII/Adder2. The component connecti ons of the remaining three full-adders are connecled
in a simil ar fas hi on. with the excepti on of the last fu ll -adder in the carry-ripple chain. The
carry-out frolll Ihe last full -adder. F"IIAdder4. is connected 10 the carry-out output (co) of
the carry- ripple adder.
SystemC
Figure 9.30 is a SystemC descripti on of a 4-bit GUTy- ri pple adder wil h a carry-in, as
appeared in Figure 4.33. The SystemC modul e, named CarryRippleAdder4, has two 4-bit
input s. a and b. and a carry-i n input. ci. The carry-ripple adder outputs a 4-bi l sum, s, and a
final carry-out co.
Figure 9.30 Siructural SY' teme
de;cri pt ion of a 4-bi l carry-
"pple adder
#inc1ude systemc.h-
#include fulladder. h-
SC_MODULE (CarryRippleAdder4)
(
sc_ in<sc_ logic> a[4];
sc_ in<sc_ logic> b [4] ;
sc_ in<sc_ logic> ci;
sc_ out<sc_ logic> 5 [4) ;
Bc_out<sc_ logic> CO;
FullAdder FullAdder_l:
FullAdder FullAdder_2;
FullAdder FullAdder_3 ;
FullAdder FullAdder_4;
SC CTOR(CarryRipple4) :
FullAddecl I "FullAdder_I") ,
FullAdder_2(MFullAdder_2-),
FullAdder_31"FullAdder_3"),
FullAdder_4 I " FullAdder_4" )
)
);
FullAdder_l.alaIO]): FullAdder_l.bl bIOI):
FullAdder_l.ci(ci): FullAdder_l.s(s[O] );
FullAdder_l . co(col) ;
FullAdder_2 . a (a II] ): FullAdder_2. blbll]):
FullAdder_2 . ci Icol); FullAdder_2 . sis II] );
FullAdder_2 .co(co2);
FullAdder_3.a la (21); FullAdder_3 . blb(21);
FullAdder_3.cilco2): FullAdder_3.sls12]):
FullAdder_3. co (co) ;
FullAdder_4 . ala(31); FullAdder_4 . blbI3 1);
FullAdder_4.cilco3); FullAdder_4.sls13]);
FullAdder_4. co (co) ;
Up-Counter
9.4 Datapath Component Description Usi ng Hardware Descri ption Languages 471
The module structurall y describes the carry-rippl e adder composed of four full -adders.
The desIgn has three internal signal s, col , co2, and c03. that are used for internal connec-
tI on between the full -adders. The module first instanti ates four Fill/Adder components. In
SystemC, each instant iated component must have a unique name. The four FlIllAdder com-
ponents in thi s design are uni quely identified by the names Fill/Adder_I . FlIllAdder 2
FIII/Adder_3, and FIiI/Adder_4. - .
Previously, we defined multiple-bit inputs as an input vector using the sc_lv type.
However, SystemC does not support connecting individual bits within a s ignal o r POri
of type sc_l v 10 a structural description. In our Can)'RippleAdder4 design. we instead
defined the input s and output s, (I, b. and s, as arrays of sc_l ogic wi th four e lement s
each, rather than using type sc_l v. The indi vidual bits of the array can be accessed by
specifying the desired bit positi on wit hin brackets after the array's name. For
example, to access bit a of the 4-element input array a of thi s design, one would USe
the synt ax "a[O ]". In defining the connecti ons to the instant iated components in the
carry-rippl e adder, indi vidual bits of the inputs a and b and output s are accessed
us ing thi s synt ax. The first fu ll -adder, Fill/Adder _1, connects bit a of the inputs a and
b as well as the carry-rippl e adders carry- in. ci, to the full- adders three inputs. The
s output of Fill/Adder _I is connected to bit 0 of the 4-bit adders sum output . s. rep-
resented as s{O). The design then connects the carry-out bit of FlIllAdder _1 to the
internal signal co l that is subsequentl y connected to the carry-i n input of the next
full-adder, Fill/Adder _2. The component connecti ons of the remaining three full-
adders are connected in a simil ar fas hi on, with the excepti on of the last full-adder in
the carry- rippl e chai n. The carry-out from the last full -adder. Fill/Adder _4. is COn-
nected to the carry-out output (co) of the carry-ripple adder.
VHDL
Figure 9.31 is a VHDL de cription of a 4-bit up-counter, a appeared in Figure 4.48. The
VHDL entity, name UpColllller, defi nes the counter's inputs and outputs. consisting of a
clock input elk, a counl enable control input CIII, the 4-bi t count value C. and a terminal
count OUlput tc.
The UpCOIlllter's architecture slructurally describes the de ign consisting of three com-
ponents, namely Reg4, IlIc4. and AND4. Reg4 is a 4-bi t parallel load register with a load
control inpul Id. III c4 is a 4-bi t incrementer. AND4 is a four- input AND gate u:at will output
1 if and onl y if all four input s are I. The archi tectures f u r t h ~ speCifies two signal . tempC
and incC, used as internal wires within the structural deSCription.
-In 9 Hardware Description Languages
Figure 9.31 Structural VHDL
descript ion of 4-bit up-
counter.
library ieee;
use ieee. std_logic_1164 . all ;
entity upCounter is
port ( elk: in stcLlogic;
cnt: in sed-logic;
C : out stc;Llogic_vector (3 downto 0);
tc: out stcLlogic
);
end UpCounter;
architecture structure of upCounter ia
component Reg4
port ( I : in stcLlogic_vector(3 dowuto 0);
Q: out std_logic_vector() dcrwnto 0);
elk, Id : in stcLlogic
);
end component ;
conwonent Inc4
port ( a: in std_Iogie_vector(3 downto 0);
s : out (3 downto 0)
);
end cOl!lPOnent ;
component AN04
port ( w,x,y,z : in std_logic :
F : out std_logic
);
end component :
signal tempC: std_logic_vector(3 downto 0);
signal incC: std.....logic_vector (3 downto 0):
begin
Reg4_1:
Inc4_1:
AND4_1 ,
Reg4 port
Inc4 port
AN04 port
map {incC, tempC. clk, ent);
map ( tempC, ineC);
map ( t empC(3) , tempC(21.
tempC(l). tempC(O) . tc);
outputC: process (tempC)
begin
C <= tempC;
end process ;
end structure:
The architecture i nstantiates each of the three components and specifies the con-
nections between them. Reg4 is the only sequenti al component within the up-counter
and thus the elk i nput onl y needs to be connected to the clock input of the register.
We control the up-counter 's counting by connecting the count enable input, CIII, to the
load enabl e. Id, of the regi ster. The output Q of Reg4 _ I i s connected to the internal
signal tempC, which connects the register 's output to both the IlI c4_1 and AND4_1
components. Inc4_1 recei ves the current count from the tempC connection and
outputs the incremented count on its output s, which is connected to the other internal
signal illcC. The ill cC si gnal connects the incremented count from IlIc4_1 to the par-
allel load input I of Reg4_1. The current count is al 0 connected to the four inputs of
the AND4_1 component. The AND4_l' s output F i s then connected to the counter's
terminal count output IC.
9.4 Datapath Component Descri ption Using Hardware Description Languages
473
. In the UpColllller design. we need to connect the output of the 4-bit regi ster to the
Incrementer. the A D gate. and the counter' s output port C. VHDL does nOt all ow us to
connect multipl e signals or ports withi n the port map of an instanti ated component.
Therefore. the architecture uses the tempC signal to connect Reg4 _I's output to both the
AND4_1 and Il/ c4_1 components. We still need to connect the register's Output to the
output pan c.. The architecture makes thi s connection by specifyi ng a proces , named
OItlPll tC. that IS used to connect the output of the regi ter to the output pon C. The
OWPlItC process is sensiti ve to the signal tempc. previously used as an internal wire
bet ween the three component s. Whenever tempC changes, which corresponds to a chanae
in the up-counter'S stored count, the OlltPlIt C process assigns the new count to the
port C.
Vcrilog
Figure 9.32 is a Veri log descripti on
of a 4-bit up-counter, as appeared in
Fi gure 4.48. The Veri log modul e,
named UpColllller. defines the
counter's input and outputs. con-
si sting of a clock input elk, a count
enable control input CIII . the 4-bit
count value C, and a terminal count
output tc.
The UpCollnter 's modul e struc-
turall y describes the design
consi sting of three components,
namely Reg4, II/ c4, and AND4.
Reg4 is a 4-bi t parall el load regi ster
with a load control input Id. II/ c4 is
a 4-bit incrementer. AND4 is a four-
input AND gate that wi ll output 1 i f
and onl y if all four i nputs are 1. The
modul e further speci fi es two 4-bi t
wires, tempC and il/cC, used as
internal wires within the structural
descripti on.
The module instantiates each of
the three component s and speci fies
the connecti ons between them. Reg4
is the onl y sequential component
within the up-counter and thus the
elk input onl y needs to be connected
to the cl ock input of the register. We
control the up-counter' s counting by
connecting the count enable input.
CIII . to the load enable. Id, of the
register. The output Q of Reg.J_1 is
module Reg4tI. Q. elk. ld);
input (3 :0) I;
input elk. ld;
output {3:0] Q;
II details not shown
endmodule
module Inc4 (a, 5);
input (3,01 a;
output [3:0) 5;
II details not shown
endmodule
module AND4(w.x,y,z.F);
input w, x, Y, Z;
output F;
II details not shown
endmodule
cnt, C. tc);
output (3,0) C;
reg 13 , 0) C;
output tc;
wire 13,01 tempe;
wire (3 : 0] incC;
Reg4 Reg4_1 lince, tempe, clk. cnt).
Inc4 Inc4_1 (tempC, incc); .
AND4 AND4_1 (tempe 13). tempe 12)
tempe Ill. tempCIOI: te);
alway. @(tempC)
begin
C <= tempC;
end
endmodule
Figure 9.32 Structural Veri log des ription of
up-coumer.
- --_ ..

Hardware Description l anguages
connected to the internal signal l empC. whi ch connects the regi ster's output to both the
IlIc-l_1 and AND-I_I components. IlIc-l_1 recei ves the current count from the l empC con-
nect ion and outputs the incremented count on its output s which is connected to the other
internal signal illcC. The ill cC si gnal connects the incremented count from IlI c4_1 to the
parall el load i nput I of Reg4_1. The current count i s also connected to the four inputs of
the AND-I_I component. The AND4_ l' s output F is then connected to the counter' s ter-
minal count out put Ic.
In the UpCollll ler desi gn, we need to connect the output of the 4- bi t regi ster to the
incrcment er. the AND gat e, and the counter' s output port C. Therefore, the modul e
uses the l empC si gnal to connec t Reg4_ l' s output to both the AND4_1 and IlI c4_1
component s. We still need to
connect the register 's output
to the output port C. The
modul e makes thi s conneC-
ti on by specifying
procedure that is used to
connect the output of the
register 10 the output port C.
The procedure i sensiti ve to
the signal l empC. previ ousl y
used as an int ernal wire
between the three compo-
nents. Whenever l empC
changes. whi ch corresponds
to a change in the up-
counter's stored count. the
procedure assigns the new
count to the output port C.
SystemC
Fi gure 9.33 i s a SystemC
description of a 4- bit up-
counter, a appeared in
Figure 4.48. The SystemC
modul e, named UpCol/llter.
defines the counter's input s
and outputs, consi sting of a
cl ock input elk, a count
enable control i nput ClII , the
4-bi t count value C, and a
tenninal count output Ie.
The UpColllll er 's module
structural ly descri bes the
design consisting of three
components, namely, Reg4,
IlIc4. and AND4. Reg4 is a
.include systemc. h"
"reg4 . h"
"inc4 . h"
"and4 . h"
ff:include
#include
#include
SC_MODULE (UpCoun ter 1
(
sc_ in<sc_ logic> elk. cnt;
s C_out <s c _ lv<4> > C;
sc_ signal<s c _ lv<4> > tempC, ineC;
sc_ signal <8c_ logic> tempC_b [4] ;
Reg4 Reg4_1;
Inc4 Ine4_1;
AND4 AND4_1;
SC_ CTOR (UpCou nter) Reg4_1 ( " Reg4_ 1 " ) ,
I ne4_1 ( "Ine4_1 ") ,
AND4_1 ( "AND4_ 1 " )
(
Reg4_1.I{ineC) ; Reg4_1 . Q{tempC) ;
Reg4_1. elk (elk); Reg4_l.ld{ent) ;
Inc4_1. a (tempC); Ine4_1. s (inee) ;
AND4_1 . w (tempC_b [0 J ); AND4_1. x (tempCb 11 J ) ;
AND4_1 . Y (tempC_b 12 J ) ; AND4_1. z ( tempCjJ [31) ;
AND4_1. F (te) ;
SC_METBOD (eomblogie) ;
sensitive tempC;
void eombl ogie ( )
}
};
tempC_b [0 I tempC . read () [0 J
ternpC_b [11 tempC . read () [1 J
tempC_ b [2 J tempC . read () [2 J
ternpCjJ 131 tempC . read () [3 J
C. write (tempC) ;
figure 9.33 Slructural SystemC descripti on of 4-bit up-counter.
9.5 RTl Desi gn Usi ng Hardware Description Languages 475
4-bit parallel load register with a load control input lei . IIIc4 i s a increment er. AND" is
a four-input AND gate that will output 1 if and onl y if all four input' are 1. The module
further specifi es two 4-bit signal s. l empC and ill c , u"cd as internal wi res within Ille struc-
tural descripti on, Additi onally. the modul e defi nes a four-element array of ,c_l ogic signals.
named l empC_b. u ed to access the indi vidual bits within the 4-bit Vector l empC.
The module first instantiates each of the three c mponelllS and then specifies the con-
necti on between them. Reg" is the onl y sequent ial component withi n the up-counter and
thus the elk input only needs to be connected to the clock i nput or the register. We contIol
the up-counter' s counting by connecting the count enable input . CIII , to the load enabl e.
Id, of the register. The output Q of Reg4_1 is connected to the internal signal tempC,
whi ch connects the register' output to IlIc4_1. IlI c4_1 receives the current count from the
l empC connecti on and outputs the increment ed count on its output " whi ch is conne ted
to the internal signal ill cC. The ill cC signal connects the incremented count from 111 ('4 I
to the parall el load input I of Reg" _I . The current COunt is aL connected to the
inputs of the AND4_1 component using the l empC_b array to access the i ndi vidual bits.
The AND4_l's output F isthen connected to the counter 's terminal Count outputlC.
In the UpCol/lller deSIgn. we need to connect the output of the 4-bit register to the
incrementer, the AND gate. and the counter' s output port . Therefore. the module uses
the l empC signal to connect Reg4_l's output to the IlIc4_1 component and uses the
l empC_b array to connect Reg4_l's output to the AND4_ 1 component. Thus, we still need
to connect the register's output to the output port C and as" ign the indi vidual bits of the
regi ster' S output to the tempC_b array. makes these connecti ons by defining
a process, named combloglc, that IS senSlll ve to the signal l emp . Whenever tempC
changes. which corresponds 10 a change III the up-counter's 5t red count. the combl ogic
process assigns the new count to the output port C. Additi onall y, the process assigns the
bi ts withi n vector tempC w the ,"d, vldual sc_l ogi c signll is within the l elllpC_b llrray. In
order to access the IndIVIdual bIts of the Vector SIgnal l empe. we use the syntax.
MtempC.read()[O) " .
9.5 RTl DESIGN USING HARDWARE DESCRIPTION LANGUAGES
We now show how 10 create RTL descripti ons using HDLs. We will show HDL descrip-
tions of the starting point of RTL deSIgn, namely. hi gh-l evel state machines and of the
ending point of RTL design, namely, connected controllers and data paths. RTL de igners
wi l l commonly create a testbench to test the hIgh- level stat e machine description, and
then use that same testbench for the comroll er/datapath descripti on, thus helping to verify
that the designer created a correct controll er/datapath IlTIpl emcnti on.
High-level State Machine of the laser-Based Distance Measurer
VHDL
Fi gures 9.34 and 9,35 present a VHDL descripti on of a hi gh-l evel state machi ne for the
l aser-based di stance measurer shown tn FI gure 5. 15. The entity, named Laser
DislMeasli rer. defines the ,"puts and output , II1cluding a user- pre ed bunon i nput B. a
la er sensor input S, a laser control output L, and a 16-bit output D for the distance
measured.
476 9 Hardware Description Languages
Figure 9.34 Behavioral
VII DL dc'eripliol1 of 3
hi gh-len.:! "lai c nwc hinc
of (hc la ... cr-bascd di l\ t:1llcC
IIlcal\urcr.
librazy ieee;
u ieee . std_logic_l 164 .all;
u ieee.atd_lOO'ic_arlth.41I;
.Dtity LaserDistHeasurer i.
port ( elk. rst: in std_I09ic:
B. 5; in std_logic;
out std_Iogic;
0: out unsigned(15 downto 0)
);
nd I..aserDist.Measurer;
architecture behavior of LaserDiatMeasucer i.
type statetype 10 ISO. 51. 52. 53. 54);
state: 8tatetype:
aignd Detr ; unsignedllS _to 0);
cone taut U_ZERO :
unsigned lIS cSownto 0) ;. '0000000000000000';
con. taut U_ONE : unsigned(O 4owDto 0) :& -1- ;
begin
statemaehine: p.roc (elk, rst)
begiD
if 1 .) th.n
L <=- 0 ' ;
o <= U_ZERO;
Detr <. U_ZERO;
state <.sa: SO: -- i nitial SUtt.
el.if (clk::o: '1' and elk' event) th.n
ca state i.
"h.n SO :=>
L <c . 0';
o <= U_ZERO:
state <= 51
(continued in Figure 9.35)
laser off
clear 0
InSlead of using a 16- bil ."d_logic_,'eclOr. we defined the OUlput 0 as ullsigned. For
logic operati ons. an ullsigli ed behaves the same as a sIlClogic_,'ecl or. However. we can
also perform ari lhmelic operations on ullsigll ed values. Whenever u ing unsigned. __ e
musl include the Slalement use ieee . 5 td_1 og ; c_a ri th. all; at the lOp of our
YHDL descriplion. The use slatement specifies which package we will use within our
design. The package i eee.sld_logic_arilir defines Ihe ullsigll ed type as well as a set of
operati ons and functions we can perform on ullsigll ed value.
The entit y also define a clock inpul elk and resel inpul r SI . We assume that the clock
input is 300 MHz. as was assumed in the laser-based distance measurer design shown in
Fi gure 5. 19. We omil delails of generating Ihe 300 MHz clock (see Section 9.3 for an
exampl e of describing an oscillalor).
The YHDL archilecture describes the behavior of the entity. Instead of using IWO
processes as shown in Fi gure 9.22. the archilecture consists of a single process describing
Ihe behavior of our highlevel state machine. The high-level state machine process.
named slOlelllacirill e, is sen itive to inputs elk and rsl. If the rSI is 1, then the process
asynchronously selS the slate signal 10 the state machine' initial state, SO, and initializes
9 Rll Design USIOg Hardwaro 0 cnphon Languagos 477
Figure 9.35 Beh3>I<'I1I1 HI)l.
descnpli n of. hl&hl",el ,laiC
machine of lhe I.",rb ",d
dl lance me ul'"er (conlln14rd)
t_ '""" 9:34/
- I
[)cu - U Z :
U I 'I') tbea
t ,
1
t I.
_if
_ 52.
L 'j
A S I
_ 6)-
t, c. '0',
Dctr c OCt.f' t 1,
if t 1'1 tben
- S. r
.1
l
_ it .
"0 . ,.
D SHRt lr, U 'mi.
<- 51:
.... c ,
_if
eDIt proee
eDd t. v r.r:
1ft. t on
1 ... 1 If
the OUlpUt!>. Land O. and the ,"Iemal cmml er /)//, . Ie) Ihell d fallh vIII II ,. '111e
default value, hould corrC5poncJ 10 Ihe va lue, "'''g'' cJ III Ihe IIIlIhlll the lI1i" ,, 1 ,Wlc
of our high. le\el 'tale machine. oli ce thm we defined U (0", 1,1111 . nllliled (J 11-.110. colre
sponding 10 the 16--bll I/IIJ'gned vnlue "f l ero. When Ihe fl' " 11,,1 e"ahled. "" Ih ' ri,i "
clock. the proce evalume, the current ,wte. ;I\"lln\ Ihe "pproprinle Uli lput, fnr Ihe currenl
state. determine the nexl ,wte. and updale, Ihe , lute reg"ler "gnu!. \ 1/11" . In uur III B.h level
Slale machine dcscrip"on. we onl y need a "ngle tale reg"ler" n"l ln l1",del Ihe behuvinr
of our tate machme, of the two ,'gnal, curmllllale and IIr>lI ll1le we prevlllu\ly
used in the controller design hown '" Figure 9.22.
The high level Ulle mach me for Ihe IU'lCr ba'ICd d"wnce meawrer perfnnm IwO
arithmetic operali ons. addilion and ,hi fling. By u\ing Ihe /II"i/l,wtl l ype. 10 lI"remenl Ihe
counter signal OClr in stale 53. we use the 'yntax. "Dc l r ( Dc t r + 1; .'. 'I h" stale
ment will add one to the current value of OCIf and \lore the re\ ult in Ik/r. In ,tute S4, we
calculale Ihe di \tancc. 0 , by dividing Ihe value of J)( If by 2. Il uwever. we perform Ihi.
division u ing a righlshift-by-one operati on. To perfom) Ihe , hlft "nd a"igll the vulue 10
the outpul O. we use the M31ement "0 (- SHR( Dc lr, U ONE) ; ". The funclion
SHR() , defined within the; eee. 5 td_1 og ; c_a rl h package. the fi"t paramo
eler. OClr, by the amount specified by the second parameler. V_ONE, where V_ONE a
constant we defined earlier in Ihe architecture.
Verilog
Figures 9.36 and 9.37 presenl a Yerilog descripti on of a hi gh level \tate machine for the
laser-based di stance measurer shown in Figure 5.15. The module, named Laser
DislMeasurer. define Ihe inputs and outpuIS, including a bullon inpul B. a
.. ... -
4711 Hardware Description Languages
la'cr \CIl,or Input S. i.1
conlfol oulpul I .. ,anu .1
1(, 1\11 OUIPUI /) f()r Ihe
mca\urcu
'I he mlluutc
tkllllc' a InpUI elk
.Ifld re'cl Inplll (II We
",,"mc Ihal Ihe IIlpUI
I' .'(X) f',III/, .Il \Va'
41"lIII1Cd III the la"crba ..cd
tip .. lance I11ca\urcr dC'lgn
,hOWII In "'gurc 5. 19 We
Olllit detail' 01 gcncr'lllIlg
Ihc .1(X) f',1I11 (,cc
cellon I) ., for .In e.'.Imple
of de,ennlng all ,,'eill.llon.
The Venlog moullic
beh.1\ lorall} dc,cril1C, the
! .,lIH' rf)'\(J\/t ' tl\urtr'\ IlIgh-
Ie'el 'I ,li e Ill.lct"ne In,le.lu
or lI'lng (\\() proccdun;\ a'
, 11 0ll II in l'igure 9.2.\. Ihe
lI10dule con ... l .... h or a 'Ingle
procedure.: dc-.crihlllg the
neha' inr or our high-level
'1:1Ie lIl ae hine. The high.
bel ,laic llIaci,,"e proce
dun.: I.... ,cn ... ili\t.: 10 the
po,ili, e edge of Ihe re,el
inpul. r<t. and Ihe po,ili'e
edge or Ihe inpul. de
If Ihe nl b enabled. Ihen
Ihe procedure ,,,ynchro
nou,l y 'e ll Ih e :'Ia lc
register. stllll', 10 the '-laIC
m:u.:hillcs initial O.
and inilializes Ihe QUIPl" '.
.odul. t.A erD stl'!e aut r rl B. S. L. j):
input ... i<:. rat. B. S:
output L,
output
reg L;
reg 1
5:01 D.
D,
pax ... ter l-bOOO.
1 bOO!.
1 bOlO,
S3 j'bOll.
Sl J blOO,
reg 2 0) 4 ..
rev (l
alway. 0 ( po dge r OT poa.dg_ kl
,,-gin
if I r I "-gin
L 0,
D < 0
OCtr 0,
,tate < I Inl i I at

end
begi n
ca Cst: e
SO, "-gin
L < 0
D < 0
8 < 51:
end
SI: "-gin
Dc r < O.
if 16 1)
state 52:
.1
st te 51:
end
52, "-gin
L <:0 1;
state <: 53;
end
(conllnued In Figure 9.37)
II las r off
I clear D
II re.et count
II laser on
Figure 9.36 Bch3\ ioral VHDL de,criplion of n hlghlel'el <lale
nuu.:hinc of Ihe lascr-bn. cd di,wncc me3\urcr.
L and D. and Ihe imernal coumer regisler. DClr. 10 lheir defauh value. The defauh valu
shou ld correspond 10 Ihe values as igned 10 Ihe ignal \Vilhin Ihe ini li al late of our hi gh-
level SlalC machinc. When the rSI is nOI enabled. on the rising clock. the procedure evaJu
ale, Ihe currenl 'laiC. assigns the appropri ale OUlput for the curren! stale. delennin the
neXI SlalC. and updales the laiC regislcr. In our high-level stale machine de criplion. \Ioe
onl, nced a si ngle late register signal 10 modellhc behavior of our Slale machine. instead
of Ihe IWO rcgi slcr signal s CIIrr elllSfGle and lIe.tlSfGle we previously used in the controller
design shown in Figure 9.23.
TIl<: hlgh!c,(1 'IJI(
mJ ' hl"" t,l< lhe I. 'r hoi ,.s
tJl'.(.Uk:e urer nn,
1\\\1 Jnlhrrk!lh. "perdlulCl.
.IUUUI\'11 .Ind ,hllung III
In.: renl<:nIIt!.: ulUnla I If '"
'laIc \.1. \Ie u lhe ')01.'
Thl' ,1.ICI11<:11I \I III JJJ """
Ul Ihe I.UfTCnl 'Jluc 1.1 I>' "
,anu ,,,,,,. lilt. J"C,uh III I>'"
In 'I.llt .\-I. \Ie (ut-lIl,lIe !l1<:
dl'IMce. D, Il) dll 111<:
'JlliC "1/>.,,,,) Ih'\lc,el
\\ e pcrhmn (hi dl\ I'u)n
U 109 .1 nghl ,hili III lin ...
opewl"n 1" pert',"11 111<:
hili ilnU It!.: 'diU' III
Ihe (lUlrUI 0 . \lC U-C III<:
,lalelllCl1I .. 0
1 : ", \I I1<:rc perln""' ,I
n!!hl ,hili operaunn
S" lem('
Igure' 'l alld I) W rrc'<'n!
il de...:"pu!>n III ,I
hl!!hle'cl ,I;Ile IIIJlhan' tnr
Ihc IJ\Crha'Cd UI,lJIleC mea
urer ,hown an IIFure
The module. n.lloco 1..1.11/ r
1J'flMwlllrrr. dchnc lhe
anpul' and oUlrUI,. ancluu,"
a uler'pre \Cd hunnn anpul
B. a l3.ler len",r anJlllI .S, J
13....:r control OlIlpUI L. Jnd J
16.bll oulpul D for Ihe d,,
lance
The module 31\0 dehnc'
a clock inpul elk and rc\C1
inpul r.l/. We a"ume Ihal Ihe
clock inpul i\ 300 MH7, a\
was a,sumed an the laler,
b3.\Cd di,wncc me3.\urer
design <hown In Figure 5.19.
We omil detail; of generaung
Ihe 300 MHl clock
Seclion 9.3 for an example of
describing an O\Cilialor).
9S RTlD
FIIIU' 31 n I \ Itl.ll \, III, "
,n hi t,I,' I I I ,I til lith:
Iip.tllll1 ttl II IIINh lr\rI 'Hil t'
11K I II""' ,."",,,,,,,./1
Unclvd<> .Y h'
" ... . II 'II
K U1'1IOOI mMI"'hlnt!'I ,
..n.ltt PO' r. 1 I
H f r ' re&d O S<.' 1.oGIC I I (
" write ..LOCI 01
write
at , II a (I'
)
.1 (
..itch ' J
c
L. write I 1.oGie 01:
o, wri.e lOI,
tAt.#>
bred;
ca ('I
II I of!
II c1"'o. r 0
Ot: r O. 1/ cleat coun
H fB. rea4 11 SC LOGIC 11
atE> ,2;
(conbnuod in FI(1UfO 9 39)
Figure 9.38 8th,"!>.. JI Sy\ltm dClCflpulln of a high-level \laic
machlllC of lhe 1.o,c;rba.\Cd d"tance me" Uf.r
480 Hardware Descnpllon Languages
I he S} 'ternl module
beh,,, ",retll} de,enbe, Ihe
l..ill,.,!)"/il" ,{/, ,,,,,r', high
bel 'tate m.lehlne. In"e<ld
oj u\lf1g two pnM.:c\\c\ J.
,hown III "'gure 9.24. Ihe
lIlodule con'I'" (II Inglc
plCe" de,ertomg Ihe
oehav;or 01 Illlr hlglt le,d
, tate lIIachlne I he
Icn'l 11l.U,.:hlllC prnec .......
Ihll11cd \/tII,'fluu1111lt'. ....
,cn"IIVC [0 the j'Xhlll\C
edge 01 Ihe re'el IIlrlil. n/.
and tlte 1'0'111 ," edge 01 lit
dock Inrut. II Ihe 1'\1 "
en.lhled. Ihell lite prtlCc"
,t') lIehrlllllH"I) 'el' the
'Iel/L' ' Ignal III thc '1.l1c
lI1achlne', Inll i.tI ",lIe. SII.
and 11111 ", II Ie' Ihe "UlpU".
(conlmued from Ftgure 9 38)
c
wr i t. ..LOGIC 1 r oc
""".. 3.
br.ak
c a
L wri t. 01 I off
Dtr r.ad 1
if read LOCI 11
.1
"(It S3;
br.u
r r d 1)- Cal"'JI. e D
br.ak :
Figure 9.39 Rch.l\IClral \)'tcmC' ut.''-\.:nplwn of .. hl).!h)e-.d
"1.lle m.ll' hmc nf Ihe tJl"I.JIKC rncol,urcr (( (mtlnut',I,
L ,", d !). and Ihe InlCrIlul wunler "gmll. nur. III Ihcor IklJuh \.llue, The ddauh \alue
, llOuld corre'pond (() Ihe v" lue' ." "gned 10 Ihe Ignal, \\ lIhl11 Ihe 111111al ,I.IIC of our high
level ,I: ll c lIladllf1e. When Ihe /'II " nul cn"bkd. on Ihe mlng dod. Ihe proce' C' aluJlc
Ihe curre nl ;. Ialc. a"lgn, Ihe "ppropn'll e OUIJ1 UI' for Ih . eurrelll "a'c. determine, the nett
;' Iale. and updale' Ihe -talc regl-ter ,ignal. \1111... In our hlghle\tl 'laiC machine de np-
li on. we onl ) need J , ingle 'lUle reg"la 'Igllal 10 model the heha\lor of our laic
nwchill e. in, lead of Ihe 1\\ 0 ' Iglla!> l'IIrr""/.lIl11t' .tnd ",'\/\/(/1/' we prc\iou,l) u<,Cd III
w lli roll er de,ign , lll1\' n in Figure 9.24.
The 11I gh b 'el ,Wle machi ne for Ihc la,erb"ed dl,tance llleJ'urer perform, 1\\0 anth-
mClic opemli on;.. addill on and 'hifling. To incremenl Ihe counler o Clr III ,late 3. "c U
Ihe ,) Illa\. "Dc r Dc r . read () + 1: ". Thb ;'(Ulcmelli ,,, II ndd one 10 Ihe urren!
va lue uf o ("/r and -tore Ihe re, uh in o C/r. III 'talc 5-1. \\C calculmc Ihe di' lancc. D.
di, iding Ihe ,aluc of DC11' h) 2. Howe\ er. we perform Ihi, di,i,ion u ill g a nght , hl ft b}
ope,,"ion. To perronn Ihe , hif! and ,,,sign the va lue 10 the Ul pul 0 , we u c the , lalemeOi
" D. w r i t e ( Dc t r . read ( ) 1 ) : " . where perfonns a ri ght hil'l opeml ion.
Controller and Datapath of the Laser-Based Distance Measurer
VUDL
Fi gure 9.40 is 3 HDL descripli on of the laser-ba. ed di slance measurer hown in Figure
5. 19. The emit )'. named ulseroislMeasllrer. defines the inpuls and outpulS. including a
user-pressed butt n inpul B. a sensor inpul . a laser conlrol outpul L. and a Iii-bit
OUlput 0 for the di slancc measured. The entil)' al so defines a 300 MHz clock input cl
and re el inpul rSl for the design's controller.
Figur. 9.40 SlnKtu",1
de'iCripllon of Iop-k\d
VHDL de\Cflptton or II",r
ba..cd dl lance mea urer
1
- WI r
arebit ttln . _t
po I el
o 01
The ilfchltecture ItJUCIUroJl y de nhe' the ,onneell nll ' of Ih '
conoroller and dalapath component, 1llc :.rChllectUl c In'ta01 l11tc, Iwn elllllr<1rICI1t \.
WM Com roller I I lhe controller for the IU'lCr h.l\Cd d"wnee men",rer ano
WM=oalOpmh_' IS the daUl palh for thl\ de"gn The arehlletlurc Cflnnctt' Ihe enlll y" ( lie.
rSI. B. and 5 inpuls 10 Ihe inpul l of WM_ on/mll, r _I "nd eonnetl, Ihe coni roll er',
control OUIPUI to the corresponding OUlpul por1 L. Addill!mully, Ihe four "gnul ,. /)I"('II-"Ir.
Dreg_ld. DC1r_clr. and Del' J nt, connect the controll er' four conlml "gnal , 111 the ruur
inputs of WMflalOp(lI"_I . 1llc l.lllrrDIfIMeaUlrrr dlllapalh ha, u \l nglc !)Ulput IJ, pro-
viding the distance mea.wred. that connected to the outpul pon () of the cntny.
Figure 9.4 I is a VHDL descriplion of the La.JrrDwMell fl"""" duwpalh comr<mcnl
shown in Figure 5. 17. The entilY, named WMJJmuplIIit . deflOe, a clock Inpul elk. four
control inpuls Dreg_elr. Dreg_ld. Dar_elr. and DClrJ fI1. and a 16-bn dl \ Lancc Ouipul O.
The architecture defines three componenls. a 16-bn upcounler, u 16-blt register,
and a 16-bil right shifter Ihat shifl right by one posill on. Up o"nlerl6 i, a. l6-btC up-
counler wilh a counl control inpul en/ and a count clear IOpUI elr. Rell 16 1\ a Iii-bit
-'82 9 Hardware Description Languages
Figure 9.41 SlruclUral
VHDL descriplion of Ihe
laser-based di slance
measurer's dmapi.llh.
library ieee;
use ieee. std_logic_1164 . all;
entity LD}LDatapath is
port ( elk: in std_logic;
Dreg_clr, Dreg_ld : in std_logic:
Detr_clr. Dctr_cnt: in std_logic;
D: out stCLlogic_vector(lS downto 0)
);
end LDtoLDa tapa th i
a rchitecture structure of LDM_Datapath is
component UpCounter16
port ( elk: in stdlogic;
clr. cnt : in std_logic;
c: out stcLlogic_vector(lS downto 0)
);
end component ;
component Reg16
port ( I: in std_logic_vector(15 downto 0);
Q: out std_logic_vector(lS downto 0).
clk, clr, ld: in std_logic
);
end component .
component ShiftRightOne16
port ( I : in std_logic_vector(IS downto a).
S: out std_logic_vector(15 downt o 0)
);
end component ;
signal tempC std_logic_vector (15 downto 0);
signal shiftc : std_logic_veetor(15 downto 0) ;
begin
Dctr : UpCounter16
port map (clk, Dctr_clr, Detr_cnt, tempC);
Shi ftRight: Shi ftRightOne16
port map (tempC, shifte) ;
Dreg! Reg16
port map (shifte, 0, clk, Oreg_clr, Dreg_ld);
end structure;
parallel load register with a register load control signal Id and a register clear signal elr.
ShijlRighlOllel6 is a 16-bit right shifter that shift s the input I ri ght by one posi ti on and
assigns the shifted value to the output S. The archit ecture instanti ates an UpColllllerl 6
component named Delr, a Regl6 component named Dreg, and a ShijlRighlOnel6 com-
ponent named ShijlRighl. Delr's instant iation connects the datapat h's DCII'_elr and
DCII'_ClII inputs to DClr' s clear and count control inputs. Delr's count output C is then
connected to the archit ecture's internal signal lempC that connects the count value 10
the ShijlRighl shifter' s input. The shifted count is thcn connected to the input of the
Dreg regi ster using the internal signal shijte. The instantiation of the Dreg regi ster con-
nects the register' s clear and load control input s to the datapath's Dreg_ell' and Dreg_ld
input ports. Finall y, the register's data output Q is connected to LDMjlatapalh' s mea-
sured di stance output D.
9.5 RTL Design Using Hardware Description Languages ... 483
Figure 9.42 and Fi gure
9.43 are the VHDL description
of the laser-based di stance
measurer's FSM controll er
described in Fi gure 5.2 1. The
entit y, named LDM COlliroller
defi nes a clock elk.
reset signal rSI, a user-pressed
button input 8. a laser sensor
input S, and five output control
signals, L, Dreg_cit; Dreg_ld.
DCII'_el,; and DClrJ III. The
output L is used to turn the
laser on and off, where if L is
1, the laser is on. The four
other out put signals are used 10
control the RTL design's data-
path component s.
The VHDL architecture
describes the behavior of the
enti ty. Si mil ar to the controll er
design sholVn in Fi gure 9.22.
the archit ecture consists of two
processes, one modeling the
stale register, the other mod-
eling the combinat ional logic.
The state register process,
named stalereg, is sensiti ve to
inputs elk and rSI. If the rSI is
enabled. then the process asyn-
chronously sets the Cllrrelllstale
signal 10 the FSM's init ial
state, SO. Otherwise, if the
clock is ri sing, the process
library ieee;
u.e ieee.stc;Llogic_l164.all;
entity LDM Controller is
port ( rst: in std.-logic;
B. S: in stcLlogic;
);
L: out stcLlogic;
Dreg_clr, Dreg_ld: out std_logic;
Dctr_clr, Dctr_cnt: out std_logic
end LDM_Controller;
architecture behavior of LDM_Controller i.
type statetype ia (SO, 51. S2, 53, S4);
signal currentstate, nextst8te: statetype;
begin
statereg: proce.s (clk, rst)
begin
if (rst='l') then
currentstate <= SO; -- initial state
elaif (c!k='!' &nO elk'event) then
currentstate <= nextstate;
enO if ;
end proce ;
comblogic: proce (eurrentstate.
begin
B. 5)
L <= '0';
Dreg_clr <= '0';
Dreg_ld <= '0';
Detr clr <= '0';
Dctr=cnt <= '0':
ca.e currentstate is
when SO =>
L <= '0';
Dreg_clr <= '1';
nextstate <= Sl;
when Sl =>
Detr e!r <= 'I'
it (B;' 1 .) tbeD
nextstate
nextstate
end if :
52;
51;
laser off
clear Dreg
clear count
updates the state register wilh (continued In Figure 9.43)
the next state.
The second process, Figure 9.42 Behavioral VHDL dcscriplion of laser-based
named comb/ogie. is sensiti ve distance measurer's controll er.
to t.he inputs to the combina- .
tional logic of Fi gure 5.21. namely, the external inputs 8 and S. and the state regIster
out put c"neIlISlale. When either of those items change. the proce sets the FSM's out-
puts. in thi s case L. Dreg_ell; Dreg_"'. DCl r_el,; and DClr -,"I, wilh the value
for the current state. In the controll er example of Figure 9:22. the FSM s output x was
defined within the case statement for all possible states. five output that mu t be
defined in the LDM COlli roller and five possibl e states. as the values to all outputs
in each stat e would be cumbersome. Fu rthermore, find Ing a mI stake and makmg
484 Hardware Descriplion Languages
correcli on, or modificalion, 10 Ihe
com roll er would become very di r-
fkul! in a larger FSM con,iqing
or more Slal e, and having many
more oulpul ,. The comblflgic
proces, u,e, a dirrerenl approach
in which a deraull va lue for Ihe
OUIPUIS is Or>! as,igned and onl y
Ihe deviali ons from Ihe deraul!,
arc a. , igned lal er. The comblogic
process fir" as,ign, a deraull
value or 0 10 all five OUIPUI . The
prates, Ihen evaluale Ihe currenl
and Ihe.! value... to
Ihe OUlpU!> onl y when Ihc OUIPUI
, hould be 1. The prace" aha
as,igns Ihe va luc 0 10 ,eve", 1
"ignal" wilhin Ihe "," ell "WIC-
men IS. howc;ver.
(contmued from Figure 9 42)
when 52 =>
L <::: '1';
nextstate <= 53;
when 53 :::;>
-- laser on
L <= '0'; laser off
Decr_cnt <= . 1 ; count up
if (5.'1') then
nextstate S4;
.1
nextstate <= 53;
end it ;
when 54 =>
Dreo_ld 1' ;
Detr_cnt <= 0' ;
nextstate <= 51;
end ca ;
end proce ;
e nd behavior;
load Dreg
stop countino
Figure 9.43 Beha, 101111 VHDL de,cnpliOIl or la'er-based
di,tnncc I11Ctll.,UrCr', controll er (cofllillll('d).
melli' arc included on ly 10 clearly indicale Ihe behavior or Ihe cOlllrolier (Ihey arc redun-
dalli. bUI help make Ihe descriplion easier 10 unde"wnd).
Thc process 01,0 delcnnincs whal Ihe nexl "ale should be. based on Ihe currenl slme
and Ihe v:llues or inlulS Band S. The neXI 'laIC will be loaded inlo Ihe slmc regisler by
Ihe slalC regi, lcr process on Ihe neXI ri sing clock edge.
Vcri log
Figure 9.4-1 is a Veri log
descriplion of" Ihe laser-ba cd
di lance measurer shown in
Figure 5.19. The module.
named ulSerDisrNfeasllrer.
defines Ihe inpuls and oul -
pUIS. including a user-pressed
bUllon inpul B. a lascr cnsor
inpul S. a laser control OUl pul
L. and a 16-bil oulPUI 0 for
the dislance measured. The
modul e also defines a 300
MHz clock inpul elk and
resel inpul !"SI ror Ihe designs
controll er.
module LaserDistMeasurer(clk. rst. B. S. L. D);
input clk. rst. B, S;
output L;
output 115,01 0;
wire Dreg_clr. Dreg_Id;
wire Detr_clr. Detr_cnt ;

LD!-l_Controller_l (clk. rst. S, S. L.
LDM_Da tapa th
Dreg_clr, Dreg_ld.
Dctr_clr. Dctr_cnt);
LD1-COa tapa th_l (clk. Dreg_clr. Dre9_1d.
Dctr_clr. Detr_cnt, 0);
endmodule
Figure 9.44 Slruclural descripli on of lop-level Veri log
descri ption of laser-bnsed distance meilsurer.
The LaserDislMeas/lrer structurally describes Ihe conneclions or Ihe controller and
dalapmh componenls. The modul e inslanti ales IIVO componenls. LDM_Colllroller_1 is
Ihe cOnlroll cr ror Ihe laser-based di slance measurer and LDM_Dataparh_1 is the dalapath
for Ihis design. The archileclure conneCIS the module s elk. r SI . B. and S inpuls 10 the
inpuls of LDM_Collfroller _I and conneCIS the conlroll er's laser control OUIPUIIO the cor-
95 RTL Des'gn USing Herdware Descnpllon Languagos -ISS
responding OUlpUI port L. ddilionnlly. Ihe rour imcnwl wire,. Ol\'g_clr.
Dcrr _elr. and Dcrr J ill. connc I the cOntroller' ur cOlllrtJl ,ignnl' 10 Ihe rour inpulS or
LDM_Dawpmh_l . Thc ulserDislA/PtU'If"l'r dillupmh hu, (I 'inglc OUlpUI D. providing Ihe
dislance mea ured. Ihal is connecled 10 the OUlpUI port I or Ihe modulc.
Figure 9.45 i a
Vcrilog descriplion or Ihe
LaserDislMecwlfus dUla-
palh componcOl shown in
Figure 5. 17. TI,e module.
named LDM_D{//lIJI{l/h.
defi nes a clock inpul elk,
rour control DregJlr.
Dreg_ld. DClrJ lr, ll nd
DClr _CIII. and a 16-bil
lance OUIPUI D.
The dmapalh consist;
or three componenls. a
16-bil Up-couOler. a 16-bil
regislcr. and a 16-bil righl
shiner Iha! hin, righl by
one position. Up Olllller-
16 is a 16-bi l up-counl er
wilh a counl control inpul
elll and a counl clear inpul
elr. Reg l6 is a 16-bil par-
all el load regislcr wilh a
regisler load control "ignal
Id and a regiSler clear
signal elr. ShiftRighlOlle-
16 is a 16-bi l righl shift er
thaI hi rls the inpul 1 ri ghl
lIOdulo IIpCounter16 (elk,
iopu,t elk. clr. cnt:
out""t 115,01 C;
If detalla not Ih '"
_1.
_1. RegI6(I, 0, elk,
i"""t (15,01 1/
iJ>put elk, elr. ld,
out""t 115;01 0,
I detail. not .hewn
.1IdIIodul.
eIr, cnt, e):
elr. Idl,
lIOdul. ShHtRlghtOne16 (I,
iJ>put 115;01 I;
51,
out""t 115;01 5;
/1 detaill not Ihewn
_1.
_1. LOll_De apath(elk, Oreq elr, Dug ld,
Dc r_cl r, Detr cn . 0):
i"""t elk;
i"""t Dreg_cl r , Dr.g_Id,
laput Octr_clr, Delc, cnt j
out""t (15;01 0,
wire (15,01 tempC, ahlttC,
UpCounter16 Dctr(clk. Detr_cle. Dc c_cnt,
tempCl,
ShiftRlllhtOnel6 ShiftRlqht (tempC, ahlL LCI :
Reg16 Dreg(shiltC, 0, elk, Orell_elr, Dreg_ld),
eDdIIo4ul.
by one poSlI,on and Figure 9.45 Slruclural Vtrilog de<c riplion or Ihe laser-based
assigns Ihe shifted va lue distance measurer". dUlap.lh.
10 Ihe oUlpUI S. The data-
path module inslanliates an UpCOlllllerl6 componeO! .Delr, a Reg l 6 componenl
named Dreg, and a ShiflRighlOnel6 cmnponenl ShiflR'ghl. The module co.nncels
Ihe dalapalh s DClrJ lr and DClr_clIl mpulS 10 DClr S clear and eouO! .control mpu.lS.
respeclively. The counlers counl OUlpUI C IS then connecled 10 Ihe 16-bll mleroal wIre
lempC Ihat connecls Ihe count value 10 the ShiflRighl shiner's input. The shined count is
Ihen connecled 10 Ihe inpul or Ihe Dreg regisler usi ng Ihe 100eroai 16-bi l wi re shiflC. The
modul e conneclS Ihe Dreg regislers clear and inpuls 10 lhe dalapath's
Dreg_elr and Dreg_ld inpul port . Finall y, the reglsler S dala OUlpUI Q IS connected to
LDM_dalCtpalhs measured dislance OUlpUI D.
486 Hardware Description Languages
Figures 9.46 and 9.47 arc the Veri log de criplion of the laser-ba ed di tance measurer's
FSM controller de;cribcd in Figure 5.21. The module. named LDM_Colllroller. defines a
clock input elk. a ignal nl, a u ..cr-pr cd bUllon inpul B. a laser ensor input S, and
five OUlput control signal;. L. Dreg_clr. Dreg_id. DCfrJir. and DcrrJIII. The OUlput L i
used to !Urn the laser on and off, where if L is 1. Ihe laser is on. The four oLher output
signal s arc used to conLrol the RTL design' datapaLh components.
Figure 9.46 Behavioral
Veri log descripti on of laser-
based di stance measurer' s
controller.
module LOM_Controller (elk, rst, B. S. L. Dreg_elk.
Dre9_1d. Deer_clr,
Deer_cnt) ;
input elk, rst. 8. S:
output L;
output Dreg_elk. Dreg_ld;
output Deer_elr. Decr_cnt;
reg L;
reg Dreg_elr. Dreg_ld;
reg Decr_clr . Octr_cnt;
parameter SO
51
52
53
54
3'bOOO,
3b001.
3 'b010,
3 bOll.
3'blOO;
reg (2:0J currentstate:
reg [2:0J nextstate;
alway. @ (po.edge rst or po dge elk)
begin
if (rst==ll
currentstate
el
so; II initial state
currents tate nexcstate;
end
always @ (currents tate
begin
L <= 0;
Dreg_elc <= 0;
Dreg_ld <= 0;
Detr_elr <= 0:
Dctr_cnt <= 0:
case (currents tate)
50, begin
L <= 0;
Dreg_clr <= 1;
nextstate <::: 51;
end
(continued in Figure 9.47)
or B or 5)
II laser off
II clear Dreg
The Veri log modul e behaviorall y describes the LaserDislMeasll rer's FSM. Similar 10
Lhe controll er design shown in Figure 9.23, the modul e consisLs of IWO procedures_ one
modeli ng the sLaLe regisler, the olher modeling Ihe FSM' s control logic. The state regiSler
procedure is sensi li ve 10 the po ilive edge of the reseL inpuI, r SI , and the positive edge of
the clock inpuI, elk. If the r SI inpul is enabled, Lhen the procedure asynchronously sets Lhe
ClirrellISIGle signal 10 Ihe FSM's inilial sLaLe, SO. OLherwise, on the ri sing edge of Lhe
clock, the procedure updaLes the Slale regisler wilh the neXI stale.
9.5 AlL Design USing Hardwaro Dascnptlon Languages -'117
The se ond procedure i, sen-
siti ve to the inputs to the
combi naLional logic of Figure
5.2 1. namely, the external
Band S. and the regi ter
output ClIrrelllSlate. When either
of those items change, the proce-
dure et the FSM's in
thi s case L, Dreg_clr. Dreg_ld.
DCIr J lr. and DCIr_CIII . \ ith the
appropriat e value for the current
state. In the controller example of
Figure 9.22. the FSM', output .f
was defined within the ca>e SlUte-
ment for all possi bl e states. With
five outputs thm mu>! be defined
in the LDM_Colllroller and five
possible tat es. as igning the
values to all outputs in each SLate
would be cumbersome. Funher-
more. finding a mi stnke und
making correcti ons or modifica-
Li ons to the controll er would
(conhnll6d from FtgUre 9 46)
Sl, be9in
Octr .c1r < 1; II clear count
if 10 I)
next.tAt.e.. 52:
_1
n xtat ... SJ:
eD4
S3, be9in
L <- I;
nex .tat..... S3;
eD4
S3, bGQln
L .. 0:
Oc:tr cot 1;
if (S I)
n xt.tate 64:
_1
n xt.tate S3r
oDd
S4, be9in
II I ... r on
/I lUG, oCt
II Count up
Dreg ld < t,
Dc r < 0:
next.tate <_ 51,
/I load Dr 11
oDd
_040
oDd

II .top countinQ
become very diffi cult in a larger Figure 9.47 Bchflvioml Veri log de,cripti on of In,er-bused
FSM consisting of more stat es dbt.n c ",casurer\ conlroli er (mlllilill cd).
and havi ng many more outputs.
Instead. the procedure uses a dif-
fe rent approach in which a default va lue for all the outputs i, fir" assigned "nd onl y the
devi aLions from the defaults are assigned later. The procedure fi rst aSlignl a default va lue
of 0 to all five outputs. The procedure then evaluates the current blate and assigns the
values to the outputs onl y when the output should be 1. nle procedure also assign!> Lhe
value 0 10 several signals wi Lh in Ihe case however, Ihese assignments are
included onl y to clearl y indicate Ihe behavior of the controll er (they ure redundant, bUI
help make Ihe descripli on easier 10 undersland).
The procedure also deLermines what the neXI Slale should be, based on the current
staLe and the values of inpuls Band S. The nexi Slale will be loaded into the SLale regisler
by the Slale regisLer procedure on Ihe nexl posi li ve clock edge.
"'ss
Hardwa re Description Languages
SystemC
Figure 9. -18 is a SystemC descrip-
ti on of the laser-based di stance
measurer shown in Fi gure 5. 19.
The module. named w serDisl-
Measurer. defines the inputs and
outputs. including a user-pressed
bUlion input B. a laser sensor
input S. a laser control output L,
and a 16-bit output D for the dis-
tance measured. The module also
defines a 300 MHz clock input
elk and reset input rsl for the
design 's controller.
The w serDislMeasurer struc-
rurall y describes the connections
of the cont roller and datapath
components. The architectu re
instanti ates two components.
LDM_Colllroller_1 is the con-
troll er for the laser-based di stance
measurer and LDM_Datapalh_1
is the datapath for thi s design.
The modul e co nnects th e
modul e's elk, r SI , B, and S inputs
to th e input s of LDM_
COlllroller _I and connects the
comroller's laser comrol output to
the corres ponding output pon L.
Additionall y, the four internal
wires, Dreg_elr. Dreg_Id, DClr_
elr. and DClr_cllI, connect the
controller's four control signal s to
the four inputs of LDM_Data-
'include "systemc .h"
linclude LDM_Controller . h"
'include "LDM_Datapath . h"
SC_MODOLE ( LaserDistMeasurer)
(
)
) ;
sc_ in<sc_ logic> elk. rst;
sc_ in<sc_ logic> B. S;
sc_out <sc_ logic> L;
sc_out<sc_ lv<16> > 0;
sc_ signal <sc_ logic> Dreg_cIr, Dreg_l d;
sc_ signal <8c_ logic> Detr_cIr. Detr_cnt j
LDM_Controller
LDM_Datapath LDt-LDatapath_l;
SC_ CTOR (LaserDistMeasurer) :
LDM_Controller_l (-LDM_Control ler_l-).
LDM_Datapath_ltLDM....Datapath_l )
LDM_Controller_l. clk (clk) ;
LDf>CController_l. rst (rst) ;
LDM_Controller_l.B(B) ;
LDM_Controller_l. S (5) ;
LDM_Controller_l . Dreg_clr (Dreg_clr) ;
LDM_ Controller_l. Dreg_ld (Dreg_ld) ;
LDM_Controller_ l . Dctr_clr (Dctr_ clr) ;
LDM_Controller_l. Dctr_cnt ( Dctr_cnt) ;
LDM_Datapath_l. elk (elk) ;
LDM_Datapath_l. Dreg_clr (Dreg_clr) ;
LDM_Da tapa th_l . Dreg_ld (Dreg_ld) ;
LDM_Datapath_l . Dctr_clr (Dctr_clr) ;
LDM_Datapath_l . Dctr_cnt (Deer_cnt) ;
LDM_Datapath_l.D{D) ;
Figure 9.48 Structural description of top-level SystemC
descripti on of laser-based di stance measurer.
palh_l . The LaserDislMeaslirer data path has a single out put D, providing the distance
measured, that is connected to the output pon D of the module.
9.5 RTl Design Using Hardware Description languages
489
. Figure 9.49 is a SystemC descrip-
tI on of the ulserDislMeasll rer's
datapath component shown in Figure
5. 17. The module, named LDM
Dawparh, define a clock input
four comrol inputs Dreg_c/r, Dreg_Id,
DClr_elr, and DClrJIII, and a 16-bit
distance output D.
.include
'include
.include
'include
systemc. h
upcounterl6 . h-
regl6.h
shiftrightone16 h-
8c_in<8c_loglc> elk;
8c_ln<8c_ logic> Dreg_clr.
8c_ln<8c_loglc> Dctr_clr.
8c_out <8c_lv<16> > D;
Dreg_ld;
Detr_ent;
8c_81gnal <8c lv<16> > tempC;
8c_ 81gnal <8c:=lv<16> > shiftC;
UpCounter16 Detr;
Reg16 Dreg;
ShiftRightOne16 Shi ftRight;
SC_CTOR (LDM_Datapath) :
)
);
Detr(-Detr-). Dreg(Dreg).
ShiftRight (. ShlftRight)
Dctr.clktclk) ;
Detr .clr (Dctr_clr) ;
Octr. cnt (Detr_ent) ;
Dctr.C(tempC) ;
ShiftRight. I (tempC) ;
ShiftRight.S (shiftC) ;
Dreg. I (shiftC) ;
Dreg . Q{D) ;
Dreg . clk{clk) ;
Dreg.elr(Dreg_clr) ;
Dreg .ld IDreg_ld);
Figure 9.49 Structural Sy temC de cription of the
laser-based distance measurer' datapath.
The datapath consists of two
components, a 16-bit up-counter, a
16-bit register, and a 16-bit ri ght
shIfter that shifts right by one posi-
tion. UpCollll lerl6 is a 16-bit up-
counter with a count control input
CII I and a count clear input clr. Regl6
is a 16-bit parall el load regi ster wi th
a register load control signal Id and a
register clear signal elr. ShifrRighl
0llel6 is a 16-bit right shifter that
shi fts the input I right by one posi-
tion and assigns the shi fted val ue
to the output S. The datapath module
instanti ates an UpColllll erl 6
component named DClr, a Regl 6
component named Dreg, and a
ShijrRighrOllel6 component named
ShifrRighr. The module connects the
datapath's DCTI'_elr and DCl rJIII
input to Dcrr's clear and count
control inputs, respect.ively. The
counter's count output C is then con-
nected to the 16-bit intemal signal
rempC that connects the count value
to the ShifrRighr shifter' S input. The shifted count value i then connected to the input
of the Dreg register usi ng the intemal signal shijrC. The module connects the Dreg reg-
ister's clear and load control inputs to the datapath's Dreg_elr and Dreg_Id input pons.
Finally, the register's data output Q is connected to LDM_darapalh's measured distance
output D.
-'90 Hardware Description languages
Figures 9.50 and 9.5 1 are the SysLemC descript ion of the laser-based distance mea-
surer's FSl\'1 controller described in Figure 5.21. The modu le, named LDM_Colllroller,
has a clock input elk. a reset signal rSI. a user-pressed bUll on input 8, a laser sensor input
S. and five output conLrol signals. L. Dreg_el,; Dreg_ld, Dell_clr. and Dclr_clll . The
output L is used to turn the laser on and off; where is L is 1, the laser is on. The four other
output signals are used to control the RTL des ign's daLapaLh component s.
Figure 9.50 Behavioral
SystemC descripti on of
la er-based di stance
measurer' s controller.
.include "system.h -
anum statetype { SQ, 51, 52, 5). 54 };
SC_MODULE (LDM_Controller)
(
sc_ in<sc_l.ogic> elk. rst, B. S;
sc_ out <sc_ logic> L;
se out<sc logic> Dreg_clr. Dreg_ld;
sc=out <sc=logic> Detr_clr. Detr_ent;
8c_ signal. <statetype> currents tate. nextstate;
SC_ CTOR CLDM_ Controller)
(
SC METHOD (statereg) ;
rst elk;
SC METHOD (comblogic) ;
se;;:sitive currents tate B S;
void statereg () {
if ( rst .read (1 == SC_LOGIC_l 1
currents tate SO: II initial state
else
eurrentstate nextstate;
void comblogic() {
L. write (SC_LOGIC_OI ;
Dreg_clr. write (5C_LOGIC_O) ;
Dreg_ld .wriU(SC_LOGIC_OI;
Detr elr. write (5C_LOGIC_0);
Detr=ent. write (5C_LOGIC_O);
switch (eurrentstate) {
case SO :
L. write (SC_LOGIC_OI;
Dreg_clr. write (SC_LOGIC_OI ;
nextstate .: 51;
break;
(continued in Figure 9.51)
II laser off
II clear Dreg
The SystemC module behaviorally describes the ulserDislMeasllrer 's FSM. Simil ar
to the comroller design shown in Figure 9.24, the module consists of two processes, one
modeling the SLaLe regi ter, the other modeli ng the FSM's conLrollogic. The state register
process. named slolereg. is sensitive to the positi ve edge of the reset input . rSl, and the
posiLive edge of the clock input, elk. If the rSI is enabled, then the process asynchronously
9.5 RTl Design Usi ng Hardware Description Languages
491
sets the curren/state to the FSM's initial state, SO. Otherwise, on the rising edge of Lhe
clock, the process updates the SLate regisLer with the llextslOle.
The second process, named eomblogie, is sensitive to the inputs to the combinaLional
logic of Fi gure 5.21 , namely, the external inputs 8 and S, and the state regi ter output eur-
relllstale. When ei ther of those signals change, the process sets the FSM' s outputs, in Lhis
case L, Dreg_ell; Dreg_ld, DClr_c1r, and Delr_ClI/, with the appropri ate value for Lhe
current state. In the controll er example of Figure 9.24, the FSM's output x was defined
wiLhin the case statement for all possibl e states. WiLh five outputs that we must define in
Lhe LDM_Collllvller and fi ve possi ble states, assigning the values to all outputs in each
state would be cumbersome. Funhermore, finding a mistake and making corrections or
modifi cati on to the controll er would become very difficult in a larger FSM Con isting of
more states and having many more outputs. Instead, the process uses a different approach
in whi ch a default value for the all Outputs is fi rst assigned and only the deviation from
the defaults are assigned later. The process first assigns a default value of 0 to all five out-
puts. The process then evaluates the current state and assigns the values to the outputs
onl y when the output should be I. The process also assigns the value 0 to several signals
within the eose statements; however, Lhese assignments are included only to clearly indi-
cate the behavior of the controller (Lhey are redundant, but help make the description
easier to understand).
Figure 9.51 Behavioral
SystclllC description of
Inser-based distance
controller
(collfill"edj.
(continued from Rgure 9.50)
)
I;
caae Sl
Octr_clr .write (SC_LOOIC_1); II clear count
if (B. read () == SC_LOGIC_ll
nextstate S2;
e18e
nextstate 51;
break:
eaa. S2 :
L. write (SC_LOGIC_l); II laser on
nextstate .: S3;
break;
caae S3:
L. write (SC_UX;rC_O); II laser off
Detr cnt. writ. (SC_LOGIC_l ) ; II count up
if (S. read () == SCLOGIC_11
nextstate 54;
e1a.
nextstate = S3;
break ;
caae S4:
oreg_ld. write (SC_LOGIC_11; 1/ load Dreg
Detr_cnt. writ. (5C_LOGIC_O) ; I I stop counting
nextstate = 51;
break; }
. what the next state should be_ based on the current state
Thc process also detmnme b
d . 111e next st3te will be loaded into the taLe regisLer y
and the values of inputs B an ..
h
L po-ttlve lock edge.
the stat c rcgista process on t nex ,
-'92 9 Hardware Description Languages
9.6 CHAPTER SUMMARY
In this chapter. we stated that hardware descripti on languages (HOLs) are widely used in
modem digi tal design. We provided brief introducti ons to several widely used HDLs,
namely. VHOL. Veri log and Systemc. We introduced those HDLs primarily through the
use of examples. illustrating how each HOL mi ght be used to describe combinational
logic. sequential logic. datapath components, as well as RTL behavi or and structure. To
become proficient at the use of HOLs, a more thorough study of a particular HOL might
be helpful. Thi s chapter a lso illustrat es the point that different HOLs have several
commononalilics.
9.7 EXERCISES
The following exercises can be completed using any of the HDLs described in this
chapter.
SECTIO 9.2: CO IBINATIONAL LOGIC DESCRIPTION USING HARDWARE
DESCRIPTION LA GUAGES
9.1 Create a structural HDL description of the binary number to seven-segment display descri bed
in Example 2.23. consisting of the simple logic gates. lllv. AND2, and OR2. Be sure to include
combi nati onal behavioral descriptions of the simple logic gates.
9.2 Create combinational behavioral HDL descript ions for each of the foll owing two- input logic
gates. where each logic gate has two inputs. a and h. and a single output F:
(a) NAND2.
(b) NOR2,
(c) XOR2.
(d) XNOR2.
9.3 (a) Create a combinational behavioral HDL descripti on of the three Is pattern detector of
Example 2.24.
(b) Create a testbench that checks that your description works properly.
9.4 (a) Create a combinati onal behavioral HDL description of the Number-of-ls counter shown
in Figure 2.4 I, by describing the combinati onal behavior of both outputs x and y in sum-
of-minterms form.
(b) Create a testbench that checks that your description works properly.
9.5 Create an HDL description of the 2x4 decoder shown in Figure 2.50, as:
(a) combi national behavior.
(b) structure.
(c) Create a testbench to test either descripti on (the same test bench can test either
description).
9.6 Create an HDL description of the 4x I mult iplexer descri bed in Figure 2.55, as:
(a) combinational behavior.
(b) structure.
(c) Create a testbench to test either description (the same test bench can test ei ther
description).
9.7 Create a behavioral HDL description of a 2x I
multIplexor described in Figure 2.54. Then.
create a HDL description that combines
three 2x I muluplexors to create a 4x I multi -
plexor as shown 10 Figure 9.52.
9.8 Create a combinational behavioral HDL descrip-
tI on . of an 8-bll 4x I multiplexor. Be Sure to
specIfy the design input and output pons usi ng a
multIple bll data type.
9.9 Clearly explain the difference between a struc-
tural HDL descripti on and a behavioral HDL
descripti on. Explain the benefits of using both
klOds of descriptions.
9.10 Explain why a combinational behavioral HDL
description must include all the combinational
ci rcui.l's inputs in a sensitivity li st. In particular.
explatn why omitting an input actually descri bes
a sequential circuit.

iO - f- iO
9.7 Exercises 493
4xl
il-f- il di\
sO I L iO
d - - d

i2 - iO d
U

i3 - it
SI
Figure 9.52 4x I multiplexor
composed of three 2x I mUltiplexor.;.
9.11 Create a behavioral HDL descripti on of a 16x4 priority encoder. The priority encoder has
16 1OPUts, dl 5, dl 4, : ... dl. dO, and four outputs e3, e2. el. eO. The priority encoder outputs a
4-bllblOary number IOdlcaMg whIch of the 16 inputs is a 1. If more than one input is a 1, the
pnonty encoder will output the bmary number for the highest numbered input.
SECTION 9.3: SEQUENTIAL LOGIC DESCRIPTION USING HARDWARE DESCRIP-
TION LANGUAGES
9.12 (a) Create a behavioral HDL description of a 32-bi t parallel load regi ster.
(b) Create a test bench to test the description.
9. 13 (a) Create behavioral HDL description of the FSM controll er for the improved code detector
described in Figure 3.46.
(b) Create a testbench to test the descripti on.
9.14 (a) Create a behavioral HDL descri ption of the button press synchronizer described in Fieure
3.53. -
(b) Create a testbench to test the description.
9.15 (a) Create a behaviroal HDL description of the secure car key controller described in Figures
3.57 and 3.58.
(b) Create a testbench to test the description.
SECTION 9.4: DATAPATH COMPOl'l'ENT DESCRIPTION USL'iG HARDWARE
DESCRIPTION LANGUAGES
9. 16 (a) Create behavioral HDL description of an 8-bi t parallel load register with register clear
input e1r.
(b) Create a testbench to test the description.
9.17 (a) Create a behavioral HDL description of an 8-bit parallel load register with a clear 10\\
input cJr_1 and a set high input When the e1r.) input is 1. the register contents
should be cleared to "00000000 . When the stU, IOputs IS 1. the registers contents
should be set to "11111111". If both inputs are I. the lear low input has priority.
(b) Create a testbench to test the description.
-
--
-19-1 Hardware Description Languages
9.18 Create a behavioral HDL description of an 8-bit
register with IwO control inputs sO and sl wi th the
following control behavior described in Figure 9.53.
9.19 Create a structura l HDL descri ption of a half-adder.
9.20 Create a structural HDL descripti on of a 4-bit carry-
ripple adder wi thout a carry input. First create a
behavioral description of a full-adder, and then use
the fuJI-adder component in your carry-rippl e adder
description.
sl sO Operation
0 0 Maintain present value
0 1 Parallel load
0 Shift right
Rotate right
Figur. 9.53 Operati on table of the
S-bi t register fo r Exercise 9. IS.
9.21 Create a behavioral HDL descripti on of the approxi lllnte Celsius- to-Fahrenheit convener
described in Figure 4.40.
9.22 Create a behavioral HDL description of an approximate Fahrenheil-lo-Celsi us converter usi ng
the following approxi mation for the conversion: C ; (F - 32) /2 .
9.23 (a) Create a behavioral HDL descripti on of a I-bit comparator.
(b) Create a structural description of a 4-bit comparator. using the I-bit comparators.
Create a behavioral HDL description of a 32-bit equality comparator with three 8-bit inputs a,
h. and c.
9.25 Create a structural HDL descri pti on of the up-dawn-counter circui t described in Figure 4.55.
Be sure to first creme a behavioral HDL description of each component used in your structural
HDL design.
9.26 Create a structural HDL description of a 4-bit down-counter with parall el load. Be sure to first
create a behavioral HDL descripti on of each component used in your structural HDL design.
9.27 Create a structural HDL descript ion of the RGB to CMYK converter described in Figure 4.68.
Be sure to first create a behavioral HDL description of each component used in your structural
HDL design.
9.28 Create a structural HDL descript ion of a CMYK to RGB converter. Hint: Use the information
presented in Example 4.20 describing the RGB to CMYK converter to assist in designing the
CMYK to RGB converter.
9.29 Create a structural HDL description of a 4-bit adder/subt.ractor circuit. Be sure to first create a
behavioral HDL description of each component used in your structural HDL design.
SECTION 9.5: RTL DESIGN USING HARDWARE DESCRIPTION LANGUAGES
9.30 Create a behavioral HDL description of the high-level state machine for the simple bus inter-
face shown in Figure 5.24.
9.31 Create a structural HDL descript ion of the controller/datapath for si mpl e bus interface shown
in Figure 5.26.
9.32 Create a behavioral HDL descripti on of the high-level tate machine for the sum-of-absolute-
differences component shown in Figure 5.29.
9.33 Create a structural HDL description of the controll er/datapath design of the sum-of-absolute-
differences component shown in Figure 5.30.
Create an RTL design of a reaction timer circuit that measures the time elapsed between the
illumination of a light , and the pre sing of a button by the user. The reacti on timer has three
input" a clock input elk. a reset input rst. and a button input B. and three output s. a li ght
enable output lell. a IO-bit reaction time output rt;me. and a .'ilow output indicating the user
was not fast enough. The reaction timer works as follows. On reset. the reaction timer waits
for 2 seconds before illuminating the light by ;elti ng l ell to l. The reaction limer then
9.7 Exercises 495
measures the length of lime in milli seconds
the time as a 10-bit binary numbe . Ore the user presses the button B. outputting
r on rtlme. if the user did l . L. b . .
I second (1000 milli seconds) the react" .. no press ",e utton Wtthln
on rrill/e. Assume a clock wlil set the output sl ow to 1 and output 1000
level state machine in an HDL (b) C z. (a) Start by captunng the design using a high-
path descripti on in an HDL. onven the high-level state machine to a controUer/data-
9.35 Starting from the C description shown in F 9
Common Di visor (GCD) calculator that RTL design
d
of a Greatest
input go, and a 16-bit output D. When the go is '1' the a an h, an enable
greatest common di visor and output the GCD on th." output D Sct a cu atohr whlil cholmpute the
h" HD . an Wit a Ig - evel state
mllac
h
In an I L. and then create an HDL implementat ion with a datapath controller and
a t clr Intema components. . ,
GCD(uint a, uint b) II not quite C syntax
while ( a ! = b )
ifla>b)(
a = a - b;
else (
b = b - a;
return(a) ;
Figure 9.54 C program description of a greatest common divisor calculator.
496
A
Boolean Algebras
This appelldix is reproduced lVilh permissioll from Ihe l exlbook " I IlIIVduClioll 10 Digital
Syslems" by rcegolloc. Lallg. alld Morello, ISBN 0471-52799-8, Johll Wiley alld SOil S
publishers, 1999.
Boolean algebras is an imponant class of algebras that has been studied and used exten-
sively for many purposes (see SecLion A.5). The switching algebra, used in the
description of switching expressions discussed in Section 2.4, is an instance (an element)
of the cia s of Boolean algebras. Consequentl y, theorems developed for Boolean algebras
are also appli cable to switching algebra, so they can be used for the transformati on of
switching expressions. Moreover, cenain ident iti es from Boolean algebra are the basi s for
the graphical and tabular techniques used for the minimization of swi tching expressions.
In this appendix. we present the definition of Boolean algebras as well as Lheorems
that are useful for the Lransformation of Boolean expressions. We also show the relaLion-
ship among Boolean and switching algebras; in panicular. we show that the swit ching
algebra satisfies the postulate of a Boolean algebra. We also sketch ot her examples of
Boolean algebras, which are helpful to funher understand the propenies of thi s class of
algebras.
A 1 BOOLEAN ALGEBRA
A Boolean algebr a is a tuple {B. +, . }, where
B is a et of elements:
+ and . are binary operat ions applied over the elements of B,
saLi sfyi ng the following postulates:
PI: If a, b e B, Lhen
(i) a + b = b + a
(ii) a . b = b . a
That is, + and are commutati ve.
P2: If a, b. c e B. then
(i) a + (b . c) = (a + b) . (0 + c)
(ii ) a . (b + c) = (a . b) + (0 . c)
A.2 Switching Algebra 4 497
P3: The set B has two di sti nct identit ele
every element in B y ments, denoted as 0 and I, such that for
(i) 0 + a = a + 0 = a
(ii) I . a = a . I = a
The elements 0 and I are call ed the additive' d n I
' d n I . I en t ye ement and the multiplicati ve
t en t ye ement, respecti vely. (These elements should t b f . .
gers 0 and I.) no e con used wl Lh the Inte
P4: For every element " e B there exists an element a' called th I f
a, such that . e comp ement 0
(i) a+a' = I
(ii) a a' = 0
. The symbol s + and should not be confused with the arithmetic addition and multi.
pltcatlOn However, for convenience + and are often called "plus" and 'times.'
and the expressIons a + b and. a b are called "sum" and "product ,. . I M
over, + and are also call ed "OR" and " AND," respectively.
, respecuve y. ore-
The elements the set B are call ed constants. Symbols representing arbitrary ele-
ments of B are variables. The symbols a, b, and c in the postulates above are variables.
whereas 0 and I are constants.
A precedence ordering i defi ned on the operators: has precedence over +. there-
fore, parentheses can be eliminated from product . Moreover, whenever single symbols
are used for van abies. the symbol can be eliminated in products. For example.
a + (b c) can be written as a + bc
A.2 SWITCHING ALGEBRA
Switching algebra is an algebraic system used to describe swi tching functions by means
of swi tching expressions. In this sense. a swi tching algebra serve the same role for
switching func ti ons as the ordinary algebra does for arithmeLic functions.
The swi tching algebra of the set of two elements B = {O. I}. and two operations AND
and OR defined as foll ows:
AND 0
o 0
o
OR 0
o 0
These operation ' are used to evaluate switching expressions. as indicated in ;,cuon
T heorem I
The swi tching algebra i a Boolean algebra.
Proof We how that the switching algebra saLisfies the postulate of a B lean al"ebra.
-'98 A Boolean Algebras
PI: Commutati vity of C+). C, ). Thi s is shown by inspect ion of the operation tables.
The commutativi ty property holds if a tabl e is symmeLric about the main
diagonal.
P2: DistributivUy of (+) and (' ). Shown by perfect induction. thaL is. by consid-
ering al l possibl e values for the elements 0 , b, and e. Consider the foll owing
table:
abc a + be (a + b)(a + c)
000 0 0
00 1 0 0
010 0 0
011 I I
100 I I
101 I I
11 0 I I
II I I I
Because a + be = (0 + bleb + e) for all cases. P2(i) is saLi sfied. A similar
proof shows that P2(ii) is also saLisfied.
P3: Existence of additive and multiplicati ve identi ty element . From the operation
Lables
0+ 1=1+0=
Therefore, 0 is the additive identity. Similarl y
0 1 =1 0=0
so that I is the multiplicative identiLY
P4: Existence of the complement. By perfect inducLion:
a a' o +a' (J-a
I 0 I 0
0 J J 0
Consequentl y, I is the complement of 0 and 0 is Ihe complement of I.
Because all postulates are saLisfied, the switchi ng algebra is a Boolean algebra. As a
result. all theorems true for Boolean algebras arc also true for the switching algebra.
A.3 IMPORTANT THEOREMS IN BOOLEAN ALGEBRA
We now present some importam theorems in Boolean algebra; these theorems can be
applied to the lran,formati on of switching expressions.
A.3 Important Theorems in Boolean Algebra 499
Theorem 2 Principle of Duality
Every algebraic idenLity deducible from th
if e postul aLes of a Boolean algebra remains valid
Lhe operati ons + and are intercha d h
. . nge L roughout; and
Lhe IdentIty elemenLs 0 and I are at h
so mterc anged throughout
Proof The proof foll ows at once from Lh f h
anoLher one (Lhe dual ) that is obtai ned by . acht L at for each of the postul aLes there is
m erc angmg + and . as well as 0 and I .
Thi s Lheorem is useful because it reduces the nu be f .
be proven: every theorem has its dual. m r 0 dIfferent Lheorems that must
Theorem 3
Every element in B has a unique complement .
Proof Let a E B; let us assume that a' and a' b h
. I 2 are Ol complements of a. Then.
uSlOg the postul ates we can perfonm the following transfonmaLi ons:
a'
t
= a'i' I by P3(ii) (identity)
= a'] . (a + a'2) by hypothesis
(a'2 is the complement of a)
= 0 '1 . a + 0 '1 . 0'2 by P2(ii) (distributivity)
= a . a'i + 0'1 . 0 '2 by PI (ii) (commutativity)
= 0 + a'l . a'2 by hypothesis (a'i is the complement of a)
= a'i . 0'2 by P3(i) (identity)
Changing the index I for 2 and vice versa, and repeating all steps for a' 2' we get
0'2 = 0 '2' 0'1
= a'l . a'l by PI (ii)
and therefore a' 2 = a'i .
The uniqueness of the complement of an element allows considering as a unary
operation called complementation.
Theorem 4
For any a E B:
l.a+ l=
2. a 0 = 0
Proof Using the postulates. we can perfonm the following rransfomlations:
-
SOO A Boolean Algebras
b)
a+1 = I (11+ I ) PJ(i,)
= (11+11') (II + I ) Pl(i)
= (I + (a' I) P2(,)
= a+a
P3(II)
= I P-l (,)
C"," (2): by
a 0 0+(1 0) P3(i)
(II (1')+(11 0) P-l (II)
= 1I
(11'+0) P2(II)
= 1I II' 1'3( ,)
= 0
P4(,i )
C '.'e (2) can al,o be proven by me:,," of en,e ( I ) and the principle of dualil) .
Theorem 5
The compl emenl of Ihe clelllent I i, O. and vice \er;J. That ".
I. 0' = I
2. I ' = 0
I'roof Oy Theorelll 4.
0+ 1 = I and
O I = 0
Because. by Theorem 3. Ihe complement of an elcment is unique. Theorem 5 follows.
Theorem 6 Idem polen I Law
For every a E B
I. a+a = (I
2. o (I = a
Proof
( I ):
(2) : dualil y
0+0=(0+0) 1
= (a+o) (a+a')
= ( a + (a . a'
= a +O
=0
by
P3(i i)
P-l (i)
P2(i)
P4(ii )
P3(i)
Theorem 7 Involulion La"
For every a E 8 .
A.3 Important Theorems ,n Booloan Algobra
SOl
(a')' = tl
Proof From the defi nili n f no !cmelll "
by Theorem 3. Ihe complement of p I (II) ,ond (J arc bolh coonpl 'l1\ nh of II ' OUI.
nn c eonenl " un'que. "h,eh prove.' Ihe Ihcorelll .
Theorem H Absorption Law
For every pair f elemen15 a. b E B.
1.(I+ o b=o
2. (I (a+b) = a
Proof
( I ):
b
(I + ab = lI ' I + tlb P3(i ,)
(2): dualil Y
Theorem 9
= a( I + b) 1'2(i ,)
= a(b+ I ) PI(i)
= (1 1 Theorem 4 ( I )
= 1I P3(ii)
For every pai r of elemenls a, b E 8.
J. (I+o' b = (I+b
2. o(a' + b) = ab
Proof
( I ):
a+a' b = (a +lI')(a+b)
= I (a +b)
(2): dualilY
Theorem 10
= a +b
by
P2(i)
P4(i)
P3(ii)
In a Boolean algebra, each of the binary operalions (+ ) and (. ) is associalive. ThaI is. for
every a, b, e E 8 ,
I. a+(b+e) = (a+ b)+e
2. a(be) = (ab)e
The proof of this Iheorem is quile lenglhy. The interesled reader should consult Ihe
further readings suggesled al Ihe end of Ihis appendix.
..
t
11
B
d i Ion I In
u r 1 b S
[XAMPLE B 1
(\'II\l'll Ih,' 1I11111"'r
I II ,,'r-llll 11 thl
,.
,.
pomt to r e ~ n t the number In
a tinHe number of l ~ auJlablc III
need to be truncated and the b1n.wy
y
508 B Additional TopIcs In Binary Number Systems
B.3 FI XED POINT ARITHMETIC
If " e tix the bi nary point of a real number in a certain posi-
ti on in the number (e.g .. after the -lth bit). we can add or
subtract binary real number by treating the numbers as inte-
gers and adding or subtracting normall y. In the resulting sum
or differen e. we maintain the binary point's positi on. For
I I
+ 00 1
I I
00 1
111
1 1 1 . 1
example. a!.sume we are worki ng wit h S- bit numbers with figure B.4 Adding two
half of the bits used to represent the fract ional part of the (i x"d poinl numbers.
number. If we wat1led to add 1001. 00 10 (9. 125) and
0011 . 1111 (3.9375). we can simpl y add the two number a
if thev were it1le2ers. The sum. shown in Fi gure B.4. Can be convert ed back to a real
by maint;ining the binary poit1l's posi ti on within the sum. Converting the sum to
decimal verifies that the calcul ation was correct: 1*2
3
+ 1*22 + o*i + 1*2
0
+ 0*2.
1
+
0*2" + 0*2.
3
+ 1*2-4 = 8 + 4 + I + 0.0625 = 13.0625.
Multipl ying binary real numbers is also straightforward
and does not require that the binary poit1l be fixed. We first
multiply the two numbers as if they were integer. Second. we
pl ace a binary point in the product such that the precision of the
product i the sum of the precisions of the multiplicand and
multipli er (the two numbers being multiplied). just like what is
done when we multiply twO decimal numbers together. Figure
B.5 shows how we might multipl y the binary numbers 01.10
01.10
x 1 1.0 1
1 1

1 1
+ 1 1
1 00 . 1 1 1
figure B.5 Multiplying
( 1.5) ard 11.0 1 (3.25) using the partial product method 111' 0 fixed poinl numbers.
described in Section 4.7. After we calculate the product of the
two numbers. we place a binary point in the appropriate loca-
ti on. Both the multiplier and multiplicand feature two bit. of precision. therefore
the product must have four bits of precision. and we insert a binary point to reftect thi s.
Convening the product to deci mal veri fie that the calculation was correct : 0*2
3
+ 1*22 +
0*2
1
+ 0*2
0
+ 1*2.
1
+ 1*2" + 1*2.
3
+ 0*2-4 = -l + 0.5 + 0.25 + 0. 125 = 4.875.
The pre"i ou, example was conveni ent
in that we never had to add four Is
IOgether in a column when we summed up
the pani al product; . To make the caleula-
li ons simpler and to all ow for the partial
product ; ummation to be implemented
u>ing full -adder,. whi ch can onl y add three
I , at a time. we add the pani al products
incrementall y in>tead of all at once. For
exampl e. let \ multiply 1110 . I ( 14.5) by
(C' III . I ) 7.5. , een in Fi gure B.6. we
1110. 1 multiplicand
x 1 1 1.1 multiplier
1 1 101 pat1ial product 1 (ppl)
+ 11101 pp2
1010111 ppshppl+pp2
+11101 pp3
71-71";0'-:0:-'1:-0::-:-1-=-1- pps2 ; pps 1 + pp3
:;:+..;.1-,;1,..;1,..,0;.-;,.1"...,-..,-_pp4
1 1 0110011 pps3 ; pps2+pp4
;,-+-7.0;,-;;-:",,;-;;--;;-;..,-_ pp5
01 101 100. 1 1 product; pps3+ pp5
begtn by generattng panial products as we Figure B.6 MultIplying IWO fixed poinl
did earli er However. we add partial prod- numbe" u, ing inl Crrl1cdtnlc partial product' .
Ut t; Immediately Int o p,lnial product
labeled PI" In the fi gure. Eventually. we "ind that the product i, 0110 1100 . 11, whi h
corre'JlIlI1d, to 'he correct an>wer. 108.75. You may want to try adding the five partial
B.4 Floating Point Representation 509
product s together at once instead of using the intermediate panial product sums to see
why thi s method i useful.
Before proceeding to binary real number division, we will introduce binary integer
divi sion. which was nOl di scussed in previous chapters.
We can use the familiar process of long divi
sian to di vide two binary integers. For example,
consider the binary divi sion of 1011 00 (44) by
10 (2). The full calculation is shown in Figure
B.7. NOlice how the procedure is exactly the
same as decimal long division except that the
numbers are now in binary.
Dividing binary real numbers, like multipli-
cation, also does not require that the binary point
be fixed. However. to simplify the calculation, we
shift both the dividend and divisor's binary point
right until the divisor no longer has a fractional
pan. For example. consider the division of 1 . 01
2
( 1.25) by 0.1
2
(0.5). The divisor. 0. 12' has one
divisor 1
1 1 1 quotient
Ojl 01 10 dividend

1
-0
1 1
-1
1

-0
o remainder
digit in its fractional pan. therefore we shift the Figure B.7 Di viding Iwo binary
dividend and divisor binary points right by one integers using long divi sion.
di git. changing our problem to 10. I, divided by
1, . We now treat the numbers as integers (ignoring the binary point) and can divide them
the long division approach. Trivially. 101/1 2 is 101 2, We then restore the binary
poin7 to where it was in the dividend. giving us the answer 10 .12 or 2.5.
Why does shifting the binary pomt not change. the. answer? Ln general , hifting the
radix point ri ght by one digit is the a . multlplymg the number by its base. For
binary numbers. shifting the binary pomt rtght IS equivalent to multtplytng the number by
2. Di vidino twO numbers will give you the rallO of the two numbers to each other. Multi-
plying numbers by the same number (by meansof the binary point) will
nOl affect that ratio. since doing a IS equivalent to muillplytng the ral10 by I.
Fi 'ed point numbers are simple to work with. but are limited in the range of numbers
that they can represent. For a fixed number of bits. tncreastng the preci ion of a number
comes at the ex pen e of the range of whole numbers that we can use. and vice versa.
Fixed point numbers are suitable a variery of uch as a digital
eter. but more demanding appllcaltons need greater ftextblhty and range in tbetr real
number . .
B.4 FLOATING POINT REPRESENTATION
\ hen " orking "ith decimal numbers. we often vcry large or very small
b
n tation. Rather than wntmg a googol as a I with a hundred
numbers y t"lng !Xl 99 9 -
o . f' . '" "rite 1.0' JOI . of - .7 _A m/s. we could write the
, ,I tcc II. C S 1 99 ' 10' or even 299. *10".
, 'cd of li ght u, .1.0*10 m/, . as - ' ' .
, pe . ' . Id be transl3ted into btnary. we would be able to tore a mu h
II <11 h nOllltlon (au .' , fi .
be urlll if the POlllt "ere xed. What feature of thiS nota-
grealcr range! of ntlm . . re entari n'!
lion need I\) be l:tptured In :1 blnar) rep
510 B Additional TopIcs in Binary Number Systems
First is the whole and fractional pan of the
number being multipl ied by a power of 10. which
is called the malllissa (or sigllificalld). as shown in
Figure B .. We do not need to store the whole pan
of the number if we make sure the number is in a
+ 3.0 * 10
8
; --'" \ \ exponent
sign mantissa base
cenain fonn. We call a number wrillen in scientific Figure B.B Parts of a number in
notat ion lIormali:ed if the whole part of the scientific notation.
number is greater than 0 but less than the base. In
the previOl;s speed of light examples, 3.0* 10
8
and
2.998*IO
s
are normali zed since 3 and 2. respectively. are greater than zero but less than
10. The number 299.8* 10
6
on the other hand, is nOl normali zed. If a binary real number
is noml ali zed. then the whole part of the manti ssa can onl y be a 1. To save bits. we can
assume that the whole pan of the significand is I and slOre onl y the fractional pan.
econd is the base (somet imes referred to as the radix) and the exponent by which
the mantissa is multiplied. shown in Figure B.8. Calling 10 the base is no accident - the
number is the same as the ba e of the entire number. In binary. the base is naturally 2.
Knowing thi s. we do not need to store the 2. We can simply assume that 2 is the base and
SlOre the exponent.
Third. we must capture the sign of the number.
The IEEE 754-1985 Standard
The Institute of Electrical and Electronic Engi neers (IEEE) 754- 1985 standard specifi es a
way in which the three values described above can be represent ed in a 32-bit or a 64-bit
binary number. referred 10 as single and double precision. respectively. Though there are
other way to represent real numbers. the IEEE standard is by far the most widely used.
We refer 10 these numbers as f/oatillg poillt numbers.
The IEEE standard a signs a
o cenain range of bits for each of bit l31 130 129 1. 124 123122 121 I .
the three val ues. For 32-bit num- ... a-L...-'
bers. the fi rst-most significant-
Figure B.9 Bit arrangement in a 32-bit Hoating poi nt
bit >pecifie; the sign. followed by number.
bit for the exponent. and the
remaining 23 bits are ued for the mantissa. Thi arrangement is piclUred in Figure B.9.
The sign bit is set to 0 if the number is positive. and the bi t is set 10 I if the number is
negative. The manti<sa bits are set 10 the fract ional pan of the mantissa in the ori ginal
number. For example. if the manti sa is 1 . 1011. we would store 1011 foll owed by 19
zeroe, in bits 22 to O. As part of the standard, we add 127 to Ule exponent we slOre in the
exponent bits. Therefore. if a fl oating point number's exponent is 3. we wou ld store 130 in
the exponent bits. If the exponent -30. we would store 97 in the exponent bits. The
adju;ted number i, call ed a hiased exponent. Exponent bits conlaining all Os or all 1 s have
'pecl3l meanings and cannot be used. Under these condi tions. the range of biased exponents
we can wnte in the exponent bi ts is I to 254, meaning the range of unbiased exponents is
- 126 to 127. Why don'l we .. imply store the exponent a< a signed, IWO'S complement number
(di'>Cu,'>Cd In Section 4.8)? Becau,e itlUms out thai biasing the exponent resulL< in impler
circuitry for cornpanng the magnilUde (absolute value) of IWO noming poinl numbers.
EXAMPLE 8.2
B.4 Floating Point Representation
511
The IEEE standard defines cenain special values if the .
are u",form. When the exponent bits are ali a' . contents of the exponent bits
s, two poSSibilities occur:
I. If the mantissa bits are all as then the e t' be
I n Ire nurn r evaluates to zero
2. If the mantissa bits are nonzero, then the number is nOl . . .
whole pan of the mantissa is a binary zero and not a one IS, the
When the exponent bits are all 1s, two possibilites occur:
I. If the bits are all as, then the entire number evaluates t _ . fi .
dependmg on the sign bit. a + or m , mry,
2. If the manti ssa bi ts are nonzero then the emire " b r" . .
number (NaN). ' , num e IS clasSified as not a
There are also speci fi c classes of NaNs, beyond the scope of tho .
used in computations involving NaNs. tS appendtx, that are
Wi th thi s information, we can conven decimal real numbers t ft .
. . a oatmg pomt num-
bers. Assuming the deCimal number to be convened is not a spect'al I ., fl ' .
.' va ue In oatIng pomt
notation, Table B.2 descnbe how to perform the conversion.
TABLE B.2 Method for converting real decimal numbers to floating point
2
3
Step Description
Convert the 'Illmher from base Use the melhod described in Seclion B.2.
10 to base 2.
COIwert 'he "umber 10
1lormali:ed scientific notatioll.
Fill ill the bit fields.
Initial ly multiply the number by i'. Adjust the binary point
and exponent so that the whole part of the number is I,.
Set the sign. biased exponent. and mantissa bits
appropriately.
Converting decimal real numbers to floating point
Conven the foll owi ng numbers from decimal to IEEE 754 32-bit floating point: 9.5. infinity. and
-52406.25 10".
Let's follow the procedure in Thble B.2 to convert 9.5 to. floating point. In tep L we COm'en
9.5 to binary. Using the subtracuon method. we find that 9.5 IS 1001 . 1 in binary. To com-en the
number to scientific per "';I' 2. we muluply the numbe: by 1'. giving 1001.1 _0 (for
readabilit), purposes. we WIlte the 2 pan In base 10). To nonnahze the number. we must shift the
binary poi nl left by three digilS. In order to not change the value of the number after movino the
binary point. we change the 2's exponent t.o 3. After step 2. our number becomes 1 _ 00 11 }
In step 3. we put everything together Into the properly fonnalled sequence of bits. The ion bit
is set to O. indicating n positive number,. The bits are 3 + 127:: I '''0 (we must bi:s the
exponent) in bina,). and the mantissa bllS areset to 0?11 ". which IS. the fra tional part of the man-
ti >sa. Remember that the 1 to the left of the blnar)' pomtlS Imphed In e the number is normalized.
TIle properly encoded number is hown m Figure B.IO.
rna-
511 B AdditIOnal TopI CS in Binary Number Systems
EXAMPLE B.3
Nm\ let' :, conven infi nity 10 a
Hoallng polill number. Since infinit y
IS.I special \alue. \\c cannol employ
the method" e used 10 om'en 9.5 10
floaling point. Rather. we 1111 in Ihe
three bit Iklds with :,pccial values
indicating that the number is infinit y.
From the discussion of special values
abo\c. we know that the exponent
bit s should be all I s and the mantissa
bits should be aliOs. The sign bit
should be 0 since infinit y is positive.
Therefore. the equivalent fl oating
poin! number is 0 11111111
00000000000000000000000.
Convening -52-l06.25 ,. 10-
2
to floating point is straightforward
u,ing the method in Table B.2. For
step I . \ ..' e conven the number to
binary. Recall that we represent the
Step 1: Conven to binary
9.510 <=> 1001.1 2
Step 2: Conven to normalized scientilic notation
1001.1 <=> 1001.1 20<=> 1.0011 ' 2
3
To normalize. move binary
point 3 digits left & add 3 to exponent

.Q. 10000010 00110000000000000000000
sign exponent mantissa
(biased)
Figure B.l0 Represenling 9.5 as a 32-bit Roating point
number. most significant bit first.
sign of the number using a single bit and not using two' s compl ement representation. so we
only need to com'en 52406.25 * 10" to binary and set the sign bit to indicate that the number
is negative. The number 52-106.25 * 10" evaluates to 524.0625. Using the subtraction or divide-
by-2 method we know that 52-1 i 1000001100 in binary. The fracti onal part. 0.0625. is con-
\'eniently 2-<. Thu 52-1.0625 is 100000 11 00 . 0001 in binary. In step 2. we write the number
in scientifi c notation: 1000001100 . 0001 * 20. We must also normali ze the number by
shifting the binary point left by 9 digits and compensating for thi s shift in the exponent:
1 . 000001 100000 1 * 29. Finall y. we combine the sign ( I since the original number is nega-
tive). biased exponent (9+ 127= 136). and fracti onal part of the mantissa into a noating point
number: I 10001000 00000110000010000000000.
Convertmg floating pomt numbers to decimal
Comen the number 1100 10 11101010100000000000000000 from IEEE 754 32-bi t fl oating
point (0 decimal.
To perform conversion. we first split the number into its sign. exponent . and mantissa
pan.<: I 1001011 1 01010100000000000000000. We can immediately see from the sign bit
that the number is negative.
Next . we convert the 8-bi t exponent and 23-bi t manti ssa from binary to decimal. We find that
1(1)101 I I IS 151. We unbi as the exponent by subtracting 127 from 151. givi ng an unbiased expo-
nent of 24. Recall that the manti ssa in the pattern of bits represents the fractional part of the
manu"a and I< 'tared Without the leading 1 from the whole part of the manti ssa (assuming the
oTl glnal number wa, normalilOd). Restoring the I and adding a binary point gives us the number
J.f)JOIOI()()()()(){)OO(. whic h is the ,arne number as 1.010101. By applying weights to
each di git . "'. ,ee that 1.010101 = ,za + 0*2" + 1*2.
2
+ 0*2' ] + 1*2-< + 0*2'
s
+ , . 2.
6
=
t
Wi th the oTl glnal Ign. exponent. and manti ssa extracted. we can combine them into a single
numller - I 327125 2". We can multiply the number out to -22.265.462.784. which is equivalent
tll -2221)5-162784 If)'
B.4 Floating Point Representation 513
The format for double preci sion
(64- bit) floating point numbers i bil [
63
1
62
161 I .. 1 53 152151 Iso I. 1 1 I 0 1
similar, with three fields having a Sign exponent
mantissa
defined number of bits. The first Figure B.ll B"
mo t significan! bit represents number. II arrangement in a 64-bit Roating point
sign of the number. The next I I
bits hold the biased exponent and the remaining 52 bi hi '
manti ssa. AdditIOnally. we add 10?3 to th . ts 0 d the fractIOnal pan of the
exponent. Thi s arrangement is in Instead of 127 to form the biased
Floal ing I'oi nt Ari thmetic
Floating poin! arithmeli c is beyond the scope of thi s text, but we'l . .
vtew of the concept. Wt I prOVIde a bnef over-
Floating poin! addition and subtraction must be performed b fi "
fl oating point numbers so that their exponents are I F Y rst aitgllmg the two
the two decimal numbers? 5?*leY + I 44*10' S. equa. Or example, consider adding
-. - . . Ince the exponents d''''' h
2.52* 10
2
to 0.0252* 10" Adding 0.0252*10' and *' IlIer.we canc ange
I 46
-2* 10' S ' 1 1.44 10 gIVes us the answer
.. ) . Iml arl y, we could have changed 1.44* I 0' t 144* 02. >
* ' . 2 0 I . Addmo 144* 10-and
2.52 10- gtves us the sum 146.52*10. which is the sarne be 0
I
. I " num r as our first set of calcu-
atlOns. An ana ogous situatIOn occurs when we work w'th fl ' .
Typicall y, hardware that performs Hoating poin! arithmetic O'ft PO'"t
. . '11 d" ' en re.erred to as aJWatmg
pO/l11 1/1111. WI a Just the mantt ssa of the number with th all
. . '. e sm er exponent before
addtng or subtracting the manussas (with their implied I s res d)
. . tore tooether and pre-
servtng the common exponen!. Notice Ihat before the addition or
o
subtraction is
performed. the exponents of the two numbers are compared Th' . . ..
" . . tS COmpanson tS facili-
tated through the us: of Ihe sIgn bit and the biased exponent as opposed to re reseorino
the exponent In twO s complement form. p e
. Multiplication and division in Hoating point require no uch alignments. Like in
deCimal multiplication and d"" ton of numbers in scientific notation: we multi I or
divide the manti ssas and add or subtract the two exponent depend' th p Y
. . . mg on e operauon.
When multIpl ying. we add exponents. For exao:ple, let's multiply 6.-14* 107 by 5.0* I 0-3.
Instead of trying to multiply 6-1.-100.000 by 0.00). we mUltiply the two m U a th
644*- O' 32? . an ssas tO"e er
and add the exponents.. ). IS . - and 7+(-3) tS -I . Thus the answer is 3_.2*10'.
\ hen di\ iding. we subtract the exponent of the e1ivi or from the e1i\ ' d d'
. d' ' d 31 - *10'" (d" d d) I' en exponenL
For example. let s IVI e.) tVI en by 2.0' 10- - (divisor). D' v' di 031 - b
7
- b . h d" , I I n_ .) y
2.0 gives us 15. ) . u tmctlng t e S e>;ponem from the dividend's - gives us
- 1-{- 12)=8. Thus the an wer IS 15.7) *10 . Floating point divi ion defines ";ults for
several boundary Ilses as d,vtdlng by O. evaluates to po iti"e Or negative
infinity. depending on the of th.e diVidend. Dlvtdtng a nonzero number bv infinity is
defined :l, O. othet'\\ ise d,vldmg by tJlfimty tS -
-

514 B Additional Topics in Binary Number Systems
B.5 EXERCISES
SECTION n.2: REAL ' UMBER REPRE ENTATION
I. Convert the following from decimal (0 binary:
(a) 1. 5
(b) 3.125
(c) 8.25
(d) 7.75
2. Convert the foll owing numbers from decimal to bi nary:
(a) 9.375
(b) 2.4375
(c) 5.65625
(d) 15.5703 125
SECTION n.3: FIXED POINT ARITHMETIC
J. Add Ihe foll owing IWO un;igned binary numbers u ing binary additi on and convertlhe result to
dec imal:
(a) 1011 1. 001 + 1010.110
(b) 01101 . 100+10100 . 101
(c) 10110.I+llO. Oll
(d) 1101. 111 + 10011 . 0111
SECTION B.4: FLOATI G POINT REPRESENTATION
Convert Ihe foll owing decimal numbers to J2-bil noating point:
(a) - 50.208
(b) 10'
(c) - 24.55 1.152 10'"
(d) 0
5. Convert the following 32-bit naming point numbers to decimal :
(a) 010011000101 10110101 100001011 000
(b) 01001100010110 11 0101001000000000
01111111111000 11 0000000000000000
(d) 01001101000 110101000101000000000
Extended RTL Design
Example
C.l INTRODUCTION
In Chapter 5, we performed RTL design of a soda di spenser processor. We ,tuned with a
high-level state machine, created the datnpath's structure, and then described tile on-
troll er using a finite-state machine. We did not further design the controll er to s!nleturc.
as such deSign was the subject of Chapter 3. and we did not wish to clutter hnptcr S"S
RTL design discussion with too many details of previously learned material. In thi s
appendix, we'll complete the RTL design by designing the controll er's F M down to a
state register and gates, resulting in a complete custom-processor impl ementation of u
controll er and a datapath. We' ll then trace through the behavior of the complete imple-
mentati on. The purpose of demonstrating thi s complete design is to give the reader a clcar
understanding of how the controller and datapath work together.
The block symbol for the soda di spenser processor appears in Figure C. I. Recall thut
the soda di spenser features three inputs, c. S, and a. The 8-bit input S represents the cost
of each bOltle of soda. The I-bit input C is 1 for
one clock cycle when a coin is inserted. Addi-
tionally, the value on 8-bit input a indicates the
value of the coin that was inserted. The soda di s-
penser features one outpUt, d, used to indicate
when soda should be dispensed. The I -bit
output d is 1 for one clock cycle after the value
Soda
dispenser
processor
8
of the coins inserted into the soda dispen er is Figure C.l Soda di spenser
greater than or equal to s. The soda dispenser block symbol.
does not give change.
In Chapter 5, we developed the high-level state machine seen in Figure C.2. We sub-
sequentl y decomposed the high-level state machine into a controller (repre ellled
behaviorall y as an FSM) and datapath, shown in Figure C.3. The datapath supports the
data operations necessi tated by the high-level state machine. includtng the value
of ror (ror = 0 in the Illir state), comparing if ror is less than S (for the from the
Wair state), and adding lOr and a (in the Add tate). The controller FSM IS slmtlar to the
SIS
maw
516 Extended RTL Design Example
hi gh-Ieve! qate machine. but "
modified to control the d.lla-
path and accept ,wtu, Input
from the datapath (I e.
to tit , ) rather than per-
fonlllng d:lw opcraW)ll'
t1irectly. The controller and
dawpath arc ,hown In h gure
'.3.
Inputs c. tOI It s (bit)
OutputS' d. tol Id. tot clr (bit)
Controller
(a)
Input c (bfls). a (8 bfls). s (8 bits)
OurputS' d (bll)
Local reg/siers lot (8 bits)
Figure C2 Soda dl'pcn",r
... t..lte mtichlOc
(b)
d=\
Figure C J Suda tlI'pcn,cr; (a) controller (de,,,,bed beh.l\ lorall y) and (b) datapalh ("ru ture) .
C.2 DESIGNING THE SODA DISPENSER CONTROLLER
U,ing Ihe controller de;ign procc" Introduced in hapter 3. we can complete
the de, ign of the controller. The five steps are as follows:
Captll re the FSM. The F I for the soda
displ..'n:-.cr"s controll er \Va, crea.ted during
step of the RTL dc,i gn method. The con-
troller' s is shown in Figure C.3(a).
Captllre the Architecture. As indicated
by the controller's F M. the tate
machine's architecture require at least 2
inputs (C and tot _ I s) and J outputs (d.
to I d. and to . C I r). Additionally. we
will usc two bits 10 represent the con-
troller' s states. which adds an additional
two inputs (the current stat e sls0) and two
outputs (the next state n 1 n 0) 10 the con-
troll er architecture. The corresponding
controll er architecture is shown in Figure
CA.
d
Combinational toUd
logic
Figure C.4 Standard controller architecture
for the soda di penser.
C 2 DeSigning 1/10 SOds Olspln 01 Controller
SI7
Encode the tate .
. 0 slr:ughtfoN nrd en,ndl
frill : O. IInit: 0 I. dd. 10. and DIJp: II nil 0' the ""'.1 .11'1 ' II'a\ hlllr 't.IIC, "
Create tire tate Table F .
kn
. tOnl Ule controller ' h .
we 0\\ thm the Itnte table UI\ lie d"" '"cd III " " ,' .11 Ilrr 'tel'
d
mu t 3 COunt for .
outputsC. o Id.O clr nl '"PU"( . a 1 . 1 ..
2
4
= 16 tOw (Figure .5). '. nnd nO) \\tth 1111 ut' . the 't.lle 1.lhle \\IIIIII 'I"de
Illputl
.1 sO c toI
d
0 0 0 0
0
0 0

0 \
0
0 0 \ 0
0
0 0 1 1
0
0 \ 0 0
0
'"
0 1 0 1
0

0 1 \ 0 0
0 1 1 1
0
1 0 0 0 0
:s
1 0 0 1
0
<0: 1 0 \ 0 0
1 0 1 1 0
1 \ 0 0 1
!
1 1 0 1 1
1 1 1 0 1
1 1 1 1 1
Figure C.5 The soda d"pcnscr conlroller" >tate wblc.
By examining the outputs pecified '" the
comroller FSM. duplicated for convenience to
Fi gure C.6, we fill in the d.
tot_I d, and tot_CI r columns in the state tabl e.
For example, in Figure C.6, we see that when the
controller FSM is in the filii state, d-O,
tot_CI r-1. and tot_ld is implicitly O. Thus.
for rows in the state tabl e that correspond to the
fil i i stat e - namel y, the four rows where
sls0-00 si nce we chose "00" as the encodi ng
for the Ini/ state - we set the d column to O. the
t ot_CI r column to I, and the to Id column
10 0 . -
We fill in the next state columns. nl and nO.
0u\pU1I
tOl Id tot clr nt nO
0
0
0
0
0
0
0
0
1
1
1
\
0
0
0
0
1 0 \
t 0 t
\ 0 1
1 0 \
0 \ 1
0 0 \
0 \ 0
0 1 0
0 0 1
0 0 \
0 0 1
0 0 \
0 0 0
0 0 0
0 0 0
0 0 0
tnpul. c. lot It (bot)
OutpulS' d. tot Id, lot clr (b.t)
Figur. C.6 Soda di spen,,:r conlroller
FSM WII/1 ' tate encoclins-,.
based on the the transiti ons specified in the controll er FSM and the stale encoding we
chose in an earlier step. For example. con ider the Wait state. As indicated in Figure .6.
the FSM transition to the Add state when coL for rows where s !sOc-Oll
5 18 C Extended RTL Design Example
(s 1 sO 01 corre'pond, 10 Ihe \Vall we CI Ihe n 1 column 10 1 and the rO col umn
10 0 (n I nO 10 corre'pond, 10 Ihe Add ,laiC). When Ihe F 1\1 Imn,II1 n., 10 Ihe D/ p
'Wle If o. I 0 <lr remalO' In Ihe "ale of ttl 1 We reprc'>eOl the
Iran'"I011 fr011l Wall 10 01\(1 In Ihe talc lable bj \ClUng r 1 10 I and nO 10 1 (D/.rp) in lhe
row II herc S 1 0 0 I O. and ;) 1 O. 111I1Iarl). I\e repre'>ClII the tr.IIl I-
lion fr011l back lO \Vall by wnllOg P I 1 1\ here 51 0-01. r -0. and
o I , 1 Wc Ihcn C"'11IllIC Ihe rem;,,",ng "<1n\1I10n, 10 a "mllar I\J). filling In Ihe
appropriale valuc, lo r n 1 alld nO UIIIII all Imn'"10n' arc ,Iccounled for The compleled
, laiC table" , hown III Figure c.s.
Implemell tthe C{)mbill oti{) llal t oxic. For each of Ihe ,laIC 1.lble \ OUIPUI . we IHlle lhe
corre'pollding \3oulean equullnn. From the \l,lIe table lie oblaln Ihe follow 109 equollon .
d 51 a
o Id - sls0 '
o clr - sl 'sO'
n1 - sl ' sOc ' 0 1 s ' s l' sOc
nO - 51 ' sO' sl'sOc ' + sls0'
nO Sa ' + sl ' sOc'
NOle Ilwl Ihe tiN four equallon, derived fro111
Ihe Malc lable arc nlready minl11llled. The fifth equa-
lion. corresponding 10 nO. can be mlnlmi/ed 10 sO'
+ s I' sOC ' Ihrough algebraic or by u,ing
a K-map t" ,howII III Fi gure .7. K-map' nre di,-
cussed in Seclion 6.2.
G slsO
c
o
SI 'sOc' sO'
Figure C 7 K-map for the inlllru
cquallon for nO.
C 2 Deslgnmg the Soda Dlspensol ContIolior
519
sing lechnique; di cu."ed '" ha I ,
inio an equi l olcnl III G-Icl cl gUie-ll'l-cd p rr -. lie the Jh.."c elll""" 11\
lhe Boolean equalion, li e are c nl II'('UII Th" '1 01 cr."," I' 'Ir.l1l1ll1lc"\\.,,,1 ''''<0
e"'"8 .11\: olreJoI 10 I I
equenl inl controll er circuli Qnd the d . 'U11l\'p"", \I,'" hll1l\ I he ",,"1
Figure C. . Jlup.llh fm Ihe 'I 1.1 Yhf'CII\CI " ,h,1IIll II I
Figure C.B Final implemental lon of Ihc "Xla mllCh,"c controller (' lell) Wll h dlllllpl1l h
m
520 C Extended RlL Design Example
C.3 UNDERSTANDING THE BEHAVIOR OF THE
SODA DISPENSER CONTROLLER AND DATAPATH
In this secti on. we will look closely at how the controll er and claw path we designed for
the soda di spenser interact to form a working implement ati on of our initi al hi gh-l evel
Slale machine.
Figure C.9 ill ustrate, the behavior of the soda di spenser controll er and dalapmh,
including initi ali zati on and how the soda di spenser behaves when the user inserts a
quaner int o the system. The 5 clock cycles shown are labeled I through 5 in the figure.
We' ll assume thm the cost of a soda can is 60 cents and thm the soda di spenser' s con-
troll er is in the /Ilir stme during the firs t clock cycle. Let' s examine what occurs in each
clock cycle:
Initi all y. in clock cycle I. the controller is in the /Ilil stale. shown in Figure C.9(b).
When in state /Il il. the controll er sets d to O. tot_l d to 0, and tot_cl r to 1.
Additionall y. the cont roll er sets the next state signals nInO to 01. corresponding
to the stat e. In the dawpath. the value of 101 and lOi+a is unknown. denoted
by ''??''. Notice that even though the cont roll er set t ot _cl r 10 1 during thi s
clock cycle. the 101 register wi ll not be cleared immediately (asynchronously).
Rather. 101 will be cleared shonl y aFter the next clock cycle, a synchronous
behavior. Finally, noti ce thm the price of the soda, s. is set to 60 cent s and the
coin input signals. C and a. are initi all y 0 and O. respectively.
Figure C.9(c) shows the soda dispenser in clock cycle 2. The controll er is now in
the iVail state. Accordingly. the controll er sets d, tot_l d. and tot_c 1 r 10 O.
The value of 101 is cleared. and shonl y afterwards. two signals. tot_l t_s and
IOI+a. take a known value. The datapath's comparalOr sets tot_l t_s to 1 since
the total . O. is less than the price of soda, 60. The dat apath's adder sets interme-
diate signal 10i+a to 0 since 101 and a are now known. The next state signals
remai n set 10 01 (IVait) since c is 0 and tot _l t _s is 1.
Figure C.9(d) shows the soda di spenser in clock cycle 3. During the third clock
cycle. the user insens a quaner inlO the soda di spenser. as indicated by C
becoming 1 and a becomi ng 25. Shonl y after a changes, the adder' s output 101+a
changes to 25. the sum of 101 and a. Since c is 1. the controll er sets the next state
to 10 (Add). The values of d. tot_l d. and toCc 1 r remain the same since the
controll er' s stale has not changed since Ihe previous (2nd) clock cycle.
In cl ock cycle 4, shown in Fi gure C.9(e), the conl roll er is in the Add stale and sets
tot_l d 10 1 whil e keeping d and tot_c 1 r at O. As was Ihe case wilh tot_clr
during Ihe /Ilil stale. 101 will nol be updaled until Ihe neXI clock cycle. The con-
troller will uncondi ti onally relurn 10 slale iVail . selling nInO 10 01 (Wail ).
(a)
"' <0
c:
Ol
' in

e
E
o
()
C.3 Understanding the Behavio .
r of the Soda Dispenser Controller and Data path
521
(b)
elk
slate (5150)
nexl state (ntnO)
d __ +.==:::::ii;---+---
C :
loUd n __ h' =---+----I---......
w
"' <0
c:
Ol
'in
.c: 101
OJ
a.
'" OJ
25 25
0 tol+a ?? 25 25
60
00 00
Figure C.9 Soda di spenser operati on from initialization to inserting a quarter: (a) timing di agrnm. (b}-{e) signal values
during clock cycles 1-4.

522
C Extended RTL Design Example
In clock cycle 5. shown in Figure C. IO. the cont roll er sets d. to t _ l d, and
tot_c 1 r to 0 since the controll er is in the Wait state. The tot register loads the
value of IOt+G. storing 25. Shonly afterwards, lOt+a changes to 50 to refl ect the
new value of lOt . however, 50 is not loaded into tot as tot will onl y perform a load
synchronous to the risi ng edge of the clock signal.
The addition procedure demonstrated in clock cycles 3 through 5 is repeated for each
coin insened unti l enough change has been insened to cover the cost of a soda, a indi-
cated by input signal s.
C.3 Understanding the Behavior of th S .
e oda Dispenser Controller and Datapath
Figure C.10 Operation
of the controller and
data path: clock cycle 5
from Fi gure C.9(a).
523

C Extended RTL Design Exampl e
Figure C. II detai ls the behavior of the soda di,pe."er when the user has inserted
enough change into the machine to merit a soda being di spensed. In the timing diagram
shown in Figure C. II (a). we dupl icate clock cycle 5 from Figure C.9(a) as a point of ref-
erence. During the next few dOlen clock cycles. we assume that the user has insert ed a
nickel followed by a quart er. As a result. the register 10 1 will cont ain the value 55
(25 + 5 + 25 cent s). Lct"s examine the behavior of the soda di 'penser when the user insert s
a dime into the machine:
In Fi gure C. II (b). corresponding to clock cycle 100. the socia di spenser' S con-
troll er is in the IVail state. Assuming the user insert s a dime into the soda
dispenser. the c input will become hi gh for one clock cycle and the a input will
change to 10. the value of a dime. Short ly after a changes, the intermediate signal
101+0 changes to 65 (55+ I 0). With c asserted. the nex t state signal s nInO become
10 (Add).
In clock cycle 101. shown in Fi gure C. II (e). the controll er is in the Add state and
assert to _I d. The regi ter /0/ will not load a new total until the ri sing edge of
lhe next clock cycle. The controll er uncondit ionall y sets the next state to 01
( \\'ail).
Figure C.I I (d) shows the status of the soda di spenser in clock cycle 102. where
the controller is in the IVail Slate. As ind icat ed by the arrows in Fi gure C.II (a).
tot_l d being asserted on the ri sing edge of the clock causes 10 1 to load the value
on its input. whi ch i, 65. Shortl y aft er 101 loads a new value, the comparator' s
output to t_l t_5 changes from 1 to 0 to re fl cct the fact that 101 (65) is not less
than 5 (60). Since the controll er is in the Wail state. and since both c and
tot_l _5 are O. the cOl1l roll er sets the ncxt stat e signal; to 11 (Disp). Notice
that prior to the next state ignals settling on the Disp statc. the next state was Wail
for a brief period of time. Depending on the time required for signal s to propagate
through the datapath and controll er. certain signals may initiall y cont ain unex-
pected value,. but signals wi ll eventuall y settl e to their expected values. We
can avoid any problems a sociated with thi s peri d of uncert ai nt y by selecting a
clock period that is long enough to all ow our circuit 's intermcdiat e signals to
; ettle into a , tablc state and stay stable long enough to compl y wit h any setup
time, requi red by our circuit' s sequential component "
In Figure C. II (e). the controller is in the Dis!, Slnte. The cOl1l roll er assert s d, indi -
cating to ,orne outside component that a soda should be di ' pcnsed. The controll er
will unconditi onall y tran. iti on to the /Ilil stat e. where the initi ali zation procedure
shown in Figure C.9 is repeated (partiall y shown in clock cycle 104 of Fi gure
C.1 1(a) ).
We ,ee that lhe controller and datapath work together to implement the behavior of
the origi nal hi gh-level ,tate machine.
(a)
'" 0;
c:
'" 'in

e
c
o
(.)
C.3 Understanding the Behavior of the Soda Dis C
penser ontroller and Datapath 525
1->0
e
(b)

(e)

state (sl s0)
I I I 1
next state (nl nO) 00 UISP : Init: Wait
1
tot ! ' it=
tot.slr ----L::::: : : /:
toUU -=+-___ __ ' , {
-5 tot _____ 1 55 t 55 65 I 65 I 65
a 2sT====gs, 10 10 10 , 10 , 10
8 tot+a 65 65 75 , 75 , 75
S 24-m- i 60 60 60 I 60 , 60
.-l-____
lO la

(d)
(e)
Figure C.ll oda J"pcn"'r opec.ll1M \\ hen ,ullic.en! change has been i=cd: (3) timing diagram. (bHe\ signal
, alue, dunllg clock c)clc, IOC Ill.\
I nd ex
=.
SdlSpla) , tatement.
6 HC II microprocessor. 2 1
subserie- ICs.
subseries ICs.
,ubseries ICs.
series ICs.
series ICs.
8051 microprocessor. 21. 422
A
Abe." component (AL-extender). 203
Abo\e-mirror display (example):
with 16 32bi l regi sters. 20-+-207
"ith 16,32 regi ster file. 208
\\ Ith parallel-load registers. 155- 156
with shift registers. 159. 160
with up-counters. 183
Absorption Law. 501
Abstraction (in RTL design). 276
Access time (RAM). 263
Active-high input. 136
ACllve-Iol> input. 137
Actuator. 9
Adaptive cruise comro!. 237
Adder(,). 165- 173. 197
building a SUblI3clor using. 197-200
carry-lookahead. 33.\-342
carry-ripple. 166-173.339-340.468-471
carry-;elect.
creallng faster. 333-343
deSIgn examples using. 171-173
.+-bit carry-ri pple. 169-171
full-. 168-169
h.lf-. 167-168
"bu. 165-166
t"'o-Ievel logIC. 334
Adder tree. 215
add ,",tructlon. 434
Addlllve Identu} elcment. 497
Addll1ve '<lund. 211
Addre" (for reg"ter). 205
"'L-extender. ;ee Anthmellcflogic extender
Algebr""1
of logic, 504
of sets. 504
switching, 496. 497
Algebraic methods, in two-level logic size optimi zation.
296-298
Algorithms:
Espresso tool in. 3 15
exact. 308
selection of. 356-357
for state reduct ion. 3 19
Algorithmic state machines (ASMs), 233
Ahemalive minimum-bidwidth binary encoding, 323-324
ALUs. see Arithmetic-logic units
always procedure, 453-454
Amperes, 3 1
Analog ci rcui ts. 5
Analog phenomena, encoding of. 9
Analog signals, 4
Anal og-la-digital converter, 9
AND gates. 43-44, 404-407
AND operator, 38-40
Appli cat ion-Specific Integrated Circuits (ASICs).
38G-388
cell arrays. 383
FPGAs vs., 40 1
gate arrays, 38 1-382
implementing. using NOR gates. 386-388
impl ementing, using onl y AND gates, 384-386
standard cell s. 382-383
structured. 383. 408
Architecture. 447
Arithmetic:
fixed point. 508-509
Roati ng point. 5 13
Arithmeti c/logic extender (AL-extender). 202-203
Arithmeti c/logic instructi ons. 439
Arithmeti c-logic units (ALUs). 20 1-203
multi -function calculator using. 203
operati on. 423-424
ARM microprocessor. 2 1. 422
Arrays. Sec also Field programmable gate arrays (FPGAs)
cell ,383
gate, 38 1-382. 389
programmable logic. 407
ASCII. 10
ASICs, see Appli cati on-Spec ifi c Integrated Circuits
ASMs (a lgorithmi c state machines). 233
Assembler programs, 430
Assembl y code. 431
assert (term). 137
assert statement s, 456, 458-459
Assoc iative propert y. 50. 50 I
Asychronous circuits. 102
Asychronous inputs. 133
Asychronous reset inputs, 135
Asychronous set inputs, 135
Atria (of heart), 138, 139
Audio, digit ized. 6-8
Audi o recordi ng, 5-7
Automation
with Quine- McCluskey method, 3 11 -3 12
of two-level logic size optimi zation, 308-3 15
B
Bardeen. John. 33
Basestati ons (cell phones), 279-28 1
Base ten. 11 - 12
Basic input/output system (BIOS), 431
Ba ic SR latch. 97-99
Beamforrners. 210-213
princi pl e of. 210-2 11
in ultrasound machines. 2 12-2 15
Behavioral-level design. 254-258
Bell . Alexander Graham. 8
Bell Laboratori es. 33
Bell Telephone. 8
BeltWam circuit . 387-388
B-frames. see Bi directional predi cted frames
Biased exponent. 5 10
Bi di rectional predicted frames ( B-frames), 363-364. 369
Binary numbers. 11- 17.505
Binary number systems. 05-5 13
fi xed point ari thmetic in. 50 -509
Roating point represcntation in. 509-513
real number represent ati on in. 505-507
Binary poi nt. 506
Binary rcprc cnl3li ons. 4
Binary sear h. 357
BIOS (basic input/output ,y tem). 431
Bit. 4
Bit file, 399
Bit storage. 96. ec also <pecific types. e.g.: R Intches
Bit wise opcrntion. I
Blinking li ght- (10 computcrs).
Block symbol. 152
Board game,. computcn/cd. 157
Boole. George. ),
BOOlean algebra, 38, 47-55 496-504
e.valualing expressions in '48-49
hterals in. 50 .
operators in. 38-39, 48-49
product terms in, 50
Properties in. 50-55
sum-of-products in, 50
S
W
llchmg, 497-498
terminology, 49-50
theorems in. 498-503
Variables in, 49
Boolean functions, 55-{i7
canonical form, 63-{i5
ClrCUlls fO.r 56
and circuits, 65
conversIon of. 58-{j()
defined, 55
Index
equations for representing, 56
truth tables for representing, 56-58, 62-{i3
BOOlean logIC gates, see Gate(s)
?perators, see Operator(s)
Boollng ' computers, 43 1
Brattain. Walter, 33
Buffers, 206, 272
Bus (i n register files). 206
Bus interface, 238-241
Bus protOCol, 239
Button press synchronizer (example). 123-124
Button sensor. 10
c
C (program language), 19-20, 254-258.388
C++ (program language), 254. 258. 388
Calculators, 200
Calculus. propositional. 504
Cameras. digital. 22-23
CAN (controller area network). 160
Canonical form (Boolean functions). 63-65
Capture (step in combinational logic desiga). 67-{i9 ..
Carry-lookahead adders. 334-342
efficient example. 336-339
half-adders in. 337-339
hierarchical .
inefficient example. 335-336
Carry-ripple adders. 166-173
in dntapath component description. -171
8-bit. 173
-l-bit. 169-17_
fulladders. 168-169
half-adders. 167-168
and hierarchical arry-lookahead adders.
Carry-ripple style magnitude comparator. 17 -I 0
ClUT)-sclc t adders.
528 Index
Cas.ell e '"pes. 5- 6
Cell arrays. 383
Ce ll s (cell phone region<). 279
Cell s. standard (ASIC), 382- 383
ellul ar telephones. 7. 279- 284
components of. 28 1- 284
voice qua lity on. 25 1
Ccb ius, 175
han ncb (in transducers). 2 10
Checkerboard. comput cri led (exampl e). 156- 158
Chips. Sili con chi ps
CincxI componenl (AL-cxlcndcr).

analog. 5
asychronous. 102
and Boolean functi on" 56
building. using gmcs. 44-l7
clock divider. 187
combinati onal. 30, 65. 85. 95
crit ical path in. 252- 25-1
defi ned. 22
digit al. 4-5.2 1- 22. 38-10. 2 13- 2 15
integrated. 33-35
mathemati cal formali sms in design. 130
and notati on simplifi cati on. 69- 72
paniti oning. among lookup tables. 390-394
sense amplifi er. 26 1
sequential. 30. 85- 86. 95
simplifying drawings of. 130
state of, 95
synchronous. 102
CLBs. see Configurabl c logic bl ocks
Clear inputs. 134
Clock di vider. 187
Clock frequency, 103.25 1- 254
Clock gating, 358-360
Clock signal. 102- 105
Clock skew. 359
CMOS transistors. 35-37, 41. 42. 357-358
CMY color space. 192- 194
CMYK color space. 194
Codecs.409
Code detector (example). 11 7- 11 8, 129- 130
Color pace convener-- RGB to CMYK (example). 192-
194
Combinati onal circui ts, 30. 85
multiple-output , 65
output of, 95
Combinational logic descripti on:
gate behavior in. 452-455
structure in. 447-452
test benches in, 455-459
us ing hardware languages. 447-459
Combinati onal logic design, 67-72, 168- 169
Combinati onal logic optimi zati on, 296-317
multil evel logic optimizati on, 3 15-3 17
two- level logic-size optimi zati on. 296-3 15
Combining le nns 10 eliminate a vari able, 297
Combl ogic process, 455, 464-466
Communi cati on:
serial , 160
wireless. 161
Commutative propeny, 50. 498
Comparator(s). 177- 181
equalit y, 177- 178
exampl e using, 180-18 1
magnitude, 178- 180
Compensating wei ght scale (example), 173
Complement (s).4 . 194-1 97, 497
defined, 195
existence of, 499
unique, 499
Compl ementati on. 499
Complement propeny. 51
Compl exit y, managing (RTL design), 275
Complex programmable logic devi ce (CPLD), 407-408
FPGAs vs .. 407-408
SPLDs vs .. 407-40
Component all ocation, 349- 350
Compre sion. 7
and computation of ratios in video, 364, 367, 368
in digital video. 363- 369
quantization in, 366-367, 369
and transforming to frequency domain. 364-366
Computers, 4
with blinking lights, 430
booting. 43 1
Computerized board games (example), 156-158
Computer monitors. 192
Concurrency (i n RTL design), 348-349
Concurrent computat ion, 354-355
Conductors, 36
Configurable logic blocks (CLBs)
grid of. in FPGAs, 398- 399
output configuration memory in, 399
as programmable ICs, 396-398
Configuration (in RTL design), 245
Configuration memory, 398, 399
Congestion, 204
Constants, 434, 497
Constructor functions, 451-452
Control-dominated design, 247
Control input, 3 1, 32. 150. See also Gate(s)
Controll er(,). III. 11 9- 130. 135- 140
behavi or of. in soda machine di spenser exampl e, 519-
525
common pitfalls with, 128- 129
connecti on of dot apath to. in RTL design. 236
defined. III
deri vati on of FSM for. 237. 238
de ign exampl es using, 116-117, 120-1 21. 123- 127
design of. in soda machine di spenser example. 516-5 18
design process for, 120, 126
and implementation of FSMs. 122
initial state of. 135- 136
in laser-based di stance measurer example, 480-491
in LED module, 4 14-4 16
negative logic in. 136-137
output glitches in. 136
in pacemakers. 138- 140
in equential logic description. 463-466
tandard architecture for, 119
Controll er area network (CA ), 160
Control unit. 424-428
in six- instruction programmable processors, 435-437
for three-instructi on programmable processors, 432-434
Conversion(s). 58
among Boolean functions. 58- 60
from any base to any ot her base, 15- 16, 60
from binary to decimal, 12
from circui ts to equati ons. 58- 59
from circuits to truth tables, 60
decimal to binary, 13- 15
from equations to truth tabl es. 59
as step in combi national logic design, 67- 69, 72
from truth tables to circui ts, 60
from truth tables to equations, 60
Convener(s):
analog-to-digital. 9
digital -to-analog, 9
of FSMs to circui ts, see Controller(s)
RGB to CMYK (example), 192-194
Core, 41 I
Cosine waves, 364-366
Counters, 181 - 188
down, 18 1, 183
exampl es usi ng, 183, 184, 186- 188
N-bit, 18 1
parallel load, 185-187
as timers, 187
up, 181-183
up/down, 184
Cover (term). 309
CPLD. sec Complex programmable logic device
Critical path (in circuits), 252-254, 317, 333
rui e control, adapt;'c, 237
Crystals. pielOClectnc, 210
CUlT'<nt (teon), I
Currentstate signal , 46 66
Custom digital circult<, .1 - 22
Cyc\c, clock, 10
o
0313 communi Ali n. 161
Data-dominated design, 247- 250
defined, 247
example using, 248- 250
Data input, 150
Data memory, 423
Data movement in\ lructiOIl\, 439
Datapath, 423 24
Index 529
COntroller to, In RTL 236
of, In RTL de'ign, 2 236
In laser-based di stance measurer (example), 480-49 1
for programmable procco". 422 24
:n Six-instruction progrnmmable proce,IO"', 435 37
r" soda machine di ' penICr (example), 519- 525
or three-instruction programmable proce"or , 431-4
Datapath component description:
and carry-ripple adde"" 468-47 1
and full -adders, 467-468
up-Counters in, 471-475
usi ng hardware lunguages in, 467-475
Datapath components, 151
and faster adders tradeoff, 333- 343
and smaller multipliers tradeoff, 343- 345
Datapath operntions, 423-424
OCT. see Discrete cosine lransfonn
Detr. sec Local registers
Debugging, 33
Decimal point, 506
Decimal to binary conversion:
di vide-by-2 method, 14-15
subtraction method, 13-14
Declaration(s):
enum, 465
process, 452-453
type, 463
Decoders, 77-79, 395
Decoding stage, 426-427
Decrement (in counters), 181
Decrementer, 183
Deep Blue (computer), 157
Delay (i n gates), 85
Delay circuits. 213-214
DeMorgan's law, 52, 502, 503
DemUltiplexers, 85
Dequeue, 272
h
b
530 Index
Dc,igner proli le,. 29. 9-1. 22-1. 293. 377- 378.444
logic. 67-72
and circuli notal ion". 69- 72
,Iep' in. 67- 69. 72
DC!'Iign proce ..... :
ror cOlll roll er>. 120. 126
for 163
Detector 17-19.21
Dcterior:lIion. 6
D nipli op,. 103- 109
edge-Iri ggered. 10 107
-I-bil. 109
and It!vcl-!"cn:, iti vc D latch. 103- 1 ().l
Di gital camcm .... 22- 23
Di gi lal circuit,. -1--5. 21- 22. 38--10. 213-2 15
Digital filter. 2-l 8. See ;l lso Finite impulse rc"pon'\c
fihers (FIR)
Digil:ll phenomena. encoding of. 9- 10
Digilal , ignal procc;<i ng/proccsso" (DSP). 213. 28-1
Digiwl +-7
Digilal sound recorder (exampk). 26+-265
Di gi lal ,yslems. 4. 17- 18
Digital telephone an!o.wcring machine (exampl e). 270-27 1
Di gital thermometer converter (example). 175
Digital -Io-analog converter. 9
vi deo. 2-l-l
Di gital video di scs (DVDs). 36 1-363
Di gilal video player/recorders. 36 1- 370
compression in. 363-369
di screle cosine Iransrorm in. 36+-367. 369
and DVDs. 36 1-363
and hurrman coding. 367-369
MPEG-2 encoding and. 363-366. 369-370
Di giti zed audio. 6-8
Digililcd pictun.:s. 8
Di giti zed video.
DIP, see Dual Inline Package switch
DIP-switch-based calculator (examples):
adding. 171 - 172
adding/sublracling. 191-192. 198
multi-runction wi thout using ALUs. 20 1
using ALU. 203
Di screle cosine transrorm (DCT). 364-367. 369
Di screte transistors. 33
Di spl ay Slalemenls. 457
Di sp stale. 330
Di stance measurer. laser-b3sed. see Laser-based
distance measurer
Di stribuli ve propeny. 50. 498
Di vide- by-2 melhod. 14-15.505
Di vide-by-n melhod. 15- 16
D lalch. 103- 106
maSler. 105- 106
,"rvanl. 105- 106
Don'l care inpul combinati ons. 305- 307
Down-counlers. 181. 183
dowTlto \;\ tatcment. 459
Drain (OUlpUI ). 35. 36
DRAM. see Dynami c random access memory
Driver>. 206
DSP. ,ee Digilal signal processing/proces.ors
Dual lnline PaCkage (DIP) swilch. 171- 172.402
Dualil Y. principle or. 499
Dual -poned regi'ler filc, 208
DVD<. sec Di gi lal video di,cs
Dymunic microphone. 5
Dynami c power. 358
Dynami c random access memory (DRAM). 262-263.271
E
EchoDelay circuilS. 214
Economy or scale. 200
EDA (eleclroni c design automalion). 409
Edge-triggered D liip-nop . 10-1--107
defined. 105
musler/servanl design. 105- 106
EEPROM. sec Eleclri cally erasable PROM
8-bi l carry-ripple adders, 173
Electri call y erasable PROM (EEPROM). 268-269. 27 1
Electronul.gncti sm. 5
Eleclronics. 31
Electronic design automalion (EDA). 409
Eleclroni c focusing (or sound). 21 I. 2 12
Embedded syslems. 4
Enable (decoders). 77
Enable inpul. 101
Encoders. 85-86
Encoding. 9- 13
of anal7,g phenomena. 9
or digilal phenomena. 9- 10
emropy. 368
huITman. 367-369
minimum-bilwidth binary, 323-324
MPEG-2.363-366
of numbers. 1(}-13
one-hOI. 324-326
OUlpUI. 327-328
run-Ienglh. 367. 369
in sequenlial logic opl imi zat ion. 323- 328
E lAC (compuler). 33
Enqueue. 272
emily declaration. 447
Entropy encoding. 368
enum declaration, 465
cnum stiltcmcnl. -t65
EPRO I. see Erasable PROM
Equali lY comparator. 177- 178
Equalions. 56
Equivalenl slales. 318
Erasable PROM (EPROM). 267-26
Espresso (heurislic (001). 315
E semial prime implicanl. 309-3 10
Exacl algorilhm. 308
Excalibur plalrorm (All era). 409
Execuling slage. 426-427
Exi tence:
or addili ve idenlit y elemenl. 498
or complemenl . 499
of mulliplicalive idenlilY elemenl. 497
Expanding (Ierm). 309
Expand operal ion. 3 13
Exponenl. biased. 510
F
Fabricalion planl (rab). 380
Fahrenheil. 175
Falling edge-Iriggered flip-fl ops. 107
Fanoul. 204
Fa. I Fourier Transrorm (FFT). 364
Feedback. 96-97
Felchng slage, 426-427
FFT (Fasl Fouri er Transrorm). 364
Field programmable gale arrays (FPGAs). 377. 388-401
archlleclure or. 398-40 I
AS ICs vs .. 40 1
configurable logic blocks with, 396-398
CPLDs vs .. 407-408
lookup tables wilh. 389-394
microprocessors vs .. 40 I
programming or. 399-400
SPLDs vs .. 407-408
swilch malrices wi lh. 394-396
FIFO (firsl-in firsl-oul). 272
FIFO queues. 272
Fillering (in digi lal signal processing). 282
Finile impul se response fill ers (FIR). 282-284
wilh clock galing. 359-360
example using. 248-250
and pipelining. 347
using operalor scheduling. 352-354
Finile inducli on, 503
Finile-slate machines (FSMs), 11 3- 119. 128-130
behavior in, 11 8- 1 19
comroll er archi lecture ror. 11 9
convening circuillo, 126- 127
wi lh data (FSMD). 230
defined. 11 4
Indo
derivulion of
d . . Or Comroller '>7 '18
,,-<ample, u' '"g. Ils-i ' ls' i -,
enly Iype. 32 _ 3 . - 110
loore Iype. 32 ). J
n.ondclcnnini lilic. 128
'linplirying n 101' ,
FIR fiI : 11 5-1 16. 130
ICrs. sec FlOlie 1I1l.pul
Firsl-in firsl -oul (FIFO) 'e rc'llOl"C lille"
Firsl-i n fi"l - . . 272
Fi"'l . , OUI (FiFO) queue,. 272
. pll (slale redUCllon) 1 0 )11
llrilhmeli .508- 50<) --
Hush memory. 269 .
F1lghl 'lIIend'''" c 11 b
Ii ,a - Ullon (cxnmple) 10K
Ip- 0ps. 96-111.130-135
cl ock signnl, '". 102- 10)
D. 103- 109
and D latche.. 103- 104
and reedback in bil Slomgc 96-97
lK.131 ' .
IDlche. vs.. 107
behavior in, 131- 134
and r:glSlers in bil Slomge. 109-1 II
resel mpul ' in. 134-135
sel inpUIS in. 135
SR. 108. 131
and SR Inlches. 97- 101
T. 131
F1oOling-poin! ari lhmelic. 513
poin! numbers. 510
poinl rcpresen!a!ion. 509- 513
Fioallng poinl unit. 513
Flops, 108. Sec also F1ip-lIop,
Flow-Of-conlrol inslruclion" 440
Focusll1 g (of sound). 21 1, 212
4-bil carry-ripple adders, 169-172
4-bH D liip-Hops. 109. Sec also Regisler(s)
FPGAs. see Field programmable gale arrays
Frames. 241.361.363-364.369
Frequency:
cl ock. 103. 251-254
sound waves. 210
FSMs. see Finile-slale machines
Full-adders. 16&-169.467-468
Full-cuslom ICs. 379-380
Fuse-based programmable ROM. sec One-lime
programmable (CYrP) ROM
G
GAL (generic array logic). 407
Games, compulerized board. 157
Gale(s), 35. 36.41-44.73-76
A D.43-44
building circuilS u ing.44-47
53 1
t
532 Index
('Olilirllled)
and combinational behavior.
with. 85
and FPGA,. 400--l01
10\\ -power. on noncriti cal paths. 360
NA D.73-75
NOR. 73-75
NOT. 42
number of possible. 76
OR. 42-13
unhersal. 75
XNOR. 74. 75
XOR. 74. 75
Gate arrays. 3 1-382.389. See also Fi eld programmable
gate arrays (FPGAs)
Gating. clock. 35 -360
General-purpose processors. -1.21 . See also Programmabl e
processors
Generate (i n carry-lookahcad adders). 338. 340-34 1
Generator(s):
I Hz pulse generator (example). 183. 186-187
sequence generator (example). 124- 125. 327-328
Generator. sequence. see Sequence generator
Generic array logic (GAL). 407
Generic variables. 503
GHz (gigahenz). 103
Giant video display (product profil e). 4 12-4 16
Gigahenz (GHz). 103
Glitcheslgli tching. 100. 136
Google. II
H
Haitz's law. 413
Half adders. 167- 168
in carry-lookahead scheme. 337-339
tmplementing on a gate array (exampl e). 382
Implementing urn circuit using NAND gates
(example). 385
Implementing sum ci rcuit using NOR gates
(example). 386-387
Implementing using standard cell s (exampl e). 3 3-384
Hardware description languages (HDLs). 446-447
Hardware languages:
in combi nati onal logic description. 447-459
in datapath component descripti on. 467-475
In reg"ter-transfer level (RTL) design. 475-49 1
In ,equential logic description. 459-466
HDLs. -.e hardware description languages
HDTV (h,gh-definilion TV). 94
Hean. human. 138
Heru (HZ), )03
Heumllc,. 308. 3) 3-3) 5
E'prc"o too) In. 315
Ilerallve. 312
Hexadecimal numbers (hex). 16- ) 7
Hi erarchical carry-)ookahead adders. 339- 342
Hierarchy (in RTL design). 275- 278
Hi gh-defi nition TV (HDTV). 94
Hi gh impedance. 239
High-Ieve) state machine(s). 229- 233
in laser-based distance measurer (example). 475-480
and Moore vs. Meal y. 354
Highway speed measuri ng system (example). 187- 188
Hold time (in flip-fl op inputs). 131. 132
Huffman codi ng. 367-369
Hz (hertz). 103
ICs, see Integrated circuits
Idempotent Law. 52. 500
Identity comparator. see N-bi t equality comparator
Identity elements, 497
Identit y propeny. 50
I-frames, see Intracoded frames
If- then-else statements. 255-256
If-then statements, 255
Impedance. high. 239
Implementation(s):
physical , see Physical implementation
as step in combinational logic design. 67-69, 72
two-level logic. 67
Implicant (term), 309
Impli cati on tables, 31 8-322
Improvement, iterative, 312
Increment (counters). 181
Incrementer. 182- 183
Inductance. 188
induction:
finite, 503
perfect. 498
Inducti ve loop. 188
Initial state (controll er ), 135- 136
Init state. 330
Input(s):
acti ve-hi gh. 136
acti ve-low, 137
asynchronous. 133, 135
clear, 134
in combinational logic descripti on. 450
conditi ons. 11 4
control . 150
data. 150
enable. 101
reset. 134- 135
synchronous, 134-135
Input/output extensions (programmabl e processors), 440
Instanti ati on (i n RTL dc;ign). 234
Instructions. 425-428. See also specifi c instruCti ons
arithmellcfloglc. 439
data movement. 439
now-of-control . 440
Instructi on memory. 425
Instructi on register (lR). 426
Instruction set:
programmable processors, 434-435
m processors. 428-431
Instrucll on set extenSIOns (programmable processo )
428. 439-440 rs .
Insulators. 36
In-system programmable EPROMs. 268
Integrated circuits (lCs). 33-35
fu ll -custom. 379-380
semicustom (ASICs), 380-388
Integrated circuit (lC) technology(-ies). 379-412
CPLDs as. 407-408
FPGA as, 388-40 I
FPGA-to-ASIC conversion as. 408
manufactured. 379-388
and Moore's Law, 412
off-the-shelf SSI ICs as. 40 1-404
and proces or varieti es. 410-41 I
programmabl e. 388
relative popularity of, 409
SOCs as. 408-409
SPLDs as. 404-407
tradeoffs among. 409-410
Intel,21
Intracoded frames (I-frames). 363-364, 369
Inverse. 48
Inverters. 42
Involution Law. 52, 50 I
lR (instruction register). 426
Irredundant operations. 3 15
Iterate (term). 313
Iterative improvement . 31 2
J
Java (program language). 254. 258
JK Hip-nops. 13 1
jump-if- zero instntction. 435-437
K
Keys. secure Cnr (example), 11 6-1 17. 125- 126
Keyboands. computer. 71
Kilohertz, 210
K-maps:
four-vari able. 302-303
three-vnriable. 298- _99
and two-level logic ,ilo optimi 7atiOll. 19 306
L
Lands (on DVDs), 362
Laser(s):
for surgery. 112
Index . 533
m three-cycles h" h .
120- 122, 3z,;, (example). 111 - 11 2, 11 5,
Laser-based distanc
230-238 e measurer (example),
connecting the data ath
COntroll er in, to a COntroller in. 236
datapath in, 234-236. 480-491
den vat IOn of COnlI II .
high-level state 0 er s FSM in, 237. 238
LatChes, 97-101 , 475-480
basIC SR. 97-99
flip-flops vs. , 107
level-sensitive D. 103- 106
level-sensiti ve SR, 99-101
Latency (in . I'
La pipe Ine registers). 347
yOUl (of transistors on chips) 380
(Liquid Crystal on chip. 94
, see Llght-enuUtng diode
Level-sensitive D latch. 103-104
Level-sensitive SR latch, 99-10 I
Li ghts. blinkin . 430
LLighh t-emitting diode (LED). 171-172 41?-416
Ig t sensor, 10 . -
Li ght sequencer (example). 184
Lmear search, 356
Liquid Crystal on Sil icon (LCoS) chip 94
Luera/s. 50, 296-298 .
load-constant instruction. 43-1-435, 437
Loadmg (data). 151
load instruction. 428-131. 434
Load operation . 423-124
Load/shift registers. 160-163
Load-store architecture. 424
Local registers (Dctr), 232-233
Logic:
next-state. 329
output. 329
Logic block. configurnble (CLB). 396-39
logIC gates. see Gate( 1
Logic Ie. 40_
Lochhead (in omputer games). 157
Lockup t. bles. 3 9-394
"an'ples using. 392-394
parrit ioning a cin:ui t among. 390-394
Lo\\ -PO" er gat . 360
LT 1000 ' .ntil.tor. 2. 3
-
534 Index
M
MAC (multipl y-accumulate) unit. 353
code.
Magnetic RAM (MAG RAM). 27 1
'Iagnitude comparators. 178- 180
MAGRAM (magnetic RAM). 27 1
Mantissa. 510
Manufactured integrated circui ts (ICs). 379-388
ASICs. 380-388
full-custom ICs. 379-380
Mark 1I (computer). 33
Mars Cli mate Orbiter. 175
Mask-programmed ROM. 266
Master latch. 105- 106
Maxterm. M
Meal y FSMs. 328-333
example using. 331
high-level state machines. 354
with Moore FSMs. 332-333
timing issues in. 33 1-332
1ean time between failures (MTBF). 134
Medium-scale integration (MSI), 34
Megahertz (MHz). 103.210
Memory. III. See also Sequential circui ts
configuration. 398
data. 423
fl ash. 269
in LrUction, 425
MxN.258
nonvolatile. 265
random access (RAM). 259-265
read-only (ROM). 265-271
in RTL design. 258--271
volatile, 265
Metastability, 131-134
Metastable state, 132
Meucci, Antonio. 8
MHz (megahertz), 103,2 10
Microphones. 5. 210
Microprocessors:
defined. 18
digital ci rcuits in, 4-5
FPGAs vs. 40 I
software in. 18--21
Millimum-bitwidth binary encodi ng, 323-324
M,nterm, 63. 308
MIPS microprocessor, 21. 422
Mnemonic instructions. 430
Module:
III combillalional logic description, 450
In LED,. 414-416
SC. 450-452
Monitor(s):
RGB.I92
in ultrasound machines. 213
Moore. Gordon. 34
Moore FSMs. 328-333
hi gh-level state machines, 354
with Mealy FSMs, 332-333
Moore' s Law. 34, 35. 412
MOS (ternl). 37
Motion-in-the-dark detector appli cat ion, 17- 19, 21 , 440
Motion sensor, 9
Motorola, 21
MP3 fornlat , 7
MPEG- 1. 363
MPEG-2 encoding, 363-366, 369-370
MSI (medium-scale integration), 34
MTBF (mean time between fai lures). 134
Multifunction registers. 160-163
Multilevel carry-Iookahead adders. 342
Multil evel logic. 360
Multil evel logic optimi zation. 315-317
Multiple bit storage. 109- 11 1
Multiple-output combinati onal circuits, 65
Multipl exers (muxes), 79-83
internal design of, 79- 80
N-bit Mxl , 81-82
Multipli cat ive identit y element, 497
Multipliers:
in beam formers. 2 15
in binary numbers. 189- 190
sequential , 343-345
Multipl y-accumul ate (MAC) unit. 353
Multi-ported register file, 208
Muxes, see Multipl exers
MxN memory, 258
MxN register file. 204
N
NAND gates, 73- 75, 384-386
Nanosecond (ns), 100
Nanowalls, 360
N-bil adders, 165- 166
N-bil arithmetic-logic unils, 20 I
N-bil barrel shifters. 176
N-bit counlers, 181
N-bit equalil y comparalor, 177- 178
N-bit magnitude comparalors, 178- 180
N-bil regislers, 151
N-bit shifters. 174
N- bit subtractors, 190-19 1
Negative edge-triggered nip-naps. 107
Negalive logic, 136--137
Negalive numbers. represenl ing. 194- 197
Network rouler, 92
New Year's Eve counldown di splay (exam Ie) 18
Nexpena plalform (Philips), 409 p , 6
Nexl-stale logic. 329
nexlSlale signal. 463-466
NMOS .Iransislors. 35. 36, 42-44, 73
Noncnli cal paths, 360
Nondelermini stic FSM, 128
Non-ideal behavior (in flip-flops), 131 - 134
Nonrecumng engll1eering (NRE), 200, 380
Nonvolalil e memory, 265
Nonvolalile RAM (NVRAM). 27 1
NOR gales, 73-75, 386--388
Normalized numbers, 510
Nolalion(s):
in Boolean algebra. 48-49
si mplifyi ng circuit. 69-70
simplifyi ng for FSMs. 115- 116, 130
NOT gales. 42
NOT operalor. 38-40
NRE, see Nonrecurring engi neering
nS (nanosecond), 100
Null elements. 52
Numbers:
bi nary, 11 -17
encoding of, 10-13
hexadecimal, 16-- 17
OClal, 17
represenling negative. 194- 197
subtractors for positi ve, 190- 191
NVRAM (nonvolatile RAM). 27 1
NxN multipliers. 189
o
Octal numbers. 17
Off-sel. 308
Off-the-shelf logic (SSI ) IC.
Ohm's Law. 31
I Hz pul se (example). I 3. I 6--1 7
One- hot encoding. 32+.326
One's complement . 196
One-time programmable (OTP) RO 1. 267. 405
On-set. 308
Opcode, 429
Operands, 429
Operati on(s):
bitwi e. 20 1
expand. 313
irredundnnt . 315
reduce. 315
Operalion ode. 429
Operator(s):
AND. 3 -40
Index
in BOOlean al eb
NOT, 3S-40 g ra, 38--39, 4S-49
OR,38--40
Operator binding 350-
Operator sched I: 351
Opticom s Uing, 351-354
O ' ystem, 188
solution. 308
OptlmlZati On(s), 294-2
and algOrithm selecti!6. also Tradeoff(s)
combinational I ' . 6
criteria for, 295OglC, 296-317
294, 295
at higher vs. lOwer d .
multilevel logic 31 eslgn levels. 355
power, 357-300 5-317
RTL deSign. 345-354
equenUallogic.317_333
two-level 10 ' .
OR glc Size. 296-315
gates, 42-43
OR operator, 38--40
Orthogonal implementati
OSCillation, 99-100 on features, 410-411
Oscillators:
defined, 102
quartz, 102
in sequential I ' '.
OTP ROM oglc d.escnpuon. 461-463
. see One-ume
OutDelay, 213-214 programmable ROM
Output(s), 31. 32
in combinational I .
reading. 246 oglc description. 450
reg. 453. 454, 460
Output enCoding, 327-328
Output glitches. 136
OutPUt logic. 329
Overclocking (in Pes). 253
Overflow detection, 198-200
p
Pacemakers, 137-1
PAL (programable array logic). 407
Parnllelload Counters. 185-187
registers, 151-152. 160-161
rdluUomng. _3 -
Pe (program COunter). 426
Pel see P . h
p'. :np eral component interface
enuum ml roProcessors. _1
Perfect indu tion. 49
Perfonnance (in digirnl systems). _95
euensions (programmable
(clock signal). 103 proces rs). I
Penpheruls.
Peripheral component interfoce (Pel). 141
SJS

536 Index
Pframcs. <;;ee Predicted frames
Physical design. 387
Physical impl ementation. 379-117
alternati ve technol ogies for. 401-409
comparing technologies for. 409--t 12
of giant video display. 412-1 16
and manufactured IC technologies. 379- 388
and programmable IC technologies. 388-40 I
PIC microprocessor. 21. 422
Pictures. digiti zed. 8
Piezoelectri c crystals. 210
Pipeline registers. 346
Pipelining. 345- 347
Pixels. 192. 361
Pl acement (in chip components). 387
PLAs (programmable logic arrays), 407
Platform SOCs. 408-109
PLD. see Programmable logic device
PMOS transistors. 37. 42-44. 73
Pop (in queues). 272
Pones):
in combinational logic description. -+47
read. 205
write. 205
Positi ve edge-triggered flip-Hops. 107
Positi ve numbers. subtractors for. 190- 194
Power:
in digi tal systems. 295
dynami c. 358
Power optimi zati on. 357-360
Power PC programmabl e processor. 422
Precharging (RAM bit storage), 261
Predicted frames (P-frames). 363-364, 369
Preset (asynchronous set). 135
Prime (term). 48
Prime implicant. 309
Printers. 192- 194
Pri ori ty encoders. 86
Proces declarati on, 452-453
Processor( s):
defi ned, 225
digital signal. 213
single-purpose, 421
superscalar. 44 1
Very Large In"ructi on Word (VLlW), 44 1
Product. 48
Product-of-maxterms form, 64
Product profi les:
cell phones. 279-284
dIgital video pl ayerl recorders, 361-370
giant VIdeO d"play, 412-4 16
pacemaker>. 137-139
ultr",ound machines. 209-2 16
Product term. 50
Program, 42 1, 425
Programable array logic (PAL). 407
Program counter (PC). 426
Programmabl e illl egraled circui t (l C) technology. see
Fi eld programmabl e gate arrays (FPGAs)
Programmable inlerCOnneCIS. 394--396
Programmable logic arrays (PLAs). 407
Programmable logic device (PLD). 404-407
Programmable processors. 42 1-442
control unit for. 424-428
datapath for. 422-424
input/oulput extensions 10, 440
instructi on set eXlensions to. 439-440
performance extensions 10, 441
six- instruction. 434-439
th ree-i nstruction. 428-434
Programmabl e ROM, 267
Programmers (ROM), 267
Programming languages, 254-258
PROM, see Programmable ROM
Propagate (in carry- lookalJead adders), 338. 340-34 1
Propagation. 104
Propositi onal calculus. 504
Pul se width modulat ion (PWM), 415
Push (in queues). 272
PWM (pulse width modulation). 4 15
Q
Quanti zati on (in video compression). 366-367, 369
Quartz, 102
Quartz oscillat ors, 102
Queues. 271-272
Queuing. 271-274
Quine-McCluskey method, 3 1 1-3 12
QWERTY keyboard, 71
R
Race conditi on. 100
Radi x. 5 10
Random access memory (RAMs):
bit storage in, 260-26 1
dynamic (DRAM), 262-263
exampl e using, 264-265
in RTL design, 259-265, 271
stati c, 26 1- 262
readO functi on. 455
Reading (data), 15 1
Read-Onl y Memory (ROMs):
exampl es using, 269- 27 1
in RTL design, 265-27 1
types of. 266-269
Read-onl y memory programming, 265
Read pon, 205
Read time, 263
Real nu mbers, 505-507
Recording, audio, 5-7
Reduce operation. 315
Register(s), 109- 111 , 15 1- 165
design process for, 163
examples using, 152-160, 164- 165
local (Dctr), 232- 233
multifunct ion, 160- 163
in multipl e bit storage, 109- 111
N-bi t, 151
parall el load, 151-152
wi th parallel load and shi ft , 160- 163
pi peline, 346
rotate, 159- 160
in sequential logic descripti on, 459-46 1
shi ft , 158, 159
updati ng of, 245-246
Regislered data outputs, 246-247
Register fi les, 204-208
dual-poned, 208
mul ti-poned, 208
MxN, 204
si ngle-poned, 208
Register-transfer level (RTL) components, 151
Register- transfer level (RTL) design, 225-285
abstraction in, 276
behavioral-level, 254-258
clock frequency, determinati on of. 25 1-254
component allocation in. 349-350
concurrency in, 348-349
connection of data path to controller, 236
controll er's FSM, derivati on of. 237. 238
data-dominated,247-248
data path, creation of, 234-236
examples of. 238-244, 248-250. 269-27 1. 279-284
hierarchy in. 275-278
high-level state machine. creation of. 229-233
managing compl exi lY in, 275
memory components in. 258-271
method, 226-238
operalor binding in. 350-351
operalor scheduling in, 35 1- 3 4
optimi zali ons and tradeolTs in, _ 45- 354
pipelining in. 345- 347
pitfalls in. 245- 246
queuing in, 27 1- 274
RAMs in, 259- _65, 27 1
and regi stered data outpul S, 2-16-_47
ROMs in, 265- 27 1
scope of. 225- 126
using hardware langung., Ill , -1 7_ 91
Index .. 537
using programmin I
reg output, 453, 45
g
7 in, 254-258
Relays, 32 ' 60
Reset inputs, 134- 135
Resetting, 98
ReSistance, 31
Resource sharing, 351
Resplll (in IC fabrication), 380
Reverse engllleering 1?6
RGB color space, 192--194
RGB monitors, 192
Rising edge-triggered fl'
Rolling over, 181 Ip-ftOps, 107
ROMs, see Read-Onl y Mem
Rotate registers, 159-160 ory
:Outing (i n chips), 387
5-6000 SP processor 157
RTL components, 151 '
RTL design, see Register-tra
Run-length encoding. 367, level design
S
SAD. see Sum-of-absolute-<1'!t
Sampl ing, 6 I. erences
Sscale, chompen ating weight (example) 173
Can c run. 399 .
Scan convener, 213
SC_CfOR statement, 45 1--452 454
Scheduling. operator. 351-354'
Schematic, 445
SChematic capture tool (use in circUi ts) 84
SC_1Il0 statement, 451 '
SC_METHOD. 454-455, 458
SC_module. 450-452
sc_outo statemenL 451
sc_signal statement. 451
SC_ THREAD testbench process. 458. -162-463
Search(es):
binary. 257
linear. 56
Seal belt warning lighl (example):
.. " ended. on an FPGA. 395-'96
implementing. with a lookup table. 390
usmg OR-b:L<ed gale :tmlv. 3 -3
usi ng off-the-shdf 7-100 403-104
using simplo PLO, -106
ond p:1SS (stale reduction), 3_1. 321
ecure car key(e,ample), 116-11 , IJ5.-L6
el tun;, 9. ee also lultipl xers (mu.,)
emi nduct rs. 36
emicuslom lCs. see Appl ication pecific IntegTllled
Circuils (A IC 1
n. amplifier, 261
&
538 Index
Sensitive processes. 453. 455
Sensitivity lists. 46 1-462
Scnsor(s). 9- 10
bunon. 10
light. 10
traffic light . 188
Sequence generator (exampl e). 124- 125. 327-328
Sequencer. li gh t (exampl e). 184
Sequenti al ci rcuits. 30. 85-86. 95. 126- 127. See also
Fini te-state machines (FSMs)
controll ers. I I I. I 19- 130. 135- 140
converting to FSM (example). 126- 127
Rip-Oops. 96- 111 . 130-135
Sequential logic descripti on:
controll ers in. 463-466
osci ll ators in. 461-463
registers in. 459-46 1
using hardware languages in. 459-466
Sequent ial logic optimizati on. 317-333
and Moore vs. Mealy FSMs. 328-333
state encodi ng as. 323-328
state reducti on as. 317-323
Sequential multi pliers. 343-345
Seri al communi cati on. 160
Seri al comput ati on. 354-355
Seriali zing (in computati ons). 352
Servant 0 latch. 105- 106
Set inputs. synchronous/asynchronous. 135
Setting (in latches). 99
Setup ti me (i n flip-flop inputs). 13 1. 132
Shannon. Claude. 40
Shifters. 173- 176
barrel. 176
examples using. 175
simple. 174
Shift registers. 158. 159
Shockley. Willi am. 33
SHRO funct ion. 477
Si gnal(s).448
currentstate. 463-466
di gital. 4-7
nextstate. 463-466
state. 476
Signal processor. 213
Sign bit. 196
Signed-magni tude. 194
Significand.510
Silicon (element). 37
Silicon chips. 33-35. See also Integrated ci rcuits (ICs)
and economy of scale. 200
fabri cation of. 380
Silicon Valley (California). 37
Simple programmabl e logic device (SPLO). 404-407
CPLOs vs .. 407-408
FPGAs vs .. 407-408
Simul ati on (in ci rcuits). 84
Simul ator. 84
Single- ported register fi le. 208
Single- purpose processor. 42 1
Si x- instructi on programmabl e processors. 434-439
cont rol unit in. 435-437
datapath in. 435-437
instruction set in. 434-435
Size (in di gital systems). 295
Small -scale int egrati on. see SSI
SOc. see System-on-a-chip
Soda machine di spenser (exampl e). 227-229. 515-525
controll er. design of. 5 16-51 8
understanding behavior of controll er and datapath.
51 9-525
Soft ware. 18
Solid-state transistors. 33
Sound. 2 10-2 12
Sound generation circuits. 2 13-214
Sound waves. 2 10. 2 12
Source input. 3 1. 32. 35. 36
SPG bl ocks. 339
Spin (in IC fabri cati on). 380
SPLO. see Simpl e programmable logic devi ce
Spurious values. 17 1
SRAM. see Static random access memory
SR flip-fl ops. 108. 13 1
SR latches. 97- 10 I
basic. 97- 99
level-sensit ive. 99- 10 I
SSI (small -scale integrati on). 34, 401-402
Stages:
pipeline regi sters. 346
programmable processors. 425-428
Standard architecture (for controll ers). 11 9
Standard cell s. 382- 383
Standard represent ati on. 62
State(s):
of ci rcuits. 85- 86. 95 . III
equivalency between. 3 18
State diagram. I 14
State encoding:
alternative minimum-bitwidt h binary. 323- 324
one-hot. 324-326
output . 327-328
in sequential logic optimi zation. 323- 328
Statements. See also specifi c statements
assert. 456. 458-459
di spl ay. 457
2
State minimi zation. 3 17-323
Sl3le reduction:
algorithm for. 3 19
example. for. using impli cati on tabl e. 32 1- 322
Impli cation tabl es. 3 18-320
in sequential logic optimi zation. 3 17- 323
steps in. 320-32 1
Stale signal. 476
Statetype. 463-466
Static random access memory (SRAM). 261-262. 271
Steenng (of sound). 2 1 I. 2 12
Stereo speaker. 2 10
SLOre instruction. 429-43 1, 434
Store operations. 423-424
Structure (in combinati onal logic descripti on). 447-452
Structured ASICs. 383. 408
Subsening (i n program language ). 258
subtract instruction. 435-437
Subt raction (using additi on). 195- 196
Subtraction method. 13- 14. 505
SubtracLOr(s). 190-200
detecting overRolV in. 198-200
exampl es using. 191 - 194, 198
for positive numbers. 190-194
usi ng adder 10 build a. 197-200
Sum. 48
Summation circuit s. 2 14--2 15
Sum-of-absolute-differences (SAD):
wilh concurrency (exampl e). 348-349
design example. 241 - 244
examples using C code. 254-258
Sum-of-minterms form. 63-65
Sum-of-products. 50
Superscalar processor. 44 1
SlI'itch(es). 3 1-35
and di screte transistors. 33
Duallnline Package. 171 - 172
and integrated circuit s. 33-34
relays in. 32
sliding (exampl e). 306-307
and vacuum tubes. 32-33
Switching algebra. 496. 497
Swilch matrices. 394-396. 398-399. See also
Programmable inl erconnects
Synchronizer. bunon press (example).
Synchronous circuit. 102
Synchronous clear. 164
Synchronous clearing. 184
Synchronous reset inputs. 134- 135
Synchronous set. 164
Synchronous Sci inputs. 135
123- 124
Systems:
Index -4 539
17- 19. 21
digital. 17- 18
embedded. 4
SystemC. 450-45? 454-4
470-471 . 474-475 458-463. 465-466 468
System-on-a-chip (SOC 0. 488-491 ' .
T ). 408-409
Tables. implication. 318- 319
Tabular methOd. 311 - 312
ialking doll (example). 269-27\
ap (as mathematical te
Technology rna ' rm). 282
Telephones. 8 PPlOg. 387
Temperat ure averager (e
Temperature histo . Xample). 175
154-155 ry display (example). 109-111
Terms: .
combini ng. to eliminate .
prOduct. 50 a vanable. 297
Terminal count (counter OUI
Testbenches. 455-459 put). 181
T flip-flops. 131
Three-cycl es-high laser .
alternative binary (example):
comroller for. '"g for. 324
first deSign. poorly done. 111-112
FSM for. 115
using one-hot encOding. 326
3-D Images (ultrasound). 216
Three-mstruction prograrnmabl
control unit for. 432-434 e processors. 428-434
datapath for. 431-433
first in truction set in. 428-431
Three-state driver. 206
Throu2hput (i . I' .
- n Pipe lOe regISters) 347
Timer(s): .
as coumer type. 187-188
Tthree-cYCles-high laser. 111-112. 115. 120-122
mllng analYSIs. 254
Timing diagrams. 20
Timing issues. with Mealy FSMs. 331-332
Tradeoff(s): See also Optimization(s)
and algonthm selection. 356
among IC technologies. 409-110
datapath omponenL 333-345
defined. 29-
at higher vs. lo\\'er design 1e\-eL. 355
m RTL design. 34>-354
bet\\'":n serial and concurrent computation. 35+-355
Traffic Itght sensors. I
Tmnsdu rs. 9. 210
G
540 Index
Transfonllation operations. 423-124
Transistors:
CMOS. 35-37. 1.42
discrete. 33
nMOS. 35. 36. 73
pMOS. 37. 42-44. 73
Transitions. 11 4
Transparent latch. sec Level-sensitive SR kl.l ch
Truth table(s). 42
and Boolean func tions. 56-58
as Boolean function standard representation. 62- 63
defined. 56
Tubes. vacuum. 32-33
Two-level logic adders. 334
Two-level logic implementations. 67
Two-level logic size optimi zati on. 296-3 15
automation of. 308-3 15
and don '( care input 305-307
and K-maps. 298-306
usi ng algebraic methods. 296-298
Two' s complement. 196- 197
building a subtractor using adders and, 197- 200
defined. 196
detecting overfl ow using. J 99-200
type declaration. 463
type statement. 463
Typewriters. 71
u
Ultrasound (term). 210
Ultrasound imaging. 2 10
Ultrasound machines. 209-216
beamfonner in. 210-2 13
digital circuits in, 213-215
future challenges with. 216
moniLOr in. 2 J 3
scan converter in. 213
signal processor in. 2 13
transducer in, 210
Unique complement, 499
Uniting theorem. 297
Universal gates, 75, 384
Universal Serial Bus (USB), 161
Upcounters, 181-183, 471-475
Up/down counters. 184
US B (Uni versal Serial Bus), 161
use statement, 476
v
Vacuum lUbeS. 32-33
Variable(s). 49.
combining temlS to elimintllc 3, 297
generic. 503
Veri log (hardware description language), 254, 258.
449-450.453-454.457.460.462.464-465,467.
469-470. 473-474,477-479.484-487
Very Large Instruction Word (VLlW) processor. 44 1
Very-large scale integration (VLSI), 34
VHDL (hardware description language). 254. 258,
447-449. 452-453. 456. 459-461. 463-464. 467-469.
47 1-473.475-477. 480-484
Video. digiti zed. 8,244
Video compression (examples):
usi ng C code. 254-256
usi ng sum-of-absolute differences (SAD) design.
24 1-244
Video di splay. giant . 412-416
Vinex II Pro platfo rm (Xilinx). 409
VLlW (Very Large Instruction Word) processor, 44 1
VLSI (very-large scale integrati on). 34
Volati Ie memory. 265
Voltage. 31
w
wai l for statement. 46 J
wai tO function. 458, 462
Wait state. 330
Wall (uni t). 357
Waves. cosine. 364-366
Wavefonn (of inputs), 84
Weight sampler (example). 153-154
Western Uni on. 8
While loop statements, 256
Wireless communi cation. 16 J
Wire signal, 457
wi res OUtput , 450. 453
Word (data item), 258
Wrapping around (counters). 181
Wristwatch, beeping (example):
using combined MooreiMealy machine, 333
using Mealy machine, 33 1
writeO function. 455
Write pan, 205
X
XNOR gates, 74. 75
XOR gates. 74, 75

Você também pode gostar