Você está na página 1de 6

90

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11:

ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 43, NO. 2, FEBRUARY 1996

cient Multipliers for nal Processing Applications


Sunder S . Kidambi, Fayez El-Guibaly, Senior Member, ZEEE, and Andreas Antoniou, Fellow, ZEEE
is required, the 2N-bit products obtained must be quantized to N bits by eliminating the N least-significant bits (LSBs) by some quantization scheme (e.g., truncation or rounding). Close examination of the standard parallel multiplier reveals that approximately 50% of the chip area is needed for the generation of the N LSBs which are eventually removed by the quantization process. One could reduce the area by approximately 50% by omitting about half the adder cells but significant error in the required product would be introduced. Our design strategy is to omit about half the adder cells, as suggested, but introduce appropriate biases to the retained adder cells based on a probabilistic estimation. In this way an area-efficient parallel multiplier, referred to as truncated multiplier, is obtained in which the error involved is kept to a minimum. An estimation of the error statistics of the biased truncated multiplier reveals that the variance of the error is less than that of a standard multiplier whose results are truncated or rounded to the N most significant bits. Section I1 of the paper deals with the underlying principles involved in the design of the truncated multiplier. Section 1 1 1 deals with a probabilistic estimation of the error in the product of the truncated multiplier. Section IV extends the principles involved to the design of two s-complement multipliers. The paper concludes with the application of the truncated multiplier for the design of a second-order digital filter. It is shown that the signal-to-noise ratio of the digital filter using a truncated multiplier is better than that using a standard multiplier.

Abstract- An area-efficient parallel sign-magnitude multiplier that receives two N-bit numbers and produces an N-bit product, referred to as a truncated multiplier, is described. The quantization of the product to N bits is achieved by omitting about half the adder cells needed to add the partial products but i n order to keep the quantization error to a minimum, probabilistic biases are obtained and are then fed to the inputs of the retained adder cells. The truncatedmultiplier requires approximately50% of the area of a standard parallel multiplier. The paper then shows that this design strategy can also be applied for the.design of twos-complement multipliers. The paper concludes with the application of the truncated multiplier for the implementation of a digital filter and it is shown that the signal-to-noise ratio of the digital filter using a truncated multiplier is better than that using a standard multiplier.

I. INTRODUCTION

HE dominant factors in the design of multipliers for digital filters and other digital signal processing (DSP) applications are the chip area required and the speed of operation. Among the many classes of multipliers, array multipliers and multipliers based on the modified Booth algorithm have been popular. In array multipliers, multiplication is effected by adding all the partial products generated by an array of AND-gate cells. On the other hand, in multipliers based on the moQfied Booth algorithm, three-bit strings of the multliplier are scanned and appropriate operations are carried out on the multiplicand [l], [2]. In the past few years, significant reduction in the chip area as well as an associated increase in the speed of operation of these multipliers have been achieved through increased device density [3]-[SI by taking advantage of advancements in VLSI technology. Unfortunately, this approach has reached a stage of diminishing returns and further improvements in the design of parallel multipliers will occur only if major breakthroughs are achieved in the technology. In this paper, we explore an alternative approach to the design of area-efficient multipliers for DSP applications, which is independent of technology. Very often in these applicaiions fixed-point arithmetic is used and typically N-bit signals are multiplied by N-bit coefficients. Since a uniform word length
Manuscript received December 13, 1993; revised May 16, 1995. This work was supported by the Natural Sciences and Engineering Research Council of Canada and Micronet, Networks of Centers of Excellence Program. This paper was recommended by Associate Editor F. Kurdahi. S . S. Kidambi is with Analog Devices Inc., Ray Stata Technology Center, Wilmington, MA 01887 USA. F. El-Guibaly and A. Antoniou are with the Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada V8W 3P6. Publisher Item Identifier S 1057-7130(96)01502-9.

PARALLEL MULTIPLIER 11. AREA-EFFICIENT


Two N-bit numbers to be multiplied, A and B, and their product P can be represented as
N-I

and

respectively, where a,, b,, p , E (0, I}, In the standard N x N parallel multiplier, the N 2 bit products are generated simultaneously and are then added by an array of full adders, as shown in Fig. 1 for an 8 x 8 multiplier. The partial sums propagate diagonally in the southeast direction along lines of equal binary weight. The carry

1057-7130/96$05.00 0 1996 IEEE

KIDAMBI et al.: AREA-EFFICIENT MULTIPLIERS

91

Fig. 2. Sections in the parallel multiplier generating the four terms of P . The shaded region represents the cells that generate discarded results due to truncation.

b,:

2=-=
(b)
(c)

Fig. 1. An 8 x 8 multiplication using parallel multiplier where A, HA, and FA are the AND, half-adder, and full-adder cells, respectively. (a) Multiplier block diagram. (b) Details of AHA cell. (c) Details of AFA cell.

produce Pz. If the N LSB's of the product are truncated, segment PZis discarded and, in effect, approximately about half of the adder cells and, in turn, about 50% of the chip area are not used effectively. In the light of the above discussion, we now attempt to develop a truncated multiplier that generates a product whose value is approximately equal to Ph without using the cells in the shaded part of Fig. 2. Fig. 3 shows the details of an 8 x 8 truncated multiplier. The reason why a 1 is introduced at the rightmost half adder will be explained in Section 1 1 1 . Comparison of Figs. 1 and 3 reveals that the truncated multiplier yields a product which is not exactly equal to Ph. To remedy this, we perform a probabilistic analysis of the resulting multiplication error and determine the expected value of the error which can be used as a bias term to improve the accuracy of the multiplier. 111. ERROR ANALYSIS From Fig. 1 it can be seen that all the bit products are generated by performing an AND operation on the bits of the two operands. Let p m be the probability that any multiplicand or multiplier bit is 1. The probability p ( i , j ) that the bit product at column i and row j is 1 is equal to p k . It must be mentioned that i increases from right to left and j increases from top to bottom as can be noted from Fig. 1. Let p s ( i , j ) denote the probability that the sum bit of the adder at the ith column and jth row is one and, similarly, let p c ( i , j ) denote the probability that the carry bit is one. For a parallel multiplier we have

signals propagate downwards along increasing binary weight directions. The delay through the parallel multiplier is due to the carry propagation through the adder array and the carry-ripple adder at the bottom of Fig. l(a). In practice, the carry-ripple adder is replaced by a carry look-ahead adder [ 2 ] . The product can be represented by the sum of two segments, Ph and l'l, representing the most and the least significant segments, respectively, i.e.,

P=Ph+fi
=&Bh

+ AhBz + AiBh + AzBi

(4)
(5)

where Ph and Pl are N bits long each and N is assumed to be even. Ah and Al are the most and least significant parts of A, while Bh and REare those of B. The most and least significant segments of P are given by
2N-1

+ 1, j P C ( G j ) =P(C P s ( i , j ) =p(4 j ) 4 ( i + 1, j
for 0

- 1) - 1)

+ di,

+ 1, .7 - 1)
(8)

PfL =
z=N

p22"-2N
N-1

(6)

and Pl =
2=0

5i5N

- 2, j = 1, with q ( i , j ) = 1 - p ( i , j ) , and

p22"-2N

(7)

Pc(i, j ) =p(i, j ) P s ( i

+ 1, j

- l ) P c ( i , j - 1)

+P(i, j)P,(i

respectively. Fig. 2 shows the various sections of a multiplier generating Ph and Pz. The shaded region contains cells that

1, j - 1) j - 1) + P ( i , j ) q s ( i + 1, j - l ) P C ( i , j - 1)

+ 4 ( i , j ) P s ( i + 1,j

1 ) P c ( , 4 j - 1)

92

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-I1

ANALOG AND DIGITAL SIGNAL PROCESSING, VOL 43, NO 2, FEBRUARY 1996

a3

16

1 4
b,
12
b z

10 8 -

6 -

4
b4

10

20

30

40

50

60

4
(a)
b 6

4
1

Fig 3 . A truncated 8 x 8 multiplier.


Ps(% j

) = P ( % j ) P s ( i + 1, j - 1 ) P c ( i , j - 1) P(Z, j ) 4 s ( i 1, j - 1)4c(i, j - 1)

+ + + 44 j ) 4 s ( i + 1,j

10

20

30

N
(b)

40

50

60

- 1)P c ( i , j - 1) - 1)

+ 4 ( i , j ) P s ( i + 1, j

d i ,j

- 1)

(9)

for 0 5 i 5 N - 2 , 2 5 j 5 N-1, with q c ( i , j ) = 1 - p c ( i , j ) and !Is(%, j ) = 1 - P s ( i , j ) . It can be seen from Fig. 1 that the expected error in the product of the truncated multiplier is equal to the value of the output cany bits from the cells at position ( i , j ) where i + j = N - 1 , 0 5 i 5 N - 2 , and 1 < j N - 1 . The weight of these bits is equal to 2TN. It is thus evident that the expected error EI lies in the range 0 5 1 I ( N - 1)Q, where Q = 2 - N . If we now calculate the expected value of the error, we can bias the result generated to yield a product with minimum error. For brevity, we shall call the cells at position (i,j ) such that i + j = N - 1 as the diagonal cells. The probability that any output carry bit at the diagonal is 1 is equal to its expected value, i.e.,

Fig. 4. Error statistics of the truncated mulhplier. (a) Variation of (b) Variation of u2 with N
V2[Cd(i,

with

N.

j)l

= E ( ( 4 4 j ) - E [ C d ( i , j)ll2)
= E [(C&
j ) ] - E2[Cd(i, j ) ]

=Pc(i, j ) I 1 - P c ( i , j)l
= P c ( i , j > 444 j ) .

<

(12)

Since each output cany bit at the diagonal is an independent random variable, the variance of the error 01 can be written as
0 1

= Q2
2).7

w2[cd(2, j ) ] .

(13)

E[Cd(Z,

dl = P C ( 4 j )

(10)

where c,j(z, 3 ) is any carry output bit at the diagonal. Therefore, the expected value of the error in the product of the truncated multiplier is given by

%>

for (2, j ) belonging to the diagonal cells. The variance of any output carry bit at the diagonal is given by

Fig. 4(a) and (b) shows the variation of E and u2, where E = E I / Q and O = with the size of the multiplier. On the basis of (ll), an expected value of the error can be obtained that can be used to bias the truncated multiplier. Let US now calculate the expected value and variance of the error of the biased truncated multiplier. It can be seen from Fig. 4(a) that E , for many commonly used multipliers, can be written as E = I 0.25, where I is the integer part of E . It is evident that the bias to the truncated multiplier can only be equal to I Q since 0.25Q corresponds to the ( N + 1)th and ( N + 2)th bits. Since a sign-magnitudemultiplier is considered, the error in the biased truncated multiplier can only be either -0.25Q or 0.25Q with equal probability. Thus, the expected or mean value of the error is zero and the variance is given by Q2/16. Note that the variance of the error for a standard N x N multiplier is Q2/S if the result is truncated or &/la if the result is rounded to N bits [9].

KIDAMBI et al.: AREA-EFFICIENT MULTIPLIERS

93

04

a,

Assuming [ = 0.45 and 4 = 0.09, we obtain At,/Af, = 0.52, 0.51, and 0.5 for N =8, 16, and 32, respectively. The speed of the truncated multiplier can be increased by using the concept of nonadditive multiplicative modules (NMM) [2]. It can be seen from Fig. 2 that partial products AhBl and AlBh contribute to the same segment of the final result. Therefore, these partial products can be obtained independently of each other and their final results added together.

V. TWO'S-COMPLEMENT MULTIPLICATION
Two's-complement numbers can be handled in a way similar to the sign-magnitude multipliers by using the multiplication scheme mentioned in [lo]. The multiplication of two numbers in tw,o' s-complement representation is written as

(b)

Fig. 5. An 8 x 8 two's-complement multiplier where ND is a NAND gate cell. (a) Multiplier block diagram. (b) Details of NFA cell.

/N-2

N-2

It should be mentioned that the value of I can be added at the adder cells present at the diagonal of the truncated multiplier, which entails no extra hardware for biasing. For an 8 x 8 multiplier, E = 1.25 and, thus, a 1 is added at the rightmost half-adder cell of the truncated multiplier which is shown in Fig. 3. Adding of more than one bit can be accomplished by replacing the half adder cells by full adder cells on the diagonal.

+ 2 - w - 2 ) - 2.

(19)

Iv.

REDUCTION OF

AREAAND

SPEED CONSIDERATIONS

Let A,, Ah, and A , be the areas of an AND gate, a half adder, and a full adder, respectively. The area of a standard N x N parallel multiplier is given by

From the above equation it can be seen that the multiplication of two numbers can be written in a form involving only positive partial products. Fig. 5 shows the parallel multiplier array for the two's-complement multiplication. Evidently, a truncated two' s-complement multiplier can be readily designed by a structure similar to that in Fig. 3. An error analysis can be carried out using the approach in Section 111. Fig. 6(a) and (b) shows the variation of E and o2 with the size of the multiplier for a truncated two's-complement multiplier.

A f , = N2A,

+ ( N - 1)Ah + ( N - 1)'Af.

(14)

VI. APPLICATION OF THE TRUNCATED MULTIPLIER IN DIGITAL FILTERS


In this section, we design a digital filter first using the biased truncated sign-magnitude multiplier and then using the full parallel sign-magnitude multiplier. To this end, we design a second-order Butterworth wave digital filter based on the GIC configuration, shown in Fig. 7 [9]. The multiplier values ml and m2 are found to be equal to -0.414 213 6, and -0.414 213 6, respectively. The output power spectral density (PSD) of the filter is given by
S,(eJw) = (IHo(eJw)12 -t~
~ H E ( P ' ) / ~ )

On the other hand, the maximum area of a truncated multiplier is given by


A,, = N ( N - l ) ( A u A f ) .

(15)

If we let Ah = [ A f and A , = q5Af where 0 < [ < 0.5 and 0 < q5 < 0.1, then the ratio of the areas of the truncated multiplier and the full multiplier can be written as

At, Af,

4N2

+ [ N 2+ (5 - 2)N - <+I ] '

0.5 N ( N

- 1)(1+ 4)

(16)

s,(eJw)

(20)

94

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-I1

kNALOG AND DIGITAL SIGNAL PROCESSING, VOL 43, NO 2, FEBRUARY 1996

16

14
12

1 . 8 1.6

10
t

1.4

Truncated multipllier Standard multiplier -e-

8
6

PSD
x10-O

1.2 1

4 2

0.8 0.6
0.4

n
0 10
20 30
40

0.2

50

GO

05

15 2 w in rad/s

25

35

(a)

Fig. 8. Variation of the output noise PSD of a low-pass GIC wave digital filter using standard and truncated multipliers.

figure, the performance of the biased truncated multiplier is better than that of the standard multiplier. The area of the truncated multiplier, as mentioned earlier, is about half that of the standard multiplier. VII. CONCLUSION

10

20

30

40

50

60

N
(b)

Fig. 6. Error statistics of the truncated twos-complement multiplier. (a) Variatlon of E with N.(b) Variation of a2 with N.

A sign-magnitude multiplier that receives two N-bit numbers and produces an N-bit product, referred to as a truncated multiplier, has been developed. The quantization of the product to N bits is achieved by omitting about half the adder cells but in order to keep the quantization error to a minimum, probabilistic biases are generated and fed at the inputs of the retained adder cells. The truncated multiplier requires approximately 50% of the area of a standard parallel multiplier. This design strategy has been extended to the design of twoscomplement multipliers. The use of truncated multipliers in the implementation of a second-order digital filter has been examined and it was found that the signal-to-noise ratio is better than that achieved with standard multipliers.

REFERENCES
S Waser and M J Flynn, Introduction to Arithmeticfor Digital Systems -1

Fig. 7. A low-pass GIC wave digital filter.

where

are the transfer functions from the filter input and point E to the output, respectively, and S,(e3) is the variance of the error of the multiplier used. The size of the multiplier was chosen to be 16 x 16. In the case of a standard 16 x 16 multiplier, the output was rounded to 16 most-significant bits. Fig. 8 shows the variation of the output PSD using standard and biased truncated multipliers. As can be seen from the

Desrgners New York CBS College Pub1 , 1982 J 5 F Cavanagh, Digital Computer Arithmetic New York McGrawHill, 1984 M. H a t m a n and G L Cash, A 70-MHz 8 x &bit parallel pipelined muluppher in 2 5 p m CMOS, ZEEE J Solid-state Circuits, vol SC-21, pp 505-513, Aug 1986 T G Noll, D Schmitt-Landsiedel, and G Enders, A pipelined 330MHz multlpiier, lEEE J Solid-State Circuits, vol SC-21, pp 411416, June 1986 D A Henlin, M T Fertsch, M Mazin, and E D Lewis, A 16 bit i ( 16 bit pipelined multlplier macrocell, IEEE J Solid State Czrcuzts, vol SC-20, pp 542-547, Apr 1985 C P Lerouge, P Girard, and J S Colardelle, A fast 16 bit NMOS parallel multiplier, IEEE I Solid State Circuits, vol SC-19, pp 338-342, June 1984 D G Crawley and G A J Amaratunga, 8 x 8 bit pipelined Dadda multiplier in CMOS, ZEE Proc., vol 135, Pt G, pp 231-240, Dec 1988 S Nakamura and K -Y Chu, A single chip parallel mulhpher by MOS technology, IEEE Trans Comput , vol 37, pp 274-282, Mar 1988 A Antoniou, DigLtd Filters Analysis, Design, and Apphcations, 2nd ed New York McGraw-Hill, 1993 C R Baugh and B A Wooley, A twos complement parallel array multiplication algorithm, ZEEE Trans Comput , vol C-22, pp 1045-1047, Dee 1973

KIDAMBI er al.: AREA-EFFICIENT MULTIPLIERS

95

Sunder S. Kidambi was bom in Madras, India on September 5, 1962. He received the B.E. and M.E. degrees in electronics and communications engineering from Anna University, Madras, India, in 1985, and 1987, respectively. He received the Ph.D. degree from the University of Victoria, Victoria, Canada, in 1992. He was a post-Doctoral fellow at Concordia University, Montreal, Canada from August 1992 to January 1995 He is presently with Analog Devices Inc, as a systems design engineer His research interests include design of one- and multidimensional digital filters, VLSI design of digital filters, high-speed digital multipliers, multirate signal processing, and digital communication systems design

Fayez El-Guibaly (S75-M79-SM83) was born


in Egypt in 1949 He received the B.Sc. degree in electronics in 1972, the B Sc degree in mathematics in 1974, and the Ph D. degree in electronics in 1979, all from the University of British Columbia. He joined the University of Victoria in 1984, where he is now a professor of computer engineering. His research interests include VLSI system design for digital signal processing and digital communications Dr El-Guibaly consults with several industnes such as PMC-Sierra, the Canadian Space Agency, and Microtel Teltech

Andreas Antoniou (M69-SM79-F82) received the B.Sc. (Eng.) and Ph.D. degrees in electrical engineering from London University in 1963, and 1966, respectively. From 1966 to 1969, he was Senior Scientific Officer at the Post Office Research Department, London, and from 1969 to 1970, he was a member of the Scientific Staff at the R&D Laboratories of Northem Electric Company Ltd., Ottawa, Ontario, Canada. From 1970 to 1983, he was with the Department of Electrical and Computer Engineering, Concordia University, Montreal, Quebec, Canada, as Professor from June 1973, and as Chairman from December 1977. He served as founding Chairman of the Department of Electrical and Computer Engineering, University of Victoria, Victoria, British Columbia, Canada, from July 1, 1983 to June 30, 1990, and is now Professor with the same department. His teaching and research interests are in the areas of electronics, network synthesis, digital system design, active and digital filters, and digital signal processing. He has published extensively in these areas. One of his papers on gyrator circuits was awarded the Ambrose Fleming Premium by the Institution of Electrical Engineers, UK. He is the author of Digital Filters: Analysis, Design, and Applications (McGraw-Hill) and the co-author with W . 3 . Lu of Two-Dimensional Digital Filters (Marcel Dekker). Dr. Antoniou is a member of the Association of Professional Engineers and Geoscientists of B.C. and a Fellow of the Institution of Electrical Engineers. He was elected Fellow of the IEEE for contributions to active and digital filters, and to electrical engineering education. He served as Associate Editor, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS from June 1983 to May 1985, and as Editor from June 1985 to May 1987. He is currently serving as a member of the Board of Governors of the Circuits and Systems Society.

Você também pode gostar