Escolar Documentos
Profissional Documentos
Cultura Documentos
Pipelining
Reza Hashemian, Senior Member IEEE
Northern Illinois University
Dept of Electrical Engineering
Reza@ceet.niu.edu
Fig.2 –The Wallace structure for a 16-bit multiplier, and the number of
MAs in each layers.
mi = ai ⊕ bi
Oi = ai ∩ bi (3)
ni = ai ⊕ b i = mi
I i = ai ∪ bi
Where, mi and Oi are the primary sum and carry terms for 0
input carry, and ni and Ii are the primary sum and carry terms
Fig.4(a) - The Wallace tree structure for a 32-bit multiplier for 1 input carry, all respectively. Note that we get the entire
partial sum and carry terms within one gate delay. This is
because all the sums and carries are generated simultaneously.
Now, to get the final value for the sum term, Si, as well as
for the carry flag, Ci, we need to use the preceding carry flag,
Ci-1, and make the selection, as depicted in Eq.(4).
S i = m i C i −1 + n i C i −1
C i = O i C i −1 + I i C i − 1 (4)
Equation (4) shows a sequentially generated carry flag
scheme, where it starts from the LSB and carries are generated
sequentially in ascending order. Figure 5 shows a block
Fig.4(b) - Number of Wallace adders for a 32-bit multiplier, XA indicates diagram for a 64-bit adder. As indicated, the adder consists of
the number of mixed adders. three circuit blocks: Partial Adder, Carry Generation Block,
and Sum Block. The Partial Adder simply generates single bit
Table 1 partial sums and partial carries for input carries 0 and 1, as
Data Size Log(n/2)/Log(3/2) # Layers in the given in Eq.(3). The Carry Generation Block (channel), on the
Wallace tree other hand, generates all carry flags, from c1 to c63. Finally,
4 1.71 2 with carry flags available, the final sum term for each bit is
8 3.42 4 selected within the Sum Block. The Sum Block is simply a 64-
16 5.13 6 bit array of XOR gates, one for each summation bit.
32 6.84 8
64 8.55 10
Table 2 [1] N. R. Scott, Computer Number Systems and Arithmetic. Englewood Cliffs,
NJ: Prentice-Hall, 1985, pp. 54-57.
C1 1 gate delay [2] B. Parhami, Computer Arithmetic, Algorithm and Hardware Design,
C2 and C3 2 gates delay Oxford University Press, New York, 2000, pp. 114-123.
[3] C. S. Wallace,”A Suggestion for a Fast Multiplier,” IEEE Trans. Electronic
C4, C5, … and C7 3 gates delay Computers, vol. 13, pp. 14 – 17, 1964.
C8, C9, … and C15 4 gates delay [4] L. Dadda,”On Parallel Digital Multipliers,” Alta Frequenza, vol. 45, pp.
C16, C17, … and C31 5 gates delay 574 – 580, 1976.
C32, C33, … and C63 6 gates delay [5] E. E. Swartzlander, "Parallel Counters," IEEE Trans. Computers, vol. 22,
no. 11, pp. 1021 – 1024, 1973.
[6] R. Hashemian, "A New Design for High Speed and High-Density Carry
Table 3 provides the gate counts for the carry generation Select Adders", 43rd Midwest Symposium on Circuits and Systems,
block and per carry flag, for different size operands. Lansing, Michigan, August 8-11, 2000.
[7] J.O. Bedrij, "Carry-Select Adder," IRE Trans. Electronic Computers, Vol.
I 1, pp. 340346, 1962.
Table 3 [8] Nigaglioni, R. H., and E. E. Swartzlander, "Variable Spanning Tree
Adder," Proc. Asilomar Conf. Signals, Svstems, and Computers, 1995,
Operand Carry Generation Block Gate Counts pp. 586-590.
Size Gate Counts Per Carry Flag [9] B. Gilchrist, J. H. Pomerence, and S. Y. Wong,"Fast carry logic for digital
computers," IRE Trans. El. Comp., vol. EC-4, no. 4, pp. 133-136, Dec.
2 2 1 1955.
4 6 1.5 [10] H. L. Garner,"A survey of some recent contributions to computer
8 18 2.25 arithmetic," IEEE Trans. Comput., vol. C-25, pp. 1277-1282, 1976.
[11] N.H.E. Weste and K. Eshraghian,"PRINCIPLES OF CMOS VLSI
16 50 3.125 DESIGN", Addison-Wesley Pub. Comp., 1993
32 130 4.0625 [12] J. Rose, A.E. Gamal, and A. Sangiovanni-Vincentelli,"Architecture of
64 322 5.03125 Field-Programmable Gate Arrays," Proc. IEEE, vol. 81, no. 7, July 1993,
pp.1013-1029.