An Advanced VLSI - Full

International Journal of Industrial Engineering & Technology (IJIET) ISSN 2277-4769 Vol.
3, Issue 3, Aug 2013, 75-80 TJPRC Pvt. Ltd.
AN ADVANCED VLSI ARCHITECTURE OF PARALLEL MULTIPLIER BASED ON HIGHER ORDER MODIFIED BOOTH ALGORITHM
MEERA BABY1, GADDAM VINAY2, YOGARAJ A3 & ANUROOP R V4
1
Department of Electronics and Communication Engineering, Veltech University, Chennai, Tamil Nadu, India Assistant Professor, Department of Electronics and Communication Engineering, Veltech University, Chennai, Tamil Nadu, India
2,3,4
ABSTRACT
Recent researches in fabrication of DSP systems and high performance systems reveal the loss of performance due to strive with high area consumption and delay. The proposed method yields high performance by introducing booth multiplication algorithm to parallel multiplier. This paper culminates with a comprehensive design example of a parallel multiplier. Parallel MAC is frequently used in digital signal processing and video/graphics applications. The MAC provides high speed multiplication and multiplication with accumulative addition. This paper presents a combined process of multiplication and accumulation based on radix-4 booth encoding. In this Paper, we investigate the method of implementing the Parallel MAC with the smallest possible delay.
KEYWORDS: Radix-4 Based Booth Multiplier, Computer Arithmetic, Digital Signal Processing (DSP), Multiplier andAccumulator (MAC), Adders
INTRODUCTION
Multipliers play an important part in todays digital signal processing (DSP) systems. Examples of their use occur in implementations of recursive and transverse filters, discrete Fourier transforms, correlation, range measurement and in most of these cases it is enough with a multiplier unit design for specific purpose. Advances in technology have permitted many researchers to design multipliers which offer both high-speed and regularity of layout, thereby making them suitable for VLSI implementation. represents a multiple of the multiplicand to be added to the final result. In this paper, we study the various parallel MAC architectures and then implement a design of parallel MACbased on radix-4 booth encoder. A Digital multiplier is the fundamental component in general purpose microprocessor and in DSP [1], [2]. Compared with many other arithmetic operations multiplication is time consuming and power hungry. Thus enhancing the performance and reducing the power dissipation are the most important design challenges for all applications in which multiplier unit dominate the system performance and power dissipation. The one most effective way to increase the speed of a multiplier is to reduce the number of the partial products. It could be achieved with a higher radix booth encoder.
GENERAL MAC STRUCTURE

In this section, we discuss basic MAC operation. Basically, multiplier operation can be divided into three operational steps. The first one is booth encoding to generate the partial products. The second one is adder array or partial product compression and the last one is final addition in which final multiplication result is produced. If the multiplication process is extended to accumulate the multiplied result, then MAC consists of four steps. It executes the multiplication operation by multiplying input multiplier X and input multiplicand Y. After that current multiplication result is added to the previous multiplication result Z as accumulation step.
76
Meera Baby, Gaddam Vinay, Yogaraj A & Anuroop R V
Figure 1 MAC Architecture The hardware architecture of the proposed MAC satisfying the aforementioned equations is shown in Figure 1. The n-bit MAC inputs, X and Y, are converted into group of partial products by passing through the booth encoders based on radix-4 booth algorithm. After that accumulation is carried out along with addition of partial products in CSA and Accumulator. As a result n-bit S, C and, Z are generated. These three values are fed back and used for accumulation operation. P [2n-1: n] is generated by adding S and C in the final adder which would be either CLA (carry look ahead adder) or RCA (ripple carry adder) or Kogge stone adder. At last, by combining P [2n-1: n] with P [n-1: 0] we get the final result.
BOOTH ENCODERS
Motivation:-The main bottleneck in the speed of multiplication is the addition of partial products. More the number of bits the multiplier / multiplicand is composed of, more are the number of partial products, longer is the delay in calculating the product.The critical path of the multiplier depends upon the number of partial products. In radix-2 booths algorithm, if we are multiplying 2 n bits number, we have n partial products to add. By reducing the number of partial products, one can effectively speed up the multiplier by a factor roughly equal to 2. Radix-4 booths multiplication is an answer to reducing the number of partial products. multiplying two n bits numbers, if n is even number, or (n+1)/2 , if n is an odd number. Radix-4 Booth Algorithm One of the ways to multiply signed number was invented by Booth Let us consider a Multiplicand M n bits wide represented as Mn-1 Mn-2 ..... M2 M1 M0 and a Multiplier R again n bits wide represented as Rn -1 Rn-2 .....R2 R1 R0. Both of these are signed (twos compliment) binary numbers. As per Booths algorithm, M x R = M x {(Sn-1 x 2n-1) + (Sn-2 x 2n-2) .............(S2 x 22) + (S1 x 21) + (S0 x 20)} (1)
where each Sk for, n-1<=k<=0, is a value which depends upon the value of R, and can be found as explained in the following steps Append R by a 0 on LSB, we will called this bit as Z Now make collections of t bits, where t = 3, for radix -4 booth algorithm, and make collections of t bits taken in one Ck, where N-1<=k<=0 where N = n/2, for n is even, N=(n+1)/2 if n is odd. from the multiplier R, such that
An Advanced VLSI Architecture of Parallel Multiplier Based on Higher Order Modified Booth Algorithm
77
Ck = (R2k+1R2kR2k-1), for N-2<=k<=1, for Ck = (R2k+1R2kZ), for k = 0, and Z=0 Ck = (R2k+1R2kR2k-1), if n is even, and k = N-1 Ck = (R2kR2kR2k-1), if n is odd, and k = N-1 Note that the sign bit is repeated i.e R2k can be seen in two of the bits in Ck. We might have to sign-extend M by one bit, if the CN-1 contains less than t or 3 bits. This will always happen, when n is odd. The number of partial products will also decrease from n to N. All the partial products inradix-2, which are multiplied by 2x, where x is odd will disappear. That means instead of shifting a partial product (before summing them up for the final product) by 1 bit i.e multiplying it by 21 in each partial product, we will shift by 2 bits instead, i.e multiplying it by 2 2. Table 1 CK 000 001 010 011 100 101 110 111 Final product equation will now become: MxR=ppN-1 x 22*(N-1) + pp N-2 x 22*(N-2) .......+ pp1 x 22+ pp0 x 20 where ppN-1=M*SN-1 , ppN-2=M*SN-2..., pp1=M*S1,pp0=M*S0 ppk are called partial products The full radix-4 booth multiplier equation can be written as MxR=M*SN-1 x 22*(N-1) + M*S N-2 x 22*(N-2) .......+ M*S1 x 22+ M*S0x 20 (3) (2) SK 0 +1 +1 +2 -2 -1 -1 0
Note that all the terms which contain multiplication by 2 to the power x, where x is an odd number have disappeared, suggesting that while addition of the partial products, each partial product will be shifted by 2 bits instead on 1 bit. We will take the same example as we took in radix-2 booths multiplier to show the working of the algorithm. M = 10110(-10), R= 10011(-13) n = 5 bits. since n is odd, N = (n+1)/2 = 3 so our partial products are pp0 = M*S0 pp1 = M*S1 pp2 = M*S2
78
Meera Baby, Gaddam Vinay, Yogaraj A & Anuroop R V
M x R = pp2 * 24 + pp1 * 22 * pp0 * 20 MxR = 11010(-10) * (-1) * 24 + 11010(-10) * (+1) * 22 + 11010(-10) * 20 * (-1) 0 1 0 1 0 (+10) 1 1 1 1 0 1 1 0 (-40) 0 1 0 1 0 (+160) 0 0 1 0 0 0 0 0 1 0 (+130) (shifting, sign extension and adding of partial products) Again, the answer was found to be +130, which is corret because M=(-10) and R=(-13) Note Each partial product is being sifted 2 places That the number of partial products have been reduced in radix-4 algorithm to half.
EXPERIMENTAL RESULTS
The 1616 bit parallel MAC based on the booth encoding (i.e. Radix-4 booth encoding) is designed in Verilog and the functionalities of the algorithms are verified by XILINX ISE 9.2i using Spartan3E with XC3S500E family with CP132 package & -5 speed grade [11]. R = Sign 1 0 0 1 1 Z, substituting Sign bit for 1, and Z for 0, R=1100110 C0 = 110, so S0 = -1 C1 = 001, so S1 = +1 C2 = 110, so S2 = -1 Table 2: Comparison Approaches AREA DELAY Radix -4 15% 10.881ns Radix-2 17% 28.6 ns CSA Serial 26% 33.55ns
CONCLUSIONS
The primary objective of this thesis has been to present a new type of Parallel MAC using Radix-4 multiplier to reduce the implementation to practice, and to show through simulation and design that this algorithm is competitive with other more commonly used algorithms when used for high performance implementations. Secondarily, this thesis has shown that Parallel MAC based on Radix-4 booth encoder has higher speed of operation and thus can be used in high performance systems. This work can be utilized in any of the following such as in DSP applications, Numerical coprocessor, Calculators (pocket, graphic etc), Filtering, Modulation & Demodulation etc. As the summation networks and partial product generation logic results in higher delay and most of the area of a MAC. Thus in future, some more techniques and advancement needs to be done which further improves the performance of MAC. Also some measures should be taken which minimize the area consumption.
An Advanced VLSI Architecture of Parallel Multiplier Based on Higher Order Modified Booth Algorithm
79
REFERENCES
1. Young-Ho Seo and Dong-Wook Kim, A New VLSI Architecture of Parallel Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm, IEEE trans. on VLSI Systems, Vol.18 No. 2, Feb. 2010. R. 2. A High-Speed Multiplication Algorithm Using Modified Partial Product Reduction Tree P. Asadee, International Journal of Electrical and Electronics Engineering, 4: 4, (2010). 3. Yajuan He and Chip-Hong Chang, A New redundant binary booth encoding for fast 2n -bit multiplier design, IEEE Transaction on circuit and systems, Vol. 56, No.6, June 2009. 4. Marc Hunger and Daniel Marienfeld, New self checking booth multiplier, Int. J. Appl. Math, Comput. Sci., Vol.18, No.3, 319328, 2008. 5. N. Jiang and D. Harris, Parallelized Radix-2 Scalable Montgomery Multiplier, submitted to IFIP Intl. Conf. on VLSI, (2007). 6. Y.N. Ching, Low-power high-speed multipliers, IEEE Transactions on Computers, vol. 54, no. 3, pp. 355-361, 2005. 7. D. Harris, R. Krishnamurthy, M. Anders, S. Mathew, and S. Hsu, An improved unified scalable radix-2 Montgomery multiplier, Proc. 17th IEEE Symp. Computer Arithmetic, pp. 172-178, 2005. 8. A. Efthymiou, W. Suntiamorntut, J. Garside, and L. Brackenbury. An asynchronous, iterative implementation of the original Booth multiplication algorithm. In Proc. Int. Symp. On Advanced Research in Asynchronous Circuits and Systems, pages 207215. IEEE Computer Society Press, Apr. (2004). 9. M. Sheplie, High performance array multiplier, IEEE transactions on very large scale integration systems, vol. 12, no. 3, pp. 320-325, (2004).

An Advanced VLSI - Full

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

An Advanced VLSI - Full

Enviado por

Direitos autorais:

Formatos disponíveis

International Journal of Industrial Engineering & Technology (IJIET) ISSN 2277-4769 Vol.

3, Issue 3, Aug 2013, 75-80 TJPRC Pvt. Ltd.

GENERAL MAC STRUCTURE

Meera Baby, Gaddam Vinay, Yogaraj A & Anuroop R V

Meera Baby, Gaddam Vinay, Yogaraj A & Anuroop R V

Você também pode gostar