Você está na página 1de 4

Multiplier design based on ancient Indian Vedic

Mathematics
Honey Durga Tiwari, Ganzorig Gankhuyag, Chan Mo Kim, Yong Beom Cho
Dept. of Electronics Engineering
Konkuk University
Seoul, South Korea
honeyndt@konkuk.ac.kr

Abstract—Vedic mathematics is the name given to the ancient [3].In the CSA method, bits are processed one by one to supply
Indian system of mathematics that was rediscovered in the early a carry signal to an adder located at a one bit higher position.
twentieth century from ancient Indian sculptures (Vedas). It This is in fact much similar to a manual calculation method,
mainly deals with Vedic mathematical formulae and their where the layout thereof corresponds to the logic and is regular,
application to various branches of mathematics. The algorithms and hence the design of layout is easy. The CSA method has its
based on conventional mathematics can be simplified and even own limitation since an execution time depends upon the
optimized by the use of Vedic Sutras. These methods and ideas number of bits of the multiplier; there is some difficulty in
can be directly applied to trigonometry, plain and spherical achieving high speed operation [3].
geometry, conics, calculus (both differential and integral), and
applied mathematics of various kinds. In this paper new In the Wallace tree method, three bit signals are passed to a
multiplier and square architecture is proposed based on one bit full adder (“3W”) which is called a three input Wallace
algorithm of ancient Indian Vedic Mathematics, for low power tree circuit, and the output signal (sum signal ) is supplied to
and high speed applications. It is based on generating all partial the next stage full adder of the same bit, and the carry output
products and their sums in one step. The design implementation signal thereof is passed to the next stage full adder of the same
on ALTERA Cyclone –II FPGA shows that the proposed Vedic no of bit, and the carry output signal thereof is supplied to the
multiplier and square are faster than array multiplier and Booth next stage of the full adder located at a one bit higher position.
multiplier. In the Wallace tree method, the circuit layout is not easy
Keywords-Vedic Mathematics; Multiplier; Array Multiplier;
although the speed of the operation is high since the circuit is
Square Architectur. quite irregular.
Another improvement in the multiplier is by reducing the
I. INTRODUCTION (HEADING 1) numbers of partial products generated. The Booth recording
multiplier is one such multiplier; it scans the three bits at a time
Digital multipliers [1], [2] are the core components of all
to reduce the number of partial products [4]. These three bits
the digital signal processors (DSPs) and the speed of the DSP
are: the two bit from the present pair; and a third bit from the
is largely determined by the speed of its multipliers [3]. They
high order bit of an adjacent lower order pair. After examining
are indispensable in the implementation of computation
each triplet of bits, the triplets are converted by Booth logic
systems realizing many important functions such as fast
into a set of five control signals used by the adder cells in the
Fourier transforms (FFTs) and multiply accumulate (MAC).
array to control the operations performed by the adder cells.
Multiplication can be implemented using several algorithms
such as: array, Booth, carry save, modified Booth algorithms The method of Booth recording reduces the numbers of
and Wallace tree. A number of interesting parallel and serial- adders and hence the delay required to produce the partial sums
parallel multiplier architectures have been proposed based on by examining three bits at a time. The high performance of
aforesaid algorithm which improve the cost-throughput Booth multiplier comes with the drawback of power
efficiency. consumption. The reason for this is the large number of adder
cells (15 cells for 8 rows-120 core cells) that consume power [4,
In an array multiplier multiplication of two binary numbers
5, 6, 7]. The conclusion is that the current methodology of
can be obtained with one micro-operation by using a
multiplication leads to more consumption of power and
combinational circuit that forms the product bits all at once
reduction in efficiency.
thus making it a fast way of multiplying two numbers since the
only delay is the time for the signals to propagate through the This paper proposes a multiplier and square architecture
gates that form the multiplication array. However, an array providing the solution of the aforesaid problems adopting the
multiplier requires a large no gates and for this reason it is less sutra of Vedic Mathematics called Urdhva Tiryakbhyam
economical [2]. The other aspect of improving the multiplier (Vertically and Cross wise)[8,9,10]. It can be shown that the
efficiency is through the arrangement of adders. As methods of design is highly efficient in terms silicon area/speed. Hence , a
arrangement of adders are concern, there are two methods: a preferred choice for DSP algorithms like FFT, DCT used in
carry save array (CSA) method and a Wallace tree method image processing standards like MPEG codec etc.

Our thanks go to IDEC, IITA/ETRI SoC Industry Promotion Center and


Seoul R&BD Program for providing funds for the research work. We also
thank Korea Ministry of Knowledge Economy for supporting this project
under "System IC 2010" project.

978-1-4244-2599-0/08/$25.00 ©2008 IEEE II-65 2008 International SoC Design Conference

Authorized licensed use limited to: Shri Ramswaroop Memorial Col of Eng and Management. Downloaded on March 20,2010 at 10:56:49 EDT from IEEE Xplore. Restrictions apply.
II. VEDIC FORMULAE shown in the figure. The square is divided into rows and
columns where each row/column corresponds to one of the
A. Vedic Sutras digit of either a multiplier or a multiplicand. Thus, each digit of
The word ‘Vedic’ is derived from the word ‘veda’ which the multiplier has a small box common to a digit of the
means the store-house of all knowledge. Vedic mathematics is multiplicand. These small boxes are partitioned into two halves
mainly based on 16 Sutras (or aphorisms) dealing with various by the crosswise lines. Each digit of the multiplier is then
branches of mathematics like arithmetic, algebra, geometry etc. independently multiplied with every digit of the multiplicand
[8]. These Sutras along with their brief meanings are enlisted and the two-digit product is written in the common box. All the
below alphabetically. digits lying on a crosswise dotted line are added to the previous
carry. The least significant digit of the obtained number acts as
1. (Anurupye) Shunyamanyat – If one is in ratio, the other is the result digit and the rest as the carry for the next step. Carry
zero for the first step (i.e., the dotted line on the extreme right side)
2. Chalana-Kalanabyham – Differences and Similarities. is taken to be zero.

3. Ekadhikina Purvena – By one more than the previous one


4. Ekanyunena Purvena – By one less than the previous one
5. Gunakasamuchyah – The factors of the sum is equal to the
sum of the factors
6. Gunitasamuchyah – The product of the sum is equal to the
sum of the product
7. Nikhilam Navatashcaramam Dashatah – All from 9 and
the last from 10
8. Paraavartya Yojayet – Transpose and adjust.
9. Puranapuranabyham – By the completion or
noncompletion
Figure 1. Alternative way of multiplication by Urdhva tiryakbhyam Sutra.
10. Sankalana-vyavakalanabhyam – By addition and by
subtraction
C. Urdhva Tiryakbhyam Sutra for binary number system
11. Shesanyankena Charamena – The remainders by the last
digit In this section we extend this Sutra to binary number
system. To illustrate the multiplication algorithm, let us
12. Shunyam Saamyasamuccaye – When the sum is the same consider the multiplication of two binary numbers a3a2a1a0 and
that sum is zero b3b2b1b0. As the result of this multiplication would be more
13. Sopaantyadvayamantyam – The ultimate and twice the than 4 bits, we express it as …r3r2r1r0. Line diagram for
penultimate multiplication of two 4-bit numbers is shown in Fig. 2 which is
nothing but the mapping of the Fig. 1 in binary system. For the
14. Urdhva-tiryakbyham – Vertically and crosswise sake of simplicity, each bit is represented by a circle. Least
significant bit r0 is obtained by multiplying the least significant
15. Vyashtisamanstih – Part and Whole
bits of the multiplicand and the multiplier. The process is
16. Yaavadunam – Whatever the extent of its deficiency followed according to the steps shown in Fig. 2. As in the last
case, the digits on the both sides of the line are multiplied and
The study of these formulae is a field of diverse study. The added with the carry from the previous step. This generates one
proposed design uses only Urdhva-tiryakbyham method hence of the bits of the result (rn) and a carry (say cn). This carry is
the detailed description of other formulae is beyond the scope added in the next step and hence the process goes on. If more
of this paper. than one line are there in one step, all the results are added to
the previous carry. In each step, least significant bit acts as the
B. Urdhva Tiryakbhyam Sutra result bit and the other entire bits act as carry. For example, if
Urdhva tiryakbhyam Sutra is a general multiplication in some intermediate step, we get 110, then 0 will act as result
formula applicable to all cases of multiplication. It literally bit and 11 as the carry (referred to as cn in this text). It should
means “Vertically and Crosswise”. To illustrate this be clearly noted that cn may be a multi-bit number. Thus we get
multiplication scheme, let us consider the multiplication of two the following expressions:
decimal numbers (5498 × 2314). The conventional methods
r0 = a0b0; (1)
already know to us will require 16 multiplications and 15
additions. c1r1 = a1b0 + a0b1; (2)
An alternative method of multiplication using Urdhva c2r2 = c1 + a2b0 + a1b1 + a0b2; (3)
tiryakbhyam Sutra is shown in Fig. 1. The numbers to be
multiplied are written on two consecutive sides of the square as c3r3 = c2 + a3b0 + a2b1 + a1b2 + a0b3; (4)

II-66 2008 International SoC Design Conference

Authorized licensed use limited to: Shri Ramswaroop Memorial Col of Eng and Management. Downloaded on March 20,2010 at 10:56:49 EDT from IEEE Xplore. Restrictions apply.
c4r4 = c3 + a3b1 + a2b2 + a1b3; (5) As shown in Fig. 3, we write the multiplier and the
multiplicand in two rows followed by the differences of each of
c5r5 = c4 + a3b2 + a2b3; (6) them from the chosen base, i.e., their compliments. We can
c6r6 = c5 + a3b3 (7) now write two columns of numbers, one consisting of the
numbers to be multiplied (Column 1) and the other consisting
with c6r6r5r4r3r2r1r0 being the final product. of their compliments (Column 2). The product also consists of
two parts which are demarcated by a vertical line for the
purpose of illustration. The right hand side (RHS) of the
product can be obtained by simply multiplying the numbers of
the Column 2 (7×4 = 28). The left hand side (LHS) of the
product can be found by cross subtracting the second number
of Column 2 from the first number of Column 1 or vice versa,
i.e., 96 - 7 = 89 or 93 - 4 = 89. The final result is obtained by
concatenating RHS and LHS (Answer = 8928).
After this illustration, we now discuss the operational
principle of Nikhilam Sutra by taking the case of multiplication
of two n–bit numbers a and e having compliments â = 10n –a
and ê = 10n - e respectively. The required product ‘p’ is defined
as:
Figure 2. Line diagram for multiplication of two 4-bit numbers.
p = ae; (8)
Hence this is the general mathematical formula applicable which can be reframed be adding and subtracting 102n +
n
to all cases of multiplication. The hardware design for this 10 (a + e) to the right hand side as:
algorithm will be very similar to that of the famous array
multiplier where an array of adders is required to arrive at the p = ae + 102n - 102n + 10n(a + e) - 10n(a + e) (9)
final product. All the partial products are calculated in parallel The above terms can be clubbed as follows:
and the delay associated is mainly the time taken by the carry
to propagate through the adders which form the multiplication p = {10n(a + e) - 102n} + {102n - 10n(a + e) + ae}
array. Clearly, this is not an efficient algorithm for the = 10n{(a + e) - 10n} + {(10n - a)(10n - e)}
multiplication of large numbers as a lot of propagation delay is
involved in such cases. To deal with this problem, we now = 10n{a – ê} + {âê} = 10n{e – â} + {âê} (10)
discuss Nikhilam Sutra which presents an efficient method of From (10), the expressions of LHS and RHS can be
multiplying two large numbers. deduced, which come out to be:
D. Nikhilam Sutra LHS = {a – ê} = {e – â}; (11)
Nikhilam Sutra literally means “all from 9 and last from RHS = {âê}; (12)
10”. Although it is applicable to all cases of multiplication, it is
more efficient when the numbers involved are large. It finds Hence the multiplication of two n- bit numbers is reduced
out the compliment of the large number from its nearest base to to the multiplication of their compliments. To take full
perform the multiplication operation on it, hence larger the advantage of this reduction, it should be ensured that the
original number, lesser the complexity of the multiplication. numbers obtained after taking the compliments are lesser than
We first illustrate this Sutra by considering the multiplication the original numbers. This condition is satisfied if both the
of two decimal numbers (96 × 93) where the chosen base is original numbers are greater than 10n/2, i.e., a > 10n/2 and e >
100 which is nearest to and greater than both these two 10n/2. This is the reason why it is said that the Nikhilam Sutra
numbers. is more efficient in the multiplication of large numbers than the
smaller ones.
An important point to note here is the number of digits
required in the RHS of the product. From (10), it is clear that
RHS should have n digits irrespective of number of digits in
the product âê. We illustrate this point by considering a special
case of the multiplication of two 2- digit numbers in which
RHS comes out to be a single digit (99×97). As shown in Fig.
4, the LHS of the product comes out to be 99 - 3 = 97 - 1 = 96
and the RHS comes out to be 3 × 1 = 3. As n = 2 in this case,
we need to append a leading zero to the RHS making it to be
03. The final result thus comes out to be 9603. On the other
hand, if the number of digits in RHS would have been three,
then the most significant digit would be the carry digit to LHS.
Figure 3. Line diagram for multiplication of two 4-bit numbers.

II-67 2008 International SoC Design Conference

Authorized licensed use limited to: Shri Ramswaroop Memorial Col of Eng and Management. Downloaded on March 20,2010 at 10:56:49 EDT from IEEE Xplore. Restrictions apply.
IV. CONCLUSION
A new reduced-bit multiplication algorithm based on a
formula of ancient Indian Vedic mathematics has been
proposed. Both the Vedic multiplication formulae, Urdhva
tiryakbhyam and Nikhilam, have been investigated in detail.
Urdhva tiryakbhyam, being general mathematical formula, is
equally applicable to all cases of multiplication. A multiplier
architecture based on this Sutra has been developed and is seen
to be similar to the popular array multiplier where an array of
adders is required to arrive at the final product. Due to its
Figure 4. Line diagram for multiplication of two 4-bit numbers. structure, it suffers from a high carry propagation delay in case
of multiplication of large numbers. This problem has been
solved by introducing Nikhilam Sutra which reduces the
III. IMPLEMENTATION AND RESULTS multiplication of two large numbers to the multiplication of
The above mentioned method was used to implement 2, 3, two small numbers. The framework of the proposed algorithm
4, 8 bit multiplier. Table 1 shows the number of calculations is taken from this Sutra and is further optimized by use of some
required for various bit lengths. general arithmetic operations such as expansion and bit shifting
to take full advantage of bit-reduction in multiplication. The
computational efficiency of the algorithm has been illustrated
TABLE I. COMPUTATIONAL COMPLEXITY OF CONVENTIOANL ARRAY by reducing a general 4 × 4-bit multiplication to a single 2 × 2-
MULTIPLIER AND VEDIC MULTIPLIER
bit multiplication operation. The FPGA implementation result
Number of calculations shows that the delay and the area required in proposed design
Input Bit Conventional Vedic is far less than the conventional booth and array multiplier
Length designs making them efficient for the use in various DSP
M A M A
applications.
2 4 2 4 1

3 9 7 9 5 REFERENCES
[1] K. Hwang, Computer Arithmetic: Principles, Architecture And Design.
4 16 15 16 9 New York: John Wiley & Sons, 1979.
8 64 77 64 53 [2] M. M. Mano, Computer System Architecture. Englewood Cliffs, NJ:
Prentice-Hall, 1982.
M: Number of multiplications, A: Number of additions
[3] Gensuke Goto,”High Speed Digital Parallel Multiplier”, United States
Patent-5,465,226, November 7 1995.
The 8 bit multiplier was implemented on ALTERA [4] A.D. Booth, “A Signed Binary Multiplication Technique”, Qrt. J. Mech.
App. Math.,, vol. 4, no. 2, pp. 236–240, 1951.
Cyclone-II FPGA and it utilized 231 combinational functions.
The worst case propagation delay in this case was found to be [5] G. Goto. “High Speed Digital Parallel Multiplier.” U. S. Patent 5 465
226, Nov. 7, 1995.
27ns making the making operating frequency as 37 MHz. to
[6] L. Ciminiera and A. Valenzano, “Low Cost Serial Multiplier for High
compare it with other implementations the design was Speed Specialised Processors”, IEE Proc., vol. 135, no. 5, pp. 259–265,
synthesized on XILINX:SPARTAN:S30VQ100:-4. Table 2 Sept. 1988.
shows the synthesis result for various implementations. The [7] Tam Anh Chu, “Booth Multiplier with Low Power High Performance
result obtained from proposed Vedic multiplier is faster than Input Circuitry”, US Patent,, 6,393,454 B1, May 21 2002.
array multiplier and Booth multiplier. In both the methods [8] Jagadguru Swami Sri Bharath, Krsna Tirathji, “Vedic Mathematics or
suggested earlier the multiplication of individual bits are done Sixteen Simple Sutras From The Vedas”, Motilal Banarsidas,
in parallel and the addition follows the multiplication process. Varanasi(India),1986.
The use of carry look ahead adder will make the addition [9] A.P. Nicholas, K.R Williams, J. Pickles, “Application of Urdhava
Sutra”, Spiritual Study Group, Roorkee (India),1984
process and carry generation faster. There by reducing the
delay associated. [10] A.P. Nicholas, K.R Williams, J. Pickles, “ Lectures on Vedic
Mathematics”, Spiritual Study Group, Roorkee (India),1982.
[11] B. K. Tirtha, Vedic Mathematics. Delhi: Motilal Banarsidass Publishers,
TABLE II. COMPUTATIONAL COMPLEXITY OF CONVENTIOANL ARRAY 1965.
MULTIPLIER AND VEDIC MULTIPLIER [12] H. Thapliyal and M. B. Srinivas, “High Speed Efficient N × N Bit
Parallel Hierarchical Overlay Multiplier Architecture Based on Ancient
Device Array Booth Vedic Indian Vedic Mathematics”, Enformatika Trans., vol. 2, pp. 225-228,
1 Dec. 2004.
FMAP 150 283 201
XILINX: [13] H. Thapliyal, M. B. Srinivas and H. R. Arabnia , “Design And Analysis
SPARTAN: HMAP 10 49 26 of a VLSI Based High Performance Low Power Parallel Square
S30VQ100: Architecture”, in Proc. Int. Conf. Algo. Math. Comp. Sc., Las Vegas,
-4 DELAY 43 124 12 June 2005, pp. 72–76.

II-68 2008 International SoC Design Conference

Authorized licensed use limited to: Shri Ramswaroop Memorial Col of Eng and Management. Downloaded on March 20,2010 at 10:56:49 EDT from IEEE Xplore. Restrictions apply.

Você também pode gostar