Project

FPGA
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured

by the customer or designer after manufacturing ,hence "field-programmable". The FPGA
configuration is generally specified using a hardware description language (HDL), similar to that
used for an application-specific integrated circuit (ASIC) (circuit diagrams were previously used
to specify the configuration, as they were for ASICs, but this is increasingly rare). FPGAs can be
used to implement any logical function that an ASIC could perform. The ability to update the
functionality after shipping, partial re-configuration of a portion of the design and the low non-
recurring engineering costs relative to an ASIC design (notwithstanding the generally higher unit
cost), offer advantages for many applications.
FPGAs contain programmable logic components called "logic blocks", and a hierarchy of
reconfigurable interconnects that allow the blocks to be "wired together" somewhat like many
(changeable) logic gates that can be inter-wired in (many) different configurations. Logic blocks
can be configured to perform complex combinational functions, or merely simple logic
gates like AND and XOR. In most FPGAs, the logic blocks also include memory elements,
which may be simple flip-flops or more complete blocks of memory.
In addition to digital functions, some FPGAs have analog features. The most common analog
feature is programmable slew rate and drive strength on each output pin, allowing the engineer to
set slow rates on lightly loaded pins that would otherwise ring unacceptably, and to set stronger,
faster rates on heavily loaded pins on high-speed channels that would otherwise run too
slow.
[3][4]
Another relatively common analog feature is differential comparators on input pins
designed to be connected to differential signaling channels. A few "mixed signal FPGAs" have
integrated peripheral Analog-to-Digital Converters (ADCs) and Digital-to-Analog Converters
(DACs) with analog signal conditioning blocks allowing them to operate as a system-on-a-
chip. Such devices blur the line between an FPGA, which carries digital ones and zeros on its
internal programmable interconnect fabric, and field-programmable analog array (FPAA), which
carries analog values on its internal programmable interconnect fabric.

Virtex family
The Virtex series of FPGAs have integrated features that include FIFO and ECC logic, DSP
blocks, PCI-Express controllers, Ethernet MAC blocks, and high-speed transceivers. In addition
to FPGA logic, the Virtex series includes embedded fixed function hardware for commonly used
functions such as multipliers, memories, serial transceivers and microprocessor cores. These
capabilities are used in applications such as wired and wireless infrastructure equipment,
advanced medical equipment, test and measurement, and defense systems. Some Virtex family
members are available in radiation-hardened packages, specifically to operate in space where
harmful streams of high-energy particles can play havoc with semiconductors. The Virtex-5QV
FPGA was designed to be 100 times more resistant to radiation than previous radiation-resistant
models and offers a ten-fold increase in performance. However, characterization and test data
were not yet available for the Virtex-5QX on the Xilinx Radiation Test Consortium website as of
November 2011.
Xilinx's most recently announced Virtex, the Virtex 7 family, is based on a 28 nm design and is
reported to deliver a two-fold system performance improvement at 50 percent lower power
compared to previous generation Virtex-6 devices. In addition, Virtex-7 doubles the memory
bandwidth compared to previous generation Virtex FPGAs with 1866 Mb/s memory interfacing
performance and over two million logic cells.
In 2011, Xilinx began shipping sample quantities the Virtex-7 2000T FPGA that packages four
smaller FPGAs into a single chip by placing them on a special silicon communications pad
called an interposer to deliver 6.8 billion transistors in a single large chip. The interposer
provides 10,000 data pathways between the individual FPGAs roughly 10 to 100 times more
than usually would be available on a board to create a single FPGA.
The Virtex-6 family is built on a 40 nm process for compute-intensive electronic systems, and
the company claims it consumes 15 percent less power and has 15 percent improved
performance over competing 40 nm FPGAs.
The Virtex-5 LX and the LXT are intended for logic-intensive applications, and the Virtex-5
SXT is for DSP applications. With the Virtex-5, Xilinx changed the logic fabric from four-input
LUTs to six-input LUTs. With the increasing complexity of combinational logic functions
performed by SoC, the percentage of combinational paths requiring multiple four-input LUTs
became a performance and routing bottleneck. The new six-input LUT represented a tradeoff
between better handling of increasingly complex combinational functions, at the expense of a
reduction in the absolute number of LUTs per device. The Virtex-5 series is a 65 nm
design fabricated in 1.0 V, triple-oxide process technology.
Legacy Virtex devices (Virtex, Virtex-II, Virtex-II Pro, Virtex 4) are still available, but are not
recommended for use in new designs.

SHARPENING

Sharpening is one of the most impressive transformations you can apply to an image since it
seems to bring out image detail that was not there before. What it actually does, however, is to
emphasize edges in the image and make them easier for the eye to pick out -- while the visual
effect is to make the image seem sharper, no new details are actually created. Paradoxically, the
first step in sharpening an image is to blur it slightly. Next, the original image and the blurred
version are compared one pixel at a time. If a pixel is brighter than the blurred version it is
lightened further; if a pixel is darker than the blurred version, it is darkened. The result is to
increase the contrast between each pixel and its neighbors. The nature of the sharpening is
influenced by the blurring radius used and the extent to which the differences between each pixel
and its neighbor are exaggerated.

UNSHARP MASKING
Unsharp masking is the most powerful sharpening method Picture Window supports, however it
is a little more complicated to use. When you select Unsharp Masking, the Sharpen dialog box
expands to add two additional sliders for Radius and Threshold.
The Radius slider lets you control the amount of blurring. Generally you should set the radius to
correspond to the degree to which the original image is blurred. The blurrier the image, the
higher the radius you need to select. Choosing too large a radius creates a sort of ghosting effect
around the edges of objects; if the radius is too small, the sharpening effect is minimized.
The Threshold setting lets you restrict to sharpening action to only those pixels whose difference
from their neighbors exceeds a specified threshold value. The idea behind setting the threshold
value is to select a value that still brings out edge detail without creating unwanted texture in
smooth areas like clouds or clear blue skies. In the image detail below, you can see how Unsharp
Mask with a threshold of zero sharpens the tree silhouette, but also brings out the film grain and
scanning noise in the sky area. Increasing the threshold to 20 leaves the sky mostly untouched
but still makes the tree stand out against its background.

Unsharp masking (USM) is an image manipulation technique, often available in digital image
processing software.
The "unsharp" of the name derives from the fact that the technique uses a blurred, or "unsharp,"
positive to create a "mask" of the original image.
[1]
The unsharped mask is then combined with
the negative, creating the illusion that the resulting image is sharper than the original. From
a signal-processing standpoint, an unsharp mask is generally a linear or nonlinear filter that
amplifies high-frequency components.

Smoothing an Image
Smoothing is often used to reduce noise within an image or to produce a less pixelated image.
Most smoothing methods are based on low pass filters. See Low Pass Filtering for more
information.
Smoothing is also usually based on a single value representing the image, such as the average
value of the image or the middle (median) value. The following examples show how to smooth
using average and middle values:
Smoothing with Average Values
Smoothing with Median Values

Vedic multiplication
NIKHILAM SUTRA PRELUDE TO
MULTIPLICATION
To fulfill my second objective, in this column I will illustrate multiplication of two numbers
using a sutra from Vedic Math called All from Nine and the last from Ten (Sanskrit - Nikhilam
Navatashcaramam Dashatah). I will choose a special case to illustrate this. But, this can be
expanded to any multiplication. The sutra basically means start from the left most digit and begin
subtracting 9 from each of the digits; but subtract 10 from the last digit.
Example 1: Let us choose the number 6. This has only one digit, so it is also the last digit. Thus
applying the Nikhilam sutra we have 10 subtracted from 6 to get -4.
Example 2: Given the number 87, it is clear that the first digit is 8 and the last digit is 7. Using
the sutra:
Subtract 9 from 8 to get -1; subtract 10 from the last digit 7 to get -3.So on the application of
the Nikhilam sutra we get -13.
NIKHILAM APPLICATION: MULTIPLICATION -
SPECIAL CASE
In the following examples I will take two numbers and illustrate how to multiply them in a very
quick way using Nikhilam sutra. Even though this technique works for any pair of numbers, we
will look at a special case when the numbers are near a base such as 10, 100, 1000, etc. We start
with a simple example.
Example 3: To multiply 8 and 7. Apply Nikhilam sutra All from nine and last from ten to the
number 8 to get -2 (since there is only one digit so subtract by 10), and for the number 7 to get
-3. Now write the following:
8 -2
7 -3
___________
Multiply (-2) and (-3) to get 6 and write it down as below.
8 -2
7 -3
________ 6_
Next we cross-add. That is add 8 and -3 to get 5 or add 7 and -2, to get 5 as shown in the
picture below. Note that in either of the operations you get the same answer that is 5.
8 -2

7 3
We find the solution by combining the numbers we found by the above operations as:
8 -2
X 7 -3
__5_____6_
So the answer is 56. One interesting observation, the origin of the multiplication sign can be
traced to the above cross- adding. Now you may be wondering that I knew the answer all
along- big deal. Well, I used a baby problem to illustrate. I will show you that such
multiplication can be done for two and higher digit multiplication.
Example 4: To multiply 92 and 89. Apply Nikhilam Sutra All from nine and last from ten on
both the numbers.
Write this down side by side.
92 -08
X 89 -11
___________
___________
Multiply (-08) and (-11) to get 88.
92 -08
89 -11
________88_
Now we cross-add. This is done by either adding 92 and -11 to get 81or adding 89 and 08.
92 -08
89 -11
Note that in both operations you get the same answer that is 81 which is written below to get
the solution.
92 -08
89 -11
__81____88_
So the answer to multiplying 92 89 is 8188. Again, this technique works very well if the
numbers to be multiplied are near a base. Upon slight modification this also works very well for
any pair of numbers. Homework For Fun: Try the Nikhilam sutra to
multiply: (i) 8598, (ii) 995988. (iii) Bonus problem
105x93. Send answers to vedicmath@hotmail.com.
All correct answers will be acknowledged.

1.Introduction

High speed arithmetic operations are very important in many signal processing applications.
Speed of the digital signal processor (DSP) is largely determined by the speed of its multipliers.
In fact the multipliers are the most important part of all digital signal processors; they are very
important in realizing many important functions such as fast Fourier transforms and
convolutions. Since a processor spends considerable amount of time in performing
multiplication, an improvement in multiplication speed can greatly improve system performance.
Multiplication can be implemented using many algorithms such as array, booth, carry save, and
Wallace tree algorithms.

The computational time required by the array multiplier is less because the partial products are
computed independently in parallel. The delay associated with the array multiplier is the time
taken by the signals to propagate through the gates that form the multiplication array .

Arrangement of adders is another way of improving multiplication speed. There are two methods
for this: Carry save array (CSA) method and Wallace tree method. In the CSA method, bits are
processed one by one to supply a carry signal to an adder located at a one bit higher position. The
CSA method has got its own limitations since the execution time depends on the number of bits
of the multiplier. In the Wallace tree method, three bit signals are passed to a one bit full adder
and the sum is supplied to the next stage full adder of the same bit and the carry output signal is
passed to the next stage full adder of same number of bit and the then formed carry is supplied to
the next stage of the full adder located at a one bit higher position. In this method, the circuit lay
out is not easy .

Booth algorithm reduces the number of partial products. However, large booth arrays are
required for high speed multiplication and exponential operations which in turn require large
partial sum and partial carry registers. Multiplication of two n-bit operands using a radix-4 booth
recording multiplier requires approximately n/ (2m) clock cycles to generate the least significant
half of the final product, where m is the number of booth recoded adder stages. Thus, a large
propagation delay is associated with this case . The modified booth encoded Wallace tree
multiplier uses modified booth algorithm to reduce the partial products and also faster additions
are performed using the Wallace tree.

This paper proposes a novel fast multiplier adopting the sutra of ancient Indian Vedic
mathematics called Urdhva tiryakbhyam . The design of the multiplier is faster than existing
multipliers reported previously.

2.FPGA Architecture

This section describes the Xilinx field programmable logic arrays based on the architecture of
Virtex-II. All Xilinx FPGA contain the same basic resources - slices (grouped into configurable
logic blocks), IOBs and programmable interconnect. The other resources include memory,
multipliers, global clock buffers and boundary scan logic. The architecture of Virtex - II is
shown in [Figure 1]. The slices contain combinational logic and register resources. Each Virtex-
II CLB contains four slices. The structure of a single slice is shown in [Figure 2]. Local routing
provides feedback between slices in the same CLB, and it provides routing to neighboring CLBs.
A switch matrix provides access to general routing resources. The major parts of a slice include
two look-up Tables (LUTs), two sequential elements, and carry logic. The LUTs are known as
the F LUT and the G LUT. The sequential elements can be programmed to be either registers or
latches. The combinational logic is stored in the LUTs. The input path of the IOB element
contains two DDR registers. The output path contains two DDR registers and two/three state
enable DDR registers. There are separate clocks and clock enables for input and output where as
the set and reset puns are shared .

Implementation with FPGA has to follow certain steps as shown in the [Figure 3].

3.Urdhva Tiryakbhyam

Urdhva tiryakbhyam is a multiplication sutra (formula) from Vedic mathematics . Vedic
mathematics is an ancient Indian system of mathematics. Vedic mathematics was rediscovered
by Jagadguru Swami Sri Bharati Krishna Tirthaji Maharaja. He found the basis of the system
written in the form of sutras in an appendix of Atharvaveda. The method is illustrated in[Figure
4].
Figure 4 :Multiplication by Urdhva tiryakbhyam.

16 bit(4 digit) bcd vedic multiplier

The vedic multiplier to be implemented on the fpga virtex 4 board for our consideration is the 16
bit (4 digit ) multiplier. Here the inputs are grouped into 4 digits each of 4 bits represented in the
bcd format. The main reason for choosing the bcd format for the representation of numbers is
that this can also be used as a fixed point fractional multiplier. The vedic multiplier designed
here is used solely for the purpose of multiplication of the ppixel values which are in bcd with
the co efficient values. The primary module for this vedic multiplier is the 4 bit parallel array
multiplier. This consists of two 4 bit inputs which are multiplied to give a 8 bit product. The
product obtained will be in the binary format. Thus we require a binary to bcd converter to
convert the obtained product to bcd form. The 4 bit multiplier is used to carry simultaneous
multiplication of each of the cross digits as shown in the urdwatiryagbya sutram. This parallel
computation of each of the products will result in the faster execution of the product. Once all the
individual products have been computed , these products are added in such a way as shown in the
sutram to obtain the final product. To add the partial products, we use a bcd addition module.
Thus the product computed wil consist of 32 bit i.e.8 digit bcd numbers.

X X X X
X X X X

X X X X
X X X X

X X X X
X X X X

X X X X
X X X X

X X X X
X X X X

X X X X
X X X X

X X X X
X X X X

BRAUN MULTIPLIER
A binary multiplier is an electronic circuit used in digital electronics, such as a computer, to
multiply two binary numbers. It is built using binary adders.
A variety of computer arithmetic techniques can be used to implement a digital multiplier. Most
techniques involve computing a set of partial products, and then summing the partial products
together. This process is similar to the method taught to primary schoolchildren for conducting
long multiplication on base-10 integers, but has been modified here for application to a base-2
(binary) numeral system.

An array multiplier is a digital combinational circuit that is used for the multiplication of two
binary numbers by employing an array of full adders and half adders

Mean Filter
One of the simplest linear filters is implemented by a local averaging operation where the value
of each pixel is replaced by the average of all the values in the local neighborhood:

Compare this with Equation 4.6. Now if g[i,j] = 1/9 for every [i,j] in the convolution mask, the
convolution operation in Equation 4.6 reduces to the local averaging operation shown above.
This result shows that a mean filter can be implemented as a convolution operation with equal
weights in the convolution mask (see Figure 4.6). In fact, we will see later that many
imageprocessing operations can be implemented using convolution.

The size of the neighborhood N controls the amount of filtering. A larger neighborhood,
corresponding to a larger convolution mask, will result in agreater degree of filtering. As a trade-
off for greater amounts of noise reduction,larger filters also result in a loss of image detail. The
results of meanfilters of various sizes are shown in Figure 4.7.

When designing linear smoothing filters, the filter weights should be chosen so that the filter has
a single peak, called the main lobe, and symmetry in the vertical and horizontal directions. A
typical pattern of weights for a 3 x 3 smoothing filter is

Linear smoothing filters remove high-frequency components, and the sharp detail in the image is
lost. For example, step changes will be blurred into gradual changes, and the ability to accurately
localize a change will be sacrificed. A spatially varying filter can adjust the weights so that more
smoothing is done in a relatively uniform area of the image, and little smoothing is doneacross
sharp changes in the image. The results of a linear smoothing filterusing the mask shown above
are shown in Figure 4.8.

Median Filter
The main problem with local averaging operations is that they tend to blur sharp discontinuities
in intensity values in an image. An alternative approach is to replace each pixel value with the
median of the gray values in the local neighborhood. Filters using this technique are called
median filters.
Median filters are very effective in removing salt and pepper and impulse noise while retaining
image details because they do not depend on values which are significantly different from typical
values in the neighborhood. Median filters work in successive image windows in a fashion
similar to linear filters. However, the process is no longer a weighted sum.For example, take a
3 x 3 window and compute the median of the pixels in each window centered around [i, j]:
1. Sort the pixels into ascending order by gray level.
2. Select the value of the middle pixel as the new value for pixel [i,j].
This process is illustrated in Figure 4.9. In general, an odd-size neighborhoodis used for
calculating the median. However, if the number of pixels is even, the median is taken as the
average of the middle two pixels after sorting. The results of various sizes of median filters are
shown in Figure 4.10.

Block diagram for Image Sharpening and Smoothing

The main objective of our project is to carry out the image sharpening and smoothening
operations on a grayscale image. The following are the pre requisites to meet the purpose:
Image acquisition using MATLAB
Storing the image pixel values into a text file.
Reading of these pixels into blockram on FPGA
Obtaining a sub image (3*3 window) starting from the first pixel performing the filtering
operations on each sub window.
Replacing the centre pixel of the sub image window with the filtered value.
Restoring these values into blockram
Saving these block ram values onto a text file and display the image using these values in
the text file.

Project

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Project

Enviado por

Direitos autorais:

Formatos disponíveis

FPGA

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured

Você também pode gostar