FFT Implementation On Fpga

Chapter 1
INTRODUCTION
The Discrete Fourier Transform (DFT) plays an important role in the analyses,
design and implementation of the discrete-time signal- processing algorithms
and systems it is used to convert the samples in time domain to frequency
domain. The Fast Fourier Transform (FFT) is simply a fast (computationally
efficient) way to calculate the Discrete Fourier Transform (DFT). The wide
usage of DFTs in Digital Signal Processing applications is the motivation to
Implement FFTs. Almost every branch of engineering and science uses Fourier
methods. The words "frequency," "period," "phase," and "spectrum" are
important parts of an engineer's vocabulary.
The Discrete Fourier transform is used to produce frequency analysis
of discrete non periodic signals. The FFT is another method of achieving the
same result, but with less overhead involved in the calculations. Transforms
basically convert a function from one domain to another with no loss of
information. Fourier Transform converts a function from the time (or spatial)
domain to the frequency domain. DFT is identical to samples of the Fourier
transform at equally spaced frequencies. Consequently, computation of the Npoint DFT corresponds to the computation of N samples of the Fourier
transform at N equally spaced frequencies k = 2k/N. Considering input x[n]
to be complex, N complex multiplications and (N-1) complex additions are
required to compute each value of the DFT, if computed directly from the
formula given as
To compute all N values therefore requires a total N 2 complex
multiplications and N (N-1) complex additions. Each complex multiplication
requires four real multiplications and two real additions and each complex
addition requires two real additions. Therefore a total of 4N2 real multiplications
and N(4N-2) real additions are required. Besides these multiplications and
1
additions there should be provision for storing N complex input sequences and
also to store N output values. Contrary to this by using Decimation in Time FFT
radix-2 algorithm the number of complex multiplications and additions will be
reduced to (N/2)
log 2 N
and N log 2 N
to compute the DFT of a given
complex x[n]. A large number of FFT algorithms have been developed over the
years, notably the Radix-2, Radix-4, Split- Radix, Fast Hartley Transform
(FHT), Quick Fourier Transform (QFT), and the Decimation-in-TimeFrequency (DITF), algorithms.Among these Radix-2,Radix-4,FHT are of prime
concern in this project.In RAD-2 algorithm the DFT computation is initially
split into two summations, one of which involves the sum over the first data
points and the other over the next data points where as a four-way split is used
in RAD-4. For the DHT, the kernel is real unlike the complex exponential
kernel of the DFT.
The objectives of this project involve
Familiarisation of the various FFT algorithms and a comparative study of
the above on the basis of effective speed and area involved.
Familiarisation
of
FPGA and
implementation of a given algorithm.
various
steps
involved
in
the
Chapter 2
FAST FOURIER TRANSFORM
The first major breakthrough in implementation of Fast Fourier Transform
(FFT), algorithms was the Cooley-Tukey algorithm developed in the mid-1960s,
which reduced the complexity of a Discrete Fourier Transform from (N 2) to
(NlogN), At that time, this was a substantial saving for even the simplest of
applications. Since then, a large number of FFT algorithms have been
developed. The Cooley-Tukey algorithm became known as the Radix- 2
algorithm and was shortly followed by the Radix-3, Radix-4, and Mixed Radix
algorithms. Further research led to the Fast Hartley Transform (FHT) and the
Split Radix (SRFFT) algorithms. Recently, two new algorithms have emerged:
the Quick Fourier Transform (QFT), and the Decimation-In-Time-Frequency
(DITF), algorithm. In this project we provide a comparison of 3 contemporary
FFT algorithms. The criteria used are the operations count, memory usage and
computation time. We chose the following algorithms for our analysis: Radix- 2
(RAD2), Radix-4 (RAD4), FHT.
The relationship between finite sequence {x(n)} in time domain and its
representation {X(k)} in frequency domain is given by the following Discrete
Fourier Transform.
N1
X ( k ) =DFT ( x ( n ) )= x (n) W nkN
, 0<k<N-1
n=0
X ( n )=IDFT ( x ( k ) ) =
j
W nk
N =e
1
N
N 1
x ( k ) Wnk
N
, 0<n<N-1
k=0
2 nk
N
When computing DFT, we can find some ratios are computed repeatedly. In
addition, W nkN owns following characteristics.
3
(n+ N )k
W nk
=W nN(k+ N)
N =W N
nk
n ( N k )
W nk
N =( W N ) =W N
W nN =W N , W N =W nnN
n
N
2
N
)
2
, W (k+
=W kN
N
W =1
2.1 Review of FFT algorithms: The basic principle behind most Radix based
FFT algorithms is to exploit the symmetry properties of a complex exponential
that is the cornerstone of the Discrete Fourier Transform (DFT), These
algorithms
divide
the
problem
into
similar
sub-problems
(butterfly
computations), and achieve a reduction in computational complexity. All Radix

algorithms are similar in structure differing only in the core computation of the
butterflies. The FHT differs from the other algorithms in that it uses a real
kernel, as opposed to the complex exponential kernel used by the Radix
algorithms.
2.2 Types of FFTs
2.2.1 Radix-2 Decimation in Frequency Algorithm: The RAD2 DIF
algorithm is obtained by using the divide-and conquer approach to the DFT
problem. The DFT computation is initially split into two summations, one of
which involves the sum over the first data points and the other over the next
data points, resulting in
N
1
2
N 1
X ( k ) = x (n) W nkN + x (n)W nk

N
n=0
n=
N
2
The above equation can be simplified to
k
x ( n ) + (1 ) . x n+
)}
N
W nk
N
2
N
1
2
X (k )=
n=0
Considering the even and odd-numbered samples separately in equations
x (n)+ x n+
N
2
)} W
nk
N
2
N
1
2
X ( 2k )=
n=0
N
. W nkN W nkN
2
2
2
x ( n )x n+
N
1
2
X ( 2 k +1 )=
n=0
The same computational procedure can be repeated through decimation of the

N/2- point DFTs X(2k), and X(2k+1), The entire process involves v= log2 N
stages with each stage involving N/2 butterflies. Thus the RAD2 algorithm
involves N/2 log 2 N
complex multiplications and N log 2 N
complex
additions. Observe that the output of the whole process is out-of-order and
requires a bit reversal operation to place the frequency samples in the correct
order.
2.2.2 Radix-4 Algorithm: The RAD4 algorithm is very similar to the RAD2
algorithm in concept. Instead of dividing the DFT computation into halves as in
RAD2, a four-way split is used. The N- point input sequence is split into four
subsequences, x(4n),, x(4n+1), , x(4n+2), and x(4n+3), , where n=0,1,N/4-1.
N
1
4
N
1
2
nk
X ( k ) = x (n)W nk
N + x (n)W N +
n=0
n=
N
4
3N
1
4
n=
N
2
x (n)W nk
N+
N
1
4
x(l , m)W mqN
m=0
n=
setting
F(l,q)=
N
1
2
3N
4
x (n)W nk
N
N
X(p,q)=X( 4 . p+ q ,
And
X(l,m)=X(4m+1) , where
N
l,p=0,1,2,3 and m,q=0,1. 4 1

The decimation process is similar to the RAD2 algorithm, and uses v=log4N
stages, where each stage has N/4 butterflies. TheRAD4 butterfly involves 8
complex additions and 3 complex multiplications, or a total of 34 floating point
operations. Thus, the total number of floating point operations involved in the
RAD4 computation of an N-point DFT is 4.25log2N, which is 15% less than the
corresponding value for the RAD2 algorithm.
2.2.3 Fast Hartley Transform: The main difference between the DFT
computations previously discussed and the Discrete Hartley Transform (DHT),
is the core kernel . For the DHT, the kernel is real unlike the complex
exponential kernel of the DFT. The DHT coefficient is expressed in terms of the
input data points as
2 nk
N
+
x ( n ) {cos
N 1
X (k )=
n=0
This results in the replacement of complex multiplications in a DFT by real

multiplications in a DHT. For complex data, each complex multiplication in the
summation requires four real
multiplications and two real additions using the DFT. For the DHT, this
computation involves only two real multiplications and one real addition. There
exists an inexpensive mapping of coefficients from the Hartley domain to the
Fourier domain, which is required to convert the output of a DHT to the
traditional DFT coefficients.Following equation , relates the DFT coefficients to
the DHT coefficients for an N-point DFT computation.
Re(DFT(k))=
DHT ( k ) + DHT ( N k )
2
Im(DFT(k))=
DHT ( k )DHT ( N k )
2
Chapter 3
7
FPGA IMPLEMENTATION OF FFTs

3.1 Basic Concept of FPGA : Field Programmable Gate Arrays (FPGAs) are
one of the fastest growing segments of the semiconductor industry. They were
first introduced in 1985, and since then they have quickly gained widespread
acceptance as an excellent technology for implementing moderately large digital
circuits in low production volumes. FPGAs are programmable devices that can
be directly configured by the end user without the use of an integrated circuit
fabrication facility. They offer the designer the benefits of custom hardware,
eliminating high development costs and manufacturing time. Figure 3.1 shows a
conceptual diagram of a typical FPGA .
Field Programmable Gate Arrays are called this because rather than having a
structure similar to a PAL or other programmable device, they are structured
very much like a gate array ASIC (Application Specific Integrated Circuit) .
The first programmable device was the programmable array logic (PAL). One of
the PAL devices is PLD. Programmable Logic Devices (PLDs) are
programmable devices that can be configured for a wide variety of applications.
They enable faster implementation and emulation of circuit designs on
hardware. The flexibility provided by these devices through the presence of
reconfigurable elements has increased their popularity .There are two major
types of PLDs: Field Programmable Gate Arrays (FPGAs) and Complex
Programmable Logic Devices (CPLDs). Among the various possible FPGA
architectures, lookup-table (LUT) based FPGA architectures have been the most
popular ones. A LUT-based FPGA consists of an array of programmable logic
blocks (PLBs) together with programmable interconnections.The maximum
numbers of gates in an FPGA are as high as 500,0000 .
Fig.3.1 Conceptual block diagram of FPGA

3.2 Field Programmable Devices FPDs, or Field Programmable Devices, is a
general term for all devices that can programmed (and possibly reprogrammed)
after fabrication. Several standard approaches to programmability are used in
the industry in the form of PROMs, PLAs, PALs, CPLDs, and FPGAs. These
approaches vary significantly in their complexity (and subsequently their cost)
and the applications for which they are best suited. Amongst the largest and
fastest growing FPDs are Field -Programmable Gate Arrays (FPGAs). Although
there are many types of FPGAs, all architectures include logic blocks, I/O
blocks, and programmable routing, which are arranged in a regular pattern.
FPGA provide narrow logic resources; in other words, their logic blocks are
generally small and uncommitted. One advantage of an FPGA over other types
of FPDs is that they generally have much higher logic capacities than other
FPDs and offer a higher ratio of flip- flops to logic. A higher ratio of flip-flops
to logic is important because flip-flops are often the limiting factor in designs.
FPGAs are the most common form of FPD offered by programmable logic
vendors. One such vendor, Xilinx, offers several different "families" of FPGAs
9
that target different design sizes, design speeds, and cost requirements. Some of
the more popular devices include the XC4000, the Spartan series, and the Virtex
II series.
Connection blocks facilitate connectivity between logic block pins
and the routing channels. Each input pin can be programmed to connect to one
or more of the tracks in a channel using either a multiplexer or multiple
transistors (see Figure 3.2). Output pins, on the other hand, are connected to
tracks using tri-state buffers. The number of tracks that a pin can connect to is
called its connection block flexibility. Switch blocks reside at the intersections
of horizontal and vertical routing channels. They provide programmable
switches used to connect tracks from both the vertical and horizontal channels
incident to the switch. The number of outgoing tracks that each ingoing track
can connect to is called its switch block flexibility. An FPGA generally consists
of a two-dimensional array of logic blocks that can be connected by general
interconnection resources.
Fig 3.2 Internal Structure of Control Logic Block

10
The interconnect comprises segments of wire, where the segments may be of

various lengths. The interconnect resources include programmable switches that
serve to connect the logic blocks to one another or one wire segment to another.
Logic circuits are implemented in the FPGA by partitioning the logic into
individual logic blocks and then interconnecting the blocks as required via the
switches. The structure and content of a logic block are called its architecture.
There are different kinds of logic block architecture available, and logic blocks
can be built using look-up tables (Xilinx), multiplexers (Actel) or even PALs
(Altera).The structure and content of the interconnect resources in an FPGA is
called its Routing architecture. The routing architecture consists of wire
segments and programmable switches. There exists many different ways to
design the structure of a routing architecture, some FPGAs offer simple
connection between blocks, and others provide fewer, but more complex routes.
Each manufacturer has a distinct name for their basic block, but the
fundamental unit is the LUT. Altera call theirs a Logic Element (LE) while
Xilinxs FPGAs have configurable logic blocks (CLBs) organized in an array.
3.3 Basic Building Blocks :Xilinx user-programmable gate arrays include two
major configurable elements: configurable logic blocks (CLBs) and input/output
blocks (IOBs).
CLBs provide the functional elements for constructing the users logic.
IOBs provide the interface between the package pins and internal signal lines.
Programmable interconnect resources provide routing paths to connect the
inputs and outputs of these configurable elements to the appropriate networks.
Customized configuration is established by programming internal static memory
cells that determine the logic functions and internal connections implemented in
the FPGA. Configurable Logic Blocks implement most of the logic in an FPGA.
3.4 Steps involved in the implementation of VHDL code on FPGA:
11
Computer-aided design (CAD) is a very important aspect of FPGA technology.

It allows users to convert a circuit description represented in a hardware
description language (HDL) or as a schematic to a bit stream that can be
uploaded to an FPGA for programming.
Examples of CAD software for FPGAs are Xilinx Alliance and Foundation,
Altera Quartus II and Max+Plus II, and Actel Libero. Implementing a logic
design with an FPGA usually consists of the following steps (depicted in the
figure 4.2, which follows):
1. First of all enter a description of your logic circuit using a hardware
description language (HDL) such as VHDL or Verilog. One can also draw the
design using a schematic editor.
2. Then use a logic synthesizer program to transform the HDL or schematic into
a netlist. The netlist is just a description of the various logic gates in your design
and how they are interconnected.
3. Implementation tools are used to map the logic gates and interconnections
into the FPGA. The FPGA consists of many configurable logic blocks which
can be further decomposed into look-up tables that perform logic operations.
The mapping tool collects the netlist gates into groups that fit into the LUTs and
then the place & route tool assigns the gate collections to specific CLBs while
opening or closing the switches in the routing matrices to connect the gates
together.
4. Once the implementation phase is complete, a program extracts the state of
the switches in the routing matrices and generates a bitstream where the ones
and zeroes correspond to open or closed switches. The bitstream is downloaded
into a physical FPGA chip. The electronic switches in the FPGA open or close
in response to the binary bits in the bitstream. Upon completion of the
downloading, the FPGA will perform the operations specified by the HDL code
or schematic .The following figure (Fig(3.4)) shows the design flow of the
FPGA.
12
Fig.3.4 Design Flow Of FPGA
Chapter 4
IMPLEMENTATION TOOLS
13
4.1 XILINX ISE

Xilinx ISE (Integrated Software Environment) is a software tool produced
by Xilinx for synthesis and analysis of HDL designs, enabling the developer
to synthesize ("compile") their designs, perform timing analysis, examine RTL
diagrams, simulate a design's reaction to different stimuli, and configure the
target device with the programmer. The Xilinx ISE consists of a set of programs
to create (capture), simulate and implement digital designs in a FPGA or CPLD
target device. All the tools use a graphical user interface (GUI) that allows all
programs to be executed from toolbars, menus or icons.
Fig.4.1 Simulation in Xilinx ISE
14
4.2 SPARTAN- 6 FPGA:

The Xilinx, Spartan-6 FPGA family delivers an optimal balance of low risk,
low cost, low power, and performance for cost-sensitive applications. These
FPGAs use a proven low-power 45nm process technology. Also, the Spartan-6
series offers an advanced power management technology, up to 150k logic cells,
integrated PCI Express blocks, advanced memory support, 250 MHz DSP
slices, and 3.2 Gbps low-power transceivers
Fig 4.2 SPARTAN 6 Development Board
15
Chapter 5
CONCLUSION
By this project we try to implement a fixed point fft on an FPGA in
collaboration with NeST Technologies. The purpose of this project is to develop
an understanding of the underlying principles in FFT implementation with the
FPGA technique as an alternate approach to the general or special purpose
microprocessors. Implementation of fixed point FFT algorithm using FPGA
technique seems to be easy and simple compared with other techniques. Also it
is possible to implement another applications or algorithms using this approach
in the field of single or image processing. Communications systems, and
electronic circuit design etc.
REFERENCES
16
[1] S. G. Johnson and M. Frigo, A modified split-radix FFT with

fewer arithmetic operations, IEEE Transactions on Signal
Processing, 2007, pp. 111-119
[2] Bracewell, R. N. The Fourier Transform and its applications, the
McGraw- hill Companies, Inc, 2000, ISBN: 0-07-303938-1.
[3] Cartwright, M, Fourier Methods for mathematicians, scientists
and engineers , Ellis Hoi- wood Limtied, 1990, ISBN: 0-13-327016.
[4] Bob Zeidman, Introduction to CPLD & FPGA Design,
www.chalknet.com.
[5] Andraka Consulting Group Inc., FPGA Basics, article at
www.andraka.com, 2002.
[6] Xilinx Inc.: Xilinx Virtex- E Data book, http:// www.
Xilinx.com, 2000-2001, (Ace 2001-02-05).
17

FFT Implementation On Fpga

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

FFT Implementation On Fpga

Enviado por

Direitos autorais:

Formatos disponíveis

Chapter 1

to compute the DFT of a given

implementation of a given algorithm.

X ( k ) =DFT ( x ( n ) )= x (n) W nkN

computations), and achieve a reduction in computational complexity. All Radix

X ( k ) = x (n) W nkN + x (n)W nk

The above equation can be simplified to

Considering the even and odd-numbered samples separately in equations

The same computational procedure can be repeated through decimation of the

complex multiplications and N log 2 N

x(l , m)W mqN

l,p=0,1,2,3 and m,q=0,1. 4 1

This results in the replacement of complex multiplications in a DFT by real

FPGA IMPLEMENTATION OF FFTs

Fig.3.1 Conceptual block diagram of FPGA

Connection blocks facilitate connectivity between logic block pins

Fig 3.2 Internal Structure of Control Logic Block

The interconnect comprises segments of wire, where the segments may be of

Computer-aided design (CAD) is a very important aspect of FPGA technology.

Fig.3.4 Design Flow Of FPGA

4.1 XILINX ISE

Fig.4.1 Simulation in Xilinx ISE

4.2 SPARTAN- 6 FPGA:

Fig 4.2 SPARTAN 6 Development Board

[1] S. G. Johnson and M. Frigo, A modified split-radix FFT with

Você também pode gostar