Você está na página 1de 3

Copyright Mentor Graphics Corporation.

All rights reserved.

This document contains information that is proprietary to Mentor Graphics Corporation. The original recipient of this document may duplicate this document in whole or in part for internal business purposes only, provided that this entire notice appears in all copies. In duplicating any part of this document, the recipient agrees to make every reasonable effort to prevent the unauthorized use and distribution of the proprietary information.
Trademarks that appear in Mentor Graphics product publications that are not owned by Mentor Graphics are trademarks of their respective owners.

FIR Filter Walkthrough


The purpose of this walkthrough is to refresh the knowledge you gained in the Catapult C Synthesis training. A few advanced exercises are provided at the end to expand your knowledge beyond the training.

About the FIR Filter


A FIR (Finite Impulse Response) filter is a common filter used in DSP design. In this case, the algorithm is implemented with a bank of registers, called regs in the C code. Each time the function is called, the data in regs is shifted by one register and new data is shifted in. Each element in regs is multiplied by a different constant, in coeffs, and the results are summed together. The variable regs is declared static, which means that the value stored in regs will be maintained between calls to the FIR filter function. Looking at the C code, there are a few things to notice: 1. If the MAC loop in the design is left rolled, then the hardware will consume input every NUM_TAPS cycles, at best. You can see this by looking at where input is read. 2. Catapult C needs at least one FSM state to implement a rolled loop, and input is only read once in the last iteration of the MAC loop. 3. An output will also be produced, at best, every NUM_TAPS cycles. The advantage of this architecture is that the hardware will require only one multiply and one add operation to multiply and add. The disadvantage is that large muxes are needed to access regs and coeffs. 4. If the loop is unrolled and the entire design is pipelined, then the result is a standard FIR filter. The input and output are consumed/generated every cycle. 5. The C code is parameterizeable because it uses a #define for the number of taps. Also the design contains a #pragma unroll y before the for loop. This defines yes as the default value for the UNROLL directive for that loop.

Exercise 1 - Setting up Catapult C Synthesis


Do the following: 1. Invoke Catapult C 2. Choose Tools > Set Options to open the Catapult C Synthesis Options window. This is how you modify the default settings for Catapult C system options. 3. In the Catapult C Synthesis Options window, choose your output language (under Output). Explore the other sections of this window to familiarize yourself with their options. 4. It is a good idea to have your input directory restored on startup and to have the tool automatically save your options between sessions. Select the General section to set these options. These options are very useful, especially on Windows systems. The settings will be stored either in the system registry on Windows operating systems or in your home directory on UNIX/Linux operating systems. 5. Click the OK button to apply your changes to the current session and close the options window.

6.

If you to preserve your modified settings for future Catapult C sessions, you must save the settings to disk. Choose the Tools > Save Options menu item and your options are saved in a .catapultc directory in you home directory on UNIX/Linux systems, or the system registry on Windows systems. The best way to save your options is to enable the Save Settings on Exit option (under General options). This will ensure continuity of the default options across Catapult C sessions. NOTE: The default option settings for Project Initialization are only read when a new project is started. If you modify these settings, you must start a new project in order for your changes to take effect.

Exercise 2 Analysis
Do the following: 1. Set your Working Directory. 2. Add the fir_filter.cpp input file. 3. Click on Setup Design in the Design Bar and choose your target technology and clock speed. Any technology will do, but Virtex-E and 50 MHz is a good choice for this example. 4. Click on Generate RTL in the Design Bar. This will generate all of the output files. 5. Now examine the results. Look at the DataPath schematic, reports, etc., and see what they say. In the RTL and the schematic you should be able to get a general feel for the structure that was built. You will find one multiplier that feeds into an accumulator. If you look back in the source code, youll see the line of C source code where the multiply and accumulate (in this case a +=) comes from. Youll also see an FSM and some control logic in the schematic. If you click on the DataPath schematic, the FSM and control logic will be filtered out. Its important to understand the RTL that will result from the C code, at least at a high level. Examine the schematic and compare it to the C code until you can understand how the tool generated that Schematic.

Exercise 3 Unrolling and Pipelining


Now were going to unroll and pipeline the entire design. 1. Click on Setup Design in the Design Bar and select Interface. 2. Disable the Start and Done flags. Start and Done are not typically used if the whole design is pipelined. 3. Click OK. 4. Click on Architectural Constraints. 5. Click on the MAC loop in the design. 6. Select Unroll for this loop. 7. Select the fir_filter_main process and select Pipeline. The default is for an initiation interval of one. 8. Click OK. 9. Click on Generate RTL. This time youll see 8 multipliers and a tree of adders that the multipliers feed into. You might want to refer back to the class for more information about pipelining (remember, this is loop pipelining, not just adding registers to the design). Study the generated architecture. How does this compare to the first solution? Mostly, the design is more parallel. For example, the loop has been unrolled so now all of the multiplies can be scheduled in parallel. Weve also used pipelining, resulting in a design that uses all of the hardware all of the time. Youll notice that this design doesnt have an FSM.

Advanced Exercises
This section covers some other things to try with this FIR design. These are more difficult exercises. 1. To get more experience with the Gantt chart and memories, try mapping the internal register to a

2.

single-port memory. Leave the inner loop rolled and then go look at it in the Gantt chart. How can you re-write the source code to get a faster design in this case? If the loop is left rolled, Catapult C cant determine the optimal bitwidth for the temp variable. Variables that are used as accumulators, like temp, cant have their bitwidths accurately analyzed. What happens when the loop is unrolled? How are the adder bitwidths different? Update the bitwidth for temp to get the optimal result when the loop is left rolled.

Você também pode gostar