Escolar Documentos
Profissional Documentos
Cultura Documentos
I. I NTRODUCTION
Finite impulse response (FIR) digital filters are widely used
as a basic tool in many digital signal processing (DSP) and
communication applications. The complexity of a FIR filter
is largely dominated by the multiplication of input samples
with filter coefficients. But fortunately, the filter coefficients
are constants for a given filter, so that multiplications are
implemented by a network of adders, subtractors, and hardwired shifts, where the number of adders and subtractors
are minimized by a constant multiplication scheme. In case
of a transposed direct-form FIR filter, the recent most input
sample at any given clock period is multiplied with all the
filter coefficients. A set of intermediate results are generated
in this case, and shared across all the multiplications in
order to minimize the total number of additions/subtractions
using multiple constant multiplication (MCM) techniques (see
Fig.I(a)). Each such intermediate result in an MCM process
corresponds to one of the common sub-expressions (CS) of
the set of constants to be multiplied.
A great deal of research has been done to develop effective
algorithms to identify the optimal set of non-redundant subexpressions to achieve the minimum number of logic operators
and the minimum logic depth of the MCM [1][7]. Irrespective
of differences in methodology and the level of optimality,
in all these works, after the common subexpression terms
are determined and the ADD/SUB network of non-redundant
subexpressions (or terms) is formed, the product value corresponding to each of the coefficients is computed by an adderYu Pan and Pramod Kumar Meher are with Institute for Infocomm Research, 1 Fusionopolis Way, Singapore, 138632, e-mail: {ypan,
pkmeher}@i2r.a-star.edu.sg.
Copyright (c) 2013 IEEE. Personal use of this material is permitted.
However, permission to use this material for any other purposes must be
obtained from the IEEE by sending an email to pubs-permissions@ieee.org.
C1 =(101)00(1001)0
<<4
<<3
<<2
+
+
<<3
<<6
<<5
<<2
<<4
Adder
Tree0
<<6
<<5
<<1
1
(a)
REG
REG
<<1
MCM
Term
Network
(b)
Adder
Tree1
REG
C0 =10(101)0(1001)
REG
Structural
Structural
Add/Sub
Fig. 1. Composition of the MCM block. (a) MCM and common subexpressions. (b) Term network and adder-trees for each coefficients.
(0)
(3)
(1)
(2)
(0)
(1)
+
(1)
SUB
(2) +
ADD
+
(3)
ADDERTREE
(2)
+
SUB (4)
ADD (5)
FAs
FAs
Wires
FAs
0s
signs
1
Fig. 2. (a) An example adder-tree with delays and signs on each input term,
(b) An internal schedule with minimum delay.
FAs
FAs
FAs*
FAs
FAs
FAs*
(d)
(c)
signs
i
Wires
signs
0s
B. Cost model
FAs
(b)
( )
(a)
(b)
0s
signs
delay
(a)
signs
0s
(3)
(1)
signs
0s
0s
FAs
FAs
FAs
Wires
FAs
Wires
(f)
(e)
<<2
<<4
SUB
<<2
SUB
15/[4b]
3/[2b]
<<4
SUB
3/[2b]
-15/[4b]
15/[4b]
<<5
<<10
ADD
15361/[14b]
ADD
97/[7b]
<<5
ADD
+ 15457/[14b]
<<1
<<1
ADD
+
SUB
ADD
(a)
<<10
10
SUB
Termnode
Treenode
Structuralnode
x/[yb]val/[bitwidth]
[y ]
[
]
+ 15457/[14b]
<<1
<<1
ADD
ADD
(b)
E0
E1
E2
V0
E3
E4
V1
E8
E5
V2
E7
V3
E9
E10
E12
E13
V4
Fig. 5.
E6
Layer0
E11
V5
Layer1
V6
Layer2
E14
Layer3
We group and define the set of all the binary variables ti.j
associated with a particular edge position Ej as ET Setj =
{ti.j | ti.j T Seti and ti.j is associated with Ej }. Then the
constraint holds that there is at most one input term scheduled
on Ej . For each edge position Ej , we have
X
sumET Setj =
ti.j 1
(2)
ti.j ET Setj
(3)
ADD
ADD
SUB
SUB
(a)
(b)
(c)
(d)
<<2
2
<<0
0
<<3
<<1
(5)
Coeff =1010010
((a))
Fig. 7.
<<0
0
ADD
ADD
Till now, the above constraints suffice to produce assignments of the input terms correctly to form adder-trees of
various topologies. In order to model the cost of the adder-tree,
we need to further model the ADD/SUB type of the operators,
the bit widths of the edges and the cost of operators.
snj = isnj
<<1
1
<<4
Ej
ti.j
<<4
4
ADD
x
<<6
6
Fig. 6. Relationship of operator type and the output sign with the signs of
input operands.
isnj =
ADD
<<1
((b))
(7)
(8)
cab+10
edges. Given that absolute shifts of all input terms are known
constants for a coefficient, absolute shifts of intermediate terms
on its possible adder-trees can be obtained by a top-down
propagation process.
Absolute shifts are modeled in a similar way that signs are
modeled. On each edge Ej , we first define an integer variable
iashj for the initial absolute shift imposed by input terms.
Let AbsSh(Ti ) denote the constant of absolute shift value of
input term Ti , we have
X
AbsSh(Ti ) ti.j
(9)
iashj =
ti.j ET Setj
ashj = iashj
(10)
For each Ej
/ 0th layer, we have
ashj = min(ashjl , ashjr ) + iashj
(11)
(12)
The max function with 0 ensures that the result value for a
NULL operator is 0. For the last layer edge Ej , i.e., the output
edge of the adder-tree, we simply have
shj = ashj
(13)
s b;
s + M z a;
s + M (1 z) b
s b;
s a + M z;
s b + M (1 z)
For each Ej
/ 0th layer, we have
vj = ivj
if knj is 1
if kaj is 1
if ksj is 1
(15)
(16)
if z is 1
= unrestricted
otherwise (z is 0)
(17)
a+M z 0
pk = 1
=0
otherwise
if spsr is 1
if snsr is 1 (20)
structural ops
(1 zk ) + zk+1 + . . . + zS1 M (1 pk ) 0
(1 zk ) + zk+1 + . . . + zS1 + pk > 0
where M is a positive constant greater than the number of z
variables involved in the inequalities. Finally, s = bits(a) is
expressed in
s = S1
k=0 (k + 1) pk
4) Cost of adder-tree operators: Let F and I be the
constants of the costs of a 1-bit full adder and invertor
respectively, the cost of each operator cj which produces
edge Ej is described as follows according to the cost model
described in Section II-B
cj = 0
if knj is 1
if kaj is 1
if ksj is 1
(18)
if shjl > 0
=0
otherwise
(19)
shjl sjl 0
The logic depth of the entire FIR filter takes the largest one
among all its coefficients logic depths.
B. Resource Minimization Algorithm of FIR filter
Filter
Order
16
24
32
40
48
56
64
Avg
Logic
Depth
3
3
3
3
4
4
4
Cell
Conv.
15764
23065
28897
36400
41271
49642
57026
Area (sq.
MIP.
13393
21003
26909
33281
38399
46837
52925
um)
Improv.
15.04%
8.94%
6.88%
8.57%
6.96%
5.65%
7.19%
8.46%
TABLE I
A REA AND POWER CONSUMPTION IMPROVEMENT ON MCM BLOCK .