Você está na página 1de 5

REAL-TIME DIGITAL SYSTEMS

Improve your root-mean


calculations
By Brian Neunaber Non-recursive average
Digital Systems Architect of The non-recursive average, or
Software and Firmware moving average, is the weighted
QSC Audio Products sum of N inputs: the current input
and N-1 previous inputs. In digital
Real-time digital systems of- filtering terminology, this is called
ten require the calculation of a a finite impulse response, or FIR
root-mean, such as a root-mean filter (Equation 1):
square (RMS) level or average
magnitude of a complex signal.
While averaging can be effi-
ciently implemented by most
microprocessors, the square root
may not be--especially with low-
cost hardware. If the processor The most common use of the
doesn’t implement a fast square moving average typically sets
root function, it must be imple- the weights such that an=1/N.
mented in software; although If we were to plot these weights as applying a specific filter to the recursive average’s window is a
this yields accurate results, it may versus time, we would see the signal. The disadvantage of these decaying exponential (Figure 1).
not be efficient. “window” of the input signal that windows is that computational Technically, the recursive average
One common method for com- is averaged at a given point in complexity and storage require- has an infinite window, since it
puting the square root is Newton’s time. This 1/N window is called a ments increase with N. never decays all the way to zero.
method, which iteratively con- rectangular window because its In digital filtering terminology,
verges on a solution using an initial shape is an N-by-1/N rectangle. Recursive average this is known as an infinite im-
estimate. Since we’re computing There is a trick for computing The recursive average is the pulse response, or IIR filter.
the square root of a slowly vary- the 1/N average so that all N weighted sum of the input, N From Figure 1, we see that
ing average value, the previous samples need not be weighted previous inputs, and M previous earlier samples are weighted
root-mean value makes a good and summed with each output outputs (Equation 3): more than later samples, allowing
estimate. Furthermore, we can calculation. Since the weights
combine the iterative Newton’s don’t change, you can simply
method with a first-order recursive add the newest weighted input
averager, resulting in a super-ef- and subtract the Nth weighted
ficient method for computing the input from the previous sum
root-mean of a signal. (Equation 2):
In this article, I’ll develop and
present three efficient recur- us to somewhat arbitrarily define
sive algorithms for comput- an averaging time for the recur-
ing the root-mean, illustrating sive average. For the first-order
each method with signal flow case, we define the averaging
diagrams and example code. time as the time at which the
To some degree, each of these While this technique is com- The simplest of these in impulse response has decayed to
methods trades hardware com- putationally efficient, it requires terms of computational com- a factor of 1/e, or approximately
plexity for error. I’ll compare the storage and circular-buffer man- plexity and storage (while still 37%, of its initial value. An equiva-
computational performance and agement of N samples. being useful) is the first-order lent definition is the time at
error of each method and sug- Of course, many other win- recursive average. In this case, which the step response reaches
gest suitable hardware for each dow shapes are commonly used. the average is computed as the 1–(1/e), or approximately 63%, of
implementation. Typically, these window shapes weighted sum of the current its final value. Other definitions
resemble, or are a variation of, a input and the previous output. are possible but will not be cov-
Root-Mean raised cosine between –π/2 and The first-order recursive average ered here. The weighting of the
The root-mean is computed as π/2. These windows weight the also lends itself to an optimisa- sum determines this averaging
the square root of the average samples in the centre more than tion when combined with the time; to ensure unity gain, the
over time of its input. This aver- the samples near the edges. Gen- computation of the square root, sum of the weights must equal
age may be recursive or non-re- erally speaking, you should only which we’ll discuss shortly. one. As a consequence, only one
cursive, and I’ll briefly review the use one of these windows when In contrast to the non-re- coefficient needs to be specified
general case for both. there is a specific need to, such cursive average, the first-order to describe the averaging time.

 eetindia.com | February 2006 | EE Times-India


Therefore, for first-order recur- when this technique can be used tion 9, we get (Equation 10): sentation for integer numbers.
sive averaging, we compute the is RMS metering of a signal. A Implementation must be ac-
mean level as (Equation 4): meter value that is displayed visu- complished using floating-point
ally may only require an update or mixed integer/fractional
every 50 to 100ms, which may number representation.
be far less often than the input
signal is sampled. Keep in mind, Rearranging Equation 9, we Root-mean using
however, that recursive averag- get (Equation 11): Newton’s Method
where x(n) is the input, m(n) ing should still be computed at A subtle difference between
is the mean value, and a is the the Nyquist rate. Equations 10 and 11 is that m be-
averaging coefficient. The aver- comes m(n), meaning that we’re
aging coefficient is defined as Logarithms attempting to find the square
(Equation 5): Recall that (Equation 7): root of a moving target. However,
since m(n) is a mean value, or
where y(n) is the approxima- slowly varying, it can be viewed
tion of the square root of m(n). as nearly constant between itera-
Equation 11 requires a divide tions. Since y(n) will also be slowly
operation, which may be incon- varying, y(n-1) will be a good ap-
venient for some processors. As proximation to y(n) and require
where t is the averaging time, If you’ll be computing the an alternative, we can calculate fewer iterations--one, we hope--
and fS is the sampling frequency. logarithm of a square root, it’s and multiply the to achieve a good estimate.
The root-mean may then be far less computationally expen- result by m to get To calculate the root-mean, one
calculated by taking the square sive to simply halve the result . Again using Newton’s may simply apply Newton’s Meth-
root of Equation 4 (Equation 6, instead. A common example of Method, we find that we may od for calculating the square root
where y(n) is the root-mean): this optimisation is the calcula- iteratively calculate the reciprocal to the mean value. As long as the
tion of an RMS level in dB, which square root as (Equation 12): averaging time is long compared
may be simplified as follows to the sample period (t &62;&62;
Equation 6 (Equation 8): 1/fS), one iteration of the square
root calculation should suffice for
reasonable accuracy. This seems
simple enough, but we can actu-
Efficient computation methods ally improve the computational
Googling “fast square root” will and calculate the square root efficiency, which will be discussed
get you a plethora of informa- as (Equation 13): in one of the following sections.
tion and code snippets on im- Using reciprocal square root
plementing fast square-root al- Unlike the iterative square-root
gorithms. While these methods method, however, the iterative
may work just fine, they don’t Newton’s Method reciprocal square-root requires
take into account the applica- Newton’s Method (also called no divide. This implementation
tion in which the square root the Newton-Rapson Method) is a Although Newton’s Method is best suited for floating-point
is required. Oftentimes, you well known iterative method for for the reciprocal square root processing, which can efficiently
may not need exact precision estimating the root of an equa- eliminates the divide operation, handle numbers both greater
to the last bit, or the algorithm tion.1 Newton’s Method can be it can be problematic for fixed- and less than one. We present this
itself can be manipulated to quite efficient when you have a point processors. Assuming implementation as a signal flow
optimise the computation of reasonable estimate of the result. that m(n) is a positive integer diagram in Figure 2. The averaging
the square root. I present a few Furthermore, if accuracy to the greater than 1, yr(n) will be a coefficient, a, is defined by Equa-
basic approaches here. last bit is not required, the num- positive number less than one- tion 5, and z–1 represents a unit
ber of iterations can be fixed to -beyond the range of repre- sample delay.
Only calculate it keep the algorithm deterministic.
when you need it We may approximate the root
Probably the simplest optimisa- of f(x) by iteratively calculating
tion is to only calculate the square (Equation 9):
root when you absolutely need it.
Although this may seem obvious,
it can be easily overlooked when
computing the root-mean on ev-
ery input sample. When you don’t
need an output value for every in-
put sample, it makes more sense If we wish to find ,
to compute the square root only then we need to find
when you read the output value. the root to the equation f(y)=y2-
One example of an application m. Substituting f(y) into Equa-

 eetindia.com | February 2006 | EE Times-India


A code listing for a C++ class following root-mean equation
that implements the computa- (Equation 14):
tion in Figure 2 is presented in where a is defined by Equa-
Listing 1. In this example class, tion 5. Now y(n) converges to
initialisation is performed in the
class constructor, and each call
to CalcRootMean() performs one
iteration of averaging and square-
root computation.
Listing 1. C++ class that
computes the root-mean using
Newton’s Method for the recipro-
cal square root
static const double Fs = the square root of the average
48000.0; // sample rate of x(n). An equivalent signal-flow
static double AvgTime = 0.1; representation of Equation 14 is
// averaging time presented in Figure 3. Here, an
additional y(n–1) term is summed
class RecipRootMean so that only one averaging coef- return RootMean; unsigned int nMaxVal;
{ ficient is required. Note that x(n) }
public: and y(n–1) must be greater than }; IntNewtonRootMean()
zero. {
double Mean; A code listing for a C++ class With some care, Figure 3 may nRootMean = 1; // >0
double RecipRootMean; that implements the computation also be implemented in fixed- or divide will fail
double AvgCoeff; shown in Figure 3 is presented point arithmetic as shown in nScaledRootMean = 0;
in Listing 2. As in the previous Listing 3. In this example, scaling double AvgCoeff = 0.5
RecipRootMean() example, initialisation is per- is implemented to ensure valid * ( 1.0 - exp( -1.0 / (Fs *
{ formed in the class constructor, results. When sufficient word size AvgTime) ) );
AvgCoeff = 1.0 - exp( -1.0 and each call to CalcRootMean() is present, x is scaled by nAvgCo- nAvgCoeff = (unsigned
/ (Fs * AvgTime) ); performs one iteration of averag- eff prior to division to maximise int)floor( ( skScaleFactor
Mean = 0.0; ing/square-root computation. the precision of the result. * AvgCoeff ) + 0.5 );
RecipRootMean = Listing 2. C++ class that Listing 3. C++ class that nMaxVal = (unsigned
1.0e-10; // 1 > initial implements the floating-point implements the fixed-point ver- int)floor( ( skScaleFactor
RecipRootMean > 0 version of Figure 3 sion of Figure 3 / AvgCoeff ) + 0.5 );
} static const double Fs = static const double Fs = }
~RecipRootMean() {} 48000.0; // sample rate 48000.0; // ~IntNewtonRootMean() {}
static double AvgTime = 0.1; sample rate
double // averaging time static double AvgTime = unsigned int
CalcRootMean(double x) 0.1; // aver- CalcRootMean(unsigned
{ class NewtonRootMean aging time int x)
Mean += AvgCoeff * (x { static const unsigned int {
- Mean); public: sknNumIntBits = 32; // # if ( x < nMaxVal )
RecipRootMean *= 0.5 * bits in int {
( 3.0 - (RecipRootMean * double RootMean; static const unsigned int nScaledRootMean += (
RecipRootMean * Mean) double AvgCoeff; sknPrecisionBits = sknNu- ( nAvgCoeff * x ) / nRoot-
); mIntBits / 2; Mean ) - ( nAvgCoeff *
return RecipRootMean * NewtonRootMean() static const double skScal- nRootMean );
Mean; { eFactor = pow(2.0, (dou }
} RootMean = 1.0; ble)sknPrecisionBits); else
}; // > 0 or divide will fail static const unsigned int {
AvgCoeff = 0.5 * ( 1.0 sknRoundOffset = nScaledRootMean +=
Using direct square root - exp( -1.0 / (Fs * Avg- (unsigned int)floor( 0.5 * nAvgCoeff * ( ( x / nRoot-
Let’s go back and take a closer Time) ) ); skScaleFactor ); Mean ) - nRootMean );
look at Equation 11. Newton’s } }
method converges on the solu- ~NewtonRootMean() {} class IntNewtonRootMean
tion as quickly as possible with- { nRootMean = ( nScaled-
out oscillating around it, but if we double public: RootMean + sknRound-
slow this rate of convergence, the CalcRootMean(double x) Offset ) >> sknPrecision-
iterative equation will converge { unsigned int nRootMean; Bits;
on the square root of the aver- RootMean += AvgCoeff * unsigned int nScaledRoot- return nRootMean;
age of its inputs. Adding the av- ( ( x / RootMean ) - Root- Mean; }
eraging coefficient results in the Mean ); unsigned int nAvgCoeff; };

 eetindia.com | February 2006 | EE Times-India


Divide-free RMS using
normalisation
Now we’ll look at the special
case of computing an RMS
value on fixed-point hardware
that does not have a fast divide
operation, which is typical for
low-cost embedded processors.
Although many of these proces-
sors can perform division, they
do so one bit at a time, requiring
at least one cycle for each bit of
word length. Furthermore, care
must be taken to insure that the
RMS calculation is implemented y(n–1) must be greater than zero. bits in y(n-1) filters before the iterative square
with sufficient numerical preci- A sample code listing that mpy x0,x0,a a,x0 root. These filters may simply be
sion. With fixed-point hardware, ;a=x(n)^2 one or more cascaded first-order
the square of a value requires recursive sections. First-order
twice the number of bits to retain ;x0=y_msw(n-1) sections have the advantage of
the original data’s precision. mac -x0,x0,a x0,y1 producing no overshoot in the
With this in mind, we ma- ;a=x(n)^2-y_msw(n-1)^2 step response. In addition, there
nipulate Equation 14 into the is only one coefficient to adjust
following (Equation 15): ;y1=y_msw(n-1) and quantisation effects (primar-
Although the expression normf b1,a ily of concern for fixed-point
x(n)2–y(n–1)2 must be calcu- ;normalize implementation) are far less than
lated with double precision, this x(n)^2-y_msw(n-1)^2 by that of higher-order filters.
implementation lends itself to y_msw(n-1) The implementer should be
a significant optimisation. Note implements Figure 4 is shown move a,x0 aware that cascading first-order
in Listing 4. This assembly-lan- ;x0=[x(n)^2- sections changes the definition
guage implementation is for the y_msw(n-1)^2]norm(y_ of averaging time. A simple but
Freescale (formerly Motorola) msw(n-1)) gross approximation that main-
DSP563xx 24-bit fixed-point mpy x1,x0,a y:(r4),y0 tains the earlier definition of step
processor. ;a= AVG_COEFF response is to simply divide the
Listing 4. Freescale DSP563xx averaging time of each first-or-
assembly implementation of ;*[x(n)^2-y_msw(n- der section by the total number
that a/2y(n–1) acts like a level- divide-free RMS using normali- 1)^2]norm(y_msw(n-1)), of sections. However, it is the
dependent averaging coefficient. sation implementer’s responsibility to
If a slight time-dependent vari- RMS ;y0=y_lsw(n-1) verify that this approximation is
ance in the averaging time can ; r4: address of output bits add y,a suitable for the application.
be tolerated--which is often the 24-47 [y_msw(n)] ;a=y(n-1)+avg_coeff Second-order sections may
case--1/y(n–1) can be grossly ap- ; r4+1: address of output bits also be used, if you want (for ex-
proximated. On a floating-point 0-23 [y_lsw(n)] ;*[x(n)^2-y_msw(n- ample) a Bessel-Thomson filter re-
processor, shifting the averaging ; x0: input [x(n)] 1)^2]norm(y_msw(n-1))} sponse. If second-order sections
coefficient to the left by the nega- move a0,y:(r4)- are used, it’s best to choose an
tive of the exponent approxi- FS equ 48000.0 ;save y_lsw(n) odd-order composite response
mates the divide operation. This ;sampling rate in Hz move a,y:(r4) since the averaging square-
process is commonly referred AVG_TIME equ 0.1 ;save y_msw(n) root filter approximates the final
to as normalisation. Some fixed- ;averaging time in rts first-order filter with Q=0.5. Care
point DSPs can perform normali- seconds must be taken to minimise the
sation by counting the leading AVG_COEFF equ @XPN(- Of course, this method can be overshoot of this averaging filter.
bits of the accumulator and 1.0/(FS*AVG_TIME)) ;cal- implemented even without fast Adjusting the averaging time of
shifting the accumulator by that culate avg_coeff normalisation. You can imple- this filter in real time will prove
number of bits.2 In both cases, ment a loop to shift x(n)2–y(n–1)2 more difficult, since there are a
the averaging coefficient will be move #>AVG_COE to the left for each leading bit in number of coefficients that must
truncated to the nearest power FF,x1 ;load y(n–1). This will be slower but can be adjusted in unison to ensure
of two, so the coefficient must avg_coeff be implemented with even the stability.
be multiplied by 3/2 to round move y:(r4)+,a simplest of processors.
the result. This implementation ;get y_msw(n-1) Results
is shown in Equation 16. move y:(r4),a0 Higher Order Averaging Three methods of calculating
Figure 4 is the signal-flow dia- ;get y_lsw(n-1) Higher order recursive averag- the RMS level are compared in
gram that represents Equation clb a,b ing may be accomplished by Figure 5. The averaging time is
16. Just as in Figure 3, x(n) and ;b=number of leading inserting additional averaging set to 100ms, and the input is one

 eetindia.com | February 2006 | EE Times-India


second of 1/f noise with a 48kHz
sampling frequency. The first trace
is the true RMS value calculated
using Equation 6. The second
trace is the RMS calculation using
Equation 14. The third trace is the
no-divide calculation of Equation
16. The fourth trace is the RMS
value using the reciprocal square-
root method of Equation 13.
For the most part, the four
traces line up nicely. All four ap-
proximations appear to converge
at the same rate as the true RMS
value. As expected, the largest
deviation from the true RMS value
is the approximation of Equation
16. This approximation will have
the greatest error during large
changes in the level of the input
signal, although this error is tem-
porary: the optimised approxima-
tion will converge upon the true
RMS value when the level of the
input signal is constant.
The errors between the three
approximations and the true
RMS value are shown in Figure 6.
The error of the RMS approxima-
tion using Equation 14 slowly
decreases until it is below 1E–7,
which is sufficient for 24-bit ac-
curacy. The optimised approxi-
mation of Equation 16 is sub-
stantially worse, at about 1E–4,
but still good enough for many
applications. The approximation
that uses the reciprocal square
root is “in the noise”--less than
1E–9. For highly critical floating-
point applications, this is the
efficient method of choice.
As you would expect, the errors
discussed above will be worse with
shorter averaging times and bet-
ter with longer averaging times. aging with Newton’s meth- trades off hardware capabilities Kent, Boston, pp. 170-176,
Table 1 summarises the approxi- od for calculating the square for error, most of you should 1988.
mate error versus averaging time root, you’ll gain a very efficient find one of these methods suit- 2. Motorola. DSP56300 Family
of these three methods, along method for computing the able for your application. Manual, Rev. 3, Motorola Litera-
with suitable hardware architec- root-mean. Although the three ture Distribution, Denver, 2000.
ture requirements. methods I presented here are Endnotes:
Suitable for average reader developed for different hard- 1. D. G. Zill. Calculus with Ana-
By combining recursive aver- ware and each, to some degree, lytic Geometry, 2nd ed., PWS- Email   Send inquiry

 eetindia.com | February 2006 | EE Times-India

Você também pode gostar