Escolar Documentos
Profissional Documentos
Cultura Documentos
Dynamic range compression, despite being one of the most widely used audio effects, is
still poorly understood, and there is little formal knowledge and analysis of compressor design
techniques. In this tutorial we describe several different approaches to digital dynamic range
compressor design. Digital implementations of several classic analog approaches are given, as
well as designs from recent literature, and new approaches that address possible issues. Several
design techniques are analyzed and compared, including RMS and peak–based approaches,
feedforward and feedback designs, and linear and log domain level detection. We explain what
makes the designs sound different and provide metrics to analyze their quality. Finally, we
provide recommendations for high performance compressor design.
0 2 PRINCIPLE OF OPERATION
-5 Knee Width
The signal entering the compressor is split in two copies.
yG (dB)
2.2 The Gain Computer Alternate definitions for the time constant are often used
The gain computer is the compressor stage that generates [11]. For instance, if one considers rise time for the step
the control voltage. The control voltage determines the gain response to go from 10% to 90% of the final value
reduction to be applied to the signal. This stage involves the 0.1 = 1 − ατ1 fs , 0.9 = 1 − ατ2 fs → τ2 − τ1 = τ ln 9
Threshold T, Ratio R, and Knee Width W parameters. These
(8)
define the static input-to-output characteristic of compres-
sion. Once the signal level exceeds the threshold value, it In analog detectors the filter is typically implemented as
is attenuated according to the ratio. a simple series resistor capacitor circuit, with τ = RC. Eqs.
The compression ratio is defined as the reciprocal of the (5) and (7) may then be found by digital simulation using
slope of the line segment above the threshold, that is: a step invariant transform. The advantage of this approach
over other digital simulation methods like the bilinear trans-
xG − T
R= f or xG > T (2) form is that we preserve the analog filter topology with the
yG − T capacitor’s voltage as the state variable. Therefore, we will
the static compression characteristic is described by the not experience any clicks and pops once we start varying
following relationship: the filter coefficients over time.
xG xG ≤ T 2.3.1 RMS Detector
yG = (3)
T + (x G − T )/R x G > T Level detection may be based on a measurement of the
In order to smooth the transition between compression Root Mean Squared (RMS) value of the input signal [6–7],
and no compression at the threshold point, we can soften which is defined as
the compressor’s knee. The width W of the knee (in deci- 1
M/2−1
RA _ RA
+
C RR CA
CR RR
Fig. 3. Output of the peak detector circuit for different release 2.3.3 Level Corrected Peak Detectors
time constants. Attack Time = 50 ms.
Fig. 4 depicts an alternative peak detector where the sub-
circuits for attack and release are completely decoupled.
The input signal is first fed through a peak detector with
positive and completely blocks when reverse biased. This instantaneous attack. The result is a maximally fast peak es-
significantly simplifies the calculation [12]. timation that is then buffered and smoothed by a first-order
low-pass filter. The advantage of this is that the detector
d VC max(Vin − VC , 0) VC
= − (11) does not suffer from the level differences caused by differ-
dt R AC RRC ent time constants exhibited by the standard peak detector
The capacitor is charged through resistor R A according circuit.
to a positive voltage across the diode but continually dis- Using the derivations for the analog peak detector, a
charged though R R . Taking α A as the attack coefficient and digital implementation of the decoupled peak detector is
α R as the release coefficient, calculated according to Eq. straightforward.
(7) from the attack and release times τ A = R A C and τ R = y1 [n] = max (x L [n], α R y1 [n − 1])
R R C), we can simulate the ideal analog peak detector with: (14)
y L [n] = α A y L [n − 1] + (1 − α A )y1 [n]
y L [n] = α R y L [n − 1] However the attack envelope is now also impressed upon
+ (1 − α A ) max (x L [n] − yL [n − 1], 0) (12) the release envelope. A good estimate of what the actual re-
lease time would be is given by simply adding the respective
Although used in many analog compressors, and some dig- attack time constant to the release time constant.
ital designs [14], this circuit has a few problems. When In a solely digital environment, we can efficiently fix
x L [n] ≥y L [n−1], the step response is this by adding a branch to the difference equation of the
n−1 standard peak detector.
y[n] = (1 − α A ) (α R + α A − 1)m
α y [n−1]+(1−α A )x L [n] x L [n] > y L [n − 1]
yL [n]= A L
m=0
α R y L [n − 1] x L [n] ≤ yL [n − 1]
1 − αA τR
→ ≈ (13) (15)
2 − αR − αA τR + τA
The added branch ensures that the state variable is only
where we used the series expansion of the exponential func-
discharged during the release phase. Such a peak detector
tion. Eq. (13) implies that we will get a correct peak estimate
was described in [10, 15].
only when the release time constant is considerably longer
Fig. 5 depicts the output of the decoupled and branching
than the attack time constant. Another side effect is that
peak detector circuits, Eqs. (14) and (15) respectively, for
the attack time also gets slightly scaled by the release time:
different release time constants. The branching peak detec-
there will be a faster attack time than expected when we use
tor produces the intended release time constant, whereas the
a fast release time. Both problems are illustrated in Fig. 3.
decoupled peak detector produces a measured time constant
In order to accomplish a program-dependent release be-
of approximately τ R +τ A . Ideally then, the decoupled peak
havior (auto release), some analog compressors use a com-
bination of two release time constants in their peak detec-
5
tors. One such design is found in the famous SSL Stereo www.ka-electronics.com/images/SSL/ssl_82E27.pdf
0.6 the input to the side-chain is taken after the gain has been
applied. This was traditionally used in early compressors
0.5
and had the benefit that the side-chain could rectify possible
0.4 inaccuracies of the gain stage.
0.3 The feedforward topology has the side-chain input be-
0.2
fore the gain is applied. This means that the side-chain
circuit responsible for calculating the gain reduction, the
0.1
gain computer, will be fed with the input signal. Therefore
0.0 it will have to be accurate over the whole of the signal’s
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Time (sec) dynamic range as opposed to a feedback type compressor
where it will have to be accurate over a reduced dynamic
Fig. 5. Output of the decoupled and branching peak detector
circuits for different release time constants. The branching peak range since the side-chain is fed with the compressors out-
detector produces the intended release time constant, whereas the put. This bears no implications to a digital design but it
decoupled peak detector produces a measured time constant of should be taken under consideration if designing an ana-
approximately τ A +τ R . log feedforward compressor. Most modern compressors are
based on the feedforward design.
By combining Eqs. (1) and (3) we can calculate the con-
trol voltage either from the input or from the output of the
detector should be used with τ R replaced by τ R +τ A . In- compressor, leading to the two topologies of feedforward
stead, the attack time is often added to the release time and feedback compression.
for the branching peak detector. In which case, neither Both feedforward and the feedback compressors can be
peak detector displays accurate time constants, but the re- implemented by subtracting the threshold from the signal
lease trajectories are roughly in agreement and it guarantees level (which is either derived from the input or from the
that the release time can never be shorter than the attack output of the compressor), applying halfwave rectification
time. and multiplying the result by the slope variable (which is
different for feedforward and feedback compressors).
The feedback design has a few limitations, such as the
2.3.4 Smooth Peak Detectors inability to allow a look-ahead function or to work as a
All the analog designs only make use of the full release perfect limiter due to the infinite negative amplification
time if the input returns back to zero after a peak. However, needed. From Eqs. (1) and (3), when, x L >T, the control
when the signal settles on an intermediate plateau instead, voltage for the feedback compressor is calculated as:
the release envelope will simply stop at this point and the cd B [n] = (1 − R)(yG [n − 1] − T ) (18)
release time is much shorter than we would expect. The
sudden stop creates a discontinuity in the envelope out- where we have assumed a hard knee and no attack or release.
put. We can adjust the branching peak detector in order A limiter (with a ratio of ∞: 1) would need infinite negative
to ensure that it will always make use of the full release amplification to calculate the control voltage. This is why
time. feedback compressors are not capable of perfect limiting.
In contrast, the control voltage of the feedforward com-
pressor, for x L >T, is:
α A y L [n − 1] + (1 − α A )x L [n] x L [n] > y L [n − 1]
yL [n] = α R y L [n − 1] + (1 − α R )x L [n] x L [n] ≤ y L [n − 1] (16)
cd B [n] = (1/R − 1)(x G [n] − T ) (19)
Here, since the slope for a limiter approaches –1, imple-
This peak detector, also used in [1, 15], now simply menting limiting is not a problem using a feedforward
switches coefficients between the attack and the release compressor. The feedforward compressor is also able to
phase. smoothly go into over-compression (with r <0); the slope
The same modification can be introduced to the release variable simply becomes smaller than –1.
mechanism of the decoupled peak detector, Eq. (14), if
a continuous release mechanism is desired. So instead of 3.1.1 An Alternate Digital Feedback Compressor
having the capacitor discharge toward ground by the release
As an alternative, we can apply the previous gain value to
resistor, it can release to the input signal, so:
the current input sample in order to estimate the current out-
put sample. Since the compressor’s gain is likely to change
y1 [n] = max(x L [n], α R y1 [n − 1] + (1 − α R )x L [n]) only by small amounts from sample to sample, the estima-
(17) tion is a fairly good one. Since the side-chain is now fed by
y L [n] = α A y L [n − 1] + (1 − α A )y1 [n] the input signal, we could instead feed in an arbitrary signal
Level _
_ c dB[n] Gain
Abs dB lin
Detector Computer
Gain Computer
τR M
x dB[n] y dB[n] τA T R W
(b)
τA τR Sidechain
c dB[n]
_
Level _
Gain
Abs dB lin
Gain Computer z -1 _ Detector Computer
x dB[n] y dB[n] M
T R W
(c)
Sidechain
Gain Computer
c dB[n]
_ Level _
_ Gain
Abs dB lin
Computer Detector
z -1
M
T R W τA τR
Fig. 6. Static compressor block diagrams showing operation on
decibel level representations of signals. Feedforward design on Fig. 7. Block Diagrams of the Compressor Configuration. (a)
top, feedback design in the middle, and alternate feedback design the return-to-zero detector; (b) the return-to-threshold detector;
on bottom. (c) the log domain detector.
in the side-chain, in order to achieve side-chaining opera- the detector in the linear domain but bias it at the threshold
tion (e.g., to implement a ducker or a de-esser). A similar level. That is, subtract the threshold from the signal before
technique has been used in analog compressors such as the it enters the detector and add the threshold back in again
SSL Bus Compressor (by using a slave VCA that mirrors after the signal has left it.
the main VCA’s action) and proposed for digital designs
[10]. x L [n] = |x[n]| − 10T /20
Fig. 6 presents the block diagrams for the feedforward, x G [n] = 20 log10 (yL [n] + 10T /20 ) (21)
feedback and alternate feedback compressors. cd B [n] = yG [n] − x G [n]
3.2 Detector Placement This turns the return-to-zero detector into a return-to-
There are various choices for the placement of the de- threshold type, and the envelope will smoothly fade out
tector circuit inside the compressor’s circuitry, as depicted once the signal falls below the threshold. However, this
in Fig. 7 for feedforward compressors. In [8, 12, 15-17], solution becomes problematic once we start using a soft
the suggested position for the detector is within the linear knee characteristic. Since the threshold becomes a smooth
domain and before the decibel converter and gain computer. transition instead of a hard boundary, we do not have a fixed
That is, value with which to bias the detector.
A similar approach is to place the detector in the linear
x L [n] = |x[n]| domain but after the gain computer [18–19]. In this case
x G [n] = 20 log10 y L [n] (20) the detector circuit works directly on the control voltage.
cd B [n] = yG [n] − x G [n] x G [n] = 20 log10 |x[n]|
However, the detector circuit then works on the full dy- x L [n] = 10 yG [n]/20 (22)
namic range of the input signal, while the gain computer c[n] = y L [n]
only starts to operate once the signal exceeds the thresh-
old (the control voltage will be zero for a signal below the For each of Eqs. (20), (21), and (22), since the release
threshold). The result is that we again experience a discon- envelope discharges exponentially toward zero in the linear
tinuity in the release envelope when the input signal falls domain, this is equivalent to a linear discharge in the log
below the threshold. It also generates a lag in the attack (or decibel) domain. Although the release rate in terms
trajectory since the detector needs some time to charge up of decibels per time is now constant, it also means that
to the threshold level even if the input signal attacks instan- the release time will be longer for heavier and shorter for
taneously. lighter compression. Unfortunately, this is not what the ear
A way to overcome this and achieve a smooth release perceives as a smooth release trajectory and the return-to-
trajectory and avoid the envelope discontinuity is by leaving zero detector in the linear domain is therefore mostly used
1.0
0.8
0.9
Signal Level 0.7
0.6 0.8
0.25 0.30 0.35
0.5 Input
Decoupled
0.4 Branching
Decoupled, smooth
0.3 Branching, Smooth
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Time (ms)
Fig. 8. Output of the various peak detector circuits. Attack Time = 50 ms, release time = 100 ms. Inset, a close-up of the decoupled
smoothed and branching smoothed peak detectors as they respond to a sudden level change at 300 ms.
20
THD (%)
6.5 0.0
Effective Compression Ratio
Linear domain
6.0 Compressor Design Biased, linear domain
Branching Log domain
5.5 Decoupled -0.2
Gain (dB)
Branching Smooth
5.0 Decoupled Smooth
4.5 -0.4
4.0
3.5
-0.6
3.0
2.5 0.0 0.2 0.4 0.6 0.8 1.0
Time (sec)
2.0
Fig. 11. Compressor envelope trajectories in the linear domain
for the linear domain return-to-zero detector (solid), linear do-
4 8 12 16 20 24 28 32 36 main return-to-threshold detector (dotted), and log domain detec-
Modulation Frequency (Hz)
tor (dashed).
Fig. 10. Effective compression ratio as a function of modulation
frequency. The input signal and compressor parameters were set
as in Fig. 9.
of the compressed and uncompressed signals. Although it
is highly dependent on the envelope estimation method, it
has the added benefit that it does not require an artificial
compression ratio [24–25], Re f f . This is found by measur-
test signal.
ing the modulation of a signal at the output given a certain
Table 1 provides the FES measures for compressors ap-
modulation at the input. Using the method described in
plied to four signals. Only minor differences can be seen
[24], an amplitude modulated sine wave,
between the FES for the various log domain approaches.
x[n] = (1 + m cos(2πf m n)) cos(2πf c n), (24) The slightly worse performance of the smooth compressors
may be due to the fact that the signal is always in attack
is applied at the input. The spectrum consists of a carrier or release phase, whereas the nonsmooth compressors have
and two side bands. The difference between the amplitude times where the envelope is completely unchanged. More
of the side bands and the amplitude of the carrier is Si . noticeable, however, is the difference in FES for different
Compression is applied, and the difference between the level detector placements.
amplitude of the side bands and the amplitude of the car- The log domain detectors clearly outperform linear do-
rier of the compressed signal is found, So . The effective main detectors on all signals except vocals (where there is
compression ratio is then given by Si /So . only a minimal difference in FES).
The effective compression ratio as a function of the mod-
ulation frequency is given in Fig. 10. The decoupled designs
generally perform closer to the target compression ratio of 5 CONCLUSION
7:1. This is primarily due to the lack of full use of the re- Based on the analysis and design choices presented, we
lease time in the branching compressors. However, the non- can propose a compressor configuration that serves the
smooth decoupled design actually outperforms the smooth goals of ease of modification and avoidance of artifacts.
decoupled design for high modulation frequencies, since Feedforward compressors are preferred since they are
the rapid decay of the compression ratio during release more stable and predictable than the feedback type ones,
prevents the low amplitude modulator from being overly and high dynamic range problems do not occur with digi-
compressed. tal designs. The detector is placed in the log domain after
the gain computer, since this generates a smooth envelope,
4.2 Comparison of Side-Chain Configurations has no attack lag, and allows for easy implementation of
Fig. 11 compares the three different detector options in variable knee width. For the compressor to have smooth
terms of the linear gain envelope they produce. The attack performance on a wide variety of signals, with minimal ar-
lag and the release discontinuity of the linear domain return- tifacts and minimal modification of timbral characteristics,
to-zero detector can be clearly seen. The linear domain the smooth, decoupled peak detector should be used. Alter-
return-to-threshold and log domain detector curves look nately, the smooth, branching peak detector could be used
similar, though the former features faster attack and slower in order to have more detailed knowledge of the effect of the
release behavior than the latter. time constants, although this may yield discontinuities in
One further metric from prior work on compressors for the slope of the gain curve when switching between attack
improved speech intelligibility can be used. The Fidelity of and release phases.
the Envelope Shape (FES) is intended to measure how well The analysis in this paper was limited to the design of
the envelope of the signal is maintained [25]. It is based on standard compressors. Their uses and recommended param-
a measure of the correlation between the envelopes (in dB) eter settings are discussed in detail in other texts. Variations
Table 1. Fidelity of the Envelope Shape for compression applied to four acoustic signals. In each
case, a strong, fast acting compressor was applied; threshold –40 dB, ratio 10:1, attack time 1 ms,
release time 40 ms, knee 20 dB.
and more advanced designs, such as side-chain filtering or [12] J. S. Abel and D. P. Berners, “On Peak-Detecting
multiband compressors, were also beyond the scope of this and RMS Feedback and Feedforward Compressors,” pre-
paper. Design of an intelligent compressor, where the signal sented at the 115th Convention of the Audio Engineering
is analyzed in order to adapt the parameter settings to the Society (2003 Oct.), convention paper 5914.
audio content, represents an interesting future direction and [13] D. Berners, “Analysis of Dynamic Range Control
will be presented in related work by the authors [26]. (DRC) Devices,” Universal Audio WebZine, vol. 4 (Septem-
ber 2006).
[14] P. Dutilleux and U. Zölzer, “Nonlinear Processing,
5 REFERENCES Chap. 5,” in DAFX - Digital Audio Effects, U. Zoelzer, Ed.
[1] P. Dutilleux, et al., “Nonlinear Processing, Chap. 4,” (1st ed: Wiley, John & Sons, 2002).
in Dafx:Digital Audio Effects, U. Zoelzer, Ed. (2nd ed:
[15] U. Zolzer, Digital Audio Signal Processing, 2nd ed.
Wiley, John & Sons, 2011), p. 554.
(John Wiley and Sons, Ltd., 2008).
[2] J. O. Smith, Introduction to Digital Filters with Audio
[16] P. Hämäläinen, “Smoothing of the Control Signal
Applications (Booksurge Llc, 2007).
Without Clipped Output in Digital Peak Limiters,” in In-
[3] R. Izhaki, Mixing Audio: Concepts, Practices and
ternational Conference on Digital Audio Effects (DAFx)
Tools (Focal Press, 2008).
(Hamburg, Germany , 2002), pp. 195–198.
[4] B. Lachaise and L. Daudet, “Inverting Dynamics
[17] S. J. Orfanidis, Introduction to Signal Processing
Compression with Minimal Side Information,” in Inter-
(Orfanidis [prev. Prentice Hall], 2010).
national Conference on Digital Audio Effects (DAFx)
(Helsinki, 2008). [18] J. Bitzer and D. Schmidt, “Parameter Estimation
[5] B. Rudolf, “1176 Revision History,” Mix Magazine of Dynamic Range Compressors: Models, Procedures and
Online (June 1 2000). Test Signals,” presented at the 120th Convention of the
[6] D. Barchiesi and J. D. Reiss, “Reverse Engineering Audio Engineering Society (2006 May), convention paper
the Mix,” J. Audio Eng. Soc., vol. 58, pp. 563–576 (2010 6849.
Jul./Aug.). [19] L. Lu, “A Digital Realization of Audio Dy-
[7] M. Bosi, “Low-Cost/High-Quality Digital Dynamic namic Range Control,” in Fourth International Conference
Range Processor,” presented at the 91st Convention of the on Signal Processing Proceedings (IEEE ICSP) (1998),
Audio Engineering Society (1991 Oct.), convention paper pp. 1424–1427.
3133. [20] Sonnox, “Dynamics Plug-In Manual,” Sonnox
[8] R. J. Cassidy, “Level Detection Tunings and Tech- Oxford Plug-ins, April 1st 2007.
niques for the Dynamic Range Compression of Audio Sig- [21] S. Wei and W. Xu, “FPGA Implementation of Gain
nals,” presented at the 117th Convention of the Audio En- Calculation Using a Polynomial Expression for Audio Sig-
gineering Society (2004 Oct.), convention paper 6235. nal Level Dynamic Compression,” J. Acoust. Soc. Jpn. (E),
[9] F. Floru, “Attack and Release Time Constants in vol. 29, pp. 372–377 (2008).
RMS-Based Feedback Compressors,” J. Audio Eng. Soc., [22] S. Wei and K. Shimizu, “Dynamic Range Com-
vol. 47, pp. 788–804 (1999 Oct.). pression Characteristics Using an Interpolating Polynomial
[10] G. W. McNally, “Dynamic Range Control of Digital for Digital Audio Systems,” IEICE Trans. Fundamentals,
Audio Signals,” J. Audio Eng. Soc., vol. 32, pp. 316–327 vol. E88-A, pp. 586–589 (2005).
(1984 May). [23] H. Fastl and E. Zwicker, Psychoacoustics: Facts
[11] B. E. Lobdell and J. B. Allen, “A Model of the VU and Models, 3rd ed. (Springer, 2007).
(Volume-Unit) Meter, with Speech Applications,” J. Acous. [24] J. Verschuure, et al., “Compression and its Effect on
Soc. Am., vol. 121, pp. 279–285 (2007). the Speech Signal,” Ear Hear, vol. 17, pp. 162–175 (1996).
[25] M. A. Stone and B. C. J. Moore, “Effect of the [26] D. Giannoulis, M. Massberg and J. D.
Speed of a Single-Channel Dynamic Range Compressor on Reiss, “Parameter Automation in a Dynamic Range
Intelligibility in a Competing Speech Task,” J. Acoust. Soc. Compressor,” submitted to J. Audio Eng. Soc.,
Am., vol. 114, pp. 1023–1034 (2003). 2011.
THE AUTHORS
Dimitrios Giannoulis studied physics at the National Uni- since 2008, working as as an R&D engineer and special-
versity of Athens, Greece, and he specialized, among izing in digital modeling of analog recording studio hard-
r
others, on signal processing and acoustics. He received ware.
a Master’s degree in music signal processing in 2010
from Queen Mary University of London. He is currently,
pursuing a Ph.D. degree in the Centre for Digital Music Dr. Josh Reiss is a senior lecturer with the Centre for
(C4DM) in Queen Mary University of London. His main Digital Music at Queen Mary University of London. He re-
research areas of interest are machine learning, digital ceived his Ph.D. in physics from Georgia Tech, specializing
signal processing, audio and speech processing and music in analysis of nonlinear systems. Dr. Reiss has published
r
information retrieval. over 100 scientific papers and serves on several steering and
technical committees. He has investigated music retrieval
systems, time scaling and pitch shifting techniques, poly-
Michael Massberg studied media technology at FH Old- phonic music transcription, loudspeaker design, automatic
enburg / Ostfriesland / Wilhelmshaven in Emden, Germany mixing for live sound and digital audio effects, among oth-
and digital music processing at Queen Mary University of ers. His primary focus of research, which ties together many
London, UK, where he received a Master’s degree in 2009. of the above topics, is on the use of state-of-the-art signal
He has been with German pro-audio company Brainworx processing techniques for professional sound engineering.