Escolar Documentos
Profissional Documentos
Cultura Documentos
BRAINDEAD
I, as the author and copyright holder, allow you to do anything you wish with this
book free of charge, including copying, printing and republishing. In return, you
must preserve this notification and the book’s website URL on the title page.
Olli Niemitalo
Contents
1 Sampling basics 1
1.1 What is sound? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 From air pressure to analog . . . . . . . . . . . . . . . . . . . . . .
1.3 From analog to digital . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Quantization error . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6 Angular frequency . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.7 Frequency range and aliasing . . . . . . . . . . . . . . . . . . . . .
1.8 Nyquist, we have a problem! . . . . . . . . . . . . . . . . . . . . .
2 Sinusoids 11
2.1 Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Processing 13
3.1 Mathematical model of sampling . . . . . . . . . . . . . . . . . . .
3.2 Discrete processing . . . . . . . . . . . . . . . . . . . . . . . . . .
Internet references 18
Symbol chart 19
About this book
The purpose of the book is to be a tutorial for people who want to learn audio
digital signal processing, but find the academic books too cryptic and impractical.
Softsynth and audio software makers, game programmers, computer musicians
etc. could fall into this league. You must know how to program and have some
basic math knowledge, that’s all.
Printing should be done on both sides of the paper, preferrably with a color printer.
If your printer is not capable of printing on both sides, first print odd pages on one
side of the paper, re-insert the papers (check you got the right order and position),
and print the even pages on the blank sides. Finally, check that there is no lonely
odd page left in the paper tray. If you can’t trust your printer, do the printing
chapter by chapter.
Don’t forget to check out the book’s website for the latest version!
I started this project, because my older, similar text (DSPSTUFF.TXT) began to
seem a bit naive, and i wanted to rewrite the whole thing. ASCII art is not that
accurate so, i chose to use graphics, specificly vector graphics. The quest for right
software led me to LATEX (MiKTEX) and Adobe products (Illustrator) and GnuPlot.
My motivation is sharing knowledge, and probably a tiny bit of that built-in desire
for 15 minutes of fame. For me, this book also works as an answer to all those
“How was it again...?” questions that hit me every now and then.
I’d like to thank Timo Tossavainen for teaching me stuff, and my big brother Kalle
Niemitalo for helping with the math and writing. Thanks to all the people i’ve
got feedback from. The coolest thing yet is that i have received free software,
documents and even job offers in return for my work! :-) Please don’t stop! It’s
great hearing this is of use. Also, i’d like to know if you have found errors or have
suggestions or questions - updating is easy, as this is published electronically.
!" #$ % ! '& ( )
***
1. Sampling basics
Sound is pressure changes traveling in the air, or in some other medium like
water. It can be caused by vibrating objects like guitar strings stirring the air, or
by air turbulence. A nuclear explosion does make a loud bang also.
An increase in air pressure practically means an increased number of air molecules
in a volume. Low pressure would mean lack of air molecules. Whenever there’s a
thinner (local low pressure) spot in the air, surrounding air molecules are pushed
there to fill it, but as they moved, they created another thin layer which is again
filled by surrounding molecules. And the sound travels. In air, at about 330m/s.
Hey, it isn’t really that simple, but you don’t necessarily need to know more! It’s
the huge amount of molecules that turns it all into statistics...
Time
This +2,. / 01324 / 5 693 8 signal is called an analog signal – referring to that the voltage
is analogous to air pressure. In this form, the sound can be recorded for example
mechanically on a vinyl disk or magnetically on a tape, or after amplification (volt-
age is scaled by multiplication), sent to speakers to convert the voltage changes
back to pressure changes, sound.
:
; <=?>-@A
BDCFE GIHKJ#@>-E LD>
The computer’s memory can not store an infinite amount of data. The memory
is not continuous like the curve on a vinyl disk. Instead it is divided into a finite
number of memory slots, bits, and they have only two states, 0 and 1 – black or
white, no greys, could one say – this is called digital.
Therefore it is not possible to save the original sound in digital in all its detailed-
ness. Luckily (in this context!) our hearing is limited and we cannot hear very
quiet sounds or very high frequencies, so the amount of information needed to
store an accurate sounding representation of a sound is finite, and can be easily
reached using today’s equipment.
How it is done is called sampling. Here’s a sampled version of the fart sound:
Amplitude
Time
The vertical axis is now titled amplitude, since we are no longer dealing with a
real quantity like voltage. Sometimes they say things like: “The amplitude of this
signal is 5 volts”. In that case, they are talking of the total zero-level to top height
of the waveform. Here, instead, we mean instantaneous values. Just try to grasp
the concepts, and you’ll be all right with the twisted terminology. You may even
become friends! (Hope not too good ones)
The sampled sound is not a continuous curve. Instead it is a set of peaks of
different amplitudes, spread in time at equal intervals M (meaning the time between
adjacent peaks is constant, same everywhere). This kind of a signal, where time
is quantized, is called a discrete signal. The amplitudes of the peaks are taken
from the instantaneous voltage levels of the original sound. Hence the name
sampling. Another name for a single peak is samplepoint, or shortly sample.
Samplepoint is preferred, since sample could mean a longer piece of sound too.
To limit the amount of memory required to save the amplitude of a single sam-
plepoint, amplitude is also quantized, meaning it can only have values that are
multiples of a constant. The relation of quantized amplitude to unquantized am-
plitude is a staircase function, from which the closest step is always taken:
N O7P Q R9S TU V W X Y[Z \ Y"]^ U T _ `acb d-e fIRg h i#Q bkji#l i g mcnFP o p i fIQ bnFg q QDQ bIf bq b q r d q P s b l tKu g t#v m P q w
nFR i l iDu g t$v m i vb P q Q uFg l iDu vg o i fd q i h i q m xP qQ P t$i `
y2z {z[|~}F"
" #Dk#
Quantized
Quantization step
Unquantized
Amplitude quantization
It would seem logical that better the amplitude precision, better the quality. Right!
Adding one more bit to the bit-depth doubles the number of available amplitude
levels, and drops the quantization error to half. Quantization error is the unwanted
addition to your signal due to quantization, and it can be calculated through:
This is a common procedure. To extract the error from a spoiled signal, for closer
investigation, you subtract the original from it.
Let’s try the formula visually and see what we get from quantizing a sinusoid ¥ ,
one of the most basic waveforms:
¦ § ¨ © ª« ¬
® ¯"°c± ² ¬ ³$´c® « ³ «µ® « ± ² © ¨ ¶ ® ª· ¸$± ¸D¹ ² ³Fº ± ¸D¹ ² ³ » ¼ ½ ½ ½ ¾I¿ À Á Â Ã Ä Å ¿ Æk¸#³ © ª ¬F¨ Ç © ¨F¨ Ç ³$¬ ¶ È ª© 2 ² ¶¬
¨ ¯ © ª¬ ® ¯ ¸#³ «¶ ª ¨ ®© ª ® ¨ Ç ³ ¯'¬ ¶ È ª© ²-¯ ³ É ¯ ³ ¬ ³ ª ¨ ¶ ª È#¨ Ç ³$® ¯ ¶ È ¶ ª© ² ½'Ê
Á Â Ë Ì$¯ ³ ³ ¯ F
¬ ¨ ®¨ Ç© ¨'¨ Ç #
³ « ¶ ¬ Í ¯ ³ ¨ ³$¬ ¶ È ª© ²
¶ ¬F©$¬ ³ ¯ ¶ ³ ¬
® -ª© ¯ ¯ ® Î9ɳ © Ï ¬ Ð É ± ² ¬ ³ ¬ ½'Ñc¿ À Ì À¶ ¬F©$¬ Ò ª ® ª Ò ¸ ® ¯"« ¶ È ¶ ¨ © ² ½
Ó ÔFª© ² ® È ® ±¬
¨ ®#Ç ± ¸#© ª ® ¶ «2Ð Ç ± ¸#© ª Õ ² ¶ Ï ³#Ö Õ ¼
× ØÙ?Ú-ÛÜ
ÝDÞFß àIáKâ#ÛÚ-ß ãDÚ
ä å
We can promptly see that the amplitude of the quantization error is strictly limited
into a range. The top limit of the range is equal to half the quantization step.
1.5 Frequencies
ô ìIõcö ë ê ÷cø-íIù
ú ûæ é ü ýþù ï
ÿ í ÷cí ~ù
ê ë ì7í õ
ô í î
And the same using the abbreviations:
ó ï
ú ûæ é ü ý ê ñ"ò î
Amplitude defines the height of the sinusoid, measured from the zero level to the
top. Initial phase defines the phase of the sinusoid at ê ä
, a cosine being the
result from ñ"ò ä
. Commonly, the time unit is seconds, the frequency unit is Hz
and no unit is used with the amplitude.
We also introduce a new letter to express discrete time, B (upper case!). These
new variables are related to C and D by:
EGFIH J C B F C KD
CK
There are no units for these quantities, and the general sinusoid formula is sim-
plified into:
LNM OP Q E
B1R1ST U
The convenience gained is not only the simplified formula, but also that now B
can be used as samplepoint number and the frequency is expressed as parts
of the sampling frequency, kicking it out of the calculations. For example, if we
are assigned to create a sampled sinusoid of some angular frequency, we don’t
need to know the sampling frequency to be able to start typing in the samplepoint
values.
Here are some possibilities for E and the corresponding real freqs:
E1FWVYX C FWV"Z[
E1F X C F C K \]
E1F %
J
\ H X
J C F C K \-H
E1F X C F CK
HJ
F L F_^
Amplitude
F`EaF
Angular frequency
F J%\H
Initial phase ST FbV
M O-P Q c d
BeU
+1.0
+0.5
Amplitude
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
L F
^ E1F FWV
Example sinusoid, , J\-H , ST
The markers are the samplepoints. Since the angular frequency is J%\H FWfV-g , a
quarter of the sampling frequency, the sinusoid goes a full cycle every 4 samples
h ijlk/m9npo"qr s?t;u!m9k/r v"k
h
– w-xy is the fourth of a full circle (z x-y = { | ). Starting to understand angular
frequency? You can also consider } as a phase increase that is added to the
phase of the sampled sinusoid at every sampling step.
A good visualization aid is a marker going counterclockwise ~ around a unit circle
(radius 1, origo-centered). The circumference (the length of the circle straight-
ened) of a unit circle is { | . At every sampling step the arch traveled by the marker
is of length } , so is the angle the marker rotates around origo. If we are creating a
sampled cosine wave, taking samples of the horizontal coordinate of the marker
and starting from coordinates (1,0) at time 0 does the job:
- e
+1.0
+0.5
Amplitude
0
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
The example sinusoid with unit circle illustration, W , }1l|%{ , !lx
¡! ¢ " £ ¢
¤ £" ¥ ! ¦ § ¥ ¥ ¢ £
¥
¨-© ª ©1«%¬®6¯"°4±²³¬´9°?µ³´°9¶·´¸¹ ´eº/¹ °4µ »
¼ ½-¾ ¿ ÀÁ9Â
+1.0
Amplitude +0.5
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
If we have a higher frequency than this Nyquist frequency, and use the same
sampling freq, shit will happen. Here we have a frequency that is 4/3 of the
Nyquist frequency:
¼ ½-¾ ¿ ÅÆ ÀÁeÂ
+1.0
+0.5
Amplitude
0
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
Now we wipe out the continuous waveform and store only the discrete samples:
+1.0
+0.5
Amplitude
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
Ô Õ-Ö × ØÙ ÚÛeÜ
+1.0
+0.5
Amplitude
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
0.5 fs
Aliased
freq
0Hz
0Hz 0.5 fs 1.0 fs 1.5 fs 2.0 fs 2.5 fs
Unaliased frequency
Now this explains why it is called aliasing! First, as we increase the frequency
above ß à , it bounces off and aliases over the already used range, and when
increased more, it bounces off the 0Hz. And so on, infinitely.
The amplitude of a sinusoid is preserved in aliasing. What happens to the phase
is usually unimportant, and will not be discussed here.
In our example, the unaliased frequency is in the range á
â ã-ß ä – å-â á/ß ä , so by reading
from the graph, we can write a mathematical equation for the aliasing relation
(applicable in this specific frequency range only):
We just declared:
And we had as the unaliased frequency , so in our example:
+0.5
Amplitude
0
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
The Nyquist frequency could be thought of as a special case, where the phase
information of the sinusoid is lost, as it is always reconstructed as %'& ( $" .
Here’s an example showing how the phase disappears:
)* +-,/.10#24365!7 8:9
; 0#.17 <6.
+0.5
Amplitude
-0.5
-1.0
0 1 2 3 4 5 6 7 8
Time (sample number)
The amplitude of the reconstructed cosine is not that of the original sinusoid. It is,
* . In short, Nyquist frequency sinusoids are attenuated or even muted
as easily interpretable from the figure, same as the value of the original sinusoid
at CJI
depending on their initial phases.
The good thing is that this special problem is limited to Nyquist frequency only. A
frequency a tiny bit less does not have the problem. Therefore, the audiophile’s
intuitive argument loses its point. – Sampling at 40001Hz is enough for represent-
ing any 20000Hz sinusoid.
Perfect reconstruction is an extremely heavy process, and practical reconstructors
are far from perfect. Still, there is no similar phase-selective attenuation below the
Nyquist frequency. Other kinds of problems, mostly aliasing-related, exist.