Comm2 2012

EE2--4: Communication Systems
EE2
Dr. Cong Ling
Department of Electrical and Electronic Engineering
Course Information
Lecturer: Dr. Cong Ling (Senior Lecturer)

Office: Room 815, EE Building
Phone: 020-7594
020 7594 6214
Email: c.ling@imperial.ac.uk
Handouts
Slides: exam is based on slides
Notes: contain more details
Problem sheets
Course homepage
http://www.commsp.ee.ic.ac.uk/~cling
You
Y can access llecture
t
slides/problem
lid / bl
sheets/past
h t /
t papers
Also available in Blackboard
Grading
g
1.5-hour exam, no-choice, closed-book
Lectures
Introduction and background
1. Introduction
2. Probability and random
processes
3 Noise
3.
Effects of noise on analog
communications
4. Noise performance of DSB
5. Noise performance of SSB and
AM
6. Noise performance of FM
7. Pre/de-emphasis
p
for FM and
comparison of analog systems
Digital communications
8. Digital representation of signals
9. Baseband digital transmission
10. Digital modulation
11 Noncoherent
11.
N
h
td
demodulation
d l ti
Information theory
12 Entropy and source coding
12.
13. Channel capacity
14. Block codes
15. Cyclic codes
EE2--4 vs. EE1EE2

EE1-6
Introduction to Signals and Communications
How do communication systems work?
About modulation, demodulation, signal analysis...
The main mathematical tool is the Fourier transform for
deterministic signal analysis.
More about analog communications (i.e., signals are continuous).
Communications
C
How do communication systems perform in the presence of noise?
About statistical aspects and noise.
noise
This is essential for a meaningful comparison of various
communications systems.
The main mathematical tool is probability.

probability
More about digital communications (i.e., signals are discrete).
Learning Outcomes
Describe a suitable model for noise in communications
Determine the signal
signal-to-noise
to noise ratio (SNR) performance of
analog communications systems
Determine the probability of error for digital
communications systems
Understand information theoryy and its significance
g
in
determining system performance
Compare the performance of various communications
systems
About the Classes
Youre welcome to ask questions.

You can interrupt
p me at any
y time.
Please dont disturb others in the class.
Our responsibility is to facilitate you to learn.

You have to make the effort
effort.
Spend time reviewing lecture notes afterwards.
If yyou have a question
q
on the lecture material
after a class, then
Look up a book! Be resourceful.
Try to work it out yourself.
yourself
Ask me during the problem class or one of scheduled times of availability.
References
C. Ling, Notes of Communication Systems,

Imperial Collage.
S Haykin & M
S.
M. Moher
Moher, Communication
Systems, 5th ed., International Student
Version, Wiley, 2009 (43.99 from Wiley)
y , Communication Systems,
y
, 4th ed.,,
S. Haykin,
Wiley, 2001
Owns the copyright of many figures in these
slides.
For
F convenience,
i
th note
the
t 2000 Wil
Wiley,
th
Haykin/Communication System, 4 ed. is not
shown for each figure.
B.P.
B
P Lathi,
Lathi Modern Digital and Analog
Communication Systems, 3rd ed., Oxford
University Press, 1998
J.G. Proakis and M. Salehi,, Communication
Systems Engineering, Prentice-Hall, 1994
L.W. Couch II, Digital and Analog
Communication Systems, 6th ed., PrenticeH ll 2001
Hall,
7
Multitude of Communications
Telephone network
Internet
Radio and TV broadcast
Mobile communications
Wi-Fi
Satellite and space communications
Smart power grid, healthcare
Analogue communications
AM, FM
Transfer of information in digits
Dominant technology today
Broadband, 3G, DAB/DVB
8
Whats Communications?
Communication involves the transfer of information from
one point to another.
Three basic elements
Transmitter: converts message into a form suitable for
transmission
Channel: the physical medium, introduces distortion, noise,
interference
Receiver:
R
i
reconstruct
t t a recognizable
i bl fform off the
th message
Speech
Music
Pictures
Data
Communication Channel
The channel is central to operation of a communication
y
system
Linear (e.g., mobile radio) or nonlinear (e.g., satellite)
Time invariant (e.g., fiber) or time varying (e.g., mobile radio)
The information-carrying capacity of a communication

system is proportional to the channel bandwidth
Pursuit for wider bandwidth
Copper wire: 1 MHz

Coaxial cable: 100 MHz
Microwave: GHz
Optical fiber: THz
Uses light as the signal carrier
Highest capacity among all practical signals
10
Noise in Communications
Unavoidable presence of noise in the channel
Noise refers to unwanted waves that disturb communications
Signal is contaminated by noise along the path.
External noise: interference from nearby channels, humanmade noise, natural noise...
Internal noise: thermal noise, random emission... in
electronic devices
Noise is one of the basic factors that set limits on
communications.
i ti
A widely used metric is the signal-to-noise (power) ratio
(SNR)
signal power
SNR= noise p
power
11
Transmitter and Receiver

The transmitter modifies the message signal into a form
suitable for transmission over the channel
This modification often involves modulation
Moving the signal to a high-frequency carrier (up-conversion) and
varying some parameter of the carrier wave
Analog: AM, FM, PM
Digital: ASK
ASK, FSK,
FSK PSK (SK: shift keying)
The receiver recreates the original message by

demodulation
Recovery is not exact due to noise/distortion
The resulting degradation is influenced by the type of modulation
Design of analog communication is conceptually simple

g
communication is more efficient and reliable; design
g
Digital
is more sophisticated
12
Objectives of System Design

Two primary resources in communications
Transmitted p
power ((should be g
green))
Channel bandwidth (very expensive in the commercial market)
In certain scenarios, one resource may be more important

than the other
Power limited (e.g. deep-space communication)
Bandwidth limited (e.g.
(
telephone circuit))
Objectives of a communication system design

The message is delivered both efficiently and reliably
reliably, subject to
certain design constraints: power, bandwidth, and cost.
Efficiency is usually measured by the amount of messages sent in
unit power, unit time and unit bandwidth.
Reliability is expressed in terms of SNR or probability of error.
13
Information Theory
In digital communications, is it possible to operate at

zero error rate even though the channel is noisy?
Poineers: Shannon, Kolmogorov
The maximum rate of reliable transmission is
calculated.
The famous Shannon capacity formula for a channel
with bandwidth W (Hz)
C = W log(1+SNR) bps (bits per second)
Zero error rate is possible as long as actual signaling

rate is less than C.
Shannon
Many concepts
M
t were fundamental
f d
t l and
d paved
d th
the
way for future developments in communication
theory.
Provides a basis for tradeoff between SNR and
bandwidth, and for comparing different communication
schemes.
Kolmogorov
14
Milestones in Communications
1837, Morse code used in telegraph

1864 Maxwell formulated the eletromagnetic (EM) theory
1864,
1887, Hertz demonstrated physical evidence of EM waves
1890s-1900s
1890
s-1900 s, Marconi & Popov,
Popov long-distance radio
telegraph
Across Atlantic Ocean
From Cornwall to Canada
1875, Bell invented the telephone

1906, radio broadcast
1918, Armstrong invented superheterodyne radio receiver
(and FM in 1933)
1921, land-mobile communication
15
Milestones (2)
1928, Nyquist proposed the sampling theorem

1947,, microwave relayy system
y
1948, information theory
1957, era of satellite communication began
1966, Kuen Kao pioneered fiber-optical
communications (Nobel Prize Winner)
1970 era off computer
1970s,
t networks
t
k b
began
1981, analog cellular system
1988 digital cellular system debuted in Europe
1988,
2000, 3G network
The big 3 telecom manufacturers in 2010
16
Cellular Mobile Phone Network

A large area is partitioned into cells
Frequency reuse to maximize capacity
17
Growth of Mobile Communications

1G: analog communications
AMPS
2G: digital communications

GSM
IS-95
3G: CDMA networks

WCDMA
CDMA2000
TD-SCDMA
TD SCDMA
4G: data rate up to

1 Gbps (giga bits per second)
Pre-4G technologies:
WiMax, 3G LTE
18
Wi--Fi
Wi
Wi-Fi connects local computers (usually within 100m
g )
range)
19
IEEE 802.11 WiWi-Fi Standard

802.11b
Standard for 2.4GHz ((unlicensed)) ISM band
1.6-10 Mbps, 500 ft range
802.11a
Standard for 5GHz band
20-70 Mbps, variable range
Similar to HiperLAN in Europe
802.11g
Standard in 2.4 GHz and 5 GHz bands
Speeds up to 54 Mbps, based on orthogonal frequency division
multiplexing (OFDM)
802.11n
Data rates up
p to 600 Mbps
p
Use multi-input multi-output (MIMO)
20
Satellite/Space Communication
Satellite communication
Cover very large areas
Optimized for one-way transmission
Radio (DAB) and movie (SatTV)
broadcasting
Two-way systems
The only choice for remote-area and
maritime communications
Propagation delay (0.25 s) is
uncomfortable in voice
communications
Space communication
Missions to Moon, Mars,

Long distance, weak signals
High-gain antennas
Powerful error-control coding
21
Future Wireless Networks

Ubiquitous Communication Among People and Devices
Wireless
Wi
l
IInternet
t
t access
Nth generation Cellular
Ad Hoc Networks
Sensor Networks
Wireless Entertainment
Smart Homes/Grids
Automated Highways
All this and more
Hard Delay Constraints
Hard Energy Constraints
22
Communication Networks
Todays communications networks are complicated
systems
A large n
number
mber of users
sers sharing the medi
medium
m
Hosts: devices that communicate with each other
Routers: route date through the network
23
Concept of Layering
Partitioned into layers, each doing a relatively simple task
Protocol stack
Application
Network
Transport
Network
Link
Ph i l
Physical
OSI Model
TCP/IP protocol
stack (Internet)
Physical
2-layer model
Communication Systems mostly deals with the physical layer

layer, but
some techniques (e.g., coding) can also be applied to the network
layer.
24

EE2
Lecture 2: Probability and Random

Processes
Dr. Cong Ling
25
Outline
Probability
How probability is defined

cdf and pdf
Mean and variance
Joint distribution
Central limit theorem
Random processes
p
Definition
Stationary random processes
Power spectral density
References
Notes of Communication Systems,
y
, Chap.
p 2.3.
Haykin & Moher, Communication Systems, 5th ed., Chap. 5
Lathi, Modern Digital and Analog Communication Systems, 3rd ed.,
Chap 11
Chap.
26
Why Probability/Random Process?

Probability is the core mathematical tool for communication
y
theory.
The stochastic model is widely used in the study of
communication systems.
Consider a radio communication system where the received
signal is a random process in nature:
Message is random. No randomness, no information.

Interference is random.
Noise is a random process.
process
And many more (delay, phase, fading, ...)
Other real-world applications of probability and random

processes include
Stock market modelling,
g g
gambling
g ((Brown motion as shown in the
previous slide, random walk)
27
Probabilistic Concepts
What is a random variable (RV)?
It is a variable that takes its values from the outputs
p
of a random
experiment.
What is a random experiment?

It is an experiment the outcome of which cannot be predicted
precisely.
All possible identifiable outcomes of a random experiment
constitute its sample space S.
An event is a collection of possible outcomes of the random
experiment.
i
Example
For tossing a coin
coin, S = { H
H, T }
For rolling a die, S = { 1, 2, , 6 }
28
Probability Properties
PX(xi): the probability of the random variable X taking on
the value xi
The probability of an event to happen is a non-negative
number, with the following properties:
The probability of the event that includes all possible outcomes of
the experiment is 1.
The probability of two events that do not have any common
outcome is the sum of the probabilities of the two events
separately.
Example
Roll a die:
PX(x = k) = 1/6
for k = 1, 2, , 6
29
CDF and PDF
The (cumulative) distribution function (cdf) of a random variable X

is defined as the probability of X taking a value less than the
argumentt x:
FX ( x) P( X x)
Properties
FX () 0, FX () 1
FX ( x1 ) FX ( x2 ) if x1 x2
The probability density function (pdf) is defined as the derivative of

the distribution function:
f X ( x)
dF X ( x )
dx
x
FX ( x )
f X ( y ) dy
P ( a X b ) FX (b ) FX ( a )
f X ( y ) dy
f X ( x)
dF X ( x )
dx
0 since F X ( x ) is non - decreasing

30
Mean and Variance
If x is sufficiently small,
P ( x X x x )
x x
f X ( y)
f X ( y ) dy f X ( x ) x
Area
f X ( x)
x
y
Mean (or expected value DC level):
E[ X ] X
x f X ( x ) dx
E[ ]: expectation
operator
Variance ( power for zero-mean signals):
X2 E[( X X ) 2 ]
( x X ) 2 f X ( x)dx E[ X 2 ] X2
31
Normal (Gaussian) Distribution

f X ( x)
f X ( x)
FX ( x )
1
2
1
2
( x m )2
2 2
( y m )2
2 2
f x
for
dy
E[ X ] m
X2 2
: rm s v a lu e
32
Uniform Distribution
fX(x)
f X ( x) b a
0
0
xa
FX ( x )
b a
1
a xb
elsewhere
xa
a xb
ab
2
(b a ) 2
12
E[ X ]
X2
xb
33
Joint Distribution
Joint distribution function for two random variables X and Y
FXY ( x , y ) P ( X x , Y y )
Joint probability density function
f XY ( x, y )
Properties
2 FXY ( x , y )
xy
1) FXY (, )
f XY (u , v)dudv
d d 1
2)
f X ( x)
f XY ( x, y ) dy
3)
fY ( x )
f XY ( x, y ) dx
4)
X , Y are independent f XY ( x, y ) f X ( x) fY ( y )
5)
X , Y are uncorrelated E[ XY ] E[ X ]E[Y ]

34
Independent vs. Uncorrelated

Independent implies Uncorrelated (see problem sheet)
Uncorrelated does not imply Independence
For normal RVs (jointly Gaussian), Uncorrelated implies
Independent (this is the only exceptional case!)
An example of uncorrelated but dependent RVs
Let be uniformly
y distributed in [0,2 ]
f ( x) 21
for 0 x 2
Define RVs X and Y as

X cos Y sin
Clearly,
y, X and Y are not independent.
p
But X and Y are uncorrelated:
E[ XY ]
1
2
Locus of
X and Y
X
cos sin d 0!
35
Joint Distribution of n RVs

Joint cdf
FX1 X 2 ... X n ( x1 , x2 ,...xn ) P( X 1 x1 , X 2 x2 ,... X n xn )
Joint pdf
f X 1 X 2 ... X n ( x1 , x2 ,...xn )
n FX1X 2 ... X n ( x1 , x2 ,... xn )

x1x2 ...xn
Independent
FX 1 X 2 ... X n ( x1 , x2 ,...xn ) FX 1 ( x1 ) FX 2 ( x2 )...FX n ( xn )
f X 1 X 2 ... X n ( x1 , x2 ,...xn ) f X 1 ( x1 ) f X 2 ( x2 )...
) f X n ( xn )
i.i.d. (independent, identically distributed)

The random variables are independent and have the same
distribution.
Example: outcomes from repeatedly flipping a coin.
36
Central Limit Theorem

For i.i.d. random variables,
z = x1 + x2 + + xn
tends to Gaussian as n
goes to infinity.
Extremely useful in
communications.
Thats why noise is usually
Gaussian. We often say
Gaussian
Gaussian noise
noise or
Gaussian channel in
communications.
x1
x1 + x2
+ x3
x1 + x2
x1 + x2+
x3 + x4
Illustration of convergence to Gaussian

distribution
37
What is a Random Process?

A random process is a time-varying function that assigns
the outcome of a random experiment to each time instant:
X(t).
For a fixed (sample path): a random process is a time
varying function, e.g., a signal.
For fixed t: a random p
process is a random variable.
If one scans all possible outcomes of the underlying
random experiment, we shall get an ensemble of signals.
Noise can often be modelled as a Gaussian random
process.
38
An Ensemble of Signals
39
Statistics of a Random Process

For fixed t: the random process becomes a random
variable, with mean
X (t ) E[ X (t )] x f X ( x; t )dx
In general
general, the mean is a function of t.
t
Autocorrelation function
RX (t1 , t2 ) E[ X (t1 ) X (t2 )]
xy f X ( x, y; t1 , t2 )dxdy
In general, the autocorrelation function is a two-variable function.

It measures the correlation between two samples.
40
Stationary Random Processes

A random process is (wide-sense) stationary if
Its mean does not depend
p
on t
X (t ) X
Its autocorrelation function only depends on time difference
RX (t, t ) RX ( )
In communications,, noise and message

g signals
g
can often
be modelled as stationary random processes.
41
Example
Show that sinusoidal wave with random phase
X (t ) A cos(( c t )
with phase uniformly distributed on [0,2] is stationary.
Mean is a constant:
1
2
f ( )
, [0
[0, 2 ]
1
X (t ) E[ X (t )] A cos(ct ) d 0
2
0
2
Autocorrelation function only depends on the time difference:
RX (t , t ) E[ X (t ) X (t )]
E[ A2 cos(c t ) cos(c t c )]
A2
A2
E[cos(2c t c 2)]
E[cos(c )]
2
2
A2 2
1
A2
cos(2c t c 2 )
d
cos(c )
0
2
2
2
A2
RX ( )
cos(c )
2
42
Power Spectral Density

Power spectral density (PSD) is a function that measures
the distribution of power of a random process with
frequency.
PSD is only defined for stationary processes.
Wiener-Khinchine
Wiener Khinchine relation: The PSD is equal to the
Fourier transform of its autocorrelation function:
S X ( f ) R X ( )e j 2f d
A similar relation exists for deterministic signals
Then the average power can be found as
2
P E[ X (t )] R X (0) S X ( f )df
The
Th frequency
f
content off a process depends
d
d on how
h
rapidly the amplitude changes as a function of time.
This can be measured by the autocorrelation function
function.
43
Passing Through a Linear System
L
Lett Y(t) obtained
bt i d b
by passing
i random
d
process X(t) th
through
h
a linear system of transfer function H(f). Then the PSD of
Y(t)
2
(2.1)
SY ( f ) H ( f ) S X ( f )
Proof: see Notes 3
3.4.2.
42
Cf. the similar relation for deterministic signals
If X(t)
( ) is a Gaussian p
process,, then Y(t)
( ) is also a Gaussian
process.
Gaussian processes are very important in communications.
44

EE2
Lecture 3: Noise
Dr. Cong Ling
45
Outline
What is noise?
White noise and Gaussian noise
Lowpass noise
Bandpass noise
In-phase/quadrature representation
Phasor representation
p
References
Notes of Communication Systems, Chap. 2.
Chap. 11
46
Noise
Noise is the unwanted and beyond our control waves that
g
disturb the transmission of signals.
Where does noise come from?
External sources: e.g., atmospheric, galactic noise, interference;
Internal sources: generated by communication devices themselves.
This type of noise represents a basic limitation on the performance of
electronic communication systems
systems.
Shot noise: the electrons are discrete and are not moving in a
continuous steady flow, so the current is randomly fluctuating.
Thermal
Th
l noise:
i
caused
db
by th
the rapid
id and
d random
d
motion
ti off electrons
l t
within a conductor due to thermal agitation.
Both are often stationaryy and have a zero-mean Gaussian

distribution (following from the central limit theorem).
47
White Noise
The additive noise channel
n(t)
( ) models all types
yp of noise
zero mean
White noise
Its power spectrum density (PSD) is constant over all frequencies,
i.e.,
N
f
SN ( f ) 0 ,
2
Factor 1/2 is included to indicate that half the power is associated
with positive frequencies and half with negative.
The term white is analogous to white light which contains equal
amounts of all frequencies (within the visible band of EM wave).
It
Itss only defined for stationary noise
noise.
An infinite bandwidth is a purely theoretic assumption.
48
White vs. Gaussian Noise

White noise
SN(f)
PSD
Rn()
Autocorrelation function of n(t ) : Rn ( ) 0 ( )

2
Samples
S
l att diff
differentt time
ti
iinstants
t t are uncorrelated.
l t d
Gaussian noise: the distribution at any time instant is

Gaussian
Gaussian
Gaussian noise can be colored
White noise Gaussian noise
PDF
White noise can be non-Gaussian
Nonetheless, in communications, it is typically additive

white Gaussian noise (AWGN)
(AWGN).
49
Ideal Low
Low--Pass White Noise
Suppose white noise is applied to an ideal low-pass filter
of bandwidth B such that
N0
, | f | B
SN ( f ) 2
0,
otherwise
Power PN = N0B
By
B Wiener-Khinchine
Wi
Khi hi relation,
l i
autocorrelation
l i ffunction
i
Rn() = E[n(t)n(t+)] = N0B sinc(2B)
(3.1)
where sinc(x) = sin(x)/x.
x
Samples at Nyquist frequency 2B are uncorrelated
Rn() = 0,
0
= k/(2B),
k/(2B) k = 1,
1 2,
2
50
Bandpass Noise
Any communication system that uses carrier modulation will typically

have a bandpass filter of bandwidth B at the front-end of the receiver.
n(t)
Any
y noise that enters the receiver will therefore be bandpass
p
in nature:
its spectral magnitude is non-zero only for some band concentrated
around the carrier frequency fc (sometimes called narrowband noise).
51
Example
If white noise with PSD of N0/2 is passed through an ideal
bandpass
p
filter,, then the PSD of the noise that enters the
receiver is given by
N0
, f fC B
SN ( f ) 2
0,
otherwise
Power PN = 2N0B
Autocorrelation function
Rn() = 2N0Bsinc(2B)cos(2fc)
which follows from (3.1) by
applying
l i th
the ffrequency-shift
hift
property of the Fourier transform
g ( t ) G ( )
g ( t ) 2 cos 0 t [ G ( 0 ) G ( 0 )]
Samples taken at frequency 2B are still uncorrelated.

uncorrelated
Rn() = 0, = k/(2B), k = 1, 2,
52
Decomposition of Bandpass Noise

Consider bandpass noise within f fC B with any PSD
((i.e.,, not necessarilyy white as in the previous
p
example)
p )
Consider a frequency slice f at frequencies fk and fk.
For f small:
n k ( t ) a k cos( 2 f k t k )
k: a random p
phase assumed independent
p
and uniformly
y
distributed in the range [0, 2)
ak: a random amplitude.
f
-ffk
fk
53
Representation of Bandpass Noise

The complete bandpass noise waveform n(t) can be
constructed by summing up such sinusoids over the entire
band ii.e.,
band,
e
n(t ) nk (t ) ak cos(2 f k t k )
k
f k f c k f
(3.2)
Now, let fk = (fk fc)+ fc, and using cos(A + B) = cosAcosB

sinAsinB we obtain the canonical form of bandpass
noise
i
n(t ) nc (t ) cos(2f ct ) ns (t ) sin( 2f ct )

where
nc (t ) ak cos(2 ( f k f c )t k )
k
(3.3)
ns (t ) ak sin( 2 ( f k f c )t k )
k
nc(t) and ns(t) are baseband signals,

signals termed the in-phase
in phase and
quadrature component, respectively.
54
Extraction and Generation

nc(t) and ns(t) are fully representative of bandpass noise.
(a)
( ) Given bandpass
p
noise,, one mayy extract its in-phase
p
and
quadrature components (using LPF of bandwith B). This is
extremely useful in analysis of noise in communication receivers.
(b) Given the two components,
components one may generate bandpass noise
noise.
This is useful in computer simulation.
nc(t)
nc(t)
ns(t)
ns(t)
55
Properties of Baseband Noise

If the noise n(t) has zero mean, then nc(t) and ns(t) have
zero mean.
If the noise n(t) is Gaussian, then nc(t) and ns(t) are
Gaussian.
If the noise n(t) is stationary, then nc(t) and ns(t) are
stationary.
If the noise n(t) is Gaussian and its power spectral density
S( f ) is symmetric with respect to the central frequency fc,
th nc(t)
then
( ) and
d ns(t)
( ) are statistical
t ti ti l independent.
i d
d t
The components nc(t) and ns(t) have the same variance (=
power) as n(t).
)
56
Power Spectral Density

Further, each baseband noise waveform will have the
same PSD:
S N ( f f c ) S N ( f f c ), | f | B
Sc ( f ) S s ( f )
otherwise
0,
(3.4)
This is analogous to
g (t ) G ( )
g (t )2 cos 0t [G ( 0 ) G ( 0 )]
A rigorous
g
p
proof can be found in A. Papoulis,
p
, Probability,
y, Random
Variables, and Stochastic Processes, McGraw-Hill.
The PSD can also be seen from the expressions (3.2) and (3.3)
where each of nc(t) and ns(t) consists of a sum of closely spaced
base-band sinusoids.
57
Noise Power
For ideally filtered narrowband noise, the PSD of nc(t)
Sc(f)=Ss(f)
and ns((t)) is therefore g
given by
y
N 0 , | f | B
Sc ( f ) S s ( f )
0, otherwise
(3.5)
Corollary: The average power in each of the baseband

waveforms
f
nc(t)
( ) and
d ns(t)
( ) is
i identical
id ti l to
t the
th average power
in the bandpass noise waveform n(t).
For ideally filtered narrowband noise
noise, the variance of nc(t)
and ns(t) is 2N0B each.
PNc = PNs = 2N0B

58
Phasor Representation
We may write bandpass noise in the alternative form:
n(t ) nc (t ) cos(2 f c t ) ns (t ) sin(2 f c t )
r (t ) cos[2 f c t (t )]
r (t ) nc (t ) 2 ns (t ) 2
ns ( t )
nc (t )
(t ) tan 1
: the envelop of the noise

: the p
phase of the noise
( ) 2 fct + (t)
(t)
()
59
Distribution of Envelop and Phase

It can be shown that if nc(t) and ns(t) are Gaussiang
r(t)
( ) has a Rayleigh
y g
distributed,, then the magnitude
distribution, and the phase (t) is uniformly distributed.
What if a sinusoid Acos(2 fct) is mixed with noise?
Then the magnitude
g
will have a Rice distribution.
The p
proof is deferred to Lecture 11,, where such
distributions arise in demodulation of digital signals.
60
Summary
White noise: PSD is constant over an infinite bandwidth.
Gaussian noise: PDF is Gaussian
Gaussian.
Bandpass noise
In-phase
In phase and quadrature compoments nc(t) and ns(t) are low-pass
low pass
random processes.
nc(t) and ns(t) have the same PSD.
nc(t) and ns(t) have the same variance as the band-pass noise n(t).
Such properties will be pivotal to the performance analysis of
bandpass communication systems.
The in-phase/quadrature representation and phasor

representation
p
are not onlyy basic to the characterization of
bandpass noise itself, but also to the analysis of bandpass
communication systems.
61

EE2
Lecture 4: Noise Performance of DSB

Dr. Cong Ling
62
Outline
SNR of baseband analog transmission
Revision of AM
SNR of DSB-SC
References
Notes of Communication Systems
Systems, Chap
Chap. 3
3.1-3.3.2.
1-3 3 2
Chap. 12
63
Noise in Analog Communication Systems

How do various analog modulation schemes perform in
the presence of noise?
Which scheme performs best?
How can we measure its performance?
Model of an analog communication system
Noise PSD: BT is the bandwidth,

N0/2 is the double-sided noise PSD
64
SNR
We must find a way to quantify (= to measure) the
performance of a modulation scheme.
p
We use the signal-to-noise ratio (SNR) at the output of the
receiver:
average power of message signal at the receiver output PS
SNRo
average power of noise at the receiver output

PN
Normally expressed in decibels (dB)
SNR (dB) = 10 log10(SNR)
This is to manage the wide range of power
levels in communication systems
In honour of Alexander Bell
Example:
p
dB
If x is power,
X (dB) = 10 log10(x)
If x is amplitude,
X ((dB)) = 20 log
g10((x))
ratio of 2 3 dB; 4 6 dB; 10 10dB

65
Transmitted Power
PT: The transmitted power
Limited by: equipment capability
capability, battery life
life, cost
cost,
government restrictions, interference with other channels,
green communications etc
The higher it is, the more the received power (PS), the
higher the SNR
For a fair comparison between different modulation
schemes:
PT should be the same for all
We use the baseband signal to noise ratio SNRbaseband to

calibrate the SNR values we obtain
66
A Baseband Communication System

It does not use
modulation
It is suitable for
transmission over wires
The power it transmits
is identical to the
message power: PT = P
No attenuation: PS = PT =
P
The results can be
extended to band
band-pass
pass
systems
67
Output SNR
Average signal (= message) power

P = the area under the triangular curve
Assume: Additive, white noise with power spectral
density PSD = N0/2
A
Average
noise
i
power att th
the receiver
i
PN = area under the straight line = 2W N0/2 = WN0
SNR at the receiver output:
SNRbaseband
PT
N 0W
Note: Assume no propagation loss
Improve the SNR by:

increasing the transmitted power (PT ),
restricting the message bandwidth (W ),
making the channel/receiver less noisy (N0 ).
)
68
Revision: AM
General form of an AM signal:
s(t ) AM [ A m(t )] cos(2f ct )

A
A: the
th amplitude
lit d off the
th carrier
i
fc: the carrier frequency
m(t): the message signal
Modulation index:
mp
A
mp: the peak amplitude of m(t), i.e., mp = max |m(t)|
69
Signal Recovery
n(t)
Receiver model
1) 1 A m p : use an envelope
l
d
detector.
t t
This is the case in almost all commercial AM radio
receivers
receivers.
Simple circuit to make radio receivers cheap.
2) Otherwise: use synchronous detection = product
detection = coherent detection
Th terms
The
t
detection
d t ti and
d demodulation
d
d l ti are used
d iinterchangeably.
t h
bl
70
Synchronous Detection for AM

Multiply the waveform at the receiver with a local carrier of
the same frequency (and phase) as the carrier used at the
t
transmitter:
itt
2 cos(2 f c t ) s (t ) AM [ A m(t )]2 cos 2 (2 f c t )
[ A m(t )][1 cos(4 f c t )]
A m(t )
Use a LPF to recover A + m(t) and finally m(t)

Remark: At the receiver you need a signal perfectly
synchronized with the transmitted carrier
71
DSB--SC
DSB
Double-sideband suppressed carrier (DSB-SC)
s (t ) DSB SC Am(t ) cos(2 f c t )

Signal recovery: with synchronous detection only
The received noisy signal is
x (t ) s (t ) n (t )
s(t ) nc (t ) cos(2f c t ) ns (t ) sin(2f c t )
Am(t ) cos(2f c t ) nc (t ) cos(2f c t ) ns (t ) sin(2f ct )
[ Am
A (t ) nc (t )] cos((2f c t ) ns (t ) sin(
i (2f c t )
y(t)
2
72
Synchronous Detection for DSBDSB-SC

Multiply with 2cos(2fct):
y (t ) 2 cos(2 f c t ) x(t )
Am(t )2 cos 2 (2 f c t ) nc (t )2 cos 2 (2 f c t ) ns (t ) sin(4 f c t )
Am(t )[1 cos(4 f c t )] nc (t )[1 cos(4 f c t )] ns (t ) sin(4 f c t )
Use a LPF to keep
~
y Am(t ) nc (t )
Signal power at the receiver output:
PS E{A2m2 (t)} A2E{m2 (t)} A2P

Power of the noise nc(t) (recall (3
(3.5),
5) and message
bandwidth W):
W
PN N 0df 2 N 0W
W
73
Comparison
A2 P
SNRo
2 N 0W
To which transmitted power does this correspond?
A2 P
2
2
2
PT E{ A m(t ) cos (2 f c t )}
2
So
SNRo
Comparison with
SNRbaseband
PT
SNRDSB SC
N0W
PT
SNRDSB SC SNRbaseband
N 0W
Conclusion: DSB
DSB-SC
SC system has the same SNR performance as a
baseband system.
74

EE2
Lecture 5: Noise performance of SSB

and AM
Dr. Cong Ling
75
Outline
Noise in SSB
Noise in standard AM
Coherent detection
(of theoretic interest only)
Envelope detection
References
Notes of Communication Systems, Chap. 3.3.3
3.3.3-3.3.4.
3.3.4.
Chap. 12
76
SSB Modulation
Consider single (lower) sideband AM:
A
A
s(t )SSB m(t ) cos 2f ct m (t ) sin
i 2f ct
2
2
where m (t ) is the Hilbert transform of m(t).
m (t ) is obtained by passing m(t) through a linear filter
with transfer function jsgn(f ).
m (t ) and m(t) have the same power P .
The average power is A2P/4.
77
Noise in SSB
Receiver signal x(t) = s(t) + n(t).
Apply a band-pass
band pass filter on the lower sideband
sideband.
Still denote by nc(t) the lower-sideband noise (different
from the double
double-sideband
sideband noise in DSB).
Using coherent detection:
y (t ) x(t ) 2 cos(2 f c t )
A
A
m(t ) nc (t ) m(t ) nc (t ) cos(4 f c t )

2
2
m (t ) ns (t ) sin(4 f c t )
2
After low-pass filtering,
y (t ) m(t ) nc (t )
2
78
Noise Power
Noise power for nc(t) = that for band-pass noise =
N0W ((halved compared
p
to DSB)) ((recall ((3.4))
))
SN(f)
N0/2
f
-ffc
-ffc+W
fc-W
W
fc
Lower-sideband noise
SNc(f)
N0/2
f
-W
W
Baseband noise
79
Output SNR
Signal power A2P/4
SNR at output
A2 P
SNRSSB
4 N 0W
For a baseband system with the same transmitted power
A2P/4
A2 P
SNRbaseband
4 N 0W
Conclusion: SSB achieves the same SNR performance
as DSB-SC (and the baseband model) but only requires
half the band-width.
80
Standard AM: Synchronous Detection

Pre-detection signal:
x(t ) [ A m(t )]cos(2f ct ) n(t )

[ A m(t )]cos(2f ct ) nc (t ) cos(2f ct ) ns (t ) sin(2f ct )
[ A m(t ) nc (t )]cos(2f ct ) ns (t ) sin(2f ct )
Multiply with 2 cos(2fct):
y (t ) [ A m(t ) nc (t )][1 cos(4 f c t )]

ns (t ) sin(4 f c t )
LPF
~
y A m(t ) nc (t )
81
Output SNR
Signal power at the receiver output:
PS E{m2(t)} P
Noise power:
PN 2 N 0W
P
SNRo
SNRAM
2 N0W
Transmitted power
A2 P A2 P
PT

2 2
2
82
Comparison
SNR of a baseband signal with the same transmitted
power:
p
A2 P
SNRbaseband
2 N 0W
Thus:
SNRAM
Note:
P
SNRbaseband
2
A P
P
1
2
A P
Conclusion: the performance of standard AM with

synchronous recovery is worse than that of a baseband
system.
y
83
Model of AM Radio Receiver
AM radio receiver of the superheterodyne type
Model of AM envelope detector

84
Envelope Detection for Standard AM

Phasor diagram of the signals present at an AM
receiver
x(t)
Envelope
y (t ) envelope of x (t )
[ A m(t ) nc (t )]2 ns (t ) 2
Equation is too complicated
Must use limiting cases to put it in a form where noise and
message are added
85
Small Noise Case

1st Approximation: (a) Small Noise Case
n ( t ) [ A m ( t )]
Then
Then
Thus
ns (t ) [ A m(t ) nc (t )]
y (t ) [ A m(t ) nc (t )]
SNRo
P
SNRenv
2 N 0W
Identical to the postdetection signal in the

case off synchronous
h
detection!
And in terms of baseband SNR:

SNRenv
P
2
SNRbaseband
A P
Valid for small noise onl

only!!
86
Large Noise Case

2nd Approximation: (b) Large Noise Case
n(t ) [ A m(t )]
Isolate the small quantity:
y2 (t ) [ A m(t) nc (t )]2 ns2 (t)
( A m(t))2 nc2 (t ) 2( A m(t ))nc (t ) ns2 (t )

2
m
t
(
(
))
2( A m(t))nc (t )
2
2
[nc (t ) ns (t )]1 2
2
2
2
2
nc (t) ns (t )
nc (t) ns (t )
2[ A m(t )]nc (t )
y (t ) [n (t ) n (t )]1 2
2
nc (t ) ns (t )
2
c
2
s
2[ A m(t )]nc (t )
E (t )1
2
En (t
(t )
2
n
En (t ) nc2 (t ) ns2 (t )
87
Large Noise Case: Threshold Effect
From the phasor diagram: nc(t) = En(t) cosn(t)

Then:
y(t) En (t) 1
U
Use
1 x 1
x
for x 1
2
2[ A m(t)]cosn (t)
En (t)
[ A m(t )] cos n (t )
y (t ) En (t )1
En ( t )
En (t ) [ A m(t )] cos n (t )
Noise is multiplicative here!

No term proportional to the message!
R
Result:
lt a threshold
th
h ld effect,
ff t as below
b l
some carrier
i power llevell ((very
low A), the performance of the detector deteriorates very rapidly.
88
Summary
A: carrier amplitude
amplitude, P: power of message signal
signal, N0: single-sided PSD of noise
noise,
W: message bandwidth.
89

EE2
Lecture 6: Noise Performance of FM

Dr. Cong Ling
90
Outline
Recap of FM
FM system model in noise
PSD of noise
References
Notes of Communication Systems, Chap. 3.4.1-3.4.2.
Lathi,
Lathi Modern Digital and Analog Communication Systems
Systems,
3rd ed., Chap. 12
91
Frequency Modulation
Fundamental difference between AM and FM:
AM: message information contained in the signal
amplitude Additive noise: corrupts directly the
modulated signal.
FM: message information contained in the signal
frequency the effect of noise on an FM signal is
determined by the extent to which it changes the
frequency of the modulated signal.
Consequently, FM signals is less affected by noise than
AM signals
92
Revision: FM
A carrier waveform
s(t)
( ) = A cos[[i((t)]
)]
where i(t): the instantaneous phase angle.
When
s(t) = A cos(2f t) i(t) = 2f t

we may say that
d
1 d
2f f
dt
2 dt
Generalisation: instantaneous frequency:
1 d i ( t )
f i (t )
2 dt
93
FM
In FM: the instantaneous frequency of the carrier varies
g
linearlyy with the message:
fi(t) = fc + kf m(t)
where kf is the frequency sensitivity of the modulator.
Hence (assuming i(0)=0):

t
i (t ) 2 fi ( )d 2 f c t 2 k f m( )d
0
Modulated signal:
Note:
s(t ) A cos 2f c t 2k f m( )d
(a) The envelope is constant

(b) Signal s(t) is a non-linear function of the message signal m(t).
94
Bandwidth of FM
mp = max|m(t)|: peak message amplitude.
fc kf mp < instantaneous frequency
q
y < fc + kf mp
Define: frequency deviation = the deviation of the
instantaneous frequency from the carrier frequency:
f = kf mp
Define: deviation ratio:
f / W
W: the message bandwidth.
Small
: FM bandwidth 2x message
g bandwidth ((narrow-band FM))
Large : FM bandwidth >> 2x message bandwidth (wide-band FM)
Carsons rule of thumb:

BT = 2W(+1) = 2(f + W)
<<1 BT 2W (as in AM)
>>1 BT 2ff
95
FM Receiver
n(t)
Bandpass filter: removes any signals outside the bandwidth of fc

BT/2 the predetection noise at the receiver is bandpass with a
b d idth off BT.
bandwidth
FM signal has a constant envelope use a limiter to remove
any
y amplitude
p
variations
Discriminator: a device with output proportional to the deviation in the
instantaneous frequency it recovers the message signal
Final baseband low-pass
lo pass filter:
filter has a bandwidth
band idth of W it
passes the message signal and removes out-of-band noise.
96
Linear Argument at High SNR
FM is nonlinear (modulation & demodulation), meaning superposition

doesnt hold.
Nonetheless, it can be shown that for high SNR, noise output and
message signal are approximately independent of each other:
Output Message + Noise (i.e., no other nonlinear terms).
Any (smooth) nonlinear systems are locally linear!

This can be justified rigorously by applying Taylor series expansion.
Noise does not affect power of the message signal at the output
output, and
vice versa.
We can compute the signal power for the case without noise, and
accept that the result holds for the case with noise too.
We can compute the noise power for the case without message,
and accept that the result holds for the case with message too.
too
97
Output Signal Power Without Noise

Instantaneous frequency of the input signal:
fi fc k f m(t)
Output of discriminator:
k f m((t )
So, output signal power:

PS k 2f P
P : the average power of the message signal
98
Output Signal with Noise

In the presence of additive noise, the real
predetection signal
p
g
is
t
x(t ) A cos 2 f c t 2 k f m( )d
0
nc (t ) cos(2 f c t ) ns (t ) sin(2 f c t )
It can be shown (by linear argument again): For high SNR,

noise output is approximately independent of the message
signal
In order to calculate the power of output noise, we may
assume there is no message
i.e.,, we onlyy have the carrier plus
p
noise p
present:
~
x (t ) A cos(2f c t ) nc (t ) cos(2f c t ) ns (t ) sin(2f c t )
99
Phase Noise
Phasor diagram of
the FM carrier and
noise signals
ns(t)
A
Instantaneous
I t t
phase
h
noise:
i
i (t) tan1
nc(t)
ns (t )
A nc (t )
For large carrier power (large A):

i (t ) tan 1
ns (t ) ns (t )
A
A
Discriminator output = instantaneous frequency:

f i (t )
1 di (t )
1 dns (t )
2 dt
2 A dt
100
Discriminator Output
The discriminator output in the presence of both
g
and noise:
signal
1 dn
d s (t )
k f m( t )
2A dt
What is the PSD of
Fourier theory:
dns (t )
nd (t )
d
dt
x(t ) X ( f )
dx(t )
then
j 2 fX ( f )
dt
if
Differentiation with respect to time = passing the signal

through a system with transfer function of H(f ) = j2 f
101
Noise PSD
It follows from (2.1) that
So ( f ) | H ( f ) |2 Si ( f )
Si(f ): PSD of input signal

So(f ): PSD of output signal
H(f )): transfer
t
f function
f
ti
off the
th system
t
PSD of nd (t ) | j 2 f |2 PSD of ns (t )
Then:
PSD of ns (t ) N 0 within band
PSD of nd (t ) | 2 f |2 N 0
BT
2
f BT / 2
1 dns (t ) 1
f2
2
PSD of fi (t )
| 2 f | N 0 2 N 0
2 A dt 2 A
A
Aft
After the
th LPF,
LPF the
th PSD off noise
i output
t t no(t)
( ) is
i restricted
t i t d iin
the band W
f2
S No ( f ) 2 N 0
A
f W
(6 1)
(6.1)
102
Power Spectral Densities

SNs(f)
(a) Power spectral density of quadrature component ns(t) of narrowband noise n(t).
(b) Power spectral density of noise nd(t) at the discriminator output.
(c) Power spectral density of noise no(t) at the receiver output.
103

EE2
Lecture 7: Pre/de
Pre/de--emphasis for FM and
Comparison of Analog Systems
Dr. Cong Ling
104
Outline
Derivation of FM output SNR
Pre/de-emphasis to improve SNR
Comparison with AM
References
Notes of Communication Systems, Chap. 3.4.2-3.5.
Lathi,
Lathi Modern Digital and Analog Communication Systems
Systems,
3rd ed., Chap. 12
105
Noise Power
Average noise power at the receiver output:
W
PN S No ( f )df
W
Thus,
Thus from (6
(6.1)
1)
PN
2 N 0W 3
f2
N 0 df
2
A
3 A2
(7.1)
Average noise power at the output of a FM receiver

1
carrier power A2
A Noise,
Noise called the quieting effect
106
Output SNR
Since PS k 2f P , the output SNR
2 2
PS 3 A k f P
SNRO
SNRFM
3
PN 2 N 0W
Transmitted power of an FM waveform:
From
A2
PT
2
k f mp
PT
and
:
SNRbaseband
N 0W
W
3k 2f P
2 P
SNR FM
SNR
3
SNRbaseband
baseband
2
2
W
mp
2 SNRbaseband (could be much higher than AM)
Valid when the carrier power is large compared with the

noise power
107
Threshold effect
The FM detector exhibits a more pronounced threshold
p detector.
effect than the AM envelope
The threshold point occurs around when signal power is
10 times noise power:
A2
10,
2 N 0 BT
BT 2W ( 1)
Below the threshold the FM receiver breaks (i.e.,

significantly deteriorated).
Can be analyzed by examining the phasor diagram
ns(t)
A
nc(t)
108
Qualitative Discussion
As the noise changes randomly, the point P1 wanders
around P2
High SNR: change of angle is small
Low SNR: P1 occasionally sweeps around origin, resulting in
changes
h
off 2
2 in
i a short
h t ti
time
Illustrating impulse like

components in
(t) d(t)/dt
produced by changes of 2 in
(t); (a) and (b) are graphs of
(t) and (t),

(t) respectively.
respectively
109
Improve Output SNR
PSD of the noise at the detector output square of

frequency.
PSD of a typical message typically rolls off at around 6 dB
per decade
p
To increase SNRFM:
Use a LPF to cut-off high frequencies at the output
Message is attenuated too, not very satisfactory
Use pre-emphasis and de-emphasis

Message is unchanged
High frequency components of noise are suppressed
110
Pre--emphasis and DePre

De-emphasis
Hpe(f ): used to artificially emphasize the high frequency components of

the message prior to modulation, and hence, before noise is
introduced.
Hde(f ): used to de-emphasize the high frequency components at the
receiver, and restore the original PSD of the message signal.
In theory, Hpe(f ) f , Hde(f ) 1/f .
Thi can improve
This
i
the
th output
t t SNR by
b around
d 13 dB.
dB
Dolby noise reduction uses an analogous pre-emphasis technique to
reduce the effects of noise (hissing noise in audiotape recording is
also
l concentrated
t t d on hi
high
h ffrequency).
)
111
Improvement Factor
Assume an ideal pair of pre/de-emphasis filters
Hde ( f ) 1/ Hpe ( f ),
)
f W
PSF of noise at the output of de-emphasis filter

f2
2
N 0 H de ( f ) ,
2
A
f BT / 2,
f2
recall S No ( f ) 2 N 0
A
Average power of noise with de

de-emphasis
emphasis
PN
f2
2
H
(
f
)
N 0 df
de
2
A
Improvement factor (using (7.1))

2 N0W 3
3
PN without pre / de - emphasis
2
W
2
I
W 2 3A 2
W
2
f
2
pre / de - emphasis
p
PN with p
(
f
)
df
W A2 H de ( f ) N0dff 3W f H de
d
112
Example Circuits
(a) Pre-emphasis filter

H pe ( f ) 1 jf / f 0
f 0 1/ (2 rC ),
R r , 2 frC 1
(b) D
De-emphasis
h i filt
filter
1
H de ( f )
1 jf / f 0
Improvement
I
2W 3
3
f 2 / (1 f 2 / f 0 2 )df
(W / f 0 )3
3[(W / f 0 ) tan 1 (W / f 0 )]
In commercial FM,
FM W = 15 kHz
kHz, f0 = 2.1
2 1 kHz
I = 22 13 dB (a significant gain)
113
Comparison of Analogue Systems
Assumptions:
single-tone
g
modulation,, i.e.: m(t)
( ) = Am cos(2
( fmt);
);
the message bandwidth W = fm;
for the AM system, = 1;
for the FM system, = 5 (which is what is used in commercial FM
transmission, with f = 75 kHz, and W = 15 kHz).
With these assumptions

assumptions, we find that the SNR
expressions for the various modulation schemes become:
SNRDSB SC SNRbaseband SNRSSB
1
SNRAM SNRbaseband
3
3
75
SNRFM 2 SNRbaseband SNRbaseband
2
2
without pre/deemphasis
114
Performance of Analog Systems
115
Conclusions
(Full) AM: The SNR performance is 4.8 dB worse than a
baseband system, and the transmission bandwidth is BT =
2W .
DSB
DSB: Th
The SNR performance
f
iis id
identical
ti l tto a b
baseband
b d
system, and the transmission bandwidth is BT = 2W.
SSB: The SNR performance is again identical, but the
transmission bandwidth is only BT = W.
FM: The SNR performance is 15.7 dB better than a
b
baseband
b d system,
t
and
d th
the ttransmission
i i b
bandwidth
d idth iis BT =
2( + 1)W = 12W (with pre- and de-emphasis the SNR
performance is increased by about 13 dB with the same
transmission bandwidth).
116

EE2
Lecture 8: Digital Representation of

Signals
Dr. Cong Ling
117
Outline
Introduction to digital communication
Quantization (A/D) and noise
PCM
Companding
References
Notes of Communication Systems, Chap. 4.1-4.3
Haykin & Moher
Moher, Communication Systems,
Systems 5th ed
ed., Chap
Chap. 7
Lathi, Modern Digital and Analog Communication Systems,
3rd ed., Chap. 6
118
Block Diagram of Digital Communication
119
Why Digital?
Advantages:
Digital
g
signals
g
are more immune to channel noise by
y using
g channel
coding (perfect decoding is possible!)
Repeaters along the transmission path can detect a digital signal
and retransmit a new noise
noise-free
free signal
Digital signals derived from all types of analog sources can be
represented using a uniform format
Digital signals are easier to process by using microprocessors and
VLSI (e.g., digital signal processors, FPGA)
Digital systems are flexible and allow for implementation of
sophisticated functions and control
More and more things are digital
For digital communication: analog signals are converted to

digital.
120
Sampling
How densely should we sample an analog signal so that
p
its form accurately?
y
we can reproduce
A signal the spectrum of which is band-limited to W Hz,
can be reconstructed exactly from its samples, if they are
taken uniformly at a rate of R 2W Hz.
Nyquist frequency: fs = 2W Hz
121
Quantization
Quantization is the process of transforming the sample
p
into a discrete amplitude
p
taken from a finite set
amplitude
of possible amplitudes.
The more levels, the better approximation.
Dont need too many levels (human sense can only detect
finite differences).
Quantizers can be of a uniform or nonuniform type.
122
Quantization Noise
Quantization noise: the error between the input signal and
p signal
g
the output
123
Variance of Quantization Noise

: gap between quantizing levels (of a uniform quantizer)
q: Quantization error = a random variable in the range
q
2
2
If is sufficiently small,
small it is reasonable to assume that q is
uniformly distributed over this range:
Noise variance
1
, q
fQ (q)
2
2
0, otherwise
1 /2 2
PN E{e } q fQ (q)dq q dq
/2
3 /2
1q
/2
1 3 ()3 2

24 12
24
124
SNR
Assume: the encoded symbol has n bits
the maximum number of q
quantizing
g levels is L = 2n
maximum peak-to-peak dynamic range of the quantizer = 2n
P: power of the message signal

mp = max |m(t)|: maximum absolute value of the message
signal
Assume: the message signal fully loads the quantizer:
1
(8.1)
m p 2n 2n 1
2
SNR at the quantizer output:
PS
12 P
P
SNRo
2
2
PN / 12
125
SNR
From (8.1)
I dB,
In
dB
mp
2
n 1
SNRo
2m p
2
12 P
4 m2p
22 n
4m 2p
22 n
3P 2 n
22
mp
(8.2)
3P
SNRo (dB) 10 log10 (2 ) 10 log10 2
mp
2n
3P
20n log10 2 10 log10 2
m
p
3P
6n 10 log10 2 (dB)
m
p
Hence, each extra bit in the encoder adds 6 dB to the output SNR
quantizer.
of the q
Recognize the tradeoff between SNR and n (i.e., rate, or bandwidth).
126
Example
Sinusoidal message signal: m(t) = Am cos(2 fmt).
Average signal power:
Am2
P
2
Maximum signal value: mp = Am.
Substitute into (8
(8.2):
2):
3 Am2 2 n 3 2 n
SNRo
2 2
2
2 Am
2
In dB
SNRo (dB) 6n 1.8
1 8 dB
Audio CDs: n = 16 SNR > 90 dB

127
Pulse--Coded Modulation (PCM)

Pulse
Sample the message signal above the Nyquist

frequency
Quantize the amplitude of each sample
Encode the discrete amplitudes into a binary
codeword
isn t modulation in the usual sense; its
it s a
Caution: PCM isnt
type of Analog-to-Digital Conversion.
128
The PCM Process
129
Problem With Uniform Quantization

Problem: the output SNR is adversely affected by peak to
g p
power ratio.
average
Typically small signal amplitudes occur more often than
large signal amplitudes.
The signal does not use the entire range of quantization levels
available with equal probabilities.
Small amplitudes are not represented as well as large amplitudes
amplitudes,
as they are more susceptible to quantization noise.
130
Solution: Nonuniform quantization

Nonuniform quantization uses quantization levels of
p
g, denser at small signal
g
amplitudes,
p
,
variable spacing,
broader at large amplitudes.
uniform
if
quantized
nonuniform
if
quantized
131
Companding = Compressing + Expanding

A practical (and equivalent) solution to nonuniform quantization:
Compress the signal first
Quantize it (using a uniform quantizer)
Transmit it
Expand
E
d it
Companding is the corresponding to pre-emphasis and de-emphasis
scheme used for FM.
Predistort a message signal in order to achieve better performance in
the presence of noise, and then remove the distortion at the receiver.
The
Th exactt SNR gain
i obtained
bt i d with
ith companding
di depends
d
d on the
th exactt
form of the compression used.
proper
p companding,
p
g, the output
p SNR can be made insensitive to
With p
peak to average power ratio.
132
-Law vs. A
A--Law
(a) -law used in North America and Japan

Japan, (b) A-law used in most countries
of the world. Typical values in practice (T1/E1): = 255, A = 87.6.
133
Applications of PCM & Variants

Speech:
PCM: The voice signal
g
is sampled
p
at 8 kHz,, q
quantized into 256
levels (8 bits). Thus, a telephone PCM signal requires 64 kbps.
need to reduce bandwidth requirements
DPCM (differential
(diff
ti l PCM):
PCM) quantize
ti the
th difference
diff
between
b t
consecutive samples; can save 8 to 16 kbps. ADPCM (Adaptive
DPCM) can go further down to 32 kbps.
Delta modulation: 1-bit DPCM with oversampling; has even lower
symbol rate (e.g., 24 kbps).
Audio CD: 16
16-bit
bit PCM at 44
44.1
1 kHz sampling rate
rate.
MPEG audio coding: 16-bit PCM at 48 kHz sampling rate
compressed to a rate as low as 16 kbps
kbps.
134
Demo
Music playing in Windows Sound Recorder
135
Summary
Digitization of signals requires
Sampling:
p g a signal
g
of bandwidth W is sampled
p
at the Nyquist
yq
frequency 2W.
Quantization: the link between analog waveforms and digital
representation.
representation
SNR (under high-resolution assumption)
3P
SNRo (dB) 6n 10 log
l 10 2 (dB)
m
p
Companding can improve SNR.
PCM is a common method of representing audio signals.

In a strict sense, pulse coded modulation is in fact a (crude)
source coding technique (i
(i.e,
e method of digitally representing
analog information).
There are more advanced source coding (compression)
techniques in information theory.
136

EE2
Lecture 9: Baseband Digital

Transmission
Dr. Cong Ling
137
Outline
Line coding
Performance of baseband
digital transmission
Model
Bit error rate
References
Notes of Communication
Systems, Chap. 4.4
Haykin & Moher, Communication
y
, 5th ed.,, Chap.
p 8
Systems,
Lathi, Modern Digital and Analog
Communication Systems, 3rd ed.,
Chap. 7
138
Line Coding
The bits of PCM, DPCM etc need to be converted into
g
some electrical signals.
Line coding encodes the bit stream for transmission
through a line, or a cable.
Line coding was used former to the wide spread
application of channel coding and modulation techniques.
Nowadays, it is used for communications between the
CPU and peripherals, and for short-distance baseband
communications, such as the Ethernet.
139
Line Codes
Unipolar nonreturn-to-zero
(NRZ) signaling (on-off
signaling)
Polar NRZ signaling
Unipolar Return-to-zero (RZ)
signaling
Bipolar RZ signaling
Manchester code
140
Analog and Digital Communications

Different goals between analog and digital communication
y
systems:
Analog communication systems: to reproduce the transmitted
waveform accurately. Use signal to noise ratio to assess the
quality of the system
Digital communication systems: the transmitted symbol to be
identified correctly by the receiver Use the probability of error of
the receiver to assess the quality of the system
141
Model of Binary Baseband

C
Communication
i ti
System
S t
T
Lowpass
Lowpass
filter
filter
n(t)
n(t)
We only consider binary PCM with on-off signaling: 0 0

and 1 A with bit duration Tb.
Assume:
AWGN channel: The channel noise is additive white Gaussian,,
with a double-sided PSD of N0/2.
The LPF is an ideal filter with unit gain on [W, W ].
The signal passes through the LPF without distortion
(approximately).
142
Distribution of Noise
Effect of additive noise on digital transmission: at the
y
1 mayy be mistaken for 0,, and vice versa.
receiver,, symbol
bit errors
What is the probability of such an error?
After the LPF, the predetection signal is
y(t) = s(t) + n(t)
s(t): the binary-valued function (either 0 or A volts)
n(t): additive white Gaussian noise with zero mean and variance
W
N 0 / 2df N 0W
2
Reminder: A sample value N of n(t) is a Gaussian random

variable drawn from a probability density function (the
normal distribution):
n2
1
pN ( n)
exp 2 N (0, 2 )
2
2
143
Decision
Y: a sample value of y(t)
If a symbol 0 were transmitted: y(t) = n(t)
Y will have a PDF of N (0, 2)
If a symbol 1 were transmitted: y(t) = A + n(t)

Y will have a PDF of N (A, 2)
Use as decision threshold T :

if Y < T,
if Y > T,
choose symbol 0
choose symbol 1
144
Errors
Two cases of decision error:
((i)) a symbol
y
0 was transmitted,, but a symbol
y
1 was chosen
(ii) a symbol 1 was transmitted, but a symbol 0 was chosen
Probability density functions for binary data transmission in noise:

((a)) symbol
y
0 transmitted,, and (b)
( ) symbol
y
1 transmitted. Here T = A/2.
145
Case (i)
Probability of (i) occurring = (Probability of an error,
given symbol
g
y
0 was transmitted)) ((Probabilityy of a 0 to
be transmitted in the first place):
p(i) = Pe0 p0
where:
p0: the a priori probability of transmitting a symbol 0
Pe0: the conditional probability of error
error, given that symbol 0 was
transmitted:
Pe 0
n2
1
dn
exp
2
2
2
146
Case (ii)
Probability of (ii) occurring = (Probability of an error,
given symbol
g
y
1 was transmitted)) ((Probabilityy of a 1 to
be transmitted in the first place):
p(ii) = Pe1 p1
where:
p1: the a priori probability of transmitting a symbol 1
Pe1: the conditional probability of error
error, given that symbol 0 was
transmitted:
( n A)2
1
dn
Pe1
exp
2
2
2
147
Total Error Probability

Total error probability:
Pe (T ) p(i ) p(ii )
p1 Pe1 (1 p1 ) Pe 0
(n A) 2
n2
1
1
exp
exp 2 dn
p1
dn (1 p1 ) T
2
2
2
2
2
T
Choose T so that Pe(T) is minimum:

dPe ((T
T)
0
dT
148
Derivation
Leibnitz rule of differentiating an integral with respect
parameter: if
to a p
I ( )
b( )
a ( )
f ( x; )dx
then
b ( ) f ( x; )
dI ( ) db( )
da ( )
f (b( ); )
f ( a ( ); )
dx
a
(
)
d
d
d
Therefore
(T A)2
T2
dPe (T )
1
1
(1 p1 )
p1
exp
exp 2 0
2
2
dT
2
2
2
T 2 (T A) 2
p1
exp
1 p1
2
T 2 T 2 A2 2TA
A(2T A)
exp
exp
2
2
2
2
149
Optimum Threshold
p1
A(2T A)
ln
1 p1
2 2
p1
2 ln
A(2T A)
1 p1
2
p1
p1
A
2 2
2
ln
ln
2T A T
A
1 p1
A 1 p1 2
Equi-probable symbols (p1 = p0 = 1 p1) T = A/2.

For
F equi-probable
i
b bl symbols,
b l it can b
be shown
h
th
thatt Pe0 =
Pe1.
Probability of total error:
Pe p(i ) p(ii ) p0 Pe 0 p1 Pe1 Pe 0 Pe1

since p0 = p1 = 1/2, and Pe0 = Pe1.
150
Calculation of Pe
Define a new variable of integration
n
1
z dz
d dn
d dn
d dz
d
When n = A/2, z = A/(2).

When n = , z = .
Then
1
z 2 /2
Pe 0
e
dz
A /(2 )
2
1
z 2 /2
e
dz
A /(2 )
2
We may express Pe0 in terms of the Q-function:

1
Q( x)
2
Then:
t2
x exp 2 dt
A
Pe Q
151
Example
Example 1: A/ = 7.4 Pe = 104
For a transmission rate is 105 bits/sec,
bits/sec there will be an
error every 0.1 seconds
Example 2: A/ = 11.2 Pe = 108
For a transmission rate is 105 bits/sec, there will be an
error everyy 17 mins
A/ = 7.4: Corresponds
p
to 20 log
g10((7.4)) = 17.4 dB
A/ = 11.2: Corresponds to 20 log10(11.2) = 21 dB
Enormous increase in reliabilityy by
y a relativelyy small
increase in SNR (if that is affordable).
152
Probability
y of Bit Error
153
Q-function
Upper bounds and good
pp
approximations:
Q( x)
1 x2 / 2
e , x0
2 x
which becomes tighter

g
for
large x, and
1 x2 / 2
Q( x) e
,x0
2
which is a better upper
bound for small x.
154
Applications to Ethernet and DSL

100BASE-TX
One of the dominant forms of fast Ethernet,, transmitting
g up
p to 100
Mbps over twisted pair.
The link is a maximum of 100 meters in length.
Line
Li codes:
d
NRZ NRZ IInvertt MLT-3.
MLT 3
Digital subscriber line (DSL)

Broadband over twist pair
pair.
Line code 2B1Q achieved bit error rate of 10-7 at 160 kbps.
ADSL and VDSL adopt
p discrete multitone modulation ((DMT)) for
higher data rates.
155

EE2
Lecture 10: Digital Modulation

Dr. Cong Ling
156
Outline
ASK and bit error rate
FSK and bit error rate
PSK and bit error rate
References
Notes of Communication Systems, Chap. 4.5
3rd ed., Chap. 13
157
Digital Modulation
Three Basic Forms of Signaling Binary Information
(a) Amplitude-shift
keying (ASK).
(b) Phase-shift keying

(PSK).
(c) Frequency
Frequency-shift
shift
keying (FSK).
158
Demodulation
Coherent (synchronous) demodulation/detection

Use a BPF to reject out-of-band noise
Multiply the incoming waveform with a cosine of the carrier
frequency
Use a LPF
Requires
q
carrier regeneration
g
((both frequency
q
y and p
phase
synchronization by using a phase-lock loop)
Noncoherent demodulation (envelope detection etc.)

Makes no explicit efforts to estimate the phase
159
ASK
Amplitude shift keying (ASK) = on-off keying (OOK)
s0((t)) = 0
s1(t) = A cos(2 fct)
or
s(t) = A(t) cos(2fct),
A(t) { 0, A}
Coherent detection
Assume an ideal band-pass filter with unit gain on [fcW, fc

+W]. For a practical band-pass filter, 2W should be
interpreted as the equivalent bandwidth.
160
Coherent Demodulation
Pre-detection signal:
x(t ) s (t ) n(t )
A cos(2 f c t ) nc (t ) cos(2 f c t ) ns (t ) sin(2 f c t )
[ A(t ) nc (t )]cos(2 f c t ) ns (t ) sin(2 f c t )
After multiplication with 2cos(2 fct):
y(t ) [ A(t ) nc (t )]2cos2 (2 fct ) ns (t )2sin(2 fct )cos(2 fct )

[ A(t ) nc (t )](1 cos(4 fct )) ns (t )sin(4 fct )
After low
low-pass
pass filtering:
y (t ) A(t ) nc (t )
161
Bit Error Rate

Reminder: The in-phase noise component nc(t) has the
g
band-pass
p
noise n(t)
()
same variance as the original
The received signal is identical to that for baseband digital
transmission
~ (t )
The
Th sample
l values
l
off y (t
will
ill h
have PDF
PDFs that
th t are id
identical
ti l tto
those of the baseband case
For ASK the statistics of the receiver signal are identical to

those of a baseband system
The
ep
probability
obab y o
of e
error
o for
o ASK
S is
s the
e sa
same
e as for
o the
e
baseband case
Assume equiprobable
q p
transmission of 0s and 1s.
Then the decision threshold must be A/2 and the
probability of error is given by:
A
Pe, ASK Q
162
PSK
Phase shift keying (PSK)
s(t) = A(t) cos(2 fct),
A(t) { A, A}
Use
U coherent
h
td
detection
t ti again,
i tto eventually
t ll gett th
the
detection signal:
y (t ) A(t ) nc (t )
Probability density functions for PSK for equiprobable 0s
and 1s in noise (use threshold 0 for detection):
(a): symbol 0 transmitted
((b):
) symbol
y
1 transmitted
163
Analysis
Conditional error probabilities:
Pe0
(n A)2
1
exp
dn
2
2
2
(n A)2
1
Pe1
exp
dn
2
2
2
0
0 n A
In the first set n n A dn dn and when n 0,
1
Pe 0
2
n~ 2 ~
A exp 2 2 dn
In the second set n (n A) n A dn dn

when n 0,
0 n A, and when n
, n :
1
Pe1
2
n 2
1
exp 2 2 (1)dn 2
A
n 2
A exp 2 2 dn
164
Bit Error Rate

So:
n2
1
Pe0 Pe1 Pe,PSK
exp 2 dn
A 2
2
Change variable of integration to z n/ dn = dz

and when n = A, z = A/. Then:
1 z2 / 2
A
Pe,PSK
d Q
dz
Ae
2

Remember that
1
Q x
2
exp( t 2 / 2)dt
165
FSK
Frequency Shift Keying (FSK)
s0((t)) = A cos(2
( f0t),
),
if symbol
y
0 is transmitted
s1(t) = A cos(2 f1t),
if symbol 1 is transmitted
Symbol recovery:
Use two sets of coherent detectors, one operating at a frequency f0
and the other at f1.
Coherent FSK demodulation.

demodulation
The two BPFs are nonoverlapping in frequency
spectrum
166
Output
Each branch = an ASK detector
i
A + noise
LPF output on each branch =
noise
if symbol
b l presentt
if symbol not present
n0(t): the noise output of the top branch

n1(t): the noise output of the bottom branch
Each of these noise terms has identical statistics to
nc(t).
(t)
Output if a symbol 1 were transmitted
y = y1((t)) = A + [[n1((t)) n0((t)]
)]
Output if a symbol 0 were transmitted
y = y0((t)) = A + [n1((t)) n0((t)]
)
167
Bit Error Rate for FSK

Set detection threshold to 0
Difference from PSK: the noise term is now n1(t) n0(t).
The noises in the two channels are independent because
their spectra are non-overlapping.
the p
proof is done in the p
problem sheet.
the variances add.
the noise variance has doubled!
Replace 2 in (172) by 22 (or by 2 )

Pe ,FSK
A
Q
2
168
The Sum of Two R.V.

Noise is the sum or difference of two independent zero
mean random variables:
x1: a random variable with variance 12
x2: a random variables with variance 22
What is the variance of y x1 x2?

By definition
y2 E{ y 2 } E{ y}2 E{( x1 x2 ) 2 }
E{x12 2 x1 x2 x22 } E{x12 } E{x1 x2 } E{x22 }
For independent variables: E{x1x2} = E{x1}E{x2}

For zero-mean random variables:
E{x1} = E{x2} = 0 E{x1x2} = 0
So
y2 E{x12 } E{x22 } 12 22
169
Comparison of Three Schemes
170
Comment
To achieve the same error probability (fixed Pe):
PSK can be reduced by 6 dB compared with a baseband
or ASK system (a factor of 2 reduction in amplitude)
FSK can be reduced by 3 dB compared with a baseband
or ASK (a factor of 2 reduction in amplitude)
Caution: The comparison is based on peak SNR. In terms
of average SNR, PSK only has a 3 dB improvement over
ASK, and FSK has the same performance as ASK
171
Q-function
172
Examples
Consider binary PSK modulation. Assume the carrier
p
A = 1 v,, and noise standard deviation = 1/3.
amplitude
Determine the bit error probability.
Answer: Pe = 1.03 10-3.
Now suppose the bit error probability is 10-5. Determine

the value of A/.
Answer: A/ = 4.3.
173
Application: GMSK
Gaussian minimum shift keying (GMSK), a special form of
preceded by
y Gaussian filtering,
g, is used in GSM
FSK p
(Global Systems for Mobile Communications), a leading
cellular phone standard in the world.
Also known as digital FM, built on some
of FM-related advantages of AMPS,
g
analog
g system
y
the first-generation
(30 KHz bandwidth).
Binary data are passed through a
Gaussian filter to satisfy stringent
requirements of out-of-band radiation.
Minimum Shift Keying: its spacing between the two frequencies of
FSK is minimum in a certain sense (see problem sheet).
GMSK is allocated bandwidth of 200 kHz, shared among 32 users.
This provides a (30/200)x32
(30/200)x32=4
4.8
8 times improvement over AMPS
AMPS.
174

EE2
Lecture 11: Noncoherent Demodulation

Dr. Cong Ling
175
Outline
Noncoherent demodulation of ASK
Noncoherent demodulation of FSK
Differential demodulation of DPSK
References
3 d ed.,
3rd
d Ch
Chap. 13
176
Noncoherent Demodulation
Coherent demodulation assumes perfect synchronization.
Needs a p
phase lock loop.
p
However, accurate phase synchronization may be difficult

in a dynamic channel.
Phase synchronization error is due to varying propagation delays,
frequency drift, instability of the local oscillator, effects of strong
noise ...
Performance of coherent detection will degrade severely.
When
e the
e ca
carrier
e p
phase
ase is
su
unknown,
o ,o
one
e must
us rely
eyo
on nono
coherent detection.
No provision is made for carrier phase recovery.
The phase is assumed to be uniformly distributed on [0,

2].
Circuitry is simpler, but analysis is more difficult!
177
Noncoherent Demodulation of ASK

Signal
plus
noise
in
Bandpass
filter
Envelope
detector
Threshold
device
Decision
t = nTb
Output of the BPF

y(t) = n(t)
when 0 is sent
y(t) = n(t) + A cos(2 fct) when 1 is sent
Recall
n(t ) nc (t ) cos(2 f c t ) ns (t ) sin(2 f c t )
Envelope
p
R nc2 (t ) ns2 (t )
when 0 is sent
R ( A nc (t )) 2 ns2 (t )
when
h 1 iis sentt
178
Distribution of the Envelope

When symbol 0 is sent, the envelope (that of the bandpass
y g distribution
noise alone)) has Rayleigh
f (r)
r2 /(2 2 )
, r 0
When symbol 1 is sent, the envelope (that of a signal +

bandpass
p
noise)) has Rician distribution
f (r)
( r 2 A2 ) /( 2 2 )
Ar
I 0 2 , r 0

The first case dominates the error p

probability
y when
A/ >> 1.
179
Rayleigh Distribution
Define a random variable R X Y where X and Y
are independent
p
Gaussian with zero mean and variance 2
R has Rayleigh distribution:
2
f R (r )
r 2 / ( 2 2 )
r0
Normalized Rayleigh distribution
v = r/
fV(v) = fR(r)
Proving it requires change into polar coordinates:

R X 2 Y2 ,
tan
t 1
Y
X
180
Derivation
Consider a small area dxdy = rdrd. One has
f R , (r , )drd f X ,Y ( x, y )dxdy f X ,Y ( x, y )rdrd
1
2
Hence
f R , ( r , )
e
2
r
2
e
2
x2 y 2
2 2
rdrd
x2 y 2
2 2
e
2
r2
2 2
pdf of R:
f R (r )
pdf of
f R , (r , )d
f ( ) f R , (r , )dr
0
e
2
r2
2 2
r0
1
, [0, 2 ]
2
181
Rician Distribution
If X has nonzero mean A, R has Rician distribution:
(r2 A2 )/(2 2 )
fR (r) 2 e
r
where
I 0 ( x)
1
2
A
I0 (Ar
2 ),
r 0
e x cos d
is the modified zero-order Bessel function of the first kind.

Normalized Rician distribution
v = r/
a = A/
fV(v) = fR(r)
182
Derivation
Similarly,
f R , (r , )drd f X ,Y ( x, y )rdrd
1
2
e
2
1
2
( x A )2 y 2
2 2
r 2 A2 2 Ar cos
Hence
f R , ( r , )
r
2
e
2
rdrd
2 2
rdrd
r 2 A2 2 Ar cos
2 2
Bessel function
pdf of R:
f R (r)
f R , ( r, )d
e
2
r 2 A2
2 2
1
2
Ar cos
d ,
r0
183
Error Probability
Let the threshold be A/2 for simplicity.
The error p
probability
y is dominated by
y symbol
y
0,, and
is given by
1 r r 2 /(2 2 )
Pe
e
dr
2
A
/2
2
The final expression

Pe, ASK ,noncoherent
1 A2 /(8 2 )
e
2
Cf.
Cf coherent demodulation
Pe, ASK ,Coherent
A
Q
2
1 A2 /(8 2 )
e
2
Noncoherent demodulation results in some performance

degradation.
g
Yet,, for a large
g SNR,, the p
performances of
coherent and noncoherent demodulation are similar.
184
Noncoherent Demodulation of FSK
Bandpass
filter
centered at f1
R1
If R1 > R2,
choose 0
If R1 < R2,
choose 1
Bandpass
filter
centered at f2
R2
185
Distribution of Envelope
When a symbol 1 is sent, outputs of the BPFs
y1((t)) = n1((t))
y2(t) = n2(t) + A cos(2 f2t)
Again, the first branch has Rayleigh distribution
f R1 (r1 )
r1 /(2 )
, r1 0
2 e
r1
while the second has Rice distribution

f R2 (r2 )
r2
( r22 A2 )/(2 2 )
I 0 ( Ar22 ),
)
r2 0
Note the envelopes

p R1 and R2 are statisticallyy independent.
p
186
Error Probability
Error occurs if Rice < Rayleigh
Pe P( R2 R1 )
e
1
2
r2
2
r2
( r2 2 A2 )/(2 2 )
r2
(2 r2 2 A2 )/(2 2 )
r1
I0 ( 2 ) 2 e
I0 ( 2 )
e
2
( r2 2 A2 )/(
)/(2 2 )
r2
Ar2
Ar2
r2
A2 /(4 2 )
r12 /(2 2 )
dr1dr2
r1 /(2
/( )
dr
d 1dr
d2
2 e
r1
I 0 ( Ar22 )dr2
( x 2 2 )/(2 2 )
I 0 ( x2 )dx
x 2r2 , A / 2
Observe the integrand is a Rician density

1 A2 /(4 2 )
Pe , FSK ,noncoherent e
2
Cf. coherent demodulation
Pe , FSK ,Coherent
A 1 A 2 / ( 4 2 )
Q
2e
2
187
DPSK: Differential PSK

It is impossible to demodulate PSK with an envelop
g
have the same frequency
q
y and
detector,, since PSK signals
amplitude.
We can demodulate PSK differentially, where phase
reference is provided by a delayed version of the signal in
the previous interval.
Differential encoding is essential: bn = bn1 an, where an,
bn 1.
bn
input
an
DPSK
signal
bn-1
Delay Tb
cos(ct)
188
Differential Demodulation
signal
plus
noise
Bandpass
filter
Lowpass
filter
Threshold
device
Decision
D
i i
on an
Delay Tb
Computing the error probability is cumbersome but

fortunately the final expression is simple:
1 A2 /(2 2 )
Pe,DPSK e
2
The derivation can be found in Haykin, Communication Systems,
4th ed., Chap. 6.
Performance is degraded in comparison to coherent PSK.
Cf. coherent demodulation

Pe ,PSK
A 1 A2 /( 2 2 )
Q e
2
189
Illustration of DPSK
Information symbols {an}
-1
-1
-1
-1
{bn-1}
-1
-1
Differentially encoded
sequence {bn}
-1
-1
Transmitted phase
(radians)
Output of lowpass filter

(polarity)
ec s o
Decision
-1
-1
-1
-1
Note: The symbol 1 is inserted at the beginning of the differentially encoded

sequence is the reference symbol.
190
Summary and Comparison

Scheme
Bit Error Rate
Coherent ASK
Q(A/2)
Coherent FSK
Q(A/2)
Coherent PSK
Q(A/)
Noncoherent
ASK
exp(
exp(-A
A2/82)
Noncoherent
FSK
exp(-A
p( 2/42)
DPSK
exp(-A2/22)
Noncoherent
ASK, FSK
Coherent
ASK, FSK
Coherent PSK
Caution: ASK and FSK have the same bit

error rate if measured by average SNR.
Average SNR, dB
191
Conclusions
Non-coherent demodulation retains the hierarchy of
p
performance.
Non-coherent
Non coherent demodulation has error performance slightly
worse than coherent demodulation, but approaches
coherent performance at high SNR.
Non-coherent demodulators are considerably easier to
build.
192
Application: DPSK
WLAN standard IEEE 802.11b
Bluetooth2
Digital audio broadcast (DAB): DPSK + OFDM (orthogonal
frequency division multiplexing)
Inmarsat (International Maritime Satellite Organization):
now a London-based mobile satellite company
193

EE2
Lecture 12: Entropy and Source

Coding
Dr. Cong Ling
194
Outline
What is information theory?
Entropy
Definition of information
Entropy of a source
Source coding
Source coding theorem
Huffman coding
References
N
Notes
t off Communication
C
i ti S
Systems,
t
Ch
Chap. 5
5.1-5.4
154
Chap. 15
195
Model of a Digital Communication System
196
What is Information Theory

C. E. Shannon, A mathematical theory of
, Bell System
y
Technical
communication,
Journal, 1948.
Two fundamental questions in
communication theory:
What is the ultimate limit on data compression?
What is the ultimate transmission rate off reliable
communication over noisy channels?
Shannon showed reliable (i

(i.e.,
e error-free)
error free)
communication is possible for all rates below
channel capacity (using channel coding).
Any source can be represented in bits at any
rate above entropy (using source coding).
Rise of digital information technology
197
What is Information?
Information: any new knowledge about something
How can we measure it?
Messages containing knowledge of a high probability of
occurrence Not very informative
Messages containing knowledge of low probability of
occurrence More informative
A small change in the probability of a certain output should
not change the information delivered by that output by a
large amount.
198
Definition
The amount of information of a symbol s with
probability
p
y p:
1
I (s) log
p
Properties:
p = 1 I(s)
( ) = 0: a symbol
y
that is certain to occur contains no
information.
0 p 1 0 I(s) : the information measure is monotonic and
non-negative.
p = p1 p2 I(s) = I(p1) + I(p2): information is additive for
statistically independent events.
199
Example
In communications, log base 2 is commonly used,
g in unit bit
resulting
Example: Two symbols with equal probability p1 = p2
= 0.5
Each symbol represents I(s) = log2(1/0.5) = 1 bit of
information.
In summary:
I ( s ) log 2
1
bits
p
Reminder: From logs of base 10 to logs of base 2:

log2 x
log10 x
3.32 log10 x
log10 2
200
Discrete Memoryless Source

Suppose we have an information source emitting a
q
of symbols
y
from a finite alphabet:
p
sequence
S = {s1, s2, . . . , sK}
Discrete memoryless source: The successive symbols
are statistically independent (i.i.d.)
Assume that each symbol has a probability of
occurrence
K
pk , k 1,..., K , such that pk 1

k 1
201
Source Entropy
If symbol sk has occurred, then, by definition, we have received
I (sk ) log2
bits of information.
1
log2 pk
pk
The expected value of I(sk) over the source alphabet

K
k 1
k 1
E{I ( sk )} pk I ( sk ) pk log
l 2 pk
Source entropy: the average amount of information per source
symbol:
K
H ( S ) pk log2 pk
k 1
Units: bits / symbol.
202
Meaning of Entropy
What information about a source does its entropy give us?
It is the amount of uncertainty before we receive itit.
It tells us how many bits of information per symbol we
expect to get on the average.
Relation with thermodynamic entropy
In thermodynamics: entropy measures disorder and randomness;
In information theory: entropy measures uncertainty.
Some argue that to obtain 1 bit information, the minimum energy
needed is 10-23 Joules/Kelvin degree (extremely cheap!).
This may have implications on green computation.
computation
203
Example: Entropy of a Binary Source

s1: occurs with probability p
s0: occurs with probability 1 p.
Entropy of the source:
H ( S ) (1 p ) llog 2 (1 p ) p log
l 2 p H ( p)
H(p)
H(p) is referred to as the

entropy function.
Maximum uncertainty
when p = 1/2.
p
204
Example: A Three-Symbol Alphabet

A: occurs with probability 0.7
B: occurs with probability 0.2
C: occurs with probability 0.1
Source entropy:
H ((SS ) 0.7 log2 (0.7) 0.2 log2 (0.2) 0.1 log2 (0.1)
0.7 0.515 0.2 2.322 0.1 3.322
1.157 bits/symbol
How can we encode these symbols in order to

transmit them?
We need 2 bits/symbol if encoded as
A = 00, B = 01, C = 10
Entropy prediction: the average amount of information is
only 1.157 bits per symbol.
We
W are wasting
ti
bits!
bit !
205
Source Coding Theory

What is the minimum number of bits that are required to
particular symbol?
y
transmit a p
How can we encode symbols so that we achieve (or at
least come arbitrarily close to) this limit?
Source encoding: concerned with minimizing the actual
number of source bits that are transmitted to the user
Channel encoding: concerned with introducing redundant
bits to enable the receiver to detect and possibly correct
errors that are introduced by the channel.
206
Average Codeword Length
lk: the number of bits used to code the k-th symbol

K: total number of symbols
pk: the probability of occurrence of symbol k
Define the average codeword length:
K
L p k lk
k 1
It represents the average number of bits per symbol in the

source alphabet.
alphabet
An idea to reduce average codeword length: symbols that
occur often should be encoded with short codewords;;
symbols that occur rarely may be encoded using the long
codewords.
207
Minimum Codeword Length

What is the minimum codeword length for a particular
p
of source symbols?
y
alphabet
In a system with 2 symbols that are equally likely:
Probability of each symbol to occur: p = 1/2
Best one can do: encode each with 1 bit only: 0 or 1
In a system with n symbols that are equally likely:

Probability of each symbol to occur: p = 1/n
One needs L = log2 n = log2 1/p = log2 p bits to represent
the symbols.
208
The General Case
S = {s1, . . . , sK}: an alphabet

pk: the p
probability
y of symbol
y
sk to occur
N : the number of symbols generated
We expect Npk occurrences of sk
Assume:
The source is memoryless
allll symbols
b l are iindependent
d
d t
the probability of occurrence of a typical sequence SN is:
p ( S N ) p1Np1 p2 Np2 ... pK NpK
All such typical sequences of N symbols are equally

lik l
likely
All other compositions are extremely unlikely to occur as N
(Shannon,
(Shannon 1948)
1948).
209
Source Coding Theorem

The number of bits required to represent a typical
q
SN is
sequence
LN log 2
1
log 2 ( p1Np1 p2 Np2 ... pK NpK )
p(S N )
Np
N 1 log
l 2 p1 Np
N 2 log
l 2 p2 ... Np
N K log
l 2 pK
K
N pk log 2 pk NH ( S )
k 1
Average length for one symbol: L LN H ( S ) bits / symbol

N
Given a discrete memoryless source of entropy H(S),

the average
g codeword length
g for any
y source coding
g
scheme is bounded by H(S):
L H (S )
210
Huffman Coding
How one can design an efficient source coding

algorithm?
Use Huffman Coding (among other algorithms)
It yields the shortest average codeword length.
Basic idea: choose codeword lengths so that moreprobable sequences have shorter codewords
Huffman Code construction
Sort the source symbols in order of decreasing probability.
Take the two smallest p(
p(xi) and assign
g each a different bit ((i.e., 0
or 1). Then merge into a single symbol.
Repeat until only one symbol remains.
It s very easy to implement this algorithm in a

Its
microprocessor or computer.
Used in JPEG,, MP3
211
Example
Read diagram backwards

for codewords
The average
g codeword length:
g
L (2 0.4) (2 0.2) (2 0.2) (3 0.1) (3 0.1) 2.2
More than the entropy H(S) = 2.12 bits per symbol.

Room for further improvement.
212
Application: File Compression

A drawback of Huffman coding is that it requires
g of a p
probabilistic model,, which is not always
y
knowledge
available a prior.
Lempel-Ziv coding overcomes this practical limitation and
has become the standard algorithm for file compression.
compress, gzip, GIF, TIFF,
PDF modem
PDF,
modem
A text file can typically be
compressed to half of its
original size.
213
Summary
The entropy of a discrete memoryless information source
K
H ( S ) pk log2 pk
k 1
Entropy function (entropy of a binary memoryless source)

H ( S ) (1 p ) log 2 (1 p ) p log 2 p H ( p )
Source coding theorem: The minimum average codeword

length for any source coding scheme is H(S) for a discrete
memoryless source.
Huffman coding: An efficient source coding algorithm.
214

EE2
Lecture 13: Channel Capacity

Dr. Cong Ling
215
Outline
Channel capacity
Channel coding theorem
Shannon formula
Comparison with
practical systems
p
y
References
Notes of Communication Systems, Chap. 5.5-5.6
3rd ed., Chap. 15
216
Channel Coding Theorem

Shannon, A mathematical theory of communication,
1948
For any channel, there exists an information capacity C
(whose calculation is beyond the scope of this course).
If the transmission rate R C, then there exists a
coding scheme such that the output of the source can
be transmitted over a noisy channel with an arbitrarily
small probability of error. Conversely, it is not
possible
ibl to
t transmit
t
it messages without
ith t error if R > C.
Important implication:
The basic limitation due to noise in a communication channel is not
on the reliability of communication, but rather, on the speed of
communication.
i ti
217
Shannon Capacity Formula

For an additive white Gaussian noise (AWGN) channel,
p
y is
the channel capacity
P
C B log 2 1 SNR B log 2 1
N
B
0
bps
B: the bandwidth of the channel

P: the average signal power at the receiver
N0: the single-sided PSD of noise
Important implication: We can communicate error free

up to
t C bits
bit per second.
d
How can we achieve this rate?
D
Design
i power error correcting
ti codes
d tto correctt as many errors as
possible (the next two lectures).
Use the ideal modulation scheme that does not lose information in
the detection process.
218
Shannon Limit
Tradeoff between power and bandwidth: to get the same
g capacity
p
y C, one mayy
target
Increase power, decrease bandwidth;
Decrease power, increase bandwidth.
Unachievable
region
Whats the minimum power to

send 1 bps (i.e., C = 1 bps)?
Achievable region
1/B
B
P
21/ B 1
1/ B
B(2 1)
N0
1/ B
Capacity boundary
P
21/ B ln 2( B 2 )
lim
lim
ln 2
2
B N
B
B
0
P/N0 (dB)
0.693
0 693 1.6
1 6 dB
This is the ultimate limit of green communications.

For more information, see http://www.greentouch.org
219
Example
A voice-grade channel of the telephone network has a
bandwidth of 3.4 kHz.
(a) Calculate the capacity for a SNR of 30 dB.
(b) Calculate the SNR required to support a rate of 4800 bps.
Answer:
(a) 30 dB SNR = 1000
C B log 2 (1 SNR)
3.4 log
g 2 ((1 1000))
33.9 kbps
(b)
SNR 2C / B 1
24.8/3.4 1
1.66
1 66 2.2
2 2 dB
220
General Model of a Modulated

Communication System
221
Input/Output SNR
The maximum rate at which information may arrive at the
receiver is
Cin B log
l 2 (1 SNR
S in )
SNRin: the predetection signal-to-noise ratio at the input to the

demodulator
demodulator.
The maximum rate at which information can leave the

receiver is
Co W log2 (1 SNRo )
SNRo: the SNR at the output of the postdetection filter.
For the ideal modulation scheme:

Cin = Co 1 + SNRo = [1 + SNRin]B/W
For high SNR,
SNR
SNRo SNRinB/W
It seems that spreading
p
g the bandwidth would make the
output SNR increase exponentially (not true).
222
A More Elaborate Analysis

If the channel noise has a double-sided white PSD of N0/2
the average
g noise p
power at the demodulator will be N0B.
If the signal power is P :
SNRini
P
SNRbaseband
N 0W
P
W P
N 0 B B N 0W
: the baseband SNR

Increasing B will reduce SNRin, and thus
SNRbaseband
W logg 2 ((1 SNRo ) B logg 2 1
B /W
Output SNR of an ideal communication system:

B /W
SNRbaseband
SNRo 1
1
B /W
e SNRbaseband as B / W
1
lim 1 e
x
x
223
SNR vs. Bandwidth
SNRo (d
dB)
B/W=12
B/W 2
B/W=2
B/W=1
SNRbaseband (dB)
224
Implications to Analog Systems

A bandwidth spreading ratio of B/W=1 corresponds to both
SSB and baseband.
A bandwidth spreading ratio of B/W=2 corresponds to
DSB and AM (the ideal system would
2
SNRo SNRbase
/4
have
)).
A bandwidth spreading ratio of B/W=12 corresponds to
g and AM ((the ideal system
y
commercial FM broadcasting
would have SNR o e SNR
).
SSB and baseband systems provide noise performance
id ti l tto th
identical
the id
ideall ((which
hi h iis ttrivial).
i i l)
DSB has a worse performance because of its additional
bandwidth requirements.
requirements
FM systems only come close to the ideal system near
threshold.
base
225
Noise Performance of Analog

Communication Systems
226
Discussion
Known analogue communication systems do not achieve
performance in g
general.
the ideal p
Something must be missing with analogue
communications.
Similarly, it can be shown that simple digital modulation
schemes cannot achieve capacity too.
One must resort to digital communications and coding to
approach the capacity.
The idea of tradeoff between bandwidth and SNR is useful
in spread-spectrum communications (CDMA, 3G mobile
communications) and ultra wideband
wideband.
227

EE2
Lecture 14: Block Codes

Dr. Cong Ling
228
Outline
Introduction
Block codes
Error detection and
correction
Generator matrix and
parity-check
parity
check matrix
References
3rd ed
ed., Chap
Chap. 16
229
Taxonomy of Coding
Error
Correction
Coding
= FEC
- no feedback
channel
Cryptography
(Ciphering)
Error
Detection
Coding
- used
in ARQ
as in TCP/IP
- feedback channel
- retransmissions
Source
Coding
Error Control
Coding
- Makes bits
equally
probable
- Strives to
utilize
channel
capacity by
adding
extra bits
Compression
Coding
- Redundancyy removal:
- Destructive (jpeg, mpeg)
- Non-destructive (zip)
Line Coding
- for baseband
communications
FEC: Forward Error Correction

ARQ: Automatic Repeat Request
230
Block vs. Convolutional Codes

Block codes
k bits
(n,k)
n bits
k input bits
encoder
Output of n bits depends only on the k input bits n output bits
Convolutional codes
Each source bit influences n(L+1) output bits
L is the memory length
Like
Lik convolution
l ti iin a lilinear filt
filter
i
input
bit
bi
n(L+1) output bits
k = 11, n = 22, L = 2
231
Noise and Errors

Noise can corrupt the information that we wish to transmit.
Generally corruption of a signal is a bad thing that should
be avoided, if possible.
Different systems will generally require different levels of
protection against errors due to noise.
Consequently, a number of different techniques have been
developed to detect and correct different types and
numbers of errors.
232
Channel Model
We can measure the effect of noise in different ways.
The most common is to specify an error probability,
probability p.
p
Consider the case of a Binary Symmetric Channel
with noise.
1
1-p
p
1
p
1-p
All examples considered here will be for error

probabilities that are symmetric, stationary and
statisticallyy independent.
p
233
Simple Error Checks
If the error probability is small and the information is fairly fault tolerant,
it is possible to use simple methods to detect errors.
Repetition Repeating each bit in the message
If the two symbols in an adjacent pair are different, it is likely that an error
has occurred.
However, this is not very efficient (efficiency is halved).
Repetition provides a means for error checking, but not for error correction.
P it bit Use
Parity
U off a parity
it bit att the
th end
d off the
th message
A parity bit is a single bit that corresponds to the sum of the other
message bits (modulo 2).
This allows any odd number of errors to be detected, but not even
numbers.
As with repetition,
p
, this technique
q only
y allows error checking,
g, not error
correction.
It is more efficient than simple repetition.
234
Block Codes
An important class of codes that can detect and correct
some errors are block codes
The first error-correcting block code was devised by
Hamming around the same time as Shannon was working
on the foundation of information theory
Hamming codes are a particular class of linear block code
Block codes
Encode a series of symbols from the source, a block, into a
longer string: codeword or code block
Errors can be detected as the received coded block will not be
one of the recognized, valid coded blocks
Error correction: To decode and associate a corrupted block to a
valid coded block by its proximity (as measured by the Hamming
distance)
d
s a ce )
235
Binary Fields and Vectors
We need to discuss some of mathematics that will be needed.

Fortunately, we have restricted things to binary sources, so the
mathematics is relatively simple.
The binary alphabet A={0,1} is properly referred to as a Galois field
with two elements
elements, denoted GF(2).
Addition (XOR)
0 + 0 = 0, 0 + 1 = 1 + 0 = 1, 1 + 1 = 0
Multiplication (AND)
01 = 00 = 10 = 0, 11 = 1
This is also referred to as Boolean arithmetic in Digital Electronics or
modulo-2 arithmetic.
A message is built up from a number of binary fields, and forms a

bi
binary
vector,
t rather
th th
than a llarger bi
binary number.
b
Hence,
101 5
101={1}{0}{1}
236
Example
i.
ii
ii.
iii.
Calculate the following examples of binary field arithmetic:

01001 + 01110
10010 01110
(1111+0011) 0011
i
i.
ii.
iii
iii.
Answers
00111
00010
1100 0011 = 0000
237
Hamming Distance
Hamming Weight
The Hamming weight of a binary vector, a (written as wH(a)), is the
number of non
non-zero
zero elements that it contains
contains.
Hence,
g weight
g of 5.
001110011 has a Hamming
000000000 has a Hamming weight of 0.
Hamming Distance
The Hamming Distance between two binary vectors, a and b, is
written dH(a,b) , and is equal to the Hamming weight of their
((Boolean)) sum.
dH(a,b) = wH(a+b)
Hence, 01110011 and 10001011 have a Hamming distance of
dH = wH (01110011+10001011)
= wH (11111000) = 5
238
Linear Block Codes

A binary linear block code that takes block of k bits of
source data and encodes them using n bits, is referred to
as a (n, k) binary linear block code.
The ratio between the number of source bits and the number of
bits used in the code
code, R=k/n,
k/n is referred to as the code rate.
The most important feature of a linear block code

Linearity: the Boolean sum of any codewords must be another
codeword.
This means that the set of code words forms a vector space,
within
ithi which
hi h mathematical
th
ti l operations
ti
can b
be d
defined
fi d and
d
performed.
239
Generator Matrix
To construct a linear block code we define a matrix, the

generator matrix G, that converts blocks of source
symbols
b l into
i t llonger bl
blocks
k corresponding
di tto code
d words.
d
G is a kn matrix (k rows, n columns), that takes a
source block u (a binary vector of length k),
k) to a code
word x (a binary vector of length n),
x=uG
240
Systematic Codes
k information bits
(n-k) parity-check bits
codeword
Systematic codes: The first k bits will be the original

source block.
From linear algebra, a kn matrix of linearly independent
y be written into the systematic
y
form:
rows can always
G = [Ik k Pk (n k )]
where Ik k is the kk identityy matrix,, and Pk(nk) is a
k(nk) matrix of parity bits.
y
code.
Everyy linear code is equivalent to a systematic
241
Example
Given the generator matrix for a (7,4) systematic code
1
0
G
0
0 0 0 0 1 1
1 0 0 1 0 1
0 1 0 1 1 0
0 0 1 1 1 1
Calculate the code words for the following source blocks:

i.
1010 [Answer: x = 1010101]
ii.
1101 [Answer: x = 1101001]
iii. 0010 [Answer: x = 0010110]
iv
iv.
Calculate the Hamming distances between each pair of code
words generated in parts (i) to (iii), and compare them to the
Hamming distances for the original source blocks. [Answer: 4,
7 3,
7,
3 llarger th
than 3
3, 4
4, 1]
242
Error Detection
To determine the number of errors a particular code can detect and

correct, we look at the minimum Hamming distance between any two
code
d words.
d
From linearity the zero vector must be a code word.
If we define the minimum distance between any two code words to be
dmin = min{dH(a,b), a,bC} = min{dH(0,a+b), a,bC}
= min{wH(c), cC, c0}
where C is the set of code words.
The number of errors that can be detected is then (dmin1), since dmin
errors can turn an input code word into a different but valid code word
word.
Less than dmin errors will turn an input code word into a vector that is
not a valid code word.
243
Error Correction
The number t of errors that can be corrected is simply
the number of errors that can be detected divided by two
and rounded down to the nearest integer, since any
output vector with less than this number of errors will
nearer to the input code word.
dmin 2t
2 +1
dmin< 2t
244
Parity Check Matrix
To decode a block coded vector, it is more complicated.

If the generator matrix is of the form
G = [Ik k Pk (n k )]
To check for errors, we define a new matrix, the parity check matrix,
H. The p
parity
y check matrix is a (n-k)n matrix that is defined so that
It will produce a zero vector when no error in the received code
vector x,
xHT = 0
(14 1)
(14.1)
where HT is the transpose of H.
To satisfy this condition
condition, it is sufficient to write the parity check matrix
in the form
H = [(Pk (n-k))T I(n k ) (n k )]
The minimum Hamming distance is equal to the smallest number of
columns of H that are linearly dependent.
This follows from the condition ((14.1).
) See Haykin,
y , Chap.
p 10 for p
proof.
245
Syndrome
If the received coded block y contains errors, then the
product of the received block with the transpose of the
parity check matrix will not be zero,
y HT 0
Writing y = x + e which is the sum of the original coded
block x and the error e, we find that
y HT = e HT
This value is referred to as the syndrome, s = y HT = e
HT.
The syndrome is a function of the error only, and contains
the information required to isolate the position (or positions)
of the error (or errors).
246
Syndrome Decoding
s is an (n-k) row vector, taking 2(n-k) -1 possible values (the
p
g to no error).
)
zero vector corresponding
This means that it is necessary to calculate and store 2(n-k)
-1 syndromes as a look-up table to be able to pin-point the
positions of the errors exactly.
Problem: this is impractical for large values of (n k).
Received
vector
t y
Parity-check
matrix H
Syndrome
s
Syndrome
table
Error
E
vector e
Decoded
codword x
247
Summary
A linear block code can be defined by a generator matrix
G and the associated p
parity-check
y
matrix H.
Every linear block code is equivalent to a systematic code.
The key parameter of a linear block code is the minimum
Hamming distance dmin:
Up to dmin 1 errors can be detected;
Up to (dmin 1)/2 errors can be corrected.
Syndrome decoding: compute the syndrome and then find

th error pattern.
the
tt
Only practical for short codes.
248

EE2
Lecture 15: Cyclic Codes

Dr. Cong Ling
249
Outline
Hamming codes
A special type of cyclic codes that
correct a single error
Cyclic codes
Can correct more errors
The most important
p
class of block codes
Implementation takes advantage
of polynomial multiplication/division
References
Haykin
ay
& Moher,
o e , Communication
Co
u cat o Systems,
Syste s, 5t
5th ed
ed.,, C
Chap.
ap 10
0
3rd ed., Chap. 16
250
Hamming Codes
Hamming codes are a class of linear block codes that
g error. Theyy satisfy
y the condition
can correct a single
r = n k = log2(n+1) n = 2r 1, k = 2r 1 r.
From this expression, it is easy to see that the first few
Hamming codes correspond to
(n, k) = (7,4), (15,11), (31,26),
They are easy to construct and are simple to use.
For all Hamming codes, dmin = 3.
All (correctable) error vectors have unit Hamming weight, and the
syndrome associated with an error in the ith column of the vector
is the ith row of HT .
Columns of H are binary representations of 1, , n.
251
Syndrome Table
For the (7,4) Hamming code, the parity check matrix is

0 1 1 1 1 0 0
H 1 0 1 1 0 1 0
1 1 0 1 0 0 1
G
Generator
t matrix
ti
1
0
G
0
0 0 0 0 1 1
1 0 0 1 0 1
0 1 0 1 1 0
0 0 1 1 1 1
The corresponding syndrome table is:
s eH T
000
0000000
001
0000001
010
0000010
100
0000100
111
0001000
110
0010000
101
0100000
011
1000000
When a coded
Wh
d d vector
t iis received,
i d th
the syndrome
d
iis calculated
l l t d and
d any
single error identified, and corrected by exchanging the relevant bit
with the other binary value However, problems can occur if there is
more than one error.
252
Example
Consider the (7,4) Hamming code. If the code vector
1000011 is sent while 1000001 and 1001100 are received,,
decode the information bits.
Answer:
The first vector
s=(1000001)HT=(010)
e=(0000010)
x=(1000001)+(0000010)=(1000011)
u=(1000)
correct (as there is one error)
The second vector
s=(1001100)HT=(000)
error-free
x=(1001100)
(
)
wrong
g ((as there are 4 errors))
u=(1001)
253
Cyclic Codes
Cyclic codes are a subclass of linear block codes offering larger

Hamming distances, thereby stronger error-correction capability.
Whereas the principal property of the simple linear block code is
that the sum of any two code words is also a code word, the cyclic
property:
p y a cyclic
y
shift of any
y code word
codes have an additional p
is also a code word.
A cyclic shift of a binary vector is defined by taking a vector of
length n,
a = [a0, a1, , an-1]
g g the elements,
and rearranging
a = [an-1, a0, a1, , an-2]
A code is cyclic if:
(c0, c1, , cn-1) C (cn-1, c0, , cn-2) C
254
Generator Matrices
A cyclic code is still a linear block code, so all of the
properties
p
p
p
previously
y discussed hold for cyclic
y
codes.
They are constructed by defining a generator matrix, and
an associated parity check matrix and are decoded using
syndromes in exactly the same way as the other linear
block codes that we have discussed.
A generator matrix is defined in the same way as before,
except that the rows are now cyclic shifts of one ndimensional basis vector
vector.
g0
0
G
g1 g n k
g0
g1
g n k
g0
g1
0
0
g n k
255
Encoding
The cyclic property leads to a very useful property: they can be

represented simply (mathematically and in hardware) by polynomial
operations.
operations
We start by looking at the code word x generated by a source block
u,
x = u G = [x0, x1,, xn-1]
With the generator matrix in the cyclic (non-systematic) form given in
the previous slide
slide, the elements of the code words are
are,
x0 = u0g0
x1 = u0g1+ u1g0
xn-1 = uk-1gn-k
These elements take the general form,
k 1
xl ui gl i
i 0
256
Polynomial Representation
Taking a binary vector x, it is possible to represent this as a

y
in z ((over the binary
y field))
polynomial
x = [ x0 , x1 , , xn1 ] x (z) = x0 zn1 + x1 zn2 ++ xn1
Representing the original code word and the n-dimensional basis
vector for the cyclic code as polynomials,
polynomials
u = [u0 , u1 , , uk 1 ] u (z) = u0 zk 1 + u1 zk 2 ++uk 1
g = [g0 , g1, , gnk ,0,,0]
g (z) = g0 znk + g1znk1 ++ gnk
We notice that the product of these two polynomials is exactly the
same as the
h polynomial
l
i l representation
i off the
h corresponding
di code
d
vector.
Nonsystematic
y
Encoding:
g x(z) = u(z) g(z)
This means that the problem of matrix-vector multiplication can be
reduced to a problem of polynomial multiplication.
257
Advantage
The main advantage is that it simplifies the cyclic shift
operation and thereby simplifies the hardware
operation,
implementation of the code considerably.
Multiplication of the code word by z shifts all of the
coefficients along by one, and replacing the term (x0zn)
byy a term ((x0z0)), g
gives a cyclic
y
shift.
A simpler way to achieve the same result is to multiply
the polynomial by z, divide by zn-1, and take the
remainder term, which can be written as
(z x(z)) mod(zn 1)
258
Parity Check Polynomials
One property important to cyclic codes is the ability to factorise

polynomials. Given a polynomial of the form a(z) = zn1 = zn+1 (n > 1),
it is always possible to find two polynomials
polynomials, such that
that,
zn+1 = g(z) h(z)
g g(z) to be a g
generator p
polynomial,
y
, this condition is sufficient for
Taking
a resultant code to be an (n,k) cyclic code, where:
g(z) is a polynomial of degree nk = r.
h(z) is
i a polynomial
l
i l off d
degree k.
The other factor, the polynomial h(z), turns out to be a parity check
polynomial
p
y
, playing
p y g the same role as the p
parity
y check matrix.
If we have a valid code word, x(z), the product of the code word with
h(z) is zero (modulo zn+1), and if we have a code word contained an
error,
error
(y(z)h(z))mod(zn+1) = [(x(z)+e(z))h(z)]mod(zn+1)
)]mod((zn+1))
= [e(z)h(z)]
which is called the syndrome polynomial, s(z).
259
Encoding/Decoding
Encoding: x(z)=u(z)g(z);
Parity check polynomial:
h(z) =[zn+1] /g(z)
Decoding: (y(z) is the received vector)
Calculate the syndrome polynomial:
s(z)=(y(z)h(z))mod(z
s(z)
(y(z)h(z))mod(zn+1)
1)
Look up the syndrome table to get e(z) from s(z);
x(z)=y(z)+e(z);
u(z)=x(z)/g(z);
Thi
This iis th
the en/decoding
/d
di procedure
d
ffor non-systematic
t
ti
codes. A modified version works for systematic codes
(non-examinable)
(non-examinable).
260
Hardware Implementation
Binary polynomial multiplication: multiply-by-g(z)

a(z)
g0
g2
g1
g3
b(z)
g(z)=
) g0+g1z+g2z2+g3z3, g0=g3=1;
1
Binary polynomial division: divide-by-g(z)
g0
g1
g2
b(z)
a(z)
g(z)=g0+g1z+g2z2+g3z3, g0=g3=1;
261
Examples of Cyclic Codes
Hamming code (used in computer memory)

Cyclic redundancy check (CRC) code (used in Ethernet, ARQ etc.)
B
Bose-Chaudhuri-Hocquenghem
Ch dh i H
h
(BCH) code
d
Reed-Solomon (RS) code (widely used in CD, DVD, hard disks,
wireless, satellites etc.))
SNR per bit (dB)
262
Applications of Coding
The first success was the application of convolutional
p space
p
p
probes 1960s-70s.
codes in deep
Mariner Mars, Viking, Pioneer missions by NASA
Voyager, Galileo missions were further enhanced by

concatenated codes (RS + convolutional).
The next chapter was trellis coded modulation (TCM) for
voice-band modems in 1980s.
1990s saw turbo codes approached capacity limit (now
used
d iin 3G)
3G).
Followed by another breakthrough space-time codes in
2000s
2000
s (used in WiMax
WiMax, 4G)
The current frontier is network coding which may widen the
bottleneck of Internet
Internet.
263
264
265
266
T
Trigonom
metic Iden
ntities

EE2
Revision Lecture
Dr. Cong Ling
267
Lectures
Introduction and background
1. Introduction
2. Probability and random
processes
3 Noise
3.
Effects of noise on analog
communications
4. Noise performance of DSB
5. Noise performance of SSB and
AM
6. Noise performance of FM
p
for FM and
7. Pre/de-emphasis
comparison of analog systems
8. Digital representation of signals
9. Baseband digital transmission
10. Digital modulation
11 Noncoherent
11.
N
h
td
demodulation
d l ti
Information theory
12 Entropy and source coding
12.
13. Channel capacity
14. Block codes
15. Cyclic codes
268
The Exam
The exam paper contains 3 questions. All questions are
p
y
compulsory.
For example, the questions may look like
Question 1 (40 marks): Basic knowledge of communication
systems, elements of information theory and/or coding, mostly
bookwork
Question 2 (30 marks): Analog or digital communications
communications,
bookwork, new example/application/theory
Question 3 (30 marks): Digital or analog communications or
i f
information
i theory/coding,
h
/ di
b
bookwork,
k
k new
example/application/theory
Sample questions:
Past papers
Problems in classes
269
Introduction, Probability and Random

Processes
Primary resources in communications: power, bandwidth,
cost
Objectives of system design: reliability and efficiency
Performance measures: SNR or bit error probability
Probability distribution: Uniform distribution, Gaussian
distribution, Rayleigh
y g distribution, Ricean distribution
Random process: stationary random process, autocorrelation and power spectral density, Wiener-Khinchine
relation
S X ( f ) RX ( )e j 2f d
270
Noise
Why is noise important in communications? How does
performance? What types
yp of noise exist?
noise affect the p
White noise: PSD is constant over an infinite bandwidth.
Gaussian noise: PDF is Gaussian.
Additive white Gaussian noise
Bandlimited noise, bandpass representation, baseband
noise nc(t) and ns(t), power spectral density
n(t ) nc (t ) cos((2f ct ) ns (t ) sin(
i ( 2f ct )
S ( f f c ) S N ( f f c ), | f | B
Sc ( f ) S s ( f ) N
otherwise
0,
271
Noise performance of AM
Signal-to-noise ratio
SNR
PS
PN
Baseband communication model
SNRbaseband
PT
N 0W
AM, DSB-SC, SSB, synchronous detection, envelope

detection
Output SNR
272
Noise performance of FM
FM modulator and demodulator
Method to deal with noise in FM: linear argument at high
SNR
Derivation of the output SNR, threshold effect
Pre-emphasis and de-emphasis, how they increase the

output SNR
273
PCM: sample, quantize, and encode
3P
Quantization noise and SNR SNRo (dB) 6n 10 log10 2 (dB)
mp
Companding (A/-law) and line coding

Baseband data transmission, effects of noise, and
probability of error
274
Noise performance of bandpass digital

communications
Modulation formats: ASK, FSK, PSK
Coherent detection and its error probability
Noncoherent detection and its error probability (including
differential detection for DPSK)
Q-function
Q function Q(x): computation by using the graph or
approximation
1
x2 / 2
Q( x)
e
,x0
2 x
275
Performance of digital modulation

Scheme
Bit Error Rate
Coherent ASK
Q(A/2)
Coherent FSK
Q(A/2)
Coherent PSK
Q(A/)
Noncoherent
ASK
exp(
exp(-A
A2/82)
Noncoherent
FSK
p( 2/42)
exp(-A
DPSK
exp(-A2/22)
Noncoherent
ASK, FSK
Coherent
ASK, FSK
Coherent PSK
Caution: ASK and FSK have the same bit

error rate
t if measured
d by
b average SNR
SNR.
Average SNR, dB
276
Information theory
The entropy of a discrete memoryless information source
K
H ( S ) pk log2 pk
k 1
Entropy function (entropy of a binary memoryless source)

H ( S ) (1 p ) log 2 (1 p ) p log 2 p H ( p )
Source coding theorem: The minimum average codeword

length for any source coding scheme is H(S) for a discrete
memoryless source.
Huffman coding: An efficient source coding algorithm.
277
Channel coding theorem

If the transmission rate R C, then there exists a
coding
g scheme such that the output
p of the source can
be transmitted over a noisy channel with an arbitrarily
small probability of error. Conversely, it is not
possible to transmit messages without error if R > C.
Shannon formula
P
C B log 2 1 SNR B log 2 1
N
B
0
278
Channel coding
Block vs. convolutional codes
Binaryy fields and vector space,
p
, Hamming
g distance/weight
g
A linear block code can be defined by a generator matrix
G and the associated parity-check matrix H.
Every linear block code is equivalent to a systematic code.
The key parameter of a linear block code is the minimum
Hamming distance dmin:
Up to dmin 1 errors can be detected;
Up
p to (d
( min 1)/2
) errors can be corrected.
Syndrome decoding: compute the syndrome and then find

the error pattern.
Hamming codes
Cyclic codes: (polynomial representation is not
examinable)
279
Appendix:
More Background on Probability
Probability
Sample Space
S : Set of all random experiment outcomes
S = { s : s iis an outcome
t
}
Examples
For tossing a coin, S = { H, T }
Roll a die, S = { 1, 2, , 6 }
Event E S
Roll a die twice: S = { (H,H), (H,T), (T,H), (T,T) }
Event E = { (H,H), (T,T) }
A collection of events, F. Obviously, F S
A probability measure on F is a function P : F [0,1]

[0 1] satisfying the
probability axioms:
P(S) = 1
P(F) 0
For events A, B belonging to F, if A B=0, P(A U B) = P(A) + P(B)
281
Random Variables
A random variable X(s) is a variable whose value depends on

the outcome s of a random experiment with the defined
probabilityy measure.
For convenience, we simply denote the random variable by X.
Consider a gamble in Macau (an example of discrete random

variables):
Loss Draw Win

P=3/8 P=1/4 P=3/8
-$5
$5
$5
X (s) 0
5
s W
sD
S W D L
sL
P ( X ( s ) 5) P ( s : s W ) P (W )
3
8
P ( X ( s ) 0) P ( s : s D ) P ( D )
1
4
X(s) P ( X ( s ) 5) P ( s : s L ) P ( L )
3
8
Continuous random variables: take values that vary

continuously, e.g., water flow of River Thames through London
Bridge
282
CDF and pdf
Cumulative Distribution Function (CDF), also known as Probability

Distribution Function
Probability Density Function (pdf)
CDF : F X ( x ) P ( X x )
pdf : f X ( x )
dF X ( x )
dx
FX ( x )
f X ( y ) dyy
FX ( )
f X ( y ) dyy 1
P ( a X b ) FX ( b ) FX ( a )
f X ( y ) dy
Since F X ( x ) is non - decreasing

f X ( x)
dF X ( x )
dx
0
283
Interpretation of pdf
If x is sufficiently small,
P ( x X x x )
f X ( y)
x x
f X ( y ) dyy f X ( x ) x
Area
f X ( x)
Expectation operator (expected value or average):
E[ X ] X
Variance:
y f X ( y ) dy
X2 E[( X X )2 ] ( y X )2 f X ( y)dy E[ X 2 ] X2
284
Moments of a Random Variable
Moments: r E [ X r ]
Average (mean) of X:
for r 1 ,2 ,3 ,...
1 E [ X ] or m X
Central moments,, centered on the mean
r ' E [| X m X |r ]
for r 1 ,2 ,3 ,...
C
Comments
1 ' E[ X m X ] E[ X ] m X 0
Variance : 2 ' E[| X m X |2 ]
Standard deviation : X 2 '
which gives a measure of dispersion of X about its mean
285
Examples of Discrete Distributions
Let p + q = 1
Bernoulli distribution
P ( X 1) p
P ( X 0) q
with p representing the probability of success in an experiment
Bi
Binomial
i l distribution
di t ib ti
n
P (Y k ) p k q n k
k
k 0 ,1, 2 ,..., n
Y represents the total number of successes, in an independent

p
trial of n Bernoulli experiments
Exercise: Verify
E [ X ] p , X2 pq
E [Y ] np , Y2 npq
286
Examples of Continuous Distributions:

Exponential Distribution
f X ( x)
f X ( x ) e x
E
Exercise:
i
V if
Verify
for x 0
E(X )
X2
287
Normal (Gaussian) Distribution
f X ( x)
f X ( x)
FX ( x )
1
2
1
2
( x m )2
2 2
for x
( y m )2
2 2
dy
Exercise: Verify E ( X ) m
2
288
Rayleigh and Rice Distributions
Define a random variable R
where X and Y are
independent Gaussian with zero mean and variance 2
R has Rayleigh distribution:
f R (r )
r 2 /( 2 2 )
r0
If X has nonzero mean A, R has Rice distribution:
f R (r )
where
I0 ( x)
1
2
e
2
r
( r 2 A2 ) /( 2 2 )
e x cos d
I 0 ( Ar2 ) r 0
is the modified zero-order
Bessel function of the first kind
289
Conditional Probability
Consider two events A and B, not necessarily independent of each

other
Define P(A|B) = Probability of A given B
Carry out N independent trials (experiments) and count

N(A) = Number of trials with A outcome
N(A,B) = Number of trials with both A and B outcome
By definition
N ( A)
as
N
N ( A,B )
B ) N (B)
P ( A)
P(A |
P(A B)
Similarly,
N ( A,B )
N
N ( A,B ) N ( A)
N ( A)
N
P ( B | A)P ( A)
P( A B ) P( A | B )P(B )
Statistical independence between A and B
P( A | B) P( A) P( A B) P( A)P(B)
290
Joint Random Variables
Random variables :
y
FXY ( x , y )
X ,Y
XY
( u , v ) dudv
f XY ( x , y ) : the joint pdf

FXY ( x , y ) : the joint CDF
Properties of joint distribution:

1) FXY ( , ) f XY (u, v )dudv 1

2a )
2b )
f X ( x)
fY ( x )
f XY ( x, y )dy
f XY ( x, y )dx
2 FXY ( x , y )
xy
3)
f XY ( x, y )
4)
X , Y are independen t f XY ( x, y ) f X ( x ) fY ( y )
291
Conditional CDF and pdf
Define FY ( y | x ) as the conditional CDF for Y given X = x.

y x x
By conditional probability
probability,
f XY ( u , v ) dudv
FY ( y | x X x x ) P ( YP (xy, xXXx xx ) x ) xx x
f X ( u ) du
x
f XY ( x ,v ) xdv
f X ( x ) x
As x 0,
FY ( y | x )
f XY ( x ,v ) dv
fX (x)
The conditional pdf
fY ( y | x )
dFY ( y | x )
dy
fY ( y | x )
f XY ( x , y )
fX (x)
292
Joint Distribution Function of Several Random Variables
The joint PDF of n random variables
X 1 ,..., X n
FX 1 X 2 ... X n ( x1 , x2 ,... xn ) P ( X 1 x1 , X 2 x2 ,... X n xn )
The joint pdf
f X1 X 2 ... X n ( x1 , x2 ,...xn )
n FX1X 2 ... X n ( x1 , x2 ,... xn )

x1x2 ...xn
Independent random variables
FX1X2 ...Xn (x1, x2 ,...xn ) FX1 (x1)FX2 (x2 )...FXn (xn )

f X1X2 ...Xn (x1, x2 ,,...xn ) f X1 (x1) f X2 (x2 ))...f Xn (xn )
Uncorrelated random variables
E[ X i X j ] E [ X i ]E [ X j ] i, j , i j
293
Covariance and Correlation Coefficient
Covariance of X and Y:
cov( X , Y ) E [( X m X )( Y mY )]
Correlation Coefficient:
X ,Y )
XY cov(
Property 1: 1
E [( X m X )(Y mY )]
XY
XY 1
Property 2: X and Y are linearly related ( i .e., Y aX b )

if and only if XY 1
Property 3: If X and Y are independent, XY 0
Caution: The converse of Property 3 is not true in general.
That is, if XY 0 , X and Y are not necessarily
independent!
294
Joint Gaussian distribution
X, Y jointly normally distribute d with m X mY 0, X Y

f XY ( x , y )
2 2
1
2
1 XY
exp
1
2
2 (1 XY
)
2
( x 2 2 XY xy y 2 )
The marginal density :
fY ( y )
XY
( x , y ) dx
1
2
exp{
Conditiona l density function :

f X |Y ( x | y )
f XY ( x , y )
fY ( y )
1
2
2 1 XY
y2
2 2
exp
1
2
2 (1 XY
)
2
( x XY y ) 2
Note that this conditional pdf is also a normal density function with
2
2
mean XY y and variance (1 XY ) . The effect of the condition
(having Y=y on X) is to change the mean of X to XY y and to
2
reduce
d
th
the variance
i
b
by 2 XY
.
295
Independent vs. Uncorrelated

Independent implies Uncorrelated
Uncorrelated does not imply Independence
For jointly Gaussian random variables, Uncorrelated
implies Independent (this is the only exceptional case!)
Exercise: verify the above claim of jointly Gaussian
random variables
296
An example of uncorrelated but dependent random variables
Let be uniformly distributed in [0,2 ]
f ( x) 21 for 0 x 2
Define random variables X and Y as
X cos Y sin
Clearly, X and Y are not independent. In particular, for a given ,

X and Y are dependent.
If X and Y were independent, we
should see possible sample points of
(X,Y) assume all possible values of X
Y
Locus of
and Y in a unit square.
q
X and Y
But X and Y are uncorrelated as
XY E[( X m X )(Y mY )]
X
E [ XY ]
2
21 0 cos sin d
0!
297
Central Limit Theorem
Define S n X i
i 1
where X i ' s are i.i.d. with
E[ X ] ,
Define a new random variable,, R n
X2 2
S n n
n
Central Limit Theorem :

x
lim P ( R n x )
n
1
2
y2 / 2
dy
Importance: The shifted and scaled sum of a very large number

of i.i.d. random variables has a Gaussian distribution
298

Comm2 2012

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Comm2 2012

Enviado por

Direitos autorais:

Formatos disponíveis

EE2--4: Communication Systems

Lecturer: Dr. Cong Ling (Senior Lecturer)

EE2--4 vs. EE1EE2

The main mathematical tool is probability.

About the Classes

Youre welcome to ask questions.

Our responsibility is to facilitate you to learn.

C. Ling, Notes of Communication Systems,

The information-carrying capacity of a communication

Copper wire: 1 MHz

Transmitter and Receiver

The receiver recreates the original message by

Design of analog communication is conceptually simple

Objectives of System Design

In certain scenarios, one resource may be more important

Objectives of a communication system design

In digital communications, is it possible to operate at

Zero error rate is possible as long as actual signaling

1837, Morse code used in telegraph

1875, Bell invented the telephone

1928, Nyquist proposed the sampling theorem

Cellular Mobile Phone Network

Growth of Mobile Communications

2G: digital communications

3G: CDMA networks

4G: data rate up to

IEEE 802.11 WiWi-Fi Standard

Missions to Moon, Mars,

Future Wireless Networks

Communication Systems mostly deals with the physical layer

EE2--4: Communication Systems

Lecture 2: Probability and Random

How probability is defined

Why Probability/Random Process?

Message is random. No randomness, no information.

Other real-world applications of probability and random

What is a random experiment?

CDF and PDF

The (cumulative) distribution function (cdf) of a random variable X

The probability density function (pdf) is defined as the derivative of

0 since F X ( x ) is non - decreasing

Mean and Variance

Mean (or expected value DC level):

Variance ( power for zero-mean signals):

Normal (Gaussian) Distribution

X , Y are uncorrelated E[ XY ] E[ X ]E[Y ]

Independent vs. Uncorrelated

Define RVs X and Y as

Joint Distribution of n RVs

n FX1X 2 ... X n ( x1 , x2 ,... xn )

i.i.d. (independent, identically distributed)

Central Limit Theorem

Illustration of convergence to Gaussian

What is a Random Process?

Statistics of a Random Process

RX (t1 , t2 ) E[ X (t1 ) X (t2 )]

In general, the autocorrelation function is a two-variable function.

Stationary Random Processes

In communications,, noise and message

Power Spectral Density

A similar relation exists for deterministic signals

Then the average power can be found as

Passing Through a Linear System