6 LossyCompression

CMPT 365 Multimedia Systems
Lossy Compression
Spring 2015
CMPT365 Multimedia Systems 1

Lossless vs Lossy Compression
 If the compression and decompression processes
induce no information loss, then the compression
scheme is lossless; otherwise, it is lossy.
 Why is lossy compression possible ?
Original
Compression Ratio: 7.7 Compression Ratio: 12.3 Compression Ratio: 33.9

Outline
 Quantization
 Uniform
 Non-uniform
 Transform coding
 DCT

Quantization
 The process of representing a large (possibly infinite)
set of values with a much smaller set.
 Example: A/D conversion
 An efficient tool for lossy compression
 Review …
Encoder
Entropy
Transform Quantization coding
channel
Decoder
Inverse Inverse Entropy
Transform Quantization decoding

Review: Basic Idea
Reconstruction Values
Input Values
Bin 4 Bin 4
index index
Bin 3 Bin 3
x Bin 2 Entropy Entropy
Coding Decoding Bin 2 x
Bin 1 Bin 1
Bin 0 Bin 0
Quantizer Dequantizer
(Inverse Quantizer)
 Quantization is a function that maps an input interval to one integer
 Can reduce the bits required to represent the source.
 Reconstructed result is generally not the original input
 Terminologies:
 Decision boundaries bi: bin boundaries
 Reconstruction levels yi: output value of each bin by the dequantizer.

Uniform Quantizer
 All bins have the same size except possibly for the two outer intervals:
 bi and yi are spaced evenly
 The spacing of bi and yi are both ∆ (step size)
yi 
1
bi 1  bi  for inner intervals.
2
Uniform Midrise Quantizer Uniform Midtread Quantizer
Reconstruction Reconstruction
3.5∆
3∆
2.5∆
2∆
1.5∆
-3∆ -2∆ -∆ 0.5 ∆ ∆
-2.5∆ -1.5∆ -0.5∆
-0.5∆ ∆ 2∆ 3∆ Input 0.5∆ 1.5∆ 2.5∆ Input
-1.5∆ -∆
-2.5∆ -2∆
-3.5∆ -3∆
Even number of reconstruction levels Odd number of reconstruction levels

0 is not a reconstruction level 0 is a reconstruction level
Midtread Quantizer
 Quantization mapping:
Reconstruction Output is an index

3∆
x 
2∆ q  A( x)  sign ( x)   0.5
∆  
-2.5∆ -1.5∆ -0.5∆  Example:
0.5∆ 1.5∆ 2.5∆ Input x = -1.8∆, q = -2.
-∆
-2∆
 De-quantization mapping:
-3∆
xˆ  B(q)  q

Model of Quantization
q
x A B x̂
 Quantization: q = A(x)
 Inverse Quantization: xˆ  B(q)  B( A( x))  Q( x)
B(x) is not exactly the inverse function of A(x), because xˆ  x

 Quantization error: e( x)  x  xˆ
 Combining quantizer and de-quantizer:
- e(x)
x Q x̂ or x x̂
Rate-Distortion Tradeoff
 Things to be determined: Distortion
 Number of bins A
 Bin boundaries
 Reconstruction levels B
Rate
 A tradeoff between rate and distortion:
 To reduce the size of the encoded bits, we need to reduce
the number of bins
 Less bins  More reconstruction errors

Measure of Distortion
 Quantization error: e( x)  x  xˆ
 Mean Squared Error (MSE) for Quantization
 Average quantization error of all input values
 Need to know the probability distribution of the input
 Number of bins: M
 Decision boundaries: bi, i = 0, …, M
 Reconstruction Levels: yi, i = 1, …, M
 Reconstruction:
xˆ  yi iff bi 1  x  bi
 bi
 MSE:
M
MSEq    x  ˆ
x 2
f ( x ) dx    x  yi 2
f ( x)dx
 i 1 bi1
 Same as the variance of e(x) if μ = E{e(x)} = 0 (zero mean).


 Definition of Variance:  e2   e   e 2
f (e)de

Rate-Distortion Optimization
 Two Scenarios:
 Given M, find bi and yi that minimize the MSE.
 Given a distortion constraint D, find M, bi and yi such that
the MSE ≤ D.

Outline
 Quantization
 Uniform
 Non-uniform
 Vector quantization
 DCT

Uniform Quantization of a Uniformly Distributed
Source
 Input X: uniformly distributed in [-Xmax, Xmax]: f(x)= 1 / (2Xmax)

 Number of bins: M (even for midrise quantizer)
 Step size is easy to get: ∆ = 2Xmax / M.
 bi = (i – M/2) ∆
y1 y2 y3 y4 y5 y6 y7 y8
-3.5∆ -2.5∆ -1.5∆ -0.5 ∆ 0. 5∆ 1.5∆ 2.5∆ 3.5∆ x
b0 b1 b2 b3 b4 b5 b6 b7 b8
-4∆ -3∆ -2∆ -∆ 0 ∆ 2∆ 3∆ 4∆
-Xmax Xmax
  e(x) is uniformly distributed in [-∆/2, ∆/2].
0.5 ∆
∆ 2∆ 3∆ 4∆ x
-4∆ -3∆ -2∆ -∆
-0.5 ∆
Uniform Quantization of a Uniformly Distributed
Source
 M bi
 MSE MSEq    x  ˆ
x 2
f ( x ) dx    x  y i 2
f ( x)dx
 i 1 bi 1


2
1  M 1 3 1 2
M   x   dx 
2 X max 0  2 2 X max 12
  
12
 M increases, ∆ decreases, MSE decreases
 Variance of a random variable uniformly distributed in [- ∆/2, ∆/2]:

/2
 2q    x  0 2 1
dx 
1 2

 / 2
 12
 Optimization: Find M such that MSE ≤ D
2
1 2 1  2 X max  1
 D     D  M  X max
12 12  M  3D
Signal to Noise Ratio (SNR)
 Variance is a measure of signal energy
 Let M = 2n
 Each bin index is represented by n bits
1 / 122 X max 
2
Signal Energy
SNR(dB)  10 log 10  10 log 10
Noise Energy 1 / 122
 10 log 10
2 X max 2  10 log 10 M 2  10 log 10 2 2 n  (20 log 10 2)n
2 X max / M 2
 6.02n dB
 If nn+1, ∆ is halved, noise variance reduces to 1/4,
and SNR increases by 6 dB.

Outline
 Quantization
 Uniform
 Non-uniform
 DCT

Non-uniform Quantization
 Uniform quantizer is not optimal if source is not uniformly
distributed
 For given M, to reduce MSE, we want narrow bin when f(x) is high
and wide bin when f(x) is low
 M bk
 q2    x  ˆ
x 2
f ( x ) dx    x  y k 2
f ( x)dx
 k 1 bk 1
f(x)
0
Lloyd-Max Quantizer
 Also known as pdf-optimized quantizer
 M bk
 q2    x  ˆ
x 2
f ( x ) dx    x  y k 2
f ( x)dx
 k 1 bk 1
 Given M, the optimal bi and yi that minimize MSE, satisfying
 q2  q2
Lagrangian condition :  0,  0.
yi bi
bi
 2  x f ( x)dx f(x)
 0  yi 
q bi 1
yi bi
 f ( x)dx
bi 1
x
yi is the centroid of interval [bi-1, bi].
0 bi-1 bi
yi
Lloyd-Max Quantizer
 If f(x) = c (uniformly distributed source):

bi bi
 x f ( x)dx c  x dx 1 2
(bi  bi21 )
1
yi    2  (bi  bi 1 )
bi 1 bi 1
bi
c(bi  bi 1 ) bi  bi 1 2
 f ( x)dx
bi 1
f(x)
 2
yi  yi 1
 0  bi 
q
bi 2
 bi is the midpoint of yi and yi+1
x
0 bi-1 bi bi+1
yi yi+1
Lloyd-Max Quantizer
 Summary of conditions for optimal quantizer:
bi
 x f ( x)dx
yi 
bi 1 yi  yi 1
bi bi 
 f ( x)dx
bi 1
2
 Given bi, can find the corresponding optimal yi

 Given yi, can find the corresponding optimal bi
 How to find optimal bi and yi simultaneously?

 A deadlock:
• Reconstruction levels depend on decision levels
• Decision levels depend on reconstruction levels
 Solution: iterative method !

Lloyd Algorithm (Sayood pp. 267)
1. Start from an initial set of reconstruction values yi.
yi  yi 1
2. Find all decision levels bi 
2
M bk
Computer MSE:
3.
 q2     x  y k 2
f ( x)dx
k 1 bk 1
4. Stop if MSE changes little from last time.
5. Otherwise, update yi,

go to step 2.
bi
 x f ( x)dx
yi 
bi 1
bi
bi 1
 f ( x)dx
Outline
 Quantization
 Uniform quantization
 Non-uniform quantization
 Discrete Cosine Transform (DCT)

Why Transform Coding ?
 Transform
 From one domain/space to another space
 Time -> Frequency
 Spatial/Pixel -> Frequency
 Purpose of transform
 Remove correlation between input samples
 Transform most energy of an input block into a few
coefficients
 Small coefficients can be discarded by quantization without too
much impact to reconstruction quality
Encoder
Entropy
Transform Quantization coding

1-D Example
 Fourier Transform

1-D Example
 Application (besides compression)
 Boost bass/audio equalizer
 Noise cancellation

1-D Example
 http://www.mathdemos.org/mathdemos/trigsounddemo/trigso
unddemo.html
 Sine wave/sound/piano
 www.sagebrush.com/mousing.htm
 An electronic instrument that allows direct control of pitch and
amplitude

1-D Example
 Smooth signals have strong DC (direct current, or zero frequency) and low
frequency components, and weak high frequency components
Original Input
200
100
0
1 2 3 4 5 6 7 8 Sample Index
DFT Magnitudes
2000
1000
0
DC 1 2 3 4 5 6 7 8 High frequency
DCT Coefficients
500
-500
1 2 3 4 5 6 7 8
High frequency
DC
2-D Example
Original Image  Apply transform to each 8x8 block
 Histograms of source and DCT coefficients
10000
8000
6000
4000
2000
0
0 50 100 150 200 250 300
2-D DCT Coefficients. Min= -465.37, max= 1789.00 5

x 10
3
0
-500 0 500 1000 1500 2000
 Most transform coefficients are around 0.

 Desired for compression
Rationale behind Transform
 If Y is the result of a linear transform T of the

input vector X in such a way that the components
of Y are much less correlated, then Y can be
coded more efficiently than X.
 If most information is accurately described by
the first few components of a transformed
vector, then the remaining components can be
coarsely quantized, or even set to zero, with little
signal distortion.

Matrix Representation of Transform
 Linear transform is an N x N matrix:
y N 1  TN  N x N 1 X T y
 Inverse Transform:
1 y
xT y X T T
-1
x
 Unitary Transform (aka orthonormal):
y
1
T T T
X T T
T
x
 For unitary transform: rows/cols have unit norm and are

orthogonal to each others
1, i  j
TT  I  t t   ij  
T T
0, i  j
i j

Discrete Cosine Transform (DCT)
 DCT – close to optimal (known as KL Transform) but much
simpler and faster
 Definition:
 (2 j  1) i  
Ci , j  a cos , i, j  0, ..., N - 1.
 2N 
a  1 / N for i  0,
a  2 / N for i  1, ..., N - 1.
 Matlab function:
 dct(eye(N));

DCT  (2 j  1) i  
Ci , j  a cos , i, j  0, ..., N - 1.
 2N 
 Definition:
a  1 / N for i  0,
a  2 / N for i  1, ..., N - 1.
 N = 2 (Haar Transform): C2 
1 1 1 
 
2 1  1
 y0   x0  1 1 1   x0  1  x0  x1 
 y   C2  x   1  1  x   x  x 
 1  1 2   1  2  1 1
 y0 captures the mean of x0 and x1 (low-pass)
 x0 = x1 = 1  y0 = sqrt(2) (DC), y1 = 0
 y1 captures the difference of x0 and x1 (high-pass)
 x0 = 1, x1 = -1  y0 = 0 (DC), y1 = sqrt(2).

DCT
 Magnitude Frequency Responses of 2-point DCT:
 Can be obtained by freqz( ) in Matlab.
DC Att.  403.0103, Mirr Att.  324.2604, Stopband  50, Coding Gain = 5.055 dB
5
Low pass 0
-5
High
Magnitude Response (dB)
-10
pass
1 1 1  -15
C2  1  1
2  -20
-25
-30
-35
-40
0 0.1 0.2 0.3 0.4 0.5
Normalized Frequency
DC x 2π
4-point DCT
 Four subbands
0.5000 0.5000 0.5000 0.5000
0.6533 0.2706 -0.2706 -0.6533
0.5000 -0.5000 -0.5000 0.5000
0.2706 -0.6533 0.6533 -0.2706
DC Att.  406.0206, Mirr Att.  324.2604, Stopband  8.3456, Coding Gain = 7.5701 dB
5
-5

-10
-15
-20
-25
-30
-35
-40
0 0.1 0.2 0.3 0.4 0.5
x 2π
8-point DCT
 Eight subbands
DC Att.  409.0309, Mirr Att.  320.1639, Stopband  9.9559, Coding Gain = 8.8259 dB
5
-5

-10
-15
-20
-25
-30
-35
-40
0 0.1 0.2 0.3 0.4 0.5
x 2π
Example
 x = [100 110 120 130 140 150 160 170]T;
 8-point DCT:
[381.8377, -64.4232, 0.0, -6.7345, 0.0, -2.0090, 0.0, -0.5070]
Most energy are in the first 2 coefficients.
250
200
150
100
50
1 2 3 4 5 6 7 8
400
300
200
100
-100
1 2 3 4 5 6 7 8

Block Transform
 Divide input data into blocks (2D)
 Encode each block separately (sometimes with information from
neighboring blocks)
 Examples:
 Most DCT-based image/video coding standards

2-D DCT Basis
For 2-point DCT For 4-point DCT

2-D Separable DCT
 X: N x N input block
 T: N x N transform
 A = TX: Apply T to each column of X
 B=XTT: Apply T to each row of X
 2-D Separable Transform:
 Apply T to each row
 Then apply T to each column
Y  TXT T
 Inverse Transform:
X  TT YT
2-D 8-point DCT Example
 Original Data:
89 78 76 75 70 82 81 82
122 95 86 80 80 76 74 81
184 153 126 106 85 76 71 75
221 205 180 146 97 71 68 67
225 222 217 194 144 95 78 82
228 225 227 220 193 146 110 108
223 224 225 224 220 197 156 120
217 219 219 224 230 220 197 151
 2-D DCT Coefficients (after rounding to integers):
1155 259 -23 6 11 7 3 0

-377 -50 85 -10 10 4 7 -3
-4 -158 -24 42 -15 1 0 1
-2 3 -34 -19 9 -5 4 -1
1 9 6 -15 -10 6 -5 -1
3 13 3 6 -9 2 0 -3
Most energy is in the upper- 8 -2 4 -1 3 -1 0 -2
left corner 2 0 -3 2 -2 0 0 -1
Interpretation of Transform
 Forward transform y = Tx (x is N x 1 vector)

 Let ti be the i-th row of T
  yi = ti x = <tiT, x> (Inner product)
 yi measures the similarity between x and ti
 Higher similarity  larger transform coefficient
 Inverse transform:
 
N 1
x  TT y  t T0 t1T  t TN 1 y   t Ti yi
i 0
 x is the weighted combination of ti.

 Rows of T are called basis vectors.

Interpretation of 2-D Transform
Y  TXT T  X  TT YT
 2-D basis matrices:
t t ,
T
i j i, j  0, ..., N 1.
 Outer products of basis vectors
 Proof:
Define S i , j : The (i, j) - th entry is Y(i, j), all others are 0.
N 1 N 1 N 1 N 1
X  TT YT   TT Si , j T   Y (i, j ) t Ti t j
i 0 j 0 i 0 j 0
 X is the weighted combination of basic matrices.

2-D DCT Basis Matrices
For 2-point DCT For 4-point DCT

2-D DCT Basis Matrices: 8-point DCT

Further Exploration
 Textbook 8.1-8.5
 Other sources
 Introduction to Data Compression by Khalid Sayood
 Vector Quantization and Signal Compression by Allen Gersho
and Robert M. Gray
 Digital Image Processing by Rafael C. Gonzales and Richard
E.Woods
 Probability and Random Processes with Applications to Signal
Processing by Henry Stark and John W. Woods
 A Wavelet Tour of Signal Processing by Stephane G. Mallat


6 LossyCompression

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

6 LossyCompression

Enviado por

Direitos autorais:

Formatos disponíveis

CMPT 365 Multimedia Systems

CMPT365 Multimedia Systems 1

Compression Ratio: 7.7 Compression Ratio: 12.3 Compression Ratio: 33.9

CMPT365 Multimedia Systems 3

CMPT365 Multimedia Systems 4

CMPT365 Multimedia Systems 5

Even number of reconstruction levels Odd number of reconstruction levels

Reconstruction Output is an index

CMPT365 Multimedia Systems 7

B(x) is not exactly the inverse function of A(x), because xˆ  x

CMPT365 Multimedia Systems 9

 Same as the variance of e(x) if μ = E{e(x)} = 0 (zero mean).

CMPT365 Multimedia Systems 11

CMPT365 Multimedia Systems 12

 Input X: uniformly distributed in [-Xmax, Xmax]: f(x)= 1 / (2Xmax)

 Variance of a random variable uniformly distributed in [- ∆/2, ∆/2]:

CMPT365 Multimedia Systems 15

CMPT365 Multimedia Systems 16

 If f(x) = c (uniformly distributed source):

 Given bi, can find the corresponding optimal yi

 How to find optimal bi and yi simultaneously?

CMPT365 Multimedia Systems 20

1. Start from an initial set of reconstruction values yi.

5. Otherwise, update yi,

CMPT365 Multimedia Systems 22

CMPT365 Multimedia Systems 23

CMPT365 Multimedia Systems 24

CMPT365 Multimedia Systems 25

CMPT365 Multimedia Systems 26

2-D DCT Coefficients. Min= -465.37, max= 1789.00 5

 Most transform coefficients are around 0.

 If Y is the result of a linear transform T of the

CMPT365 Multimedia Systems 29

 Unitary Transform (aka orthonormal):

 For unitary transform: rows/cols have unit norm and are

CMPT365 Multimedia Systems 30

CMPT365 Multimedia Systems 31

CMPT365 Multimedia Systems 32

Magnitude Response (dB)

Magnitude Response (dB)

CMPT365 Multimedia Systems 36

CMPT365 Multimedia Systems 37

For 2-point DCT For 4-point DCT

CMPT365 Multimedia Systems 38

 2-D DCT Coefficients (after rounding to integers):

1155 259 -23 6 11 7 3 0

 Forward transform y = Tx (x is N x 1 vector)

 x is the weighted combination of ti.

CMPT365 Multimedia Systems 41

 X is the weighted combination of basic matrices.

CMPT365 Multimedia Systems 42

For 2-point DCT For 4-point DCT

CMPT365 Multimedia Systems 43

CMPT365 Multimedia Systems 44

CMPT365 Multimedia Systems 45

Você também pode gostar