Você está na página 1de 45

CMPT 365 Multimedia Systems

Lossy Compression

Spring 2015

CMPT365 Multimedia Systems 1


Lossless vs Lossy Compression
 If the compression and decompression processes
induce no information loss, then the compression
scheme is lossless; otherwise, it is lossy.
 Why is lossy compression possible ?

Original

Compression Ratio: 7.7 Compression Ratio: 12.3 Compression Ratio: 33.9


CMPT365 Multimedia Systems 2
Outline

 Quantization
 Uniform
 Non-uniform

 Transform coding
 DCT

CMPT365 Multimedia Systems 3


Quantization
 The process of representing a large (possibly infinite)
set of values with a much smaller set.
 Example: A/D conversion
 An efficient tool for lossy compression
 Review …

Encoder
Entropy
Transform Quantization coding

channel
Decoder
Inverse Inverse Entropy
Transform Quantization decoding

CMPT365 Multimedia Systems 4


Review: Basic Idea

Reconstruction Values
Input Values

Bin 4 Bin 4
index index
Bin 3 Bin 3
x Bin 2 Entropy Entropy
Coding Decoding Bin 2 x
Bin 1 Bin 1
Bin 0 Bin 0

Quantizer Dequantizer
(Inverse Quantizer)
 Quantization is a function that maps an input interval to one integer
 Can reduce the bits required to represent the source.
 Reconstructed result is generally not the original input
 Terminologies:
 Decision boundaries bi: bin boundaries
 Reconstruction levels yi: output value of each bin by the dequantizer.

CMPT365 Multimedia Systems 5


Uniform Quantizer
 All bins have the same size except possibly for the two outer intervals:
 bi and yi are spaced evenly
 The spacing of bi and yi are both ∆ (step size)

yi 
1
bi 1  bi  for inner intervals.
2
Uniform Midrise Quantizer Uniform Midtread Quantizer
Reconstruction Reconstruction
3.5∆
3∆
2.5∆
2∆
1.5∆
-3∆ -2∆ -∆ 0.5 ∆ ∆
-2.5∆ -1.5∆ -0.5∆
-0.5∆ ∆ 2∆ 3∆ Input 0.5∆ 1.5∆ 2.5∆ Input
-1.5∆ -∆
-2.5∆ -2∆
-3.5∆ -3∆

Even number of reconstruction levels Odd number of reconstruction levels


0 is not a reconstruction level 0 is a reconstruction level
CMPT365 Multimedia Systems 6
Midtread Quantizer
 Quantization mapping:

Reconstruction Output is an index


3∆
x 
2∆ q  A( x)  sign ( x)   0.5
∆  
-2.5∆ -1.5∆ -0.5∆  Example:
0.5∆ 1.5∆ 2.5∆ Input x = -1.8∆, q = -2.
-∆

-2∆
 De-quantization mapping:
-3∆
xˆ  B(q)  q

CMPT365 Multimedia Systems 7


Model of Quantization
q
x A B x̂
 Quantization: q = A(x)
 Inverse Quantization: xˆ  B(q)  B( A( x))  Q( x)

B(x) is not exactly the inverse function of A(x), because xˆ  x


 Quantization error: e( x)  x  xˆ
 Combining quantizer and de-quantizer:
- e(x)

x Q x̂ or x x̂
CMPT365 Multimedia Systems 8
Rate-Distortion Tradeoff
 Things to be determined: Distortion

 Number of bins A
 Bin boundaries
 Reconstruction levels B

Rate
 A tradeoff between rate and distortion:
 To reduce the size of the encoded bits, we need to reduce
the number of bins
 Less bins  More reconstruction errors

CMPT365 Multimedia Systems 9


Measure of Distortion
 Quantization error: e( x)  x  xˆ
 Mean Squared Error (MSE) for Quantization
 Average quantization error of all input values
 Need to know the probability distribution of the input

 Number of bins: M
 Decision boundaries: bi, i = 0, …, M
 Reconstruction Levels: yi, i = 1, …, M
 Reconstruction:
xˆ  yi iff bi 1  x  bi
 bi
 MSE:
M
MSEq    x  ˆ
x 2
f ( x ) dx    x  yi 2
f ( x)dx
 i 1 bi1

 Same as the variance of e(x) if μ = E{e(x)} = 0 (zero mean).



 Definition of Variance:  e2   e   e 2
f (e)de

CMPT365 Multimedia Systems 10
Rate-Distortion Optimization
 Two Scenarios:
 Given M, find bi and yi that minimize the MSE.
 Given a distortion constraint D, find M, bi and yi such that
the MSE ≤ D.

CMPT365 Multimedia Systems 11


Outline

 Quantization
 Uniform
 Non-uniform
 Vector quantization

 Transform coding
 DCT

CMPT365 Multimedia Systems 12


Uniform Quantization of a Uniformly Distributed
Source

 Input X: uniformly distributed in [-Xmax, Xmax]: f(x)= 1 / (2Xmax)


 Number of bins: M (even for midrise quantizer)
 Step size is easy to get: ∆ = 2Xmax / M.
 bi = (i – M/2) ∆

y1 y2 y3 y4 y5 y6 y7 y8
-3.5∆ -2.5∆ -1.5∆ -0.5 ∆ 0. 5∆ 1.5∆ 2.5∆ 3.5∆ x

b0 b1 b2 b3 b4 b5 b6 b7 b8
-4∆ -3∆ -2∆ -∆ 0 ∆ 2∆ 3∆ 4∆
-Xmax Xmax
  e(x) is uniformly distributed in [-∆/2, ∆/2].

0.5 ∆
∆ 2∆ 3∆ 4∆ x
-4∆ -3∆ -2∆ -∆
-0.5 ∆
CMPT365 Multimedia Systems 13
Uniform Quantization of a Uniformly Distributed
Source

 M bi

 MSE MSEq    x  ˆ
x 2
f ( x ) dx    x  y i 2
f ( x)dx
 i 1 bi 1



2
1  M 1 3 1 2
M   x   dx 
2 X max 0  2 2 X max 12
  
12
 M increases, ∆ decreases, MSE decreases

 Variance of a random variable uniformly distributed in [- ∆/2, ∆/2]:


/2
 2q    x  0 2 1
dx 
1 2

 / 2
 12
 Optimization: Find M such that MSE ≤ D
2
1 2 1  2 X max  1
 D     D  M  X max
12 12  M  3D
CMPT365 Multimedia Systems 14
Signal to Noise Ratio (SNR)
 Variance is a measure of signal energy
 Let M = 2n
 Each bin index is represented by n bits

1 / 122 X max 
2
Signal Energy
SNR(dB)  10 log 10  10 log 10
Noise Energy 1 / 122

 10 log 10
2 X max 2  10 log 10 M 2  10 log 10 2 2 n  (20 log 10 2)n
2 X max / M 2
 6.02n dB
 If nn+1, ∆ is halved, noise variance reduces to 1/4,
and SNR increases by 6 dB.

CMPT365 Multimedia Systems 15


Outline

 Quantization
 Uniform
 Non-uniform

 Transform coding
 DCT

CMPT365 Multimedia Systems 16


Non-uniform Quantization
 Uniform quantizer is not optimal if source is not uniformly
distributed
 For given M, to reduce MSE, we want narrow bin when f(x) is high
and wide bin when f(x) is low

 M bk

 q2    x  ˆ
x 2
f ( x ) dx    x  y k 2
f ( x)dx
 k 1 bk 1

f(x)

0
CMPT365 Multimedia Systems 17
Lloyd-Max Quantizer
 Also known as pdf-optimized quantizer
 M bk

 q2    x  ˆ
x 2
f ( x ) dx    x  y k 2
f ( x)dx
 k 1 bk 1
 Given M, the optimal bi and yi that minimize MSE, satisfying

 q2  q2
Lagrangian condition :  0,  0.
yi bi
bi

 2  x f ( x)dx f(x)
 0  yi 
q bi 1

yi bi

 f ( x)dx
bi 1
x
yi is the centroid of interval [bi-1, bi].
0 bi-1 bi
yi
CMPT365 Multimedia Systems 18
Lloyd-Max Quantizer

 If f(x) = c (uniformly distributed source):


bi bi

 x f ( x)dx c  x dx 1 2
(bi  bi21 )
1
yi    2  (bi  bi 1 )
bi 1 bi 1
bi
c(bi  bi 1 ) bi  bi 1 2
 f ( x)dx
bi 1

f(x)
 2
yi  yi 1
 0  bi 
q

bi 2
 bi is the midpoint of yi and yi+1
x
0 bi-1 bi bi+1
yi yi+1
CMPT365 Multimedia Systems 19
Lloyd-Max Quantizer
 Summary of conditions for optimal quantizer:
bi

 x f ( x)dx
yi 
bi 1 yi  yi 1
bi bi 
 f ( x)dx
bi 1
2

 Given bi, can find the corresponding optimal yi


 Given yi, can find the corresponding optimal bi

 How to find optimal bi and yi simultaneously?


 A deadlock:
• Reconstruction levels depend on decision levels
• Decision levels depend on reconstruction levels
 Solution: iterative method !

CMPT365 Multimedia Systems 20


Lloyd Algorithm (Sayood pp. 267)

1. Start from an initial set of reconstruction values yi.

yi  yi 1
2. Find all decision levels bi 
2
M bk
Computer MSE:
3.
 q2     x  y k 2
f ( x)dx
k 1 bk 1
4. Stop if MSE changes little from last time.

5. Otherwise, update yi,


go to step 2.
bi

 x f ( x)dx
yi 
bi 1
bi

bi 1
 f ( x)dx
CMPT365 Multimedia Systems 21
Outline

 Quantization
 Uniform quantization
 Non-uniform quantization

 Transform coding
 Discrete Cosine Transform (DCT)

CMPT365 Multimedia Systems 22


Why Transform Coding ?
 Transform
 From one domain/space to another space
 Time -> Frequency
 Spatial/Pixel -> Frequency

 Purpose of transform
 Remove correlation between input samples
 Transform most energy of an input block into a few
coefficients
 Small coefficients can be discarded by quantization without too
much impact to reconstruction quality

Encoder
Entropy
Transform Quantization coding

CMPT365 Multimedia Systems 23


1-D Example
 Fourier Transform

CMPT365 Multimedia Systems 24


1-D Example
 Application (besides compression)
 Boost bass/audio equalizer
 Noise cancellation

CMPT365 Multimedia Systems 25


1-D Example
 http://www.mathdemos.org/mathdemos/trigsounddemo/trigso
unddemo.html
 Sine wave/sound/piano

 www.sagebrush.com/mousing.htm
 An electronic instrument that allows direct control of pitch and
amplitude

CMPT365 Multimedia Systems 26


1-D Example
 Smooth signals have strong DC (direct current, or zero frequency) and low
frequency components, and weak high frequency components
Original Input
200

100

0
1 2 3 4 5 6 7 8 Sample Index
DFT Magnitudes
2000

1000

0
DC 1 2 3 4 5 6 7 8 High frequency
DCT Coefficients
500

-500
1 2 3 4 5 6 7 8
High frequency
DC
CMPT365 Multimedia Systems 27
2-D Example
Original Image  Apply transform to each 8x8 block
 Histograms of source and DCT coefficients

10000

8000

6000

4000

2000

0
0 50 100 150 200 250 300

2-D DCT Coefficients. Min= -465.37, max= 1789.00 5


x 10
3

0
-500 0 500 1000 1500 2000

 Most transform coefficients are around 0.


 Desired for compression
CMPT365 Multimedia Systems 28
Rationale behind Transform

 If Y is the result of a linear transform T of the


input vector X in such a way that the components
of Y are much less correlated, then Y can be
coded more efficiently than X.
 If most information is accurately described by
the first few components of a transformed
vector, then the remaining components can be
coarsely quantized, or even set to zero, with little
signal distortion.

CMPT365 Multimedia Systems 29


Matrix Representation of Transform
 Linear transform is an N x N matrix:

y N 1  TN  N x N 1 X T y

 Inverse Transform:
1 y
xT y X T T
-1
x

 Unitary Transform (aka orthonormal):

y
1
T T T
X T T
T
x

 For unitary transform: rows/cols have unit norm and are


orthogonal to each others
1, i  j
TT  I  t t   ij  
T T

0, i  j
i j

CMPT365 Multimedia Systems 30


Discrete Cosine Transform (DCT)
 DCT – close to optimal (known as KL Transform) but much
simpler and faster

 Definition:
 (2 j  1) i  
Ci , j  a cos , i, j  0, ..., N - 1.
 2N 
a  1 / N for i  0,
a  2 / N for i  1, ..., N - 1.

 Matlab function:
 dct(eye(N));

CMPT365 Multimedia Systems 31


DCT  (2 j  1) i  
Ci , j  a cos , i, j  0, ..., N - 1.
 2N 
 Definition:
a  1 / N for i  0,
a  2 / N for i  1, ..., N - 1.
 N = 2 (Haar Transform): C2 
1 1 1 
 
2 1  1

 y0   x0  1 1 1   x0  1  x0  x1 
 y   C2  x   1  1  x   x  x 
 1  1 2   1  2  1 1
 y0 captures the mean of x0 and x1 (low-pass)
 x0 = x1 = 1  y0 = sqrt(2) (DC), y1 = 0
 y1 captures the difference of x0 and x1 (high-pass)
 x0 = 1, x1 = -1  y0 = 0 (DC), y1 = sqrt(2).

CMPT365 Multimedia Systems 32


DCT
 Magnitude Frequency Responses of 2-point DCT:
 Can be obtained by freqz( ) in Matlab.

DC Att.  403.0103, Mirr Att.  324.2604, Stopband  50, Coding Gain = 5.055 dB
5

Low pass 0

-5

High
Magnitude Response (dB)

-10
pass
1 1 1  -15
C2  1  1
2  -20

-25

-30

-35

-40
0 0.1 0.2 0.3 0.4 0.5
Normalized Frequency
DC x 2π
CMPT365 Multimedia Systems 33
4-point DCT
 Four subbands
0.5000 0.5000 0.5000 0.5000
0.6533 0.2706 -0.2706 -0.6533
0.5000 -0.5000 -0.5000 0.5000
0.2706 -0.6533 0.6533 -0.2706
DC Att.  406.0206, Mirr Att.  324.2604, Stopband  8.3456, Coding Gain = 7.5701 dB
5

-5

Magnitude Response (dB)


-10

-15

-20

-25

-30

-35

-40
0 0.1 0.2 0.3 0.4 0.5
Normalized Frequency
x 2π
CMPT365 Multimedia Systems 34
8-point DCT
 Eight subbands

DC Att.  409.0309, Mirr Att.  320.1639, Stopband  9.9559, Coding Gain = 8.8259 dB
5

-5

Magnitude Response (dB)


-10

-15

-20

-25

-30

-35

-40
0 0.1 0.2 0.3 0.4 0.5
Normalized Frequency
x 2π
CMPT365 Multimedia Systems 35
Example
 x = [100 110 120 130 140 150 160 170]T;
 8-point DCT:
[381.8377, -64.4232, 0.0, -6.7345, 0.0, -2.0090, 0.0, -0.5070]
Most energy are in the first 2 coefficients.

250

200

150

100

50
1 2 3 4 5 6 7 8

400

300

200

100

-100
1 2 3 4 5 6 7 8

CMPT365 Multimedia Systems 36


Block Transform
 Divide input data into blocks (2D)
 Encode each block separately (sometimes with information from
neighboring blocks)
 Examples:
 Most DCT-based image/video coding standards

CMPT365 Multimedia Systems 37


2-D DCT Basis

For 2-point DCT For 4-point DCT

CMPT365 Multimedia Systems 38


2-D Separable DCT

 X: N x N input block
 T: N x N transform
 A = TX: Apply T to each column of X
 B=XTT: Apply T to each row of X
 2-D Separable Transform:
 Apply T to each row
 Then apply T to each column

Y  TXT T

 Inverse Transform:
X  TT YT
CMPT365 Multimedia Systems 39
2-D 8-point DCT Example
 Original Data:
89 78 76 75 70 82 81 82
122 95 86 80 80 76 74 81
184 153 126 106 85 76 71 75
221 205 180 146 97 71 68 67
225 222 217 194 144 95 78 82
228 225 227 220 193 146 110 108
223 224 225 224 220 197 156 120
217 219 219 224 230 220 197 151

 2-D DCT Coefficients (after rounding to integers):

1155 259 -23 6 11 7 3 0


-377 -50 85 -10 10 4 7 -3
-4 -158 -24 42 -15 1 0 1
-2 3 -34 -19 9 -5 4 -1
1 9 6 -15 -10 6 -5 -1
3 13 3 6 -9 2 0 -3
Most energy is in the upper- 8 -2 4 -1 3 -1 0 -2
left corner 2 0 -3 2 -2 0 0 -1
CMPT365 Multimedia Systems 40
Interpretation of Transform

 Forward transform y = Tx (x is N x 1 vector)


 Let ti be the i-th row of T
  yi = ti x = <tiT, x> (Inner product)
 yi measures the similarity between x and ti
 Higher similarity  larger transform coefficient

 Inverse transform:

 
N 1
x  TT y  t T0 t1T  t TN 1 y   t Ti yi
i 0

 x is the weighted combination of ti.


 Rows of T are called basis vectors.

CMPT365 Multimedia Systems 41


Interpretation of 2-D Transform
Y  TXT T  X  TT YT
 2-D basis matrices:

t t ,
T
i j i, j  0, ..., N 1.
 Outer products of basis vectors

 Proof:
Define S i , j : The (i, j) - th entry is Y(i, j), all others are 0.
N 1 N 1 N 1 N 1
X  TT YT   TT Si , j T   Y (i, j ) t Ti t j
i 0 j 0 i 0 j 0

 X is the weighted combination of basic matrices.

CMPT365 Multimedia Systems 42


2-D DCT Basis Matrices

For 2-point DCT For 4-point DCT

CMPT365 Multimedia Systems 43


2-D DCT Basis Matrices: 8-point DCT

CMPT365 Multimedia Systems 44


Further Exploration

 Textbook 8.1-8.5
 Other sources
 Introduction to Data Compression by Khalid Sayood
 Vector Quantization and Signal Compression by Allen Gersho
and Robert M. Gray
 Digital Image Processing by Rafael C. Gonzales and Richard
E.Woods
 Probability and Random Processes with Applications to Signal
Processing by Henry Stark and John W. Woods
 A Wavelet Tour of Signal Processing by Stephane G. Mallat

CMPT365 Multimedia Systems 45

Você também pode gostar