Você está na página 1de 16

Compressibility

Rakesh K R
209040
Computer Science & Engineering
Government Engineering College, Idukki
Basics of Compression
Goals:
to understand how image/audio/video signals are
compressed to save storage and increase
transmission efficiency
to understand in detail common formats like GIF,
JPEG and MPEG
What does compression do
Reduces signal size by taking advantage of
correlation
Spatial
Temporal
Spectral
Linear Predictive AutoRegressive Polynomial Fitting
Model-Based
Huffman
Statistical
Arithmetic Lempel-Ziv
Universal
Lossless
Spatial/Time-Domain
Subband Wavelet
Filter-Based
Fourier DCT
Transform-Based
Frequency-Domain
Lossy
Waveform-Based
Compression Methods
Compression Issues
Lossless compression
Coding Efficiency
Compression ratio
Coder Complexity
Memory needs
Power needs
Operations per second
Coding Delay
Compression Issues
Additional Issues for Lossy Compression
Signal quality
Bit error probability
Signal/Noise ratio
Mean opinion score
Compression Method Selection
Constrained output signal quality? TV
Retransmission allowed? interactive sessions
Degradation type at decoder acceptable?
Multiple decoding levels? browsing and retrieving
Encoding and decoding in tandem? video editing
Single-sample-based or block-based?

Fixed coding delay, variable efficiency?
Fixed output data size, variable rate?
Fixed coding efficiency, variable signal quality? CBR
Fixed quality, variable efficiency? VBR
Fundamental Ideas
Run-length encoding
Average Information Entropy
For source S generating symbols S
1
through S
N

Self entropy: I(s
i
) =

Entropy of source S:

Average number of bits to encode s H(S) - Shannon
Differential Encoding
To improve the probability distribution of symbols
i
i
p
p
log
1
log =

=
i
i i p p S H 2 log ) (
Huffman Encoding
Let an alphabet have N symbols S
1
S
N

Let p
i
be the probability of occurrence of S
i
Order the symbols by their probabilities
p
1
> p
2
> p
3
> > p
N
Replace symbols S
N-1
and S
N
by a new symbol H
N-1

such that it has the probability p
N-1
+ p
N

Repeat until there is only one symbol
This generates a binary tree
Huffman Encoding Example
Symbols picked up as
K+W
{K,W}+?
{K,W,?}+U
{R,L}
{K,W,?,U}+E
{{K,W,?,U,E},{R,L}}
Codewords are
generated in a tree-
traversal pattern
Symbol Probability Codeword
K 0.05 10101
L 0.2 01
U 0.1 100
W 0.05 10100
E 0.3 11
R 0.2 00
? 0.1 1011
Properties of Huffman Codes
Fixed-length inputs become variable-length outputs
Average codeword length:
Upper bound of average length: H(S) + p
max
Code construction complexity: O(N log N)
Prefix-condition code: no codeword is a prefix for
another
Makes the code uniquely decodable

i ip l
Huffman Decoding
Look-up table-based decoding
Create a look-up table
Let L be the longest codeword
Let c
i
be the codeword corresponding to symbol s
i
If c
i
has l
i
bits, make an L-bit address such that the first l
i
bits
are c
i
and the rest can be any combination of 0s and 1s.
Make 2^(L-l
i
) such addresses for each symbol
At each entry, record, (s
i
, l
i
) pairs
Decoding
Read L bits in a buffer
Get symbol s
k
, that has a length of l
k
Discard l
k
bits and fill up the L-bit buffer
Constraining Code Length
The problem: the code length should be at most L bits
Algorithm
Let threshold T=2
-L
Partition S into subsets S
1
and S
2
S
1
= {s
i
|p
i
> T} t-1 symbols
S
2
= {s
i
|p
i
s T} N-t+1 symbols
Create special symbol Q whose frequency q =
Create code word c
q
for symbol Q and normal
Huffman code for every other symbol in S
1

For any message string in S
1
, create regular code word
For any message string in S
2
, first emit c
q
and then a regular
code for the number of symbols in S
2

=
N
t i
i p
Constraining Code Length
Another algorithm
Set threshold t like before
If for any symbol s
i
, p
i
s T, set p
i
= T
Design the codebook
If the resulting codebook violates the ordering of code
length according to symbol frequency, reorder the codes
How does this differ from the previous algorithm?
What is its complexity?
Golomb-Rice Compression
Take a source having symbols occurring with a geometric
probability distribution
P(n)=(1-p
0
) p
n
0
n>0, 0<p
0
<1
Here is P(n) the probability of run-length of n for any symbol
Take a positive integer m and determine q, r where
n=mq+r.
G
m
code for n: 001 + bin(r)
Optimal if
m =
q
has log
2
m bits if r
>sup(log
2
m)
(
(
(

) log(
) 1 log(
0
0
p
p
Golomb-Rice Compression
Now if m=2
k

Get q by k-bit right shift and r by last r bits of n
Example:
If m=4 and n=22 the code is 00000110
Decoding G-R code:
Count the number of 0s before the first 1. This gives q. Skip
the 1. The next k bits gives r
The Lossless JPEG standard
Try to predict the value of pixel X as
prediction residual r=y-X
y can be one of 0, a, b, c, a+b-c, a+(b-
c)/2,b+(a-c)/2
The scan header records the choice of y
Residual r is computed modulo 2
16
The number of bits needed for r is called its
category, the binary value of r is its
magnitude
The category is Huffman encoded
c b
a X

Você também pode gostar