Escolar Documentos
Profissional Documentos
Cultura Documentos
Motivation
The Information Revolution
Motivation
Consider a 3 minute song:
assuming two channels, a 16-bit resolution, a
sampling rate of 48 kHz, it will take 33 MB of disk
space to store the song.
Introduction
If data generation is growing at an explosive
rate, why not focus on improving transmission
and storage technologies?
Transmission and storage technologies are
improving but not at the same rate as data is
generated.
This is especially true for wireless
communications where the radio spectrum is
limited.
IEEE Intercon 2014, Arequipa
Introduction
Data compression is the art or science of
representing information in a compact form.
Data compression is performed by identifying
and exploiting structure and redundancies in
the data.
Data can be samples of audio, images, text
files, it can be generated by sensors or
scientific instruments, social networks,
markets, etc.
IEEE Intercon 2014, Arequipa
Introduction
Consider Morse code, developed in the 19th
century, in which letters are encoded with dots
and dashes.
some letters (e and a) occur more often than others (q
and j).
letters that occur more frequently are encoded using
shorter sequences: e .
a . Letters that occur less frequently are encoded using
longer sequences: q - - . j .- - -
Introduction
There are many other types of structure in
data that can be exploited to achieve
compression.
In speech, the physical structure of our vocal
tract determines the kind of sounds that we
can produce instead of sending speech
samples we can send information about the
vocal tract to the receiver.
We can also exploit characteristics of the end
user of the data.
IEEE Intercon 2014, Arequipa
Introduction
In many cases, when transmitting images or
audio, the end user is a human.
Humans have limited hearing and vision
abilities.
We can exploit the limitations of human
perception to discard irrelevant information
and obtain higher compression.
Reconstructed
compression
reconstruction
(decompression)
Compression Algorithm
Lossless Compression
Lossless compression involves no loss of
information.
The recovered data is an exact copy of the
original.
Useful in applications that cannot tolerate any
difference:
medical images
scientific data
financial records
computer programs
Lossy Compression
In lossy compression some loss of information is
tolerated.
The original data cannot be recovered exactly but
results in higher compression ratios.
Useful in applications where some loss of
information is not critical:
speech coding
telephone
communications
video coding
digital photography
Compression Performance
Compression ratio (CR):
CR =
2
2
2
max
PSNR dB = 10 log10
MSE
Example 1
Lets consider the following input sequence:
= [9, 11, 11, 11, 14, 13, 15, 17, 16, 17, 20, 21]
To encode this sequence using plain binary code, we would need to use 5
bits per number and a total of 60 bits.
K. Sayood, Introduction to Data Compression, 2nd edition, Morgan Kauffman
Example 1
If we use the model:
=+8
Example 2
Input sequence:
a_barayaran_array_ran_far_faar_faaar_away
The sequence is made of eight different characters (symbols):
a, b, f, n, r, w, y, _
Hence, we can use three bits per symbol to encode the
sequence resulting in a total of 413=123 bits for the entire
sequence.
However, we can use fewer bits if we realize that some
symbols occur more frequently than others.
We can use fewer bits to encode the more frequent symbols.
K. Sayood, Introduction to Data Compression, 2nd edition, Morgan Kauffman
Example 2
Input sequence: a_barayaran_array_ran_far_faar_faaar_away
Input character
Frequency
Variable-length code
Fixed-length code
16
001
01100
010
0100
011
0111
100
000
101
01101
110
0101
111
000
codewords
001
codes
Using variable-length codes we can encode the sequence using only 97 bits.
IEEE Intercon 2014, Arequipa
Statistical Redundancy
Statistical redundancy was employed in
Example 2 to build a code to encode the input
sequence.
When compressing text, statistical redundancy
can be extended to, not only characters, but
also words dictionary technique.
Examples of compression solutions that use
the dictionary technique include the LempelZiv (LZ) algorithm, LZ77, gzip, Zip, PNG, PKZip.
IEEE Intercon 2014, Arequipa
A
{a1, a2 an}
symbols
a1 a2 a3 a6 a8 a 5 a3 a4
( ) log ( )
=1
alphabet
message
Entropy
( ) log ( )
=1
If the base of the logarithm is 2 the units of entropy are bits. If the base is
10 the units are hartleys. If the base is e the units are nats.
The first-order entropy assumes that the symbols occur independently of
each other.
The entropy is a measure of the average number of bits needed to
encoded the output of the source.
Claude Shannon showed that the best rate that a lossless compression
algorithm can achieve is equal to the entropy of the source.
Example:
Lets consider a source with an alphabet consisting of four symbols: a 1, a2, a3, a4.
P(a1) = 1/2, P(a2) = 1/4, P(a3) = 1/8, P(a4) = 1/8
H = -(1/2 log2(1/2) + 1/4 log2(1/4) + 1/8 log2(1/8) + 1/8 log2(1/8)) = 1.75
bits/symbol.
IEEE Intercon 2014, Arequipa
Coding
Coding is the process of assigning binary sequences to symbols of
an alphabet.
Example:
Lets consider a source with a four-symbol alphabet such that: P(a1) = 1/2,
P(a2) = 1/4, P(a3) = 1/8, P(a4) = 1/8
H = 1.75 bits/symbol.
Symbol
Probability
Code 1
Code 2
Code 3
Code 4
a1
0.5
a2
0.25
10
01
a3
0.125
00
110
011
a4
0.125
10
11
111
0111
1.125 bits
1.25 bits
1.75 bits
1.875 bits
Average length
uniquely
decodable
codes
Prefix Codes
Consider the following codewords:
k bits
n bits
C2
C1
n bits
IF
C2
C1
k bits
dangling suffix
Huffman Coding
Huffman coding is an algorithm for building optimum prefix
codes.
It was developed as a class assignment in the first class on
information theory taught by Robert Fano at MIT in 1950.
Huffman coding assumes that the probabilities of the source are
known.
Huffman coding is based on the following observations about
optimum prefix codes:
Symbols with higher probability have shorter codewords than
less probable symbols.
The two symbols with the lowest probabilities have the same
length (proof by contradiction)
In a Huffman code the codewords corresponding to the two
symbols with the lowest probabilities differ only in the last bit.
IEEE Intercon 2014, Arequipa
Huffman Coding
Example:
Lets build a Huffman code for a source with a four-symbol alphabet
such that: (a1) = 0.5, P(a2) = 0.25, P(a3) = 0.125, P(a4) = 0.125
0.5
0.25
0.125
0.125
a1
a2
a3
a4
0.5
0.25
a1
a2
0.25
0
a3
a4
Huffman Coding
2
0.5
0.25
a1
a2
0.25
0.5
a3
a4
0.5
a1
1
0
0.25
a2
0.25
0
a3
a4
Huffman Coding
4
1.0
0.5
a1
0.5
1
0
0.25
a2
0.125
a3
a4
Probability
Codeword
a1
0.5
a2
0.25
10
a3
0.125
110
a4
0.125
111
0.25
0
Symbol
0.125
0110101110
0110101110
a1
0110101110
a2
0110101110
0
a3
a4
0110101110
Decoded message
a1
a1 a3
a1 a3 a2
a1 a3 a2 a4
a1 a3 a 2 a4 a1
Golomb-Rice Codes
The Golomb-Rice codes are a family of codes commonly used in data
compression applications due to their low-complexity and good
compression performance.
The JPEG committee and the Consultative Committee for Space Data
Systems (CCSDS), for instance, have adopted the Golomb-Rice codes as
part of their standards.
Golomb-Rices codes have also been recommended in the lossless audio
compression standard H.264 and are already used in many commercial
audio compression software.
The Golomb-Rice codes have their origin in the pioneering work of
Golomb who proposed a method to encode run lengths of events of a
binary source when pom=1/2, where po is the probability of events and m
is an integer.
Golomb-Rice Codes
binary source
A
{0, 1}
100001000100001000000010001001
po is the probability of a 1
(pom=1/2 where m is an integer)
. 10. . 12.
0 1 2 3 4 5 6 7 8 9
11
Golomb-Rice Codes
The Golomb-Rice codes consider the special case when m = 2k (k0)
Encoding Procedure:
n
2k
unary code
n
mod k
2
natural binary
code
b7 b6 b5 b4 b3 b2 b1 b0
1 1 1 1 1 1 0 b3 b2 b1 b0
unary code
binary code
Example:
n =17 (00010001)
k=0 codeword = 111111111111111110
k=1 codeword = 1111111101
k=2 codeword = 1111001
k=3 codeword = 110001
IEEE Intercon 2014, Arequipa
k=4
k=5
k=6
k=7
codeword = 100001
codeword = 010001
codeword = 0010001
codeword = 00010001
Golomb-Rice Codes
Practical sources produce positive and negative numbers
(double-sided distribution)
P(n)
....
. 10. 11. .
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
M(n) =
IEEE Intercon 2014, Arequipa
if n 0
2|n|1 if n < 0
source
G-R
coder
codeword
adaptive
algorithm
M
1)
2)
3)
4)
5)
6)
7)
Initialize k to kini;
Reset counter;
Read input n and encode it using parameter k;
If (unary code 1) increment counter;
If (unary code = 0) decrement counter;
If (counter value M) k++; Goto 2;
If (counter value -M) k--; Goto 2;
Entropy Coding
If the source has a narrow distribution, an entropy encoder (Huffman, Golomb-Rice,
arithmetic) can be used directly
P(n)
source
entropy
encoder
compressed
output
source
decorrelation
predictive coding,
transform coding,
subband coding
entropy
encoder
compressed
output
55 57 59 63
58 61 63 69
60 64
X = 64
XX
-2
-1
eX
pixel prediction
prediction residual
Residual
Histogram
Histogram
Pixel neighborhood
NN NNE
WW
NW
NE
else if > 80
else {
+ /2 + ( )/4
if > 32
( + )/2
else if > 32
( + )/2
= | | + + | |
else if > 8
(3 + )/4
= | | + + | |
else if > 8
(3 + )/4
}
Transform Coding
In transform coding the input sequence is transformed into another sequence in
which most of the information is contained in only a few elements.
For a 1D signal such as audio or speech, , the forward transform is defined as:
=
and the inverse transform is defined as:
=
the transforms are orthonormal transforms: = =
Transform Coding
In the JPEG standard, the forward transform is the Discrete Cosine Transform
(DCT) and the inverse transform is the Inverse Discrete Cosine Transform (IDCT).
The DCT transform matrix is defined as:
, =
1
2+1
cos
= 0, = 0,1, , 1
2
2+1
cos
= 1,2, , 1, = 0,1, , 1
input image
DC
DCT
Quantization
AC
quantization
table
DPCM
RLC
Entropy
encoder
compressed
image
79
39
38
39
45
50
50
54
41
38
42
61
57
54
55
55
34
37
41
63
73
60
57
59
35
39
46
59
52
58
58
67
43
44
46
44
47
54
54
63
DCT
DC coefficient
AC coefficients
6.0
0.0
-0.1
-0.3
-1.8
-0.2
-0.8
-9.4
-7.3
-1.1
5.0
-3.0
4.1 11.5
5.1
12.2
12.1
9.7
-7.0
-6.6
2.6 11.3
8.5 11.5
9.2
7.9
3.7
-6.4
6.3 10.1
3.8
1.8
2.6
9.8
1.4
-2.0
0.3
2.3
-5.1
-1.2
DCT coefficients
502.0 119.5
88.6 173.4
62.0 78.7
12.2 4.7
3.5 -22.5
12.1 9.7
9.2 7.9
2.6 9.8
83.8
90.9
22.2
-37.1
-36.9
-7.0
3.7
1.4
48.3
22.5
-44.9
-44.6
-20.3
-6.6
-6.4
-2.0
6.0
11.5
-19.8
-30.2
-13.0
2.6
6.3
0.3
0.0
-1.8
-9.4
-12.2
4.1
11.3
10.1
-1.2
-0.1
-0.2
-7.3
5.0
11.5
8.5
3.8
2.3
-0.3
-0.8
-1.1
-3.0
5.1
11.5
1.8
-5.1
16
12
14
14
18
24
49
72
11
12
13
17
22
35
64
92
10
14
16
22
37
55
78
95
16
19
24
29
56
64
87
98
24
26
40
51
68
81
103
112
40
58
57
87
109
104
121
100
51
60
69
80
103
113
120
103
= round
61
55
56
62
77
92
101
99
Quantized coefficients
496
84
56
14
0
24
0
0
121
168
78
0
-22
0
0
0
80
84
16
-44
-37
0
0
0
48 0
19 0
-48 0
-58 -51
0 0
0 0
0 0
0 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Sub-band Coding
In sub-band coding the input signal is decomposed into several subbands using an analysis filter bank.
Depending on the signal different sub-bands will contain different
amounts of information.
Sub-bands with lots of information are encoded using more bits while
sub-bands with little information are encoded using fewer bits.
At the decoder side, the signal is reconstructed using a bank of synthesis
filter.
. . .
f1
IEEE Intercon 2014, Arequipa
f2
f3
. . .
fM
Subband Coding
analysis
filter 1
entropy
encoder 1
. . .
entropy
decoder 1
synthesis
filter 1
analysis
filter 2
entropy
encoder 2
. . .
entropy
decoder 2
synthesis
filter 2
output
input
analysis
filter 3
entropy
encoder 3
analysis
filter M
entropy
encoder M
. . .
. . .
entropy
decoder 3
synthesis
filter 3
entropy
decoder M
synthesis
filter M
Further Reading
Khalid Sayood, Introduction to Data Compression, 4th edition, Morgan
Kaufmann, San Francisco, 2012.
G. Held and T. R. Marshall, Data Compression, 3rd edition, John Wiley
and Sons, New York, 1991.
N. S. Jayant and P. Noll, Digital Coding of Waveforms, Prentice Hall,
Englewood Cliffs, 1984.
B. E. Usevitch, A tutorial on modern lossy wavelet image compression:
foundations of JPEG 2000, IEEE Signal Processing Magazine, vol. 18, no.
5, 2001.
D. Pan, Digital audio compression, Digital Technical Journal, vol. 5, no.
2, 1993.
M. Hans and R. W. Schafer, Lossless compression of digital audio, IEEE
Signal Processing Magazine, vol. 18, no. 4, 2001.
G. E. Blelloch, Introduction to Data Compression, course notes,
Computer Science Department, Carnegie Mellon University
IEEE Intercon 2014, Arequipa