Escolar Documentos
Profissional Documentos
Cultura Documentos
~ Jennie G. Abraham Fall 2009, EE5355 Reference Book: THE TRANSFORM AND DATA COMPRESSION HANDBOOK, edited by K.R. Rao and P.C. Yip
4.0 Transform Introduction In general, there are several characteristics that are desirable for the purpose of data compression. Transforms are useful entities that encapsulate these some/all of these characteristics: Data decorrelation: The ideal transform completely decorrelates the data in a sequence/block; i.e., it packs the most amount of energy in the fewest number of coefficients. In this way, many coefficients can be discarded after quantization and prior to encoding. It is important to note that the transform operation itself does not achieve any compression. It aims at decorrelating the original data and compacting a large fraction of the signal energy into relatively few transform coefficients. Data-independent basis functions: Owing to the large statistical variations among data, the optimum transform usually depends on the data, and finding the basis functions of such transform is a computationally intensive task. This is particularly a problem if the data blocks are highly nonstationary, which necessitates the use of more than one set of basis functions to achieve high decorrelation. Therefore, it is desirable to trade optimum performance for a transform whose basis functions are data-independent. Fast implementation: The number of operations required for an n-point transform is generally of the order O(n2). Some transforms have fast implementations, which reduce the number of operations to O(n log n). For a separable n n 2-D transform, performing the row and column 1-D transforms successively reduces the number of operations from O(n4) to O(2n2 log n). 4.1 DCT Introduction
The discrete cosine transforms (DCT) and discrete sine transform (DST) are members of a family of sinusoidal unitary transforms. They are real, orthogonal, and separable with fast algorithms for its computation. They have a great relevance to data compression Sinusoidal unitary transform: ~ is an invertible linear transform whose kernel describes a set of complete, orthogonal discrete cosine and/or sine basis functions. E.g.: KLT, generalized DFT, generalized discrete Hartley transform, and various types of the DCT and DST are members of this class of unitary transforms. The family of discrete trigonometric transforms consists of 8 versions of DCT. Each transform is identified as EVEN or ODD and of type I, II, III, and IV. All present digital signal and image processing applications (mainly transform coding and digital filtering of signals) involve only even types of the DCT and DST. Therefore, we consider these four even types of DCT. DCT-I DCT-II DCT-III DCT-IV Wang and Hunt Ahmed, Natarajan, and Rao Ahmed, Natarajan, and Rao Jain defined for the order N +1. excellent energy compaction property, best approximation for the optimal KLT Inverse of DCT-II fast implementation of lapped orthogonal transform for the efficient transform/subband coding
Note: For normalized even types of DCT in the matrix form : calculate RHS value for each n and k at (n,k) N is assumed to be an integer power of 2, i.e., N = 2m subscript of matrix denotes its order superscript denotes the version number
Unitary Property
Linearity Property , for a matrix M, constants and , and vectors g and f, all DCTs are linear transforms.
The Convolution-Multiplication Property Convolution in the spatial domain is equivalent to taking an inverse transform of the product of forward transforms of two data sequences. The convolution multiplication property is a powerful tool for performing digital filtering in the transform domain.
All DCTs are separable transforms multidimensional transform can be decomposed into successive application of one-dimensional (1-D) transforms in the appropriate directions.
4.3 Relations to the KLT KLT is an optimal transform for data compression in a statistical sense because it decorrelates a signal in the transform domain, packs the most information in a few coefficients, and minimizes mean-square error between the reconstructed and original signal compared to other transform. However, KLT is constructed from the eigenvalues and the corresponding eigenvectors of a covariance matrix of the data to be transformed; it is signal-dependent, and there is no general algorithm for its fast computation. There is asymptotic equivalence of the family of DCTs with respect to KLT for a first-order stationary Markov process in terms of transform size and the adjacent (inter element) correlation coefficient . The performance of DCTs, particularly important in transform coding, is associated with the KLT. For finite length data, DCTs and DSTs provide different approximations to KLT, and the best approximating transform varies with the value of correlation coefficient . E.g.: 1 0 -1 KLT is reduced to DCT-II (DCT-III) DST-I DST-II (DST-III)
For infinite length data i.e. data if the transform size N increases (i.e., N tends to infinity KLT is reduced to DCT I or DCT IV This asymptotic behavior implies that DCTs and DSTs can be used as substitutes for KLT of certain random processes. 4.4 Relation to DFT[Question (?)]
DCT is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. DCTs are equivalent to DFTs of roughly twice the length, operating on real data with even symmetry. The obvious distinction between a DCT and a DFT is that the former uses only cosine functions, while the latter uses both cosines and sines (in the form of complex exponentials). Compared with DFT, DCT has two main advantages: Its a real transform with better computational efficiency than DFT which by definition is a complex transform. It does not introduce discontinuity while imposing periodicity in the time signal. In DFT, as the time signal is truncated and assumed periodic, discontinuity is introduced in time domain and some corresponding artifacts is introduced in frequency domain. But as even symmetry is assumed while truncating the time signal, no discontinuity and related artifacts are introduced in DCT.
4.5 Relevance to data compression DCT-II Performance of DCT-II is closest to the statistically optimal KLT based on a number of performance criteria. variance distribution, energy packing efficiency, residual correlation, rate distortion, maximum reducible bits
Exhibition of desirable characteristics for data compression namely, o Data decorrelation o Data-independent basis functions o Fast implementation The importance of DCT II is further accentuated by its Superiority in bandwidth compression (redundancy reduction) of a wide range of signals. Powerful performance in the bit-rate reduction. Existence of fast algorithms for its implementation.
DCT-II and its inversion, DCT-III, have been employed in the international image/video coding standards: e.g.: JPEG, MPEG, H.261, H.263, H.264 4.6 DCT Computation 4.6.1 : DCT Definition
Example: A
4.6.3 Computation of DCT from DFT (using 2N point FFT): To derive the DCT of an N-point real signal sequence first construct a new sequence of points: , we
This
2N-point
sequence
is
assumed
to
repeat
its
self
outside
the
range
The DFT of this 2N-point even symmetric sequence can be found as:
Since
is even and
, all terms
in the second summation are odd and the summation is zero (while all terms in the first summation are even). It can also be seen that all by and get is real and even . Next, we replace
Note that since all terms in the summation are all even symmectric, only the first half of the data points need to be used. Moreover, as cosine function is even, periodic with period , we have , indicating that a point corresponding point can be dropped. Now we have the discrete cosine transform (DCT): ( ) in the second half is the same as its is also even and
in the first half, i.e., the second half is redundant and therefore
where
All row vectors of this DCT matrix are orthogonal and normalized except the first one (
):
each row is unity and the dot product of any pair of rows is zero(the product terms may be expressed as the sum of a pair of cosine functions, which are each zero mean). To make DCT a orthonormal transform, we define a coefficient
where
is modified with
. As these
is orthogonal:
or in matrix form:
4.6.4 DCT Fast Algorithms: 1. N point DCT via 2N point FFT 2. N point DCT via N point FFT
3. Recursive Fast Algorithm 4. Sparse Matrix Factors 5. Prime Factor Algorithm for DCT 6. DIT & DIF Algorithms for DCT Fast DCT algorithm Forward DCT The DCT of a sequence we define a new sequence can be implemented by FFT. First :
where the first summation is for all even terms and second all odd terms. We define for the second summation becomes and for , then the limits of the summation and for
can be combined
and take the real part of the result (and keep in mind that both
and
where
is the DFT of
(defined from .
of
is real,
is symmetric and
from
by
Inverse DCT The most obvious way to do inverse DCT is to reverse the order and the mathematical operations of the three steps for the forward DCT:
step 1: Obtain
from
. In step 3 above there are N equations but 2N variables ). However, note that as are real, the real
independent variables). So there are only N variables which can be obtained by solving the
step 2: Obtain
from
complexity.
Step 3: Obtain
from
by
However, there is a more efficient way to do the inverse DCT. Consider first the real part of the inverse DFT of the sequence :
equation
gives
the
inverse .
DCT To
of the
even points,
data recall
obtain
can be obtained from the second half of the previous equation in reverse order . In summary, we have these steps to compute IDCT:
from
by inverse DFT also using FFT. (Only the real part need
Step 3: Obtain
from
by
These three steps are mathematically equivalent to the steps of the first method.
Data Compression Although representing images in digital form allows visual information to be easily manipulated in useful and novel ways, there is one potential problem with digital imagesthe large number of bits required to represent even a single digital image directly. The need for image compression becomes apparent when we compute the number of bits per image resulting from typical sampling and quantization schemes. We consider the amount of storage for the Lena digital image shown in Fig. 4.7.
The monochrome (grayscale) version of this image with a resolution 512 512 8 bits/pixel requires a total of 2,097,152 bits, or equivalently 262,144 bytes. The color version of the same image in RGB format (red, green, and blue color bands) with a resolution of 8 bits/color requires a total of 6,291,456 bits (=512 512 3 8 bits/pixel), or 786,432 bytes. Such an image should be compressed for efficient storage or transmission. In order to utilize digital images effectively, specific techniques are needed to reduce the number of bits required for their representation. Fortunately, digital images generally contain a significant amount of redundancy (spatial, spectral, or temporal redundancy). Image data compression (the art/science of efficient coding of the picture data) aims at taking advantage of this redundancy to reduce the number of bits required to represent an image. This can result in significantly reducing the memory needed for image storage and channel capacity for image transmission.
Image compression methods can be classified into two fundamental groups: lossless and lossy Lossless compression Reconstructed image after compression identical to the original image. Modest 1:2 or 1:3 compression ratios are achieved. Lossy compression Reconstructed image contains degradations relative to the original. Generally, more compression is obtained at the expense of more distortion. Transform Coding Compression Scheme: [Question (?)] The most used lossy compression technique is transform coding.
A general transform coding scheme involves subdividing an N N image into smaller nonoverlapping n n sub-image blocks and performing a unitary transform on each block. The transform operation itself does not achieve any compression. It aims at decorrelating the original data and compacting a large fraction of the signal energy into a relatively small set of transform coefficients (energy packing property). In this way, many coefficients can be discarded after quantization and prior to encoding. In principle, DCT introduces no loss to the source samples, it merely transforms them to a domain in which they can be more efficiently encoded.
Most practical transform coding systems are based on DCT of types II and III, which Provides good compromise between energy packing ability and computational complexity. The energy packing property of DCT is superior to that of any other unitary transform. Transforms that redistribute or pack the most information into the fewest coefficients provide the best sub-image approximations and, consequently, the smallest reconstruction errors. DCT basis images are fixed (image independent) as opposed to the optimal KLT which is data dependent. E.g.: DCT-Based Image Compression/Decompression
Block diagram of encoder and decoder for JPEG DCT-based image compression and decompression.