Você está na página 1de 20

DISCRETE COSINE TRANSFORMS

~ Jennie G. Abraham Fall 2009, EE5355 Reference Book: THE TRANSFORM AND DATA COMPRESSION HANDBOOK, edited by K.R. Rao and P.C. Yip

4.0 Transform Introduction In general, there are several characteristics that are desirable for the purpose of data compression. Transforms are useful entities that encapsulate these some/all of these characteristics: Data decorrelation: The ideal transform completely decorrelates the data in a sequence/block; i.e., it packs the most amount of energy in the fewest number of coefficients. In this way, many coefficients can be discarded after quantization and prior to encoding. It is important to note that the transform operation itself does not achieve any compression. It aims at decorrelating the original data and compacting a large fraction of the signal energy into relatively few transform coefficients. Data-independent basis functions: Owing to the large statistical variations among data, the optimum transform usually depends on the data, and finding the basis functions of such transform is a computationally intensive task. This is particularly a problem if the data blocks are highly nonstationary, which necessitates the use of more than one set of basis functions to achieve high decorrelation. Therefore, it is desirable to trade optimum performance for a transform whose basis functions are data-independent. Fast implementation: The number of operations required for an n-point transform is generally of the order O(n2). Some transforms have fast implementations, which reduce the number of operations to O(n log n). For a separable n n 2-D transform, performing the row and column 1-D transforms successively reduces the number of operations from O(n4) to O(2n2 log n). 4.1 DCT Introduction

The discrete cosine transforms (DCT) and discrete sine transform (DST) are members of a family of sinusoidal unitary transforms. They are real, orthogonal, and separable with fast algorithms for its computation. They have a great relevance to data compression Sinusoidal unitary transform: ~ is an invertible linear transform whose kernel describes a set of complete, orthogonal discrete cosine and/or sine basis functions. E.g.: KLT, generalized DFT, generalized discrete Hartley transform, and various types of the DCT and DST are members of this class of unitary transforms. The family of discrete trigonometric transforms consists of 8 versions of DCT. Each transform is identified as EVEN or ODD and of type I, II, III, and IV. All present digital signal and image processing applications (mainly transform coding and digital filtering of signals) involve only even types of the DCT and DST. Therefore, we consider these four even types of DCT. DCT-I DCT-II DCT-III DCT-IV Wang and Hunt Ahmed, Natarajan, and Rao Ahmed, Natarajan, and Rao Jain defined for the order N +1. excellent energy compaction property, best approximation for the optimal KLT Inverse of DCT-II fast implementation of lapped orthogonal transform for the efficient transform/subband coding

4.1.2 Definitions of DCTs

Note: For normalized even types of DCT in the matrix form : calculate RHS value for each n and k at (n,k) N is assumed to be an integer power of 2, i.e., N = 2m subscript of matrix denotes its order superscript denotes the version number

4.1.3 Mathematical Properties DCT Matrices are real and orthogonal

Unitary Property

Linearity Property , for a matrix M, constants and , and vectors g and f, all DCTs are linear transforms.

The Convolution-Multiplication Property Convolution in the spatial domain is equivalent to taking an inverse transform of the product of forward transforms of two data sequences. The convolution multiplication property is a powerful tool for performing digital filtering in the transform domain.

All DCTs are separable transforms multidimensional transform can be decomposed into successive application of one-dimensional (1-D) transforms in the appropriate directions.

4.3 Relations to the KLT KLT is an optimal transform for data compression in a statistical sense because it decorrelates a signal in the transform domain, packs the most information in a few coefficients, and minimizes mean-square error between the reconstructed and original signal compared to other transform. However, KLT is constructed from the eigenvalues and the corresponding eigenvectors of a covariance matrix of the data to be transformed; it is signal-dependent, and there is no general algorithm for its fast computation. There is asymptotic equivalence of the family of DCTs with respect to KLT for a first-order stationary Markov process in terms of transform size and the adjacent (inter element) correlation coefficient . The performance of DCTs, particularly important in transform coding, is associated with the KLT. For finite length data, DCTs and DSTs provide different approximations to KLT, and the best approximating transform varies with the value of correlation coefficient . E.g.: 1 0 -1 KLT is reduced to DCT-II (DCT-III) DST-I DST-II (DST-III)

For infinite length data i.e. data if the transform size N increases (i.e., N tends to infinity KLT is reduced to DCT I or DCT IV This asymptotic behavior implies that DCTs and DSTs can be used as substitutes for KLT of certain random processes. 4.4 Relation to DFT[Question (?)]

DCT is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. DCTs are equivalent to DFTs of roughly twice the length, operating on real data with even symmetry. The obvious distinction between a DCT and a DFT is that the former uses only cosine functions, while the latter uses both cosines and sines (in the form of complex exponentials). Compared with DFT, DCT has two main advantages: Its a real transform with better computational efficiency than DFT which by definition is a complex transform. It does not introduce discontinuity while imposing periodicity in the time signal. In DFT, as the time signal is truncated and assumed periodic, discontinuity is introduced in time domain and some corresponding artifacts is introduced in frequency domain. But as even symmetry is assumed while truncating the time signal, no discontinuity and related artifacts are introduced in DCT.

4.5 Relevance to data compression DCT-II Performance of DCT-II is closest to the statistically optimal KLT based on a number of performance criteria. variance distribution, energy packing efficiency, residual correlation, rate distortion, maximum reducible bits

Exhibition of desirable characteristics for data compression namely, o Data decorrelation o Data-independent basis functions o Fast implementation The importance of DCT II is further accentuated by its Superiority in bandwidth compression (redundancy reduction) of a wide range of signals. Powerful performance in the bit-rate reduction. Existence of fast algorithms for its implementation.

DCT-II and its inversion, DCT-III, have been employed in the international image/video coding standards: e.g.: JPEG, MPEG, H.261, H.263, H.264 4.6 DCT Computation 4.6.1 : DCT Definition

4.6.2 DCT Matrix Form:

Example of a 4x4 DCT Matrix:

Example of a 4x4 IDCT Matrix:

Example: A

-point DCT matrix can be generated by

Assume the signal is

, then its DCT transform is:

The inverse transform is:

4.6.3 Computation of DCT from DFT (using 2N point FFT): To derive the DCT of an N-point real signal sequence first construct a new sequence of points: , we

This

2N-point

sequence

is

assumed

to

repeat

its

self

outside

the

range

, i.e., it is periodic with period to the point :

, and it is even symmetric with respect

If we shift the signals defining another index respect to

to the right by 1/2, or, equivalently, shift , then

to the left by 1/2 by is even symmetric with .

. In the following we simply represent this new function by

The DFT of this 2N-point even symmetric sequence can be found as:

Since

is even and

is odd with respect to

, all terms

in the second summation are odd and the summation is zero (while all terms in the first summation are even). It can also be seen that all by and get is real and even . Next, we replace

Note that since all terms in the summation are all even symmectric, only the first half of the data points need to be used. Moreover, as cosine function is even, periodic with period , we have , indicating that a point corresponding point can be dropped. Now we have the discrete cosine transform (DCT): ( ) in the second half is the same as its is also even and

in the first half, i.e., the second half is redundant and therefore

where

the nth row and mth column of the DCT matrix:

All row vectors of this DCT matrix are orthogonal and normalized except the first one (

):

It is straightforward to show that a DCT matrix is orthonormal

for n even, since the norm of

each row is unity and the dot product of any pair of rows is zero(the product terms may be expressed as the sum of a pair of cosine functions, which are each zero mean). To make DCT a orthonormal transform, we define a coefficient

so that DCT now becomes

where

is modified with

, which is also the component in the nth row and mth

coloum of the N by N cosine transform matrix:

Here row vectors are orthogonal:

is the ith row of the DCT transform matrix

. As these

the DCT matrix

is orthogonal:

The inverse DCT is

or in matrix form:

4.6.4 DCT Fast Algorithms: 1. N point DCT via 2N point FFT 2. N point DCT via N point FFT

3. Recursive Fast Algorithm 4. Sparse Matrix Factors 5. Prime Factor Algorithm for DCT 6. DIT & DIF Algorithms for DCT Fast DCT algorithm Forward DCT The DCT of a sequence we define a new sequence can be implemented by FFT. First :

Then the DCT of simplicity):

can be written as the following (the coefficient

is dropped for now for

where the first summation is for all even terms and second all odd terms. We define for the second summation becomes and for , then the limits of the summation and for

, and the second summation can be written as

where the equal sign is due to the trigonometric identity:

Now the two summations in the expression of

can be combined

Next, consider the DFT of

If we multiply both sides by

and take the real part of the result (and keep in mind that both

and

are real), we get:

The last equal sign is due to the trigonometric identity:

This expression for

is identical to that for

above, therefore we get

where

is the DFT of

(defined from .

) which can be computed using FFT

algorithm with time complexity

In summary, fast forward DCT can be implemented in 3 steps:

Step 1: Generate a sequence

from the given sequence

Step 2: Obtain DFT

of

using FFT. (As

is real,

is symmetric and

only half of the data points need be computed.)

step 3: Obtain DCT

from

by

Inverse DCT The most obvious way to do inverse DCT is to reverse the order and the mathematical operations of the three steps for the forward DCT:

step 1: Obtain

from

. In step 3 above there are N equations but 2N variables ). However, note that as are real, the real

(both real and imaginary parts of part of its spectrum N equations.

is even (N+1 independent variables) and imaginary part odd (N-1

independent variables). So there are only N variables which can be obtained by solving the

step 2: Obtain

from

by inverse DFT also using FFT in

complexity.

Step 3: Obtain

from

by

However, there is a more efficient way to do the inverse DCT. Consider first the real part of the inverse DFT of the sequence :

This points that

equation

gives

the

inverse .

DCT To

of the

all odd data

even points,

data recall

obtain

, and all odd data points

can be obtained from the second half of the previous equation in reverse order . In summary, we have these steps to compute IDCT:

step 1: Generate a sequence

from the given DCT sequence

step 2: Obtain be computed.)

from

by inverse DFT also using FFT. (Only the real part need

Step 3: Obtain

from

by

These three steps are mathematically equivalent to the steps of the first method.

Data Compression Although representing images in digital form allows visual information to be easily manipulated in useful and novel ways, there is one potential problem with digital imagesthe large number of bits required to represent even a single digital image directly. The need for image compression becomes apparent when we compute the number of bits per image resulting from typical sampling and quantization schemes. We consider the amount of storage for the Lena digital image shown in Fig. 4.7.

The monochrome (grayscale) version of this image with a resolution 512 512 8 bits/pixel requires a total of 2,097,152 bits, or equivalently 262,144 bytes. The color version of the same image in RGB format (red, green, and blue color bands) with a resolution of 8 bits/color requires a total of 6,291,456 bits (=512 512 3 8 bits/pixel), or 786,432 bytes. Such an image should be compressed for efficient storage or transmission. In order to utilize digital images effectively, specific techniques are needed to reduce the number of bits required for their representation. Fortunately, digital images generally contain a significant amount of redundancy (spatial, spectral, or temporal redundancy). Image data compression (the art/science of efficient coding of the picture data) aims at taking advantage of this redundancy to reduce the number of bits required to represent an image. This can result in significantly reducing the memory needed for image storage and channel capacity for image transmission.

Image compression methods can be classified into two fundamental groups: lossless and lossy Lossless compression Reconstructed image after compression identical to the original image. Modest 1:2 or 1:3 compression ratios are achieved. Lossy compression Reconstructed image contains degradations relative to the original. Generally, more compression is obtained at the expense of more distortion. Transform Coding Compression Scheme: [Question (?)] The most used lossy compression technique is transform coding.

A general transform coding scheme involves subdividing an N N image into smaller nonoverlapping n n sub-image blocks and performing a unitary transform on each block. The transform operation itself does not achieve any compression. It aims at decorrelating the original data and compacting a large fraction of the signal energy into a relatively small set of transform coefficients (energy packing property). In this way, many coefficients can be discarded after quantization and prior to encoding. In principle, DCT introduces no loss to the source samples, it merely transforms them to a domain in which they can be more efficiently encoded.

Most practical transform coding systems are based on DCT of types II and III, which Provides good compromise between energy packing ability and computational complexity. The energy packing property of DCT is superior to that of any other unitary transform. Transforms that redistribute or pack the most information into the fewest coefficients provide the best sub-image approximations and, consequently, the smallest reconstruction errors. DCT basis images are fixed (image independent) as opposed to the optimal KLT which is data dependent. E.g.: DCT-Based Image Compression/Decompression

Block diagram of encoder and decoder for JPEG DCT-based image compression and decompression.

Você também pode gostar