Você está na página 1de 8

A Sasmito Adibowo – 1299000029

Digital Image Processing


This paper glances on several topics in the field of digital image pro-
cessing. The discussion consists of brief introductions in image com-
pression, data redundancy, image segmentation and classification.

Image Compression
The main reason to support the use of image reduction and compression tech-
niques is that digital images typically are large two-dimensional matrices of
intensity values, and that these matrices often contain redundant data. The goal
is to reduce the capacity required in terms of space for storage, manipulation and
transmission of digital images.

Relative Data Redundancy


Data redundancy – the central issue in digital image compression – is not an
abstract concept, but a mathematically quantifiable entity. If n1 and n2 denote
how much data in two sets that represents the same information, the relative
redundancy RD of the first data set (the one characterized by n1) can be defined

1 n1
as RD = 1 − . Where CR, commonly called the compression ratio, is CR = .
CR n2

For the case n2 = n1, CR = 1 and RD = 0, saying that the first representation of the
information contains no redundant data compared with the second data set.
When n2 n n1, CR ÷ 4 and RD ÷ 1 implying highly redundant data. In the case
n2 o n1, CR ÷ 0 and RD ÷ - 4 saying that the second data set contains much more
information than the first.

Redundancy in digital images consists of three main categories: coding redun-


dancy, interpixel redundancy, and psychovisual redundancy. Data compression is
achieved when one or more of these redundancies are reduced or eliminated.

Coding Redundancy
Redundancy of this type is caused by the internal representation of digital
images, which is typically a matrix of intensity values. These values are stored

Page 1 of 8
using equal-sized code despite their frequency of occurrence. Data compression
techniques which approaches this type of redundancy assigns smaller-sized code
for less-frequently appearing intensity values. This process is commonly called
variable-length coding.

Interpixel Redundancy
Most images from the real world are continuous that is most pixel value has little
difference from its neighbors. In the digitization process, the codes assigned to
pixels have nothing to do with the correlation between pixels. For example, an
image of a textbook page consists of mostly white background with black text
arranged in an ordered fashion: the white pixels of the background are mostly
adjacent to other white pixels, and the black pixels which makes up the letters
are mostly – although not always – adjacent to other black pixels. Redundancy of
this type has a variety of names, including spatial redundancy, geometric redun-
dancy, and interframe redundancy.

Phsychovisual Redundancy
The self-adjusting nature of the human visualization system often perceives
intensity variations (Mach bands) in a region as a constant intensity. This comes
from the fact that although the human eye can recognize a wide range of intensity
values, it can only recognize a small subset of those values at any given time.
The human observer does not analyze every pixel or luminance value in the
image, but searches for distinguishing features such as edges or textural regions.
Certain information simply has less relative importance than other information in
normal visual processing. This information can be eliminated without signifi-
cantly impairing the quality of image perception.

Phsycovisual redundancy is fundamentally different from the redundancies


described earlier. Unlike coding or interpixel redundancy, phsycovisual redun-
dancy is associated with how humans visually interpret image data. Elimination
of this type of redundancy implicates the loss of quantifiable information. Thus,
compression that addresses this type of redundancy is inherently irreversible
because it discards some visual information.

Compression Methods

Page 2 of 8
There are two main categories of image compression methods, depending on how
much data retained: error-free compression and lossy compression. In error free –
or lossless – compression, image data are compressed using a reversible transfor-
mation. Thus, the image data can be reconstructed in its entirety. On the other
hand, lossy compression eliminates unimportant information in the image.

Error-free compression techniques include the Huffman coding, arithmetic


coding, bit-plane coding, constant area coding, run-length coding, contour
tracing and lossless predictive coding. The performance of these coding methods
– both compression ratio and processing power required – varies depending on
the particular input image. Thus, each method is used for compressing different
types of images. For example, a variation of run-length coding is used in the
CCIT standard for fax transmission since the typical fax document consists of
mostly whitespace interspersed with black text.

Error-free compression is used for images that do not allow – or even illegal in
some circumstances – the loss of data. Examples of such applications are
medical x-ray image, satellite photography, or business documents. In these and
other cases, the nature or the intended use of the image motivates the need of
error-free compression.

Lossy compression techniques include the Discrete Cosine Transform (DCT) –


now used to compress JPEG images, Improved Gray Scale (IGS) transform, lossy
predictive coding, and Differential Pulse Code Modulation (DPCM). Like the error-
free compression techniques described earlier, the performance of the various
techniques also depends on the input image.

Lossless compression is used for image storage and retrieval purposes that can
tolerate a slight – relative to the particular application – degradation of the image
data. Applications such as this include real-life photographs, television and
motion picture transmission.

Image Segmentation
The first step in image analysis is to segment the image. Image segmentation is
the process of subdividing the image into its constituent parts or objects. Image

Page 3 of 8
segmentation is normally followed by image classification, which is the process of
classifying image segments into meaningful objects. For example, in the applica-
tion of a heat-seeking missile system, the targeting circuitry is given data from
the heat sensor as an image in which heat are coded with increasing intensity.
The targeting circuitry then segmentates the image according to the various heat
ranges and then classify the hottest object as the target, separating it from other
non-important objects.

Image segmentation is done in two ways, edge-based and region-based. Edge-


based segmentation works by detecting discontinuities in the image and use
those discontinuities as the outline of each segment. Region-based segmentation
works by grouping similar-valued adjacent pixels.

Edge-based Segmentation
This type of segmentation detects edges in the image by using the second deriva-
tive of the image to produce outlines which borders the segments. The derivative
is obtained by applying various masks to the image. These masks are n×n
matrices proposed by Sobel, Laplace, Kirch and various others. The various
masks each have its own output characteristics, and a selection of a particular
mask depends largely from the input image.

Edge-based segmentation can detect fine edges in the image, but it typically
enhances noise in the image. Although this technique requires a relatively large
amount of processing power, about n times the size of the image for each n×n
matrix, it always requires a finite amount of time, unlike image clustering – a
technique of region-based segmentation – which does not terminate for a certain
type of image.

Region-based Segmentation
This type of segmentation segmentates the image by grouping similar-valued
neighboring pixels. Several techniques include thresholding, clustering , region
growing, and splitting.

Thresholding works by grouping intensity values by a set of predefined values as


borders. Intensity values that are between the border values are assigned the
same value, thus grouping them into the same segment.

Page 4 of 8
Clustering works by first randomly picking a certain amount of pixels as starting
points and then groups other pixels that have intensity values near – specified by
a predetermined tolerance level – the starting points. These groups are then
calculated their average values and select other points which have these values
as the new starting points. This process is repeated until there are no new
starting points selected.

Region growing – a bottom-up approach – works by first randomly picking a


certain amount of pixels as starting points and then selects adjacent pixels that
have intensity values near the starting points. These pixels are then grouped as a
region and the region is expanded the same way until no more pixels meet the
criterion or the regions have collided.

Region splitting – a top-down approach – works by subdividing the image into a


set of arbitrary disjointed regions and then further merges and/or splitting the
regions to meet the segment criteria. The approach works by subdividing the
image into four disjointed regions called a quadtree (a tree that each node has
four descendants) then subdivides it further into more quadtree when the
criterion of the region is not met, or merging adjacent regions to form a segment.
The steps can be summarized as follows:

1. split into four disjointed quadrants any region Ri where P(Ri) = FALSE;
2. merge any adjacent regions Rj and Rk for which P(Rj c Rk) = TRUE; and
3. stop when no further merging or splitting is possible.

Decision theory is based on the probability density function and maximum like
hood decision rule. This technique is rather complex and involves the use of
artificial intelligence to segment the image based by the sample image segments
used to train the system.

Combination of Edge-based and Region-based Segmentation


The fuse of edge-based and region-based segmentation attempts to take the best
from both techniques. The procedure is of as follows:

1. Perform edge-detection that will result in an image with only edge and non-
edge pixels.

Page 5 of 8
2. Separate the regions by searching connected regions – which are sets of 4-
neighbor non-edge pixels.
3. pi is the perimeter and ni is the area of the region Ri. Bij is the length of the
perimeter between regions Ri and Rj; and Eij is the sum of edge pixels at that
perimeter. For each pairs Ri and Rj calculate these three criterions:

E ij
Boundary strength: Γ ij =
1
C . 'ij1 has a value between 0 and 1, larger
Bij

value suggests a stronger border between the regions Ri and Rj.

( )
2
mRi − mR j
Similarity measure: Γ ij =
2
C . The larger value of 'ij2
450 × ( scaling factor )

suggests there is more chance that Ri and Rj are really a single region.

min( pi , p j )
Connectivity measure: Γ i j =
3
C . The possibility of merging is
4 × Bij

greater when the two regions have a long enough perimeter line.
4. Region Ri and Rj are combined when these conditions are fulfilled:
C ni < nj (one region
is smaller)
C 'ij1 < ts (40% edge pixel are on the real perimeter line)
C 'ij2 ×'ij3 = min('ik2,'ik3) < tc (Rk fulfills both the above two conditions, search
a region that best fits both the above conditions to be combined with Ri.
Image Classification
There are two types of image classification: supervised and unsupervised. In
supervised classification the image processing system is controlled by a human
operator who is typically an expert in the type of image being processed. In
unsupervised classification the computer system processes the image without any
human intervention.

The steps of image classification are:

1. Object categorization
In supervised classification, objects in the image are categorized by the

Page 6 of 8
human operator. In unsupervised classification, clusters are classified as
objects.
2. Training data selection
In supervised classification, a sample set is selected and labeled, then the
statistics of each object categories are calculated. In unsupervised classifi-
cation, the samples are unlabeled and the statistics of each cluster is
calculated.
A group of methods of image classification is the decision theoretic methods. These
methods use decision (or discriminant) functions, one function for each pattern
class. It works by inputting the unknown pattern into each decision function and
the function that results in the largest value yields the unknown pattern to be a
member of the associated pattern class. In other words, by substituting the
unknown pattern x into decision functions f1(x), f2(x),...,fn(x) when the function
fi(x) results in the largest numerical value, x is then classified as member of
pattern class i. Ties are resolved arbitrarily.

The geometrical approach of decision theoretic method is to use the minimum-


distance classifier as the discriminant function. The function is

d= (x 2 − x1 ) + ( y 2 − y1 )
2 2

The statistical approach of decision theoretic method is to use the Gaussian


maximum-likehood classifier as the discriminant function. The function

1  1 ( x − µ )2 
( )
is p x| c = exp −

i 

2Π σ i  2 σi
2

Object Recognition and Image Interpretation


The goal of automated image analysis is to produce a computer system capable of
approximating the intelligence possessed by human beings. Specifically these
systems should be able to:

Page 7 of 8
C Extract pertinent information from a background of irrelevant details.
C Learn from examples and generalize this knowledge so that it will apply in
new and different circumstances.
C Make inferences from incomplete information.
The current state-of-the-art image analysis systems are based on heuristic
formulas tailored to solve specific problems. For example, some machines could
read printed, properly formatted documents at speeds faster than the average
skilled human readers. However, these systems are too specific and have almost
no extendability. Thus, current theoretic and implementation limitations in the
field of image analysis imply solutions that are highly problem dependent.

In image analysis systems, the objects of interest are separated from the back-
ground using one or more of the various segmentation and classification tech-
niques. The objects are then further categorized and assigned meaningful labels.
The process continues by recognizing the relation between objects and produce
the intended interpretation. This process is complex and requires the mix
between image processing and artificial intelligence disciplines.

References
Aniati Murni [2000], Image Processing, class handouts, Faculty of Computer
Science, University of Indonesia, Jakarta.
Gonzalez, Rafael C, Richard E Woods [1992], Digital Image Processing, Addison-
Wesley Publishing Company, Inc, Reading, Massachusetts.

Page 8 of 8

Você também pode gostar