Você está na página 1de 112

CCR: Clustering and Collaborative

Representation
for Fast Single Image Super-Resolution

ABSTRACT

Clustering and collaborative representation (CCR) have recently been used in fast single image super-
resolution SR). In this paper, we propose an effective and fast single image super-resolution (SR)
algorithm by combining clustering and collaborative representation. In particular, we first cluster the
feature space of low-resolution (LR) images into multiple LR feature subspaces and group the
corresponding high-resolution (HR) feature subspaces. The local geometry property learned from the
clustering process is used to collect numerous neighbour LR and HR feature subsets from the whole
feature spaces for each cluster center. Multiple projection matrices are then computed via collaborative
representation to map LR feature subspaces to HR subspaces. For an arbitrary input LR feature, the desired
HR output can be estimated according to the projection matrix, whose corresponding LR cluster center is
nearest to the input. Moreover, by learning statistical priors from the clustering process, our clustering-
based SR algorithm would further decrease the computational time in the reconstruction phase. Extensive
experimental results on commonly used datasets indicate that
our proposed SR algorithm obtains compelling SR images quantitatively and qualitatively against many
state-of-the-art methods.
This letter addresses the problem of generating a super-resolution (SR) image from a single low-resolution
(LR) input image in the wavelet domain. To achieve a sharper image, an intermediate stage for estimating
the high-frequency (HF) sub-bands has been proposed. This stage includes an edge preservation procedure
and mutual interpolation between the input LR image and the HF sub-band images, as performed via the
discrete wavelet transform (DWT). Sparse mixing weights are calculated over blocks of coefficients in an
image, which provides a sparse signal representation in the LR image. All of the sub-band images are used
to generate the new high-resolution image using the inverse DWT. Experimental results indicated that the
proposed approach outperforms existing methods in terms of objective criteria and subjective perception
improving the image resolution.

CHAPTER-1

I. INTRODUCTION

The images and video sequences that register from radar, optical, medical and other sensors and that are

presented on high-definition television, in electron microscopy, etc., are obtained from electronic devices

that use a variety of sensors. Therefore, a pre processing technique that permits enhancement of image

resolution should be used.

This step can be performed by estimating a high-resolution (HR) image x(m, n) from measurements of a

low-resolution (LR) image y(m, n) that were obtained through a linear operator V that forms a degraded

version of the unknown HR image, which was additionally contaminated by an additive noise w(m, n), i.e.,

y(m, n) = V [x(m, n)] + w(m, n).

In most applications, V is a sub sampling operator that should be inverted to restore an original image size,

and this problem usually should be treated as an ill-posed problem.

In remote sensing monitoring and navigation missions with small airborne or unmanned flying vehicle

platforms, LR sensors with simple and cheap hardware, such as unfocused fractional SAR systems, optical

cameras, etc., and using the

Manuscript received October 20, 2013; revised January 13, 2014; accepted
The authors are with the School of Mechanical and Electrical Engineering (ESIME)

However, such cheap sensors or fractal synthesis mode inevitably sacrifices spatial resolution.

The system could also suffer from the uncertainties that are attributed to random signal perturbations,

imperfect system calibration, etc. Therefore, the SR algorithms that are the cost-effective decisions have an

important application in the pro-cessing of satellite or aerial images obtained by radar or optical sensors.

The wavelet technique as a simple sparse representation also plays a significant role in many image

processing applications, in particular in resolution enhancement, and recently, many novel algorithms have

been proposed.

Prior information on the image sparsity has been widely used for image interpolation. The principal idea

behind the restriction of the sparse SR algorithms is that the HR results can be improved by using more

prior information on the image properties.

The predominant challenge of this study is to employ an approach that is similar to the approach of these

wavelet-based algorithms, accounting for both spatial and spectral wavelet pixel information to enhance

the resolution of a single image.

The principal difference of the novel SR approach in comparison with existing methods consists in the

mutual interpolation via Lanczos and nearest neighbour interpolation (NNI) techniques for wavelet

transform (WT) high-frequency (HF) subband images and edge extracting images via discrete wavelet

transform (DWT); additionally, an adaptive

directional LR image interpolation is computed by estimating sparse image mixture models in a DWT

image. To obtain robustness for the SR process in presence of noise, the novel framework uses special

denoising filtering, employing the nonlocal means (NLM) technique for the input LR image .

Finally, all of the sub-band images are combined, reconstructing via inverse DWT (IDWT) the

output HR image that appears to demonstrate superiority of the designed algorithm in terms of the
objective criteria and subjective perception (via the human visual system), in comparison with the best

existing techniques.

To justify that the novel algorithm called super resolution using wavelet domain interpolation with

edge extraction and sparse representation (SR-WDIEE-SR) has real advantages, we have compared the

proposed SR procedure with other similar techniques, such as the following: Demirel - Anbarjafari Super

Resolution, Wavelet Domain Image Resolution Enhance-ment Using Cycle-Spinning, Image Resolution

Enhance-ment by using Discrete and Stationary Wavelet Decomposition, Discrete Wavelet Transform-

Based Satellite Image Resolution Enhancement, and Dual-Tree Complex Wavelet Transform.

To ascertain the effectiveness of the proposed algorithm over other wavelet domain resolution-

enhancement techniques, numerous LR images of different nature (video and radar) obtained from the

Web pages were tested. The first database consists of 38 images, and the second database contains about

20 radar images.

The remainder of this letter is organized as follows. Section II presents a short introduction to the

NLM filtering method and to implementation of the interpolation through the inverse mixing estimator for

a single image in WT space. The proposed technique is presented in Section III. Section IV discusses the

qualitative and quantitative results of the proposed algorithm in comparison with other better conventional

techniques. Finally, the conclusions are drawn in the final section.


II. PRELIMINARIES

B) interpolations with spars wavelet mixtures


The subsampled image xˆ(m, n) is decomposed with one-level DWT in the LL,
LH, HL, and HH image subbands, which are treated as matrices Ψ whose
columns (approxima-tions and details) are the vectors of a wavelet image single
scale {ψd,n}0≤d≤3,n∈ G . The decomposition process is performed with a dual frame
matrix Ψ whose columns are
the dual wavelet frames ψd,n 0≤d≤3,n∈ G [15]. The wavelet coefficients are written
as follows:

The WT separates a low-frequency (LF) image zl (an approx-imation) that is


projected over the LF scaling filters {ψ0,n}n∈ G and an HF image zh (details) that
is projected over the finest scale wavelets HF in three directions {ψd,n}1≤d≤3,n∈ G ,
i.e.,

˜ ˜
zl =z(0, n)ψ0,n zh =z(d, n)ψd,n. (5)

n∈ G d=1
zl has little aliasing and can thus be interpolated with a Lanczos interpolator V
+
. zh is interpolated by selecting direc-tional interpolators Vθ+ for θ ∈ Θ, where
Θ is a set of angles uniformly discretized between 0 and π.

For each angle θ, a directional interpolator Vθ+ is applied a block B = Bθ,q that
is interpolated with a directional interpolator VB+ = Vθ+. The HF zh and LF zl
images are interpolated with a separable and Lanczos interpolator V +. The
resulting interpolator can be
written in the following form [15]:
XLL(m, n) = V +zl(m, n) + θ∈ Θ Vθ+ − V +_

Ψ a˜(Bθ,q )1Bθ,q zh(m, n) . (6)


× _q∈ zˆθ

(m − r)2 + (n − s)2

For each angle θ, an update is computed over wavelet coeffi-cients of each


block of direction θ multiplied by their mixing weight a˜(Bθ,q ), with the
difference between the separable

(2) Lanczos interpolator V + and a directional interpolator Vθ+ along θ. 1B is the


indicator of the approximation set B. This overall interpolator is calculated
with 20 angles, with blocks having a width of 2 pixels and a length between
6 and 12 pixels depending on their orientation.

III. PROPOSED SR-WDIEE-SR TECHNIQUE


In this letter, one level of DWT that applies different wavelet families is
used to decompose an input image. DWT separates an image into different
subband images, namely, LL, LH, HL, and HH, where last three subbands
contain the HF component

(3) of the image. The interpolation process should be applied to all subband
images. To suppress noise influence, the novel framework applies a
denoising procedure by using the NLM technique for the input LR image
(see step 1 in Fig. 1).

In the proposed SR procedure, the LR image is used as the input data in


the sparse representation for the resolution-enhancement process in the
following way (see step 2a in Fig. 1).

The LR image is calculated by using a 1-D interpolation in a given


direction θ and then following the computations of the new samples along
the oversampled rows, columns, or diag-onals. Finally, in this step, the
algorithm computes the missing samples along the direction θ from the
previously calculated new samples, where the entire sparse process is
performed with
the Lanczos interpolation (factor α = 2), reconstructing the LL sub band.
The differences between the interpolated (factor α = 2) LL subband image
and the LR input image are in their HF components, which is why the
intermediate process to correct

(4) the estimated HF components applying this difference image


CHAVEZ-ROMAN AND PONOMARYOV: SR IMAGE GENERATION USING
WAVELET DOMAIN INTERPOLATION

Fig. 1. Block diagram of the proposed resolution-enhancement technique.

Fig. 2. Visual perception results for the Aerial-A image contaminated by


Gaussian additive noise (PSNR = 17 dB).

has been proposed. As shown in step 2b of the algorithm (see Fig. 1), this
difference is performed in HF subbands by interpolating each band via the NNI
process (changing the values of pixels in agreement with the closest neighbor
value), including additional HF features into the HF images.

To preserve more edge information (to obtain a sharper enhanced image), we


have proposed an extraction step of the edges using HF subbands HH, HL, and
LH images; next, the edge information is used in HF subbands employing the
NNI process (see step 2c in Fig. 1). The edges extracted are calculated as
follows [16]:
_ 2
E = HH + HL2 +
LH2. (7)

In the concluding stage, we perform an additional interpola-tion with the


Lanczos interpolation (factor α = 2) to reach the required size for the IDWT
process (see step 3 in Fig. 1).

It is noticed that the intermediate process of adding the difference image and the
edge extraction stage, both of which contain the additional HF features,
generate a significantly sharper reconstructed SR image. This sharpness is
boosted by the fact that the interpolation of the isolated HF components in HH,
HL, and LH appears to preserve more HF components than interpolating from
the LR image directly.

IV. SIMULATION RESULTS AND DISCUSSION


This section reports the results of the statistical simulations and the
performance evaluation that is conducted via objective metrics (Peak Signal-to-
Noise Ratio, Mean Absolute Error, and Structural Similarity Index Measure)
[17]. In addition, a sub-jective visual comparison of the SR images performed
by different algorithms was employed and thus made it possible to evaluate the
performance of the analyzed techniques in a different manner.

Numerous aerial optical and radar satellite images from [13] and [14],
particularly Aerial-A and SAR-B images of different nature and physical
characteristics, were studied applying the designed and better existing SR
procedures. In simulations, the pixels of the LR image have been obtained by
down sampling the original HR image by a factor of 4 in each axis. In the

Fig. 3. Visual perception results for the SAR-B image contaminated by Gaussian additive
noise (PSNR = 17 dB).

TABLE I
OBJECTIVE CRITERIA RESULTS (Aerial-A Image) OF THE RESOLUTION
ENHANCEMENT FROM 128 × 128 TO 512 × 512

TABLE II
OBJECTIVE CRITERIA RESULTS (SAR-B Image) OF THE RESOLUTION
ENHANCEMENT FROM 128 × 128 TO 512 × 512

denoising stage, the NLM filter from (2) was applied, the neighborhood Q
was found in the simulation as 5 × 5 pixels, and the parameter δ = 2 was chosen.
In this letter, the following families of classic wavelet func-tions are used:
Daubechies (Db1) and Biorthogonal (Bior1.3).
In the Aerial-A image (see Fig. 2), it is easy to see better performance in
accordance with the objective criteria and via subjective visual perception in SR
when the proposed algorithm SR-WDIEE-SR is employed with the wavelet
Bior1.3, demon-strating better preservation of the fine feature in the zoomed
part of the image. It has better sharpness and less smoothing at the edges,
preventing pixel blocking (known as jaggies), blurred details, and ringing
artifacts around edges.

Referring to the SR image SAR-B, one can see in Fig. 3 that the novel SR
algorithm appears to perform better in terms of objective criteria (PSNR and
SSIM), as well as in visual sub-jective perception, particularly using wavelet
Bior1.3. This can be particularly viewed in the well-defined borders, where the
designed framework restores slightly rather regular geometrical structures and
the fine details appear to be preserved better.
The presented analysis of many simulation results obtained in the SR for images
of different nature (optical and radar)

CHAVEZ-ROMAN AND PONOMARYOV: SR IMAGE GENERATION USING


WAVELET DOMAIN INTERPOLATION

using state-of-the-art techniques has shown that the novel SR-WDIEE-SR


framework outperforms other competitor meth-ods, presenting better
performance. Given that the textures and chromaticity properties of these
images are different, the perfor-mance results confirm the robustness of the
current proposal.

Because the noise presence in LR image is natural in practice, we compare the


two configuration of the novel framework when the NLM filter is used, or as in
different algorithms [8]–[12], this filter is not applied (except [12]) in the
corrupted image.

In Tables I and II, one can observe the superiority of the proposed SR-
WDIEE-SR framework, observing better perfor-mance in the objective criteria
(PSNR, MAE, and SSIM).

We performed the comparison evaluation of average ob-jective criteria values


(PSNR and SSIM) throughout all im-ages from the mentioned databases (Aerial
and SAR) for the proposed framework and competitor techniques. The results
obtained have presented the following: PSNR = 31.12 dB and SSIM = 0.672
in the case of our proposal and PSNR = 28.62 dB and SSIM = 0.614 for the
better technique DASR [8] from the counterpart algorithms.
Last comparison of criteria values over image databases, visual subjective
perception (see Figs. 1 and 2), and results presented in Tables I and II have also
confirmed the robustness of the current proposal.

MATLAB
INTRODUCTION TO MATLAB

What Is MATLAB?

MATLAB® is a high-performance language for technical computing. It


integrates computation, visualization, and programming in an easy-to-use
environment where problems and solutions are expressed in familiar
mathematical notation. Typical uses include

Math and computation

Algorithm development

Data acquisition

Modeling, simulation, and prototyping

Data analysis, exploration, and visualization

Scientific and engineering graphics

Application development, including graphical user interface building.


MATLAB is an interactive system whose basic data element is an array
that does not require dimensioning.

This allows you to solve many technical computing problems, especially those
with matrix and vector formulations, in a fraction of the time it would take to
write a program in a scalar non interactive language such as C or FORTRAN.

The name MATLAB stands for matrix laboratory. MATLAB was


originally written to provide easy access to matrix software developed by the
LINPACK and EISPACK projects. Today, MATLAB engines incorporate the
LAPACK and BLAS libraries, embedding the state of the art in software for
matrix computation.

MATLAB has evolved over a period of years with input from many users.
In university environments, it is the standard instructional tool for introductory
and advanced courses in mathematics, engineering, and science. In industry,
MATLAB is the tool of choice for high-productivity research, development,
and analysis.

MATLAB features a family of add-on application-specific solutions


called toolboxes. Very important to most users of MATLAB, toolboxes allow
you to learn and apply specialized technology. Toolboxes are comprehensive
collections of MATLAB functions (M-files) that extend the MATLAB
environment to solve particular classes of problems. Areas in which toolboxes
are available include signal processing, control systems, neural networks, fuzzy
logic, wavelets, simulation, and many others.
The MATLAB System:

The MATLAB system consists of five main parts:

Development Environment:

This is the set of tools and facilities that help you use MATLAB
functions and files. Many of these tools are graphical user interfaces. It includes
the MATLAB desktop and Command Window, a command history, an editor
and debugger, and browsers for viewing help, the workspace, files, and the
search path.

The MATLAB Mathematical Function:

This is a vast collection of computational algorithms ranging from


elementary functions like sum, sine, cosine, and complex arithmetic, to more
sophisticated functions like matrix inverse, matrix eigen values, Bessel
functions, and fast Fourier transforms.

The MATLAB Language:

This is a high-level matrix/array language with control flow statements,


functions, data structures, input/output, and object-oriented programming
features. It allows both "programming in the small" to rapidly create quick and
dirty throw-away programs, and "programming in the large" to create complete
large and complex application programs.
Graphics:

MATLAB has extensive facilities for displaying vectors and matrices as


graphs, as well as annotating and printing these graphs. It includes high-level
functions for two-dimensional and three-dimensional data visualization, image
processing, animation, and presentation graphics. It also includes low-level
functions that allow you to fully customize the appearance of graphics as well
as to build complete graphical user interfaces on your MATLAB applications.

The MATLAB Application Program Interface (API):

This is a library that allows you to write C and Fortran programs that
interact with MATLAB. It includes facilities for calling routines from
MATLAB (dynamic linking), calling MATLAB as a computational engine, and
for reading and writing MAT-files

MATLAB WORKING ENVIRONMENT:

MATLAB DESKTOP:-

Matlab Desktop is the main Matlab application window. The desktop


contains five sub windows, the command window, the workspace browser, the
current directory window, the command history window, and one or more
figure windows, which are shown only when the user displays a graphic.
The command window is where the user types MATLAB commands and
expressions at the prompt (>>) and where the output of those commands is
displayed.

MATLAB defines the workspace as the set of variables that the user creates in a
work session. The workspace browser shows these variables and some
information about them. Double clicking on a variable in the workspace
browser launches the Array Editor, which can be used to obtain information and
income instances edit certain properties of the variable.

The current Directory tab above the workspace tab shows the contents of
the current directory, whose path is shown in the current directory window. For
example, in the windows operating system the path might be as follows:
C:\MATLAB\Work, indicating that directory “work” is a subdirectory of the
main directory “MATLAB”; WHICH IS INSTALLED IN DRIVE C. clicking
on the arrow in the current directory window shows a list of recently used paths.
Clicking on the button to the right of the window allows the user to change the
current directory.

MATLAB uses a search path to find M-files and other MATLAB related
files, which are organize in directories in the computer file system. Any file run
in MATLAB must reside in the current directory or in a directory that is on
search path. By default, the files supplied with MATLAB and math works
toolboxes are included in the search path.
The easiest way to see which directories are on the search path. The easiest way
to see which directories are soon the search path, or to add or modify a search
path, is to select set path from the File menu the desktop, and then use the set
path dialog box. It is good practice to add any commonly used directories to the
search path to avoid repeatedly having the change the current directory.

The Command History Window contains a record of the commands a user


has entered in the command window, including both current and previous
MATLAB sessions. Previously entered MATLAB commands can be selected
and re-executed from the command history window by right clicking on a
command or sequence of commands. This action launches a menu from which
to select various options in addition to executing the commands. This is useful
to select various options in addition to executing the commands. This is a useful
feature when experimenting with various commands in a work session.

Using the MATLAB Editor to create M-Files:

The MATLAB editor is both a text editor specialized for creating M-files
and a graphical MATLAB debugger. The editor can appear in a window by
itself, or it can be a sub window in the desktop. M-files are denoted by the
extension .m, as in pixelup.m.

The MATLAB editor window has numerous pull-down menus for tasks such as
saving, viewing, and debugging files. Because it performs some simple checks
and also uses color to differentiate between various elements of code, this text
editor is recommended as the tool of choice for writing and editing M-functions.
To open the editor , type edit at the prompt opens the M-file filename.m in an
editor window, ready for editing. As noted earlier, the file must be in the current
directory, or in a directory in the search path.

Getting Help:

The principal way to get help online is to use the MATLAB help browser,
opened as a separate window either by clicking on the question mark symbol (?)
on the desktop toolbar, or by typing help browser at the prompt in the command
window. The help Browser is a web browser integrated into the MATLAB
desktop that displays a Hypertext Markup Language(HTML) documents. The
Help Browser consists of two panes, the help navigator pane, used to find
information, and the display pane, used to view the information. Self-
explanatory tabs other than navigator pane are used to perform a search.
CHAPTER-4

DIGITAL IMAGE PROCESSING


Digital image processing

Background:

Digital image processing is an area characterized by the need for extensive


experimental work to establish the viability of proposed solutions to a given
problem. An important characteristic underlying the design of image
processing systems is the significant level of testing & experimentation that
normally is required before arriving at an acceptable solution. This
characteristic implies that the ability to formulate approaches &quickly
prototype candidate solutions generally plays a major role in reducing the cost
& time required to arrive at a viable system implementation.

What is DIP
An image may be defined as a two-dimensional function f(x, y), where x
& y are spatial coordinates, & the amplitude of f at any pair of coordinates (x,
y) is called the intensity or gray level of the image at that point. When x, y &
the amplitude values of f are all finite discrete quantities, we call the image a
digital image. The field of DIP refers to processing digital image by means of
digital computer. Digital image is composed of a finite number of elements,
each of which has a particular location & value. The elements are called pixels.

Vision is the most advanced of our sensor, so it is not surprising that


image play the single most important role in human perception. However,
unlike humans, who are limited to the visual band of the EM spectrum imaging
machines cover almost the entire EM spectrum, ranging from gamma to radio
waves.

They can operate also on images generated by sources that humans are not
accustomed to associating with image.

There is no general agreement among authors regarding where image


processing stops & other related areas such as image analysis& computer vision
start. Sometimes a distinction is made by defining image processing as a
discipline in which both the input & output at a process are images. This is
limiting & somewhat artificial boundary. The area of image analysis (image
understanding) is in between image processing & computer vision.

There are no clear-cut boundaries in the continuum from image processing


at one end to complete vision at the other. However, one useful paradigm is to
consider three types of computerized processes in this continuum: low-, mid-, &
high-level processes. Low-level process involves primitive operations such as
image processing to reduce noise, contrast enhancement & image sharpening. A
low- level process is characterized by the fact that both its inputs & outputs are
images.

Mid-level process on images involves tasks such as segmentation, description


of that object to reduce them to a form suitable for computer processing &
classification of individual objects. A mid-level process is characterized by the
fact that its inputs generally are images but its outputs are attributes extracted
from those images.
Finally higher- level processing involves “Making sense” of an ensemble of
recognized objects, as in image analysis & at the far end of the continuum
performing the cognitive functions normally associated with human vision.

Digital image processing, as already defined is used successfully in a


broad range of areas of exceptional social & economic value.

What is an image?

An image is represented as a two dimensional function f(x, y) where x


and y are spatial co-ordinates and the amplitude of ‘f’ at any pair of coordinates
(x, y) is called the intensity of the image at that point.

Gray scale image:

A grayscale image is a function I (xylem) of the two spatial coordinates


of the image plane.

I(x, y) is the intensity of the image at the point (x, y) on the image plane.

I (xylem) takes non-negative values assume the image is bounded by a rectangle


[0, a] [0, b]I: [0, a]  [0, b]  [0, info)

Color image:

It can be represented by three functions, R (xylem) for red, G (xylem)


for green and B (xylem) for blue.
An image may be continuous with respect to the x and y coordinates and
also in amplitude. Converting such an image to digital form requires that the
coordinates as well as the amplitude to be digitized. Digitizing the coordinate’s
values is called sampling. Digitizing the amplitude values is called quantization.

Coordinate convention:

The result of sampling and quantization is a matrix of real numbers. We


use two principal ways to represent digital images. Assume that an image f(x, y)
is sampled so that the resulting image has M rows and N columns. We say that
the image is of size M X N. The values of the coordinates (xylem) are discrete
quantities. For notational clarity and convenience, we use integer values for
these discrete coordinates.

In many image processing books, the image origin is defined to be at


(xylem)=(0,0).The next coordinate values along the first row of the image are
(xylem)=(0,1).It is important to keep in mind that the notation (0,1) is used to
signify the second sample along the first row. It does not mean that these are
the actual values of physical coordinates when the image was sampled.
Following figure shows the coordinate convention. Note that x ranges from 0 to
M-1 and y from 0 to N-1 in integer increments.

The coordinate convention used in the toolbox to denote arrays is


different from the preceding paragraph in two minor ways. First, instead of
using (xylem) the toolbox uses the notation (race) to indicate rows and columns.
Note, however, that the order of coordinates is the same as the order discussed
in the previous paragraph, in the sense that the first element of a coordinate
topples, (alb), refers to a row and the second to a column.

The other difference is that the origin of the coordinate system is at (r, c) =
(1, 1); thus, r ranges from 1 to M and c from 1 to N in integer increments. IPT
documentation refers to the coordinates. Less frequently the toolbox also
employs another coordinate convention called spatial coordinates which uses x
to refer to columns and y to refers to rows. This is the opposite of our use of
variables x and y.

Image as Matrices:

The preceding discussion leads to the following representation for a


digitized image function:

f (0,0) f(0,1) ……….. f(0,N-1)

f (1,0) f(1,1) ………… f(1,N-1)

f (xylem)= . . .

. . .

f (M-1,0) f(M-1,1) ………… f(M-1,N-1)

The right side of this equation is a digital image by definition. Each


element of this array is called an image element, picture element, pixel or pel.
The terms image and pixel are used throughout the rest of our discussions to
denote a digital image and its elements.

A digital image can be represented naturally as a MATLAB matrix:

f (1,1) f(1,2) ……. f(1,N)

f (2,1) f(2,2) …….. f (2,N)

. . .

f= . . .

f (M,1) f(M,2) …….f(M,N)

Where f (1,1) = f(0,0) (note the use of a monoscope font to denote


MATLAB quantities). Clearly the two representations are identical, except for
the shift in origin. The notation f(p ,q) denotes the element located in row p and
the column q. For example f(6,2) is the element in the sixth row and second
column of the matrix f. Typically we use the letters M and N respectively to
denote the number of rows and columns in a matrix. A 1xN matrix is called a
row vector whereas an Mx1 matrix is called a column vector. A 1x1 matrix is a
scalar.
Matrices in MATLAB are stored in variables with names such as A, a,
RGB, real array and so on. Variables must begin with a letter and contain only
letters, numerals and underscores.

As noted in the previous paragraph, all MATLAB quantities are written


using mono-scope characters. We use conventional Roman, italic notation such
as f(x ,y), for mathematical expressions

Reading Images:

Images are read into the MATLAB environment using function imread
whose syntax is

Imread (‘filename’)

Format name Description


recognized extension

TIFF Tagged Image File Format


.tif, .tiff

JPEG Joint Photograph Experts Group


.jpg, .jpeg

GIF Graphics Interchange Format


.gif
BMP Windows Bitmap
.bmp

PNG Portable Network Graphics


.png

XWD X Window Dump


.xwd

Here filename is a spring containing the complete of the image


file(including any applicable extension).For example the command line

>> f = imread (‘8. jpg’);

Reads the JPEG (above table) image chestxray into image array f. Note
the use of single quotes (‘) to delimit the string filename. The semicolon at the
end of a command line is used by MATLAB for suppressing output If a
semicolon is not included. MATLAB displays the results of the operation(s)
specified in that line. The prompt symbol (>>) designates the beginning of a
command line, as it appears in the MATLAB command window.

Data Classes:

Although we work with integers coordinates the values of pixels


themselves are not restricted to be integers in MATLAB. Table above list
various data classes supported by MATLAB and IPT are representing pixels
values. The first eight entries in the table are refers to as numeric data classes.
The ninth entry is the char class and, as shown, the last entry is referred to as
logical data class.

All numeric computations in MATLAB are done in double quantities, so this


is also a frequent data class encounter in image processing applications.

Class unit 8 also is encountered frequently, especially when reading data


from storages devices, as 8 bit images are most common representations found
in practice. These two data classes, classes logical, and, to a lesser degree, class
unit 16 constitute the primary data classes on which we focus.

Many ipt functions however support all the data classes listed in table. Data
class double requires 8 bytes to represent a number uint8 and int 8 require one
byte each, uint16 and int16 requires 2bytes and unit 32.
CHAPTER 1

INTRODUCTION TO IMAGE PROCESSING

INTRODUCTION

1.1 IMAGE:

An image is a two-dimensional picture, which has a similar appearance to some subject usually a
physical object or a person.

Image is a two-dimensional, such as a photograph, screen display, and as well as a three-


dimensional, such as a statue. They may be captured by optical devices—such as cameras, mirrors, lenses,
telescopes, microscopes, etc. and natural objects and phenomena, such as the human eye or water surfaces.
The word image is also used in the broader sense of any two-dimensional figure such as a map, a
graph, a pie chart, or an abstract painting. In this wider sense, images can also be rendered manually, such
as by drawing, painting, carving, rendered automatically by printing or computer graphics technology, or
developed by a combination of methods, especially in a pseudo-photograph.
Fig 1 General image

An image is a rectangular grid of pixels. It has a definite height and a definite width counted in
pixels. Each pixel is square and has a fixed size on a given display. However different computer monitors
may use different sized pixels. The pixels that constitute an image are ordered as a grid (columns and
rows); each pixel consists of numbers representing magnitudes of brightness and color.

Fig 1.1 Image pixel

Each pixel has a color. The color is a 32-bit integer. The first eight bits determine the redness of the
pixel, the next eight bits the greenness, the next eight bits the blueness, and the remaining eight bits the
transparency of the pixel.
Fig1.2 Transparency image

1.2 IMAGE FILE SIZES:

Image file size is expressed as the number of bytes that increases with the number of pixels
composing an image, and the color depth of the pixels. The greater the number of rows and columns, the
greater the image resolution, and the larger the file. Also, each pixel of an image increases in size when its
color depth increases, an 8-bit pixel (1 byte) stores 256 colors, a 24-bit pixel (3 bytes) stores 16 million
colors, the latter known as true color.

Image compression uses algorithms to decrease the size of a file. High resolution cameras produce
large image files, ranging from hundreds of kilobytes to megabytes, per the camera's resolution and the
image-storage format capacity. High resolution digital cameras record 12 megapixel (1MP = 1,000,000
pixels / 1 million) images, or more, in true color. For example, an image recorded by a 12 MP camera;
since each pixel uses 3 bytes to record true color, the uncompressed image would occupy 36,000,000 bytes
of memory, a great amount of digital storage for one image, given that cameras must record and store
many images to be practical. Faced with large file sizes, both within the camera and a storage disc, image
file formats were developed to store such large images.

1.3 IMAGE FILE FORMATS:

Image file formats are standardized means of organizing and storing images. This entry is about
digital image formats used to store photographic and other images. Image files are composed of either
pixel or vector (geometric) data that are rasterized to pixels when displayed (with few exceptions) in a
vector graphic display. Including proprietary types, there are hundreds of image file types. The PNG,
JPEG, and GIF formats are most often used to display images on the Internet.
Fig1.3 Resolution image

In addition to straight image formats, Metafile formats are portable formats which can include both
raster and vector information. The metafile format is an intermediate format. Most Windows applications
open metafiles and then save them in their own native format.

1.3.1 RASTER FORMATS:

These formats store images as bitmaps (also known as pixmaps)

 JPEG/JFIF:

JPEG (Joint Photographic Experts Group) is a compression method. JPEG compressed images are
usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG compression is lossy
compression. Nearly every digital camera can save images in the JPEG/JFIF format, which supports 8 bits
per color (red, green, blue) for a 24-bit total, producing relatively small files. Photographic images may be
better stored in a lossless non-JPEG format if they will be re-edited, or if small "artifacts" are
unacceptable. The JPEG/JFIF format also is used as the image compression algorithm in many Adobe PDF
files.

 EXIF:

The EXIF (Exchangeable image file format) format is a file standard similar to the JFIF format
with TIFF extensions. It is incorporated in the JPEG writing software used in most cameras. Its purpose is
to record and to standardize the exchange of images with image metadata between digital cameras and
editing and viewing software. The metadata are recorded for individual images and include such things as
camera settings, time and date, shutter speed, exposure, image size, compression, name of camera, color
information, etc. When images are viewed or edited by image editing software, all of this image
information can be displayed.

 TIFF:

The TIFF (Tagged Image File Format) format is a flexible format that normally saves 8 bits or 16
bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, usually using either the TIFF or
TIF filename extension. TIFFs are lossy and lossless. Some offer relatively good lossless compression for
bi-level (black & white) images. Some digital cameras can save in TIFF format, using the LZW
compression algorithm for lossless storage. TIFF image format is not widely supported by web browsers.
TIFF remains widely accepted as a photograph file standard in the printing business. TIFF can handle
device-specific color spaces, such as the CMYK defined by a particular set of printing press inks.
 PNG:

The PNG (Portable Network Graphics) file format was created as the free, open-source successor
to the GIF. The PNG file format supports true color (16 million colors) while the GIF supports only 256
colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format
is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of
photographic images, because JPG files are smaller than PNG files. PNG, an extensible file format for the
lossless, portable, well-compressed storage of raster images. PNG provides a patent-free replacement for
GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and true color images are
supported, plus an optional alpha channel. PNG is designed to work well in online viewing applications,
such as the World Wide Web. PNG is robust, providing both full file integrity checking and simple
detection of common transmission errors.

 GIF:

GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the GIF
format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and
cartoon style images. The GIF format supports animation and is still widely used to provide image
animation effects. It also uses a lossless compression that is more effective when large areas have a single
color, and ineffective for detailed images or dithered images.

 BMP:

The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS.
Typically, BMP files are uncompressed, hence they are large. The advantage is their simplicity and wide
acceptance in Windows programs.

1.3.2 VECTOR FORMATS:

As opposed to the raster image formats above (where the data describes the characteristics of each
individual pixel), vector image formats contain a geometric description which can be rendered smoothly at
any desired display size.

At some point, all vector graphics must be rasterized in order to be displayed on digital monitors.
However, vector images can be displayed with analog CRT technology such as that used in some
electronic test equipment, medical monitors, radar displays, laser shows and early video games. Plotters
are printers that use vector data rather than pixel data to draw graphics.

 CGM:
CGM (Computer Graphics Metafile) is a file format for 2D vector graphics, raster graphics, and
text. All graphical elements can be specified in a textual source file that can be compiled into a binary file
or one of two text representations. CGM provides a means of graphics data interchange for computer
representation of 2D graphical information independent from any particular application, system, platform,
or device.

 SVG:

SVG (Scalable Vector Graphics) is an open standard created and developed by the World Wide
Web Consortium to address the need for a versatile, scriptable and all purpose vector format for the web
and otherwise. The SVG format does not have a compression scheme of its own, but due to the textual
nature of XML, an SVG graphic can be compressed using a program such as gzip.

1.4 IMAGE PROCESSING:

Digital image processing, the manipulation of images by computer, is relatively recent


development in terms of man’s ancient fascination with visual stimuli. In its short history, it has been
applied to practically every type of images with varying degree of success. The inherent subjective appeal
of pictorial displays attracts perhaps a disproportionate amount of attention from the scientists and also
from the layman. Digital image processing like other glamour fields, suffers from myths, mis-connect ions,
mis-understandings and mis-information. It is vast umbrella under which fall diverse aspect of optics,
electronics, mathematics, photography graphics and computer technology. It is truly multidisciplinary
endeavor ploughed with imprecise jargon.

Several factor combine to indicate a lively future for digital image processing. A major factor is the
declining cost of computer equipment. Several new technological trends promise to further promote digital
image processing. These include parallel processing mode practical by low cost microprocessors, and the
use of charge coupled devices (CCDs) for digitizing, storage during processing and display and large low
cost of image storage arrays.

1.5 FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING:


Fig 1.5
Image fundamental

1.5.1 Image Acquisition:


Image Acquisition is to acquire a digital image. To do so requires an image sensor and the
capability to digitize the signal produced by the sensor. The sensor could be monochrome or color TV
camera that produces an entire image of the problem domain every 1/30 sec. the image sensor could also
be line scan camera that produces a single image line at a time. In this case, the objects motion past the
line.

Fig 1.5.1 Digital camera image

Scanner produces a two-dimensional image. If the output of the camera or other imaging sensor is
not in digital form, an analog to digital converter digitizes it. The nature of the sensor and the image it
produces are determined by the application.
Fig 1.5.2 digital camera cell

1.5.2 Image Enhancement:

Image enhancement is among the simplest and most appealing areas of digital image processing.
Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to
highlight certain features of interesting an image. A familiar example of enhancement is when we increase
the contrast of an image because “it looks better.” It is important to keep in mind that enhancement is a
very subjective area of image processing.

Fig 1.5.3 Image enhancement

1.5.3 Image restoration:


Image restoration is an area that also deals with improving the appearance of an image. However,
unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration
techniques tend to be based on mathematical or probabilistic models of image degradation.
Fig 1.5.4 Image restoration
Enhancement, on the other hand, is based on human subjective preferences regarding what
constitutes a “good” enhancement result. For example, contrast stretching is considered an enhancement
technique because it is based primarily on the pleasing aspects it might present to the viewer, where as
removal of image blur by applying a deblurring function is considered a restoration technique.

1.5.4 Color image processing:


The use of color in image processing is motivated by two principal factors. First, color is a
powerful descriptor that often simplifies object identification and extraction from a scene. Second, humans
can discern thousands of color shades and intensities, compared to about only two dozen shades of gray.
This second factor is particularly important in manual image analysis.

Fig 1.5.5 Color & Gray scale image

1.5.5 Wavelets and multiresolution processing:


Wavelets are the formation for representing images in various degrees of resolution. Although the
Fourier transform has been the mainstay of transform based image processing since the late1950’s, a more
recent transformation, called the wavelet transform, and is now making it even easier to compress,
transmit, and analyze many images. Unlike the Fourier transform, whose basis functions are sinusoids,
wavelet transforms are based on small values, called Wavelets, of varying frequency and limited duration.
Fig 1.5.6 rgb histogram image

Wavelets were first shown to be the foundation of a powerful new approach to signal processing
and analysis called Multiresolution theory. Multiresolution theory incorporates and unifies techniques
from a variety of disciplines, including sub band coding from signal processing, quadrature mirror filtering
from digital speech recognition, and pyramidal image processing.

1.5.6 Compression:
Compression, as the name implies, deals with techniques for reducing the storage required saving
an image, or the bandwidth required for transmitting it. Although storage technology has improved
significantly over the past decade, the same cannot be said for transmission capacity. This is true
particularly in uses of the Internet, which are characterized by significant pictorial content. Image
compression is familiar to most users of computers in the form of image file extensions, such as the jpg
file extension used in the JPEG (Joint Photographic Experts Group) image compression standard.

1.5.7 Morphological processing:


Morphological processing deals with tools for extracting image components that are useful in the
representation and description of shape. The language of mathematical morphology is set theory. As such,
morphology offers a unified and powerful approach to numerous image processing problems. Sets in
mathematical morphology represent objects in an image. For example, the set of all black pixels in a
binary image is a complete morphological description of the image.
Fig 1.5.7 blur to deblur image

In binary images, the sets in question are members of the 2-D integer space Z2, where each element
of a set is a 2-D vector whose coordinates are the (x,y) coordinates of a black(or white) pixel in the image.
Gray-scale digital images can be represented as sets whose components are in Z3. In this case, two
components of each element of the set refer to the coordinates of a pixel, and the third corresponds to its
discrete gray-level value.

1.5.8 Segmentation:
Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged
segmentation procedure brings the process a long way toward successful solution of imaging problems that
require objects to be identified individually.

Fig 1.5.8 Image segmentation

On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual
failure. In general, the more accurate the segmentation, the more likely recognition is to succeed.

1.5.9 Representation and description:


Representation and description almost always follow the output of a segmentation stage, which
usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one
image region from another) or all the points in the region itself. In either case, converting the data to a
form suitable for computer processing is necessary. The first decision that must be made is whether the
data should be represented as a boundary or as a complete region. Boundary representation is appropriate
when the focus is on external shape characteristics, such as corners and inflections.

Regional representation is appropriate when the focus is on internal properties, such as texture or
skeletal shape. In some applications, these representations complement each other. Choosing a
representation is only part of the solution for transforming raw data into a form suitable for subsequent
computer processing. A method must also be specified for describing the data so that features of interest
are highlighted. Description, also called feature selection, deals with extracting attributes that result in
some quantitative information of interest or are basic for differentiating one class of objects from another.

1.5.10 Object recognition:


The last stage involves recognition and interpretation. Recognition is the process that assigns a
label to an object based on the information provided by its descriptors. Interpretation involves assigning
meaning to an ensemble of recognized objects.

1.5.11 Knowledgebase:
Knowledge about a problem domain is coded into image processing system in the form of a
knowledge database. This knowledge may be as simple as detailing regions of an image when the
information of interests is known to be located, thus limiting the search that has to be conducted in seeking
that information. The knowledge base also can be quite complex, such as an inter related to list of all major
possible defects in a materials inspection problem or an image data base containing high resolution
satellite images of a region in connection with change deletion application. In addition to guiding the
operation of each processing module, the knowledge base also controls the interaction between modules.
The system must be endowed with the knowledge to recognize the significance of the location of the string
with respect to other components of an address field. This knowledge glides not only the operation of each
module, but it also aids in feedback operations between modules through the knowledge base. We
implemented preprocessing techniques using MATLAB.

1.6 COMPONENTS OF AN IMAGE PROCESSING SYSTEM:


As recently as the mid-1980s, numerous models of image processing systems being sold
throughout the world were rather substantial peripheral devices that attached to equally substantial host
computers. Late in the 1980s and early in the 1990s, the market shifted to image processing hardware in
the form of single boards designed to be compatible with industry standard buses and to fit into
engineering workstation cabinets and personal computers. In addition to lowering costs, this market shift
also served as a catalyst for a significant number of new companies whose specialty is the development of
software written specifically for image processing.

Fig 1.6 Component of image processing

Although large-scale image processing systems still are being sold for massive imaging
applications, such as processing of satellite images, the trend continues toward miniaturizing and blending
of general-purpose small computers with specialized image processing hardware. Figure 1.24 shows the
basic components comprising a typical general-purpose system used for digital image processing. The
function of each component is discussed in the following paragraphs, starting with image sensing.

 Image sensors:
With reference to sensing, two elements are required to acquire digital images. The first is a
physical device that is sensitive to the energy radiated by the object we wish to image. The second, called a
digitizer, is a device for converting the output of the physical sensing device into digital form. For
instance, in a digital video camera, the sensors produce an electrical output proportional to light intensity.
The digitizer converts these outputs to digital data.
 Specialized image processing hardware:
Specialized image processing hardware usually consists of the digitizer just mentioned, plus
hardware that performs other primitive operations, such as an arithmetic logic unit (ALU), which performs
arithmetic and logical operations in parallel on entire images. One example of how an ALU is used is in
averaging images as quickly as they are digitized, for the purpose of noise reduction. This type of
hardware sometimes is called a front-end subsystem, and its most distinguishing characteristic is speed. In
other words, this unit performs functions that require fast data throughputs (e.g., digitizing and averaging
video images at 30 frames) that the typical main computer cannot handle.
 Computer:
The computer in an image processing system is a general-purpose computer and can range from a
PC to a supercomputer. In dedicated applications, sometimes specially designed computers are used to
achieve a required level of performance, but our interest here is on general-purpose image processing
systems. In these systems, almost any well-equipped PC-type machine is suitable for offline image
processing tasks.
 Image processing software:
Software for image processing consists of specialized modules that perform specific tasks. A well-
designed package also includes the capability for the user to write code that, as a minimum, utilizes the
specialized modules. More sophisticated software packages allow the integration of those modules and
general-purpose software commands from at least one computer language.
 Mass storage:
Mass storage capability is a must in image processing applications. An image of size 1024*1024
pixels, in which the intensity of each pixel is an 8-bit quantity, requires one megabyte of storage space if
the image is not compressed. When dealing with thousands, or even millions, of images, providing
adequate storage in an image processing system can be a challenge. Digital storage forimage processing
applications fall into three principal categories: (1) short-term storage for use during processing, (2) on-
line storage for relatively fast recall, and (3) archival storage, characterized by infrequent access. Storage
is measured in bytes (eight bits), Kbytes (one thousand bytes), Mbytes (one million bytes), Gbytes
(meaning giga, or one billion, bytes), and Tbytes (meaning tera, or one trillion, bytes)

One method of providing short-term storage is computer memory. Another is by specialized


boards, called frame buffers that store one or more images and can be accessed rapidly, usually at video
rates. The latter method allows virtually instantaneous image zoom, as well as scroll (vertical shifts) and
pan (horizontal shifts). Frame buffers usually are housed in the specialized image processing hardware unit
shown in Fig. 1.24. Online storage generally takes the form of magnetic disks or optical-media storage.
The key factor characterizing on-line storage is frequent access to the stored data. Finally, archival storage
is characterized by massive storage requirements but infrequent need for access. Magnetic tapes and
optical disks housed in “jukeboxes” are the usual media for archival applications.

 Image displays:
Image displays in use today are mainly color (preferably flat screen) TV monitors. Monitors are
driven by the outputs of image and graphics display cards that are an integral part of the computer system.
Seldom are there requirements for image display applications that cannot be met by display cards available
commercially as part of the computer system. In some cases, it is necessary to have stereo displays, and
these are implemented in the form of headgear containing two small displays embedded in goggles worn
by the user.

 Hardcopy:
Hardcopy devices for recording images include laser printers, film cameras, heat-sensitive devices,
inkjet units, and digital units, such as optical and CD-ROM disks. Film provides the highest possible
resolution, but paper is the obvious medium of choice for written material. For presentations, images are
displayed on film transparencies or in a digital medium if image projection equipment is used. The latter
approach is gaining acceptance as the standard for image presentations.

 Network:
Networking is almost a default function in any computer system in use today. Because of the large
amount of data inherent in image processing applications, the key consideration in image transmission is
bandwidth. In dedicated networks, this typically is not a problem, but communications with remote sites
via the Internet are not always as efficient. Fortunately, this situation is improving quickly as a result of
optical fiber and other broadband technologies.

CCM:
In music

 Contemporary Christian music, a genre of popular music which is lyrically focused on matters concerned
with the Christian faith
o CCM Magazine, a magazine that covers Contemporary Christian music
 University of Cincinnati College-Conservatory of Music, the performing arts college of the University of
Cincinnati
 Cincinnati Conservatory of Music, a conservatory formed in 1867 as part of a girls' finishing school which
later became part of the University of Cincinnati College-Conservatory of Music.
 Contemporary Commercial Music
 In the context of MIDI: control change message.

In cryptography

 CCM mode, a mode of operation for cryptographic block ciphers


 Combined Cipher Machine, a common cipher machine system for securing Allied communications during
World War II

In medicine

 Cerebral cavernous malformation, a vascular disorder of the central nervous system that may appear either
sporadically or exhibit autosomal dominant inheritance
 Classical Chinese medicine, a medicine that developed from germ theory
 Comprehensive Care Management, a member of the Beth Abraham Family of Health Services
 Critical Care Medicine, a peer-reviewed medical journal in the field of critical care medicine

In politics

 Chama Cha Mapinduzi, the ruling political party of Tanzania


 Crown Council of Monaco, a seven-member administrative body which meets at least twice annually to
advise the Prince of Monaco on various domestic and international affairs
 Convention on Cluster Munitions, is an international treaty that prohibits the use of cluster bombs, a type of
explosive weapon which scatters submunitions ("bomblets") over an area.
 Coalition for a Conservative Majority, a nonprofit, political advocacy group organized by Tom DeLay and
Kenneth Blackwell

In religion

 Catholic Campus Ministry, a Catholic student organization on many college campuses


 Council of Churches of Malaysia, an ecumenical body in Malaysia comprising mainline Protestant churches
and Oriental Orthodox Church
 Christ Church Manchester, otherwise known as CCM is a church in Manchester that meets in numerous
locations
 Christian Compassion Ministries, a mission organisation in the Philippines

In sports

 CCM (The Hockey Company), a manufacturing company of Canada


 CCM (cycle), a manufacturing company of Canada
 Central Coast Mariners FC, an Australian professional football (soccer) team based on the Central Coast of
New South Wales, Australia

In technology

 Continuous Controls Monitoring describes techniques of continuously monitoring and auditing an IT system
 Continuous Current Mode, operational mode of DC-DC converters
 Cisco CallManager, a Cisco product
 Cloud Computing Manifesto
 Configuration & Change Management
 CORBA Component Model, a portion of the CORBA standard for software componentry
 A deprecated abbreviation for the cubic centimetre unit of volume measurement

In transportation

 CCM (cycle), a cycle manufacturer


 CCM Airlines, a regional airline based in Ajaccio, Corsica, France
 Clews Competition Motorcycles, a British motorcycle manufacturer based in Blackburn, England
 Cabin Crew Member, another name for flight attendant

In education

 County College of Morris, a two-year, public community college located off of Route 10 on Center Grove
Road in Randolph Township, New Jersey
 City College Manchester, a Further Education college in the United Kingdom.
 City College of Manila, in the Philippines.
In military

 Center for Countermeasures, a United States military center based at White Sands Missile Range, New
Mexico
 Command Chief Master Sergeant, a position in the United States Air Force

In other fields

 Corn cob mix, a kind of silage consisting of corn cobs and kernels.
 El Centro Cultural de Mexico, an alternative space in Santa Ana, Orange County, California
 Cerberus Capital Management, a large privately owned hedge fund
 Certified Consulting Meteorologist, a person designated by the American Meteorological Society to
possess attributes as they pertain to the field of meteorology
 Crime Classification Manual, FBI produced text for a standardized system to investigate and classify
violent crimes.
 Cervecería Cuauhtémoc Moctezuma, a major brewery in Mexico that produces brands such as Dos Equis
and Tecate.

Color and texture are two low-level features widely used for image classification, indexing and retrieval. Color is

usually represented as a histogram, which is a first order statistical measure that captures global distribution of color

in an image One of the main drawbacks of the histogram- based approaches is that the spatial distribution and local

variations in color are ignored. Local spatial variation of pixel intensity is commonly used to capture texture

information in an image. Grayscale Co-occurrence Matrix (GCM) is a well-known method for texture extraction in

the spatial domain. A GCM stores the number of pixel neighborhoods in an image that have a particular grayscale

combination. Let I be an image and let p and Np respectively denote any arbitrary pixel and its neighbor in a given

direction. If GL denotes the total number of quantized gray levels and gl denotes the individual gray levels, where,

gl {0, . . .,GL _ 1}, then each component of GCM can be written as follows:
gcm(i, j) is the number of times the gray level of a pixel p denoted by glp equals i, and the gray

level of its neighbor Np denoted by glNp equals j, as a fraction of the total number of pixels in the image. Thus, it

estimates the probability that the gray level of an arbitrary pixel in an image is i, and that of its neighbor is j. One

GCM matrix is generated for each possible neighborhood direction, namely, 0, 45, 90 and 135.Average and range of

14 features like Angular Second Moment, Contrast, Correlation, etc., are generated by combining all the four

matrices to get a total of 28 features. In the GCM approach for texture extraction, color information is completely

lost since only pixel gray levels are considered.

To incorporate spatial information along with the color of image pixels, a feature called

color correlogram has recently been proposed. It is a three dimensional matrix that represents the probability of

finding pixels of any two given colors at a distance ‘d’ apart Auto correlogram is a variation of correlogram, which

represents the probability of finding two pixels with the same color at a distance ‘d’ apart. This atproach can

effectively represent color distribution in an image. However, correlogram features do not capture intensity variation

Many image databases often contain both color as well as gray scale images. The color correlogram method does not

constitute a good descriptor in such databases.

Another method called Color Co-occurrence Matrix (CCM) has been proposed to capture

color variation in an image. CCM is represented as a three-dimensional matrix, where color pair of the pixels p and

Np are captured in the first two dimensions of the matrix and the spatial distance ‘d’ between these two pixels is

captured in the third dimension. This approach is a generalization of the color correlogram and reduces to the pure

color correlogram for d = 1. CCM is generated using only the Hue plane of the HSV (Hue, Saturation and Intensity

Value) color space. The Hue axis is quantized into HL number of levels. If individual hue values are denoted by hl,

where , then each component of CCM can be written as follows:

Four matrices representing neighbors at angles 0, 90, 180 and 270 are considered. This approach

was further extended by separating the diagonal and the non-diagonal components of CCM to generate a Modified

Color Co-occurrence Matrix (MCCM). MCCM, thus, may be written as follows: MCCM = (CCMD;CCMND)
Here, CCMD and CCMND correspond to the diagonal and off-diagonal components of CCM.

The main drawback of this approach is that, like correlogram, it also captures only color information and intensity

information is completely ignored.

An alternative approach is to capture intensity variation as a texture feature from an image and

combine it with color features like histograms using suitable weights . One of the challenges of this approach is to

determine suitable weights since these are highly application-dependent. In certain applications like Content-based

Image Retrieval (CBIR), weights are often estimated from relevance feedback given by users.

While relevance feedback is sometimes effective, it makes the process of image retrieval user-

dependent and iterative. There is also no guarantee on the convergence of the weight-learning algorithms. In order to

overcome these problems, researchers have tried to combine color and texture features together during extraction.

proposed two approaches for capturing color and intensity variations from an image using the LUV color space. In

the Single-channel Co-occurrence Matrix (SCM), variations for each color channel, namely, L, U and V are

considered independently. In the Multi channel Co-occurrence Matrix (MCM), variations are captured taking two

channels at a time – UV, LU and LV. Since the LUV color space separates out chrominance (L and U) from

luminance (V), SCM in effect, generates one GCM and two CCMs from each image independently. As a result,

correlation between the color channels is lost

However, in MCM, the count of pair wise occurrences of the values of different channels of

the color space is captured. Thus, each component of MCM can be written as follows:

mcmUV(i; j) = Pr((up; vNp) = (i; j))

mcmLU(i; j) = Pr((lp; uNp) = (i; j))

mcmLV(i; j) = Pr((lp; vNp) = (i; j))

Here, mcmUV(i, j) is the number of times the U chromaticity value of a pixel p denoted by

up equals i, and the V chromaticity value of its neighbor Np denoted by vNp equals j, as a fraction of the total

number of pixels in the image. Similarly, mcmLU(i, j) and mcmLV(i, j) are defined. One MCM matrix is generated

for each of the four neighborhood directions, namely, 0, 45, 90 and 135.
Deng and Manjunath (2001) proposed a two-stage method called JSEG, which combines color

and texture after image segmentation. In the first stage, colors are quantized to the required levels for differentiating

between various regions of an image. Pixel values of the regions are then replaced by their quantized color levels to

form a color map. Spatial variation of color levels between different regions in the map is viewed as a type of texture

composition of the image.

Yu et al. (2002) suggested the use of color texture moments to represent both color and texture of

an image. This approach is based on the calculation of

Local Fourier Transformation (LFT) coefficients. Eight templates equivalent to LFT are operated over an image to

generate a characteristic map of the image. Each template is a 3 · 3 filter that considers eight neighbors of the current

pixel for LFT calculation. First and second order moments of the characteristic map are then used to generate a set

of features.

In this paper, we propose an integrated approach for capturing spatial variation of both color and

intensity levels in the neighborhood of each pixel using the HSV color space. In contrast to the other methods, for

each pixel and its neighbor, the amount of color and intensity variation between them is estimated using a weight

function. Suitable constraints are satisfied while choosing the weight function for effectively relating visual

perception of color and the HSV color space properties. The color and intensity variations are represented in a single

composite feature known as Integrated Color and Intensity Co-occurrence Matrix (ICICM). While the existing

schemes generally treat color and intensity separately, the proposed method provides a composite view to both color

and intensity variations in the same feature. The main advantage of using ICICM is that it avoids the use of weights

to combine individual color and texture features. We use ICICM feature in an image retrieval application from large

image databases.

Early result on this work was reported in (Vadivel et al., 2004a). In the next section, we describe

the proposed feature extraction technique after introducing some of the properties of the HSV color space. Choice of

quantization levels for color and intensity axes, selection of parameter values and a brief overview of the image

retrieval application

Integrated color and intensity co-occurrence matrix:


We propose to capture color and intensity variation around each pixel in a two-dimensional

matrix called Integrated Color and Intensity Co-occurrence Matrix (ICICM). This is a generalization of the

Grayscale Co-occurrence Matrix and the Color Co-occurrence Matrix techniques. For each pair of neighboring

pixels, we consider

their contribution to both color perception as well as gray level perception to the human eye. Some of the useful

properties of the HSV color space and their relationship to human color perception are utilized for extracting this

feature. In the next sub-section, we briefly explain relevant properties of the HSV color space. In the subsequent

subsection, we describe how the properties can be effectively used for generating ICICM.

HSV color space:

HSV Color space: Basically there are three properties or three dimensions of color that being

hue, saturation and value HSV means Hue, Saturation and Value. It is important to look at because it describes the

color based on three properties. It can create the full spectrum of colors by editing the HSV values. The first

dimension is the Hue. Hue is the other name for the color or the complicated variation in the color. The quality of

color as determined by its dominant wavelength. This Hue is broadly classified into three categories. They are

primary Hue, Secondary Hue and Teritiary Hue. The first and the foremost is the primary Hue it consists of three

colors they are red, yellow and blue. The secondary Hue is formed by the combination of the equal amount of colors

of the primary Hue and the colors of the secondary Hue which was formed by the primary Hue are Orange, Green

and violet. The remaining one is the teritiary Hue is formed by the combination of the primary Hue and the

secondary Hue. The limitless number of colors are produced by mixing the colors of the primary Hue in different

amounts. Saturation is the degree or the purity of color. Then the second dimension is the saturation. Saturation just

gives the intensity to the colors. The saturation and intensity drops just by mixing the colors or by adding black to

the color. By adding the white to the color in spite of more intense the color becomes lighter. Then finally the third

dimension is the Value. The value is the brightness of the color. When the value is zero the color space is totally

black with the increase in the color there is also increase in the brightness and shows the various colors. The value

describes the contrast of the color. That means it describes the lightness and darkness of the color. As similar to the
saturation this value consists of the tints and shades. Tints are the colors with the added white and shades are the

colors with the added black.

Properties of the HSV color space:

Sensing of light from an image in the layers of human retina is a complex process with rod cells

contributing to scotopic or dim-light vision and cone cells to photopic or bright-light vision (Gonzalez and Woods,

2002). At low levels of illumination, only the rod cells are excited so that only gray shades are perceived. As the

illumination level increases, more and more cone cells are excited, resulting in increased color perception. Various

color spaces have been introduced to represent and specify colors in a way suitable for storage, processing or

transmission of color information in images. Out of these, HSV is one of the models that separate out the luminance

component (Intensity) of a pixel color from its chrominance components (Hue and Saturation). Hue represents pure

color, which is perceived when incident light is of sufficient illumination and contains a single wavelength.

Saturation gives a measure of the degree by which a pure color is diluted by white light. For light with low

illumination, corresponding intensity value in the HSV color space is also low.

The HSV color space can be represented as a Hexa cone, with the central vertical axis denoting

the luminance component, I (often denoted by V for Intensity Value). Hue, is a chrominance component defined as

an angle in the range [0,2p] relative to the red axis with red at angle 0, green at 2p/3, blue at 4p/3 and red again at

2p. Saturation, S, is the other chrominance component, measured as a radial distance from the central axis of the

hexacone with value between 0 at the center to 1 at the outer surface. For zero saturation, as the intensity is

increased, we move from black to white through various shades of gray. On the other hand, for a given intensity and

hue, if the saturation is changed from 0 to 1, the perceived color changes from a shade of gray to the most pure form

of the color represented by its hue. When saturation is near 0, all the pixels in an image look alike even though their

hue values are different.


As we increase saturation towards 1, the colors get separated out and are visually perceived as the

true colors represented by their hues. Low saturation implies presence of a large number of spectral components in

the incident light, causing loss of color information even though the illumination level is sufficiently high. Thus, for

low values of saturation or intensity, we can approximate a pixel color by a gray level while for higher saturation

and intensity, the pixel color can be approximated by its hue. For low intensities, even for a high saturation, a pixel

color is close to its gray value. Similarly, for low saturation even for a high value of intensity, a pixel is perceived as

gray. We use these properties to estimate the degree by which a pixel contributes to color perception and gray level

perception.

One possible way of capturing color perception of a pixel is to choose suitable thresholds on

the intensity and saturation. If the saturation and the intensity are above their respective thresholds, we may consider

the pixel to have color dominance; else, it has gray level dominance. However, such a hard thresholding does not

properly capture color perception near the threshold values. This is due to the fact that there is no fixed level of

illumination above which the cone cells get excited. Instead, there is a gradual transition from scotopic to photopic

vision. Similarly, there is no fixed threshold for the saturation of cone cells that leads to loss of chromatic

information at higher levels of illumination caused by color dilution. We, therefore, use suitable weights that vary

smoothly with saturation and intensity to represent both color and gray scale perception for each pixel.

NON INTERVAL QUANTIZATION:

Due to the large range for each component by directly calculating the characteristics for the

retrieval then the computation will be very difficult to ensure rapid retrieval. It is essential to quantify HSV space

component to reduce computation and improve efficiency. At the same time, because the human eye to distinguish

colors is limited, do not need to calculate all segments. Unequal interval quantization according the human color

perception has been applied on H , S ,V components.

Based on the color model of substantial analysis, we divide color into eight parts. Saturation

and intensity is divided into three parts separately in accordance with the human eyes to distinguish. In accordance
with the different colors and subjective color perception quantification, quantified hue(H), saturation(S) and

value(V)

In accordance with the quantization level above, the H, S, V three-dimensional feature

vector for different values of with different weights to form one dimensional feature vector and is given by the

following equation:

G = Qs*Qv*H+Qv*s+V

Where Qs is the quantized series of S and Qv is the quantized series of V. And now by

setting Qs = Qv = 3, Then G = 9H+3S+V

In this way three component vector of the HSV from one dimensional vector, Which

quantize the whole color space for the 72 kinds of the main colors. So we can handle 72 bins of one dimensional

histogram. This qualification is effective in reducing the images by the effect of the light intensity, but also reducing

the computational time and complexity.


IMAGE RETRIEVAL:

Image retrieval is nothing but a computer system used for browsing searching and

retrieving images from a large database of digital images. Most traditional and common methods of image retrieval

use some method of adding metadata by captioning, Keywords or the descriptions to the images so that the retrieval

can be performed. Manual image annotation is time consuming, expensive and laborious. For addressing this there

has been a large amount of research done on automatic image annotation. It is crucial to understand the scope and

nature of the image data in order to determine the complexity of the image search system design. The design is also

largely dependent on the factors. And some of the factors include archives, Domain specific collection, Enterprise

collection, Personal collection and web etc..,

Invention of the digital camera has given the common man the privilege to capture his

world in pictures, and conveniently share them with others. one can today generate volumes of images with content

as diverse as family get-togethers and national park visits. Low-cost storage and easy Web hosting has fueled the

metamorphosis of common man from a passive consumer of photography in the past to a current-day active

producer. Today, searchable image data exists with extremely diverse visual and semantic content, spanning

geographically disparate locations, and is rapidly growing in size. All these factors have created innumerable

possibilities and hence considerations for real-world image search system designers.

As far as technological advances are concerned, growth in Content-based image

retrieval has been unquestionably rapid. In recent years, there has been significant effort put into understanding the

real world implications, applications, and constraints of the technology. Yet, real-world application of the

technology is currently limited. We devote this section to understanding image retrieval in the real world and discuss
user expectations, system constraints and requirements, and the research effort to make image retrieval a reality in

the not-too-distant future.

An image retrieval system designed to serve a personal collection should focus on features

such as personalization, flexibility of browsing, and display methodology. For example, Google’s Picasa system

[Picasa 2004] provides a chronological display of images taking a user on a journey down memory lane. Domain

specific collections may impose specific standards for presentation of results. Searching an archive for content

discovery could involve long user search sessions. Good visualization and a rich query support system should be the

design goals. A system designed for the Web should be able to support massive user traffic. One way to supplement

software approaches for this purpose is to provide hardware support to the system architecture. Unfortunately, very

little has been explored in this direction, partly due to the lack of agreed-upon indexing and retrieval methods. The

notable few applications include an FPGA implementation of a color-histogram-based image retrieval system

[Kotoulas and Andreadis 2003], an FPGA implementation for sub image retrieval within an image database [Nakano

and Takamichi 2003], and a method for efficient retrieval in a network of imaging devices [Woodrow and

Heinzelman 2002].

Discussion. Regardless of the nature of the collection, as the expected user-base grows, factors such as

concurrent query support, efficient caching, and parallel and distributed processing of requests become critical. For

future real-world image retrieval systems, both software and hardware approaches to address these issues are

essential. More realistically, dedicated specialized servers, optimized memory and storage support, and highly

parallelizable image search algorithms to exploit cluster computing powers are where the future of large-scale image

search hardware support lies.

OVERVIEW OF TEXTURE:
We all know about the term Texture but for defining it is a hard time. One can differentiate the two

different Textures by recognizing the similarities and differences. Commonly there are three ways for the usage of

the Textures:

Based on the Textures the images can be segmented To differentiate between already segmented

regions or to classify them.We can reproduce Textures by producing the descriptions. The texture can be analyzed in

three different ways. They are Spectral, Structural and Statistical:

Image Types:
The toolbox supports four types of images:

1 .Intensity images;

2. Binary images;

3. Indexed images;

4. R G B images.

Most monochrome image processing operations are carried out using


binary or intensity images, so our initial focus is on these two image types.
Indexed and RGB colour images.

Intensity Images:

An intensity image is a data matrix whose values have been scaled to


represent intentions. When the elements of an intensity image are of class unit8,
or class unit 16, they have integer values in the range [0,255] and [0, 65535],
respectively. If the image is of class double, the values are floating point
numbers. Values of scaled, double intensity images are in the range [0, 1] by
convention.

Binary Images:

Binary images have a very specific meaning in MATLAB.A binary image


is a logical array 0s and1s.Thus, an array of 0s and 1s whose values are of data
class, say unit8, is not considered as a binary image in MATLAB .A numeric
array is converted to binary using function logical. Thus, if A is a numeric array
consisting of 0s and 1s, we create an array B using the statement.
B=logical (A)

If A contains elements other than 0s and 1s.Use of the logical function


converts all nonzero quantities to logical 1s and all entries with value 0 to
logical 0s.

Using relational and logical operators also creates logical arrays.

To test if an array is logical we use the I logical function: islogical(c).

If c is a logical array, this function returns a 1.Otherwise returns a 0.


Logical array can be converted to numeric arrays using the data class
conversion functions.

Indexed Images:

An indexed image has two components:

A data matrix integer, x

A color map matrix, map

Matrix map is an m*3 arrays of class double containing floating point


values in the range [0, 1].The length m of the map are equal to the number of
colors it defines. Each row of map specifies the red, green and blue components
of a single color. An indexed images uses “direct mapping” of pixel intensity
values color map values.
The color of each pixel is determined by using the corresponding value the
integer matrix x as a pointer in to map. If x is of class double ,then all of its
components with values less than or equal to 1 point to the first row in map, all
components with value 2 point to the second row and so on. If x is of class units
or unit 16, then all components value 0 point to the first row in map, all
components with value 1 point to the second and so on.

RGB Image:

An RGB color image is an M*N*3 array of color pixels where each color
pixel is triplet corresponding to the red, green and blue components of an RGB
image, at a specific spatial location. An RGB image may be viewed as “stack”
of three gray scale images that when fed in to the red, green and blue inputs of a
color monitor

Produce a color image on the screen. Convention the three images forming an
RGB color image are referred to as the red, green and blue components images.
The data class of the components images determines their range of values. If an
RGB image is of class double the range of values is [0, 1].

Similarly the range of values is [0,255] or [0, 65535].For RGB images of


class units or unit 16 respectively. The number of bits use to represents the pixel
values of the component images determines the bit depth of an RGB image. For
example, if each component image is an 8bit image, the corresponding RGB
image is said to be 24 bits deep.
Generally, the number of bits in all component images is the same. In this
case the number of possible color in an RGB image is (2^b) ^3, where b is a
number of bits in each component image. For the 8bit case the number is
16,777,216 colors.

sparse representation-based, and nonlocal selfsimilarity- based ones, have been widely studied and
exploited for noise removal. In spite of the great success of many denoising algorithms, they tend to
smooth the fine scale image textures when removing noise, degrading the image visual quality. To address
this problem, in this paper, we propose a texture enhanced image denoising method by enforcing the
gradient histogram of the denoised image to be close to a reference gradient histogram of the original
image. Given the reference gradient histogram, a novel gradient histogram preservation (GHP) algorithm
is developed to enhance the texture structures while removing noise. Two region-based variants of GHP
are proposed for the denoising of images consisting of regions with different textures. An algorithm is also
developed to effectively estimate the reference gradient histogram from the noisy observation of the
unknown image. Our experimental results demonstrate that the proposed GHP algorithm can well preserve
the texture appearance in the denoised images, making them look more natural.
THE problem of recovering patterns and structures in images from corrupted observations is encouraged
in
many engineering and science applications, ranging from computer vision, consumer electronics to
medical imaging. In many practical image processing problems, observed images often contain noise that
should be removed beforehand for improving the visual pleasure and the reliability of subsequent image
analysis tasks. Images may be contaminated by various types of noise. Among them, the impulse noise is
one of the most frequently happened noises, which may be introduced into images during acquisition and
transmission. For example, it may be caused by malfunctioning pixels in camera sensors, faulty memory
locations in hardware or transmission in a noisy channel [1]. In this paper, we focus on the task of impulse
noise removal. Removing impulse noise from images is a challenging image processing problem, because
edges which can also be modeled as abrupt intensity jumps in a scan line are highly salient features for
visual attention. Therefore, besides impulse noise removal, another important requirement for image
denoising procedures is that they should preserve important image structures, such as edges and major
texture features. A vast variety of impulse noise removal methods are available in the literature, touching
different fields of signal processing, mathematics and statistics. From a signal processing perspective,
impulse noise removal poses a fundamental challenge for conventional linear methods. They typically
achieve the target of noise removal by low-pass filtering which is performed by removing the high-
frequency components of images. This is effective for smooth regions in images. But for texture and detail
regions, the low-pass filtering typically introduces large, spurious oscillations near the edge known as
Gibb’s phenomena [2], [3]. Accordingly, nonlinear filtering
techniques are invoked to achieve effective performance. One kind of the most popular and robust
nonlinear filters is the so called decision-based filters, which first employ an impulsenoise detector to
determine which pixels should be filtered and then replace them by using the median filter or its variants,
while leaving all other pixels unchanged. The representative methods include the adaptive median filter
(AMF) [4] and the adaptive center-weighted median filter (ACWMF) [5]. Besides, many successful
frameworks for impulse noise removal can be derived from the energy method. In this framework, image
denoising is considered as a variational problem where a restored image is computed by a minimization of
some energy functions. Typically, such functions consist of a fidelity term such as the norm difference
between the recovered image and the noisy image, and a regularization term which penalizes high
frequency noise. For example, Chan et al. [6] propose a powerful two-stage scheme, in which noise
candidates are selectively restored using an objective function with an

_1 data-fidelity term and an edge-preserving regularization term. Under the similar scheme, Cai et al. [7]
propose a
enhanced algorithm used for deblurring and denoising, and achieve wonderful objective and subjective
performance. Different from Chan and Cai’s work, Li et al. [8] formulate the problem with a new
variational functional, in which the content-dependent fidelity assimilates the strength of fidelity terms
measured by the _1 and _2 norms, and the regularizer is formed by the _1 norm of tight framelet
coefficients of the underlying image. From a statistical perspective, recovering images from degraded
forms is inherently an ill-posed inverse problem. It often can be formulated as an energy minimization
problem in which either the optimal or most probable configuration is the goal. The performance of an
image recovery algorithm largely depends on how well it can employ regularization
conditions or priors when numerically solving the problem, because the useful prior statistical knowledge
can regulate estimated pixels. Therefore, image modeling lies at the core of image denoising problems.
One common prior assumption for natural images is intensity consistency, which means: (1) nearby pixels
are likely to have the same or similar intensity values; and (2) pixels on the same structure are likely to
have the same or similar intensity values. Note that the first assumption means images are locally smooth,
and the second assumption means images
have the property of non-local self-similarity. Accordingly, how to choose statistical models that
thoroughly explore such two prior knowledge directly determines the performance of image recovery
algorithms. Another important characteristic of natural images is that they are comprised of structures at
different scales. Through multi-scale decomposition, the structures of images at different scales become
better exposed, and hence be more easily predicted. At the same time, the availability of multi-scale
structures can significantly reduce the dimension of problem, hence, make the ill-posed problem to be
better posed [9], [10]. Early heuristic observation about the local smoothness of image intensity field has
been quantified by several linear parametric models, such as the piecewise autoregressive (PAR) image
model [11], [12]. Moreover, the study of natural image statistics reveals that the second order statistics of
natural images tends to be invariant across different scales, as illustrated in Fig. 1 (denoted by inter-scale
correlation). And those scale invariant features are shown to be crucial for human visual perception [13],
[14]. This observation inspires us to learn and propagate the statistical features across different scales to
keep the local smoothness of images. On the other hand, the idea of exploiting the non-local self-similarity
of images has attracted increasingly more attention in the field of image processing [15], [16]. Referring to
Fig. 1 (denoted
by intra-scale correlation), the non-local self-similarity is based on the observation that image patches tend
to repeat
themselves in the whole image plane, which in fact reflects the intra-scale correlation. All those findings
tell us that localnonlocal redundancy and intra-inter-scale correlation can be thought of as two sides of the
same coin. The multiscale framework provides us a wonderful choice to efficiently combine the principle
of local smoothness and non-local similarity for image recovery. Moreover, recent progress in semi-
supervised learning gives us additional inspiration to address the problem of image recovery. Semi-
supervised learning is motivated by a considerable
interest in the problem of learning from both labeled (measured) and unlabeled (unmeasured) points [17].
Specially,
geometry-based semi-supervised learning methods show that natural images cannot possibly fill up the
ambient Euclidean space rather it may reside on or close to an underlying submanifold [18], [19]. In this
paper, we try to extract this kind of low dimensional structure and use it as prior knowledge to regularize
the process of image denoising. In another word, in the algorithm design, we will explicitly take into
account the intrinsic manifold structure by making use of both labeled and unlabeled data points.
Motivated by the above observation, the well-known theory of kernels [20] and works on graph-based
signal processing [24]–[26], in this paper, we propose a powerful algorithm to perform progressive image
recovery based on hybrid graph Laplacian regularized regression. Part of our previous work has been
reported in [21]. In our method, a multi-scale representation of the target image is constructed by
Laplacian pyramid, through which we try to effectively combine local smoothness and non-local self-
similarity. On one hand, within each scale, a graph Laplacian regularization model represented by implicit
kernel is learned which simultaneously minimizes the least square error on the measured samples and
preserves the geometrical structure of the image data space by exploring non-local self-similarity. In this
procedure, the intrinsic
manifold structure is considered by using both measured and unmeasured samples. On the other hand,
between two scales, the proposed model is extended to the parametric manner through explicit kernel
mapping to model the interscale correlation, in which the local structure regularity is learned and
propagated from coarser to finer scales.It is worth noting that the proposed method is a general framework
to address the problem of image recovery. We choose one typical image recovery task, impulse noise
removal, but not limit to this task, to validate the performance

of the proposed algorithm. Moreover, in our method the objective functions are formulated in the same
form for
intra-scale and inter-scale processing, but with different solutions obtained in different feature spaces: the
solution in the original feature space by implicit kernel is used for intra-scale prediction, and the other
solution in a higher feature space mapped by explicit kernel is used for inter-scale prediction. Therefore,
the proposed image recovery algorithm actually casts the consistency of local and global correlation
through the multi-scale scheme into a unified framework. The rest of the paper is organized as follows: In
Section II, we introduce the proposed graph Laplacian regularized model and its kernel-based optimization
solutions. Section III details the proposed multi-scale image recovery framework. Section IV presents
some experimental results and comparative studies. Section V concludes the paper. IMAGE RECOVERY
VIA GRAPH LAPLACIAN REGULARIZED REGRESSION . Problem Description
Given a degraded image X with n pixels, each pixel can be described by its feature vector xi = [ui , bi] ∈
_m+2, where ui = (h,w) is the coordinate and bi ∈ _m is a certain context of xi which is defined differently
for different tasks. All pixels in the image construct the sample set χ = {x1, x2, . . . , xn}. We call the
grayscale value yi as the label of xi . For image impulse noise removal, when image measures are noise-
dominated, the performance of image recovery can be improved by implementing it in two steps [6]: The
first step is to classify noisy and clean samples by using the adaptive median filter or its variant, which
depends on the noise type. Then, noisy samples are treated as unlabeled ones with their intensity values to
be re-estimated, and the rest clean samples are treated as labeled ones with intensity values unchanged.
The second step is to adjust the inference to give a best fit to labeled measures and uses the fitted model to
estimate the unlabeled samples. In view of machine learning, this task can
be addressed as a problem of semi-supervised regression. Graph Laplacian Regularized Regression
(GLRR) What we want to derive is the prediction function f , which gives the re-estimated values of noisy
samples. Given labeled samples Xl = {(x1, y1), . . . , (xl , yl )} as the training data, one direct approach of
learning the prediction function f is to minimize the prediction error on the set of labeled samples, which is
formulated as follows: argmin

f ∈ Hκ J ( f ) = argmin f ∈ Hκ l _ i=1 _yi − f (xi )_2 + λ_ f _2, (1) where Hκ is the Reproducing Kernel
Hilbert pace (RKHS) associated with the kernel κ. Hκ will be the completion of the linear span given by
κ(xi , ·) for all xi ∈ χ, i.e.,Hκ = span{κ(xi , ·)|xi ∈ χ}.
The above regression model only makes use of the labeled samples to carry out inference. When the noise
level is heavy, which means there are few labeled samples, it is hard to achieve a robust recovery of noisy
image. Moreover, it fails to take into account the intrinsic geometrical structure of the image data. Note
that we also have a bunch of unlabeled samples {xl+1, . . . , xn} at hand. In the field of machine learning,
the success of semi-supervised learning [20]–[25] is plausibly due to effective utilization of the large
amounts of unlabeled data to extract information that is useful forgeneralization. Therefore, it is
reasonable to leverage both labeled and unlabeled data to achieve better predictions. In order to make use
of unlabeled data, we follow the well-known manifold assumption, which is implemented by a graph
structure. Specially, the whole image sample set is modeled as a undirected graph, in which the vertices
are all the data points and the edges represent the relationships between vertices. Each edge is assigned a
weight to reflect the similarity between the connected vertices. As stated by the manifold assumption [18],
data points in the graph with larger affinity weights should have similar values. Meanwhile, with the above
definition, the intrinsic geometrical structure of the data space can be described by the graph Laplacian.
Through
the graph Laplacian regularization, the manifold structure can be incorporated in the objective function.
Mathematically, the manifold assumption can be implemented by minimizing the following term: where
Wi j is in inverse proportion to d2(xi , x j ). Wi j is defined as the edge weight in the data adjacency graph
which reflects the affinity between two vertices xi and x j. In graphconstruction, edge weights play a
crucial role. In this paper, we
combine the edge-preserving property of bilateral filter [16] and the robust property of non-local-means
weight [15] to design the edge weights, which are defined as follows:
1.1WAVELETS

1.1.1FOURIER ANALYSIS

Signal analysts already have at their disposal an impressive arsenal of tools. Perhaps the most well-
known of these is Fourier analysis, which breaks down a signal into constituent sinusoids of different
frequencies. Another way to think of Fourier analysis is as a mathematical technique for transforming our
view of the signal from time-based to frequency-based.
For many signals, Fourier analysis is extremely useful because the signal’s frequency content is of great
importance. So why do we need other techniques, like wavelet analysis?

Fourier analysis has a serious drawback. In transforming to the frequency domain, time information
is lost. When looking at a Fourier transform of a signal, it is impossible to tell when a particular event took
place. If the signal properties do not change much over time — that is, if it is what is called a stationary
signal—this drawback isn’t very important. However, most interesting signals contain numerous non
stationary or transitory characteristics: drift, trends, abrupt changes, and beginnings and ends of events.
These characteristics are often the most important part of the signal, and Fourier analysis is not suited to
detecting them.

1.1.2 SHORT-TIME FOURIER ANALYSIS:

In an effort to correct this deficiency, Dennis Gabor (1946) adapted the Fourier transform to
analyze only a small section of the signal at a time—a technique called windowing the signal. Gabor’s
adaptation, called the Short-Time Fourier Transform (STFT), maps a signal into a two-dimensional
function of time and frequency.

The STFT represents a sort of compromise between the time- and frequency-based views of a
signal. It provides some information about both when and at what frequencies a signal event occurs.
However, you can only obtain this information with limited precision, and that precision is determined by
the size of the window. While the STFT compromise between time and frequency information can be
useful, the drawback is that once you choose a particular size for the time window, that window is the
same for all frequencies. Many signals require a more flexible approach—one where we can vary the
window size to determine more accurately either time or frequency.

1.1.3 WAVELET ANALYSIS


Wavelet analysis represents the next logical step: a windowing technique with variable-sized
regions. Wavelet analysis allows the use of long time intervals where we want more precise low-frequency
information, and shorter regions where we want high-frequency information.

Here’s what this looks like in contrast with the time-based, frequency-based, and STFT views of a
signal:

You may have noticed that wavelet analysis does not use a time-frequency region, but rather a time-scale
region. For more information about the concept of scale and the link between scale and frequency, see
“How to Connect Scale to Frequency?”

One major advantage afforded by wavelets is the ability to perform local analysis, that is, to
analyze a localized area of a larger signal. Consider a sinusoidal signal with a small discontinuity — one
so tiny as to be barely visible. Such a signal easily could be generated in the real world, perhaps by a
power fluctuation or a noisy switch.

A plot of the Fourier coefficients (as provided by the fft command) of this signal shows nothing
particularly interesting: a flat spectrum with two peaks representing a single frequency. However, a plot of
wavelet coefficients clearly shows the exact location in time of the discontinuity.

Wavelet analysis is capable of revealing aspects of data that other signal analysis techniques miss, aspects
like trends, breakdown points, discontinuities in higher derivatives, and self-similarity. Furthermore,
because it affords a different view of data than those presented by traditional techniques, wavelet analysis
can often compress or de-noise a signal without appreciable degradation. Indeed, in their brief history
within the signal processing field, wavelets have already proven themselves to be an indispensable
addition to the analyst’s collection of tools and continue to enjoy a burgeoning popularity today.

Now that we know some situations when wavelet analysis is useful, it is worthwhile asking “What is
wavelet analysis?” and even more fundamentally,

“What is a wavelet?”

A wavelet is a waveform of effectively limited duration that has an average value of zero.

Compare wavelets with sine waves, which are the basis of Fourier analysis.

Sinusoids do not have limited duration — they extend from minus to plus infinity. And where
sinusoids are smooth and predictable, wavelets tend to be irregular and asymmetric.

Fourier analysis consists of breaking up a signal into sine waves of various frequencies. Similarly,
wavelet analysis is the breaking up of a signal into shifted and scaled versions of the original (or mother)
wavelet. Just looking at pictures of wavelets and sine waves, you can see intuitively that signals with sharp
changes might be better analyzed with an irregular wavelet than with a smooth sinusoid, just as some
foods are better handled with a fork than a spoon. It also makes sense that local features can be described
better with wavelets that have local extent.

1.1.4 NUMBER OF DIMENSIONS:

Thus far, we’ve discussed only one-dimensional data, which encompasses most ordinary signals.
However, wavelet analysis can be applied to two-dimensional data (images) and, in principle, to higher
dimensional data. This toolbox uses only one and two-dimensional analysis techniques.

1.4.1 THE CONTINUOUS WAVELET TRANSFORM:

Mathematically, the process of Fourier analysis is represented by the Fourier transform:


Which is the sum over all time of the signal f(t) multiplied by a complex exponential. (Recall that a
complex exponential can be broken down into real and imaginary sinusoidal components.) The results of
the transform are the Fourier coefficients F(w), which when multiplied by a sinusoid of frequency w yields
the constituent sinusoidal components of the original signal. Graphically, the process looks like:

Similarly, the continuous wavelet transform (CWT) is defined as the sum over all time of signal
multiplied by scaled, shifted versions of the wavelet function.

The result of the CWT is a series many wavelet coefficients C, which are a function of scale and
position.

Multiplying each coefficient by the appropriately scaled and shifted wavelet yields the constituent
wavelets of the original signal:

1.4.2 SCALING

We’ve already alluded to the fact that wavelet analysis produces a time-scale view of a signal and
now we’re talking about scaling and shifting wavelets.

What exactly do we mean by scale in this context?

Scaling a wavelet simply means stretching (or compressing) it.

To go beyond colloquial descriptions such as “stretching,” we introduce the scale factor, often
denoted by the letter a.

If we’re talking about sinusoids, for example the effect of the scale factor is very easy to see:

The scale factor works exactly the same with wavelets. The smaller the scale factor, the more
“compressed” the wavelet.
It is clear from the diagrams that for a sinusoid sin (wt) the scale factor ‘a’ is related (inversely) to
the radian frequency ‘w’. Similarly, with wavelet analysis the scale is related to the frequency of the
signal.

1.4.3 SHIFTING

Shifting a wavelet simply means delaying (or hastening) its onset.

1.2 THE DISCRETE WAVELET TRANSFORM:

Calculating wavelet coefficients at every possible scale is a fair amount of work, and it generates
an awful lot of data. What if we choose only a subset of scales and positions at which to make our
calculations? It turns out rather remarkably that if we choose scales and positions based on powers of two
so-called dyadic scales and positions—then our analysis will be much more efficient and just as accurate.
We obtain such an analysis from the discrete wavelet transform (DWT).

An efficient way to implement this scheme using filters was developed in 1988 by Mallat. The
Mallat algorithm is in fact a classical scheme known in the signal processing community as a two-channel
sub band coder. This very practical filtering algorithm yields a fast wavelet transform — a box into which
a signal passes, and out of which wavelet coefficients quickly emerge. Let’s examine this in more depth.

For many signals, the low-frequency content is the most important part. It is what gives the signal its
identity. The high-frequency content on the other hand imparts flavor or nuance. Consider the human
voice. If you remove the high-frequency components, the voice sounds different but you can still tell
what’s being said. However, if you remove enough of the low-frequency components, you hear gibberish.
In wavelet analysis, we often speak of approximations and details. The approximations are the high-scale,
low-frequency components of the signal. The details are the low-scale, high-frequency components.
The original signal S passes through two complementary filters and emerges as two signals.
Unfortunately, if we actually perform this operation on a real digital signal, we wind up with twice as
much data as we started with. Suppose, for instance that the original signal S consists of 1000 samples of
data. Then the resulting signals will each have 1000 samples, for a total of 2000.

These signals A and D are interesting, but we get 2000 values instead of the 1000 we had. There
exists a more subtle way to perform the decomposition using wavelets. By looking carefully at the
computation, we may keep only one point out of two in each of the two 2000-length samples to get the
complete information. This is the notion of own sampling. We produce two sequences called cA and cD.

The process on the right which includes down sampling produces DWT

Coefficients To gain a better appreciation of this process let’s perform a one-stage discrete wavelet
transform of a signal. Our signal will be a pure sinusoid with high- frequency noise added to it.

Here is our schematic diagram with real signals inserted into it:

Notice that the detail coefficients cD is small and consist mainly of a high-frequency noise, while
the approximation coefficients cA contains much less noise than does the original signal.

You may observe that the actual lengths of the detail and approximation coefficient vectors are
slightly more than half the length of the original signal. This has to do with the filtering process, which is
implemented by convolving the signal with a filter. The convolution “smears” the signal, introducing
several extra samples into the result.

1.3 MULTIPLE-LEVEL DECOMPOSITION:

The decomposition process can be iterated, with successive approximations being decomposed in
turn, so that one signal is broken down into many lower resolution components. This is called the wavelet
decomposition tree.
Looking at a signal’s wavelet decomposition tree can yield valuable information.

1.3.1 NUMBER OF LEVELS:

Since the analysis process is iterative, in theory it can be continued indefinitely. In reality, the
decomposition can proceed only until the individual details consist of a single sample or pixel. In practice,
you’ll select a suitable number of levels based on the nature of the signal, or on a suitable criterion such as
entropy.

The double-density DWT is an improvement upon the critically sampled DWT with important additional
properties: (1) It employs one scaling function and two distinct wavelets, which are designed to be offset
from one another by one half, (2) The double-density DWT is overcomplete by a factor of two, and (3) It
is nearly shift-invariant. In two dimensions, this transform outperforms the standard DWT in terms of
denoising; however, there is room for improvement because not all of the wavelets are directional. That is,
although the double-density DWT utilizes more wavelets, some lack a dominant spatial orientation, which
prevents them from being able to isolate those directions.
A solution to this problem is provided by the double-density complex DWT, which combines the
characteristics of the double-density DWT and the dual-tree DWT. The double-density complex DWT is
based on two scaling functions and four distinct wavelets, each of which is specifically designed such that
the two wavelets of the first pair are offset from one other by one half, and the other pair of wavelets form
an approximate Hilbert transform pair. By ensuring these two properties, the double-density complex
DWT possesses improved directional selectivity and can be used to implement complex and directional
wavelet transforms in multiple dimensions.

We construct the filter bank structures for both the double-density DWT and the double-density complex
DWT using finite impulse response (FIR) perfect reconstruction filter banks, which are discussed in detail
at the beginning of each section. These filter banks are then applied recursively to the lowpass subband,
using the analysis filters for the forward transform and the synthesis filters for the inverse transform. By
doing this, it is then possible to evaluate each transform's performance in several applications including
signal denoising, image enhancement.

Figure 1. A 3-Channel Perfect Reconstruction Filter Bank.

The analysis filter bank consists of three analysis filters—one lowpass filter denoted by h0(-n) and two
distinct highpass filters denoted by h1(-n) and h2(-n). As the input signal x(n) travels through the system,
the analysis filter bank decomposes it into three subbands, each of which is then down-sampled by 2. From
this process we obtain the signals c(n), d1(n), and d2(n), which represent the low frequency (or coarse)
subband, and the two high frequency (or detail) subbands, respectively.

The synthesis filter bank consists of three synthesis filters—one lowpass filter denoted by h0(n) and two
distinct highpass filters denoted by h1(n) and h2(n)—which are essentially the inverse of the analysis
filters. As the three subband signals travel through the system, they are up-sampled by two, filtered, and
then combined to form the output signal y(n).

One of the main concerns in filter bank design is to ensure the perfect reconstruction (PR) condition. That
is, to design h0(n), h1(n), and h2(n) such that y(n)=x(n).
The transform is 2-times expansive because for an N-point signal it gives 2N DWT coefficients. If the
filters in the upper and lower DWTs are the same, then no advantage is gained. However, if the filters are
designed is a specific way, then the subband signals of the upper DWT can be interpreted as the real part
of a complex wavelet transform, and subband signals of the lower DWT can be interpreted as the
imaginary part. Equivalently, for specially designed sets of filters, the wavelet associated with the upper
DWT can be an approximate Hilbert transform of the wavelet associated with the lower DWT. When
designed in this way, the dual-tree complex DWT is nearly shift-invariant, in contrast with the critically-
sampled DWT. Moreover, the dual-tree complex DWT can be used to implement 2D wavelet transforms
where each wavelet is oriented, which is especially useful for image processing. (For the separable 2D
DWT, recall that one of the three wavelets does not have a dominant orientation.) The dual-tree complex
DWT outperforms the critically-sampled DWT for applications like image denoising and enhancement.

The use of digital multimedia system content is increased large amount of data is transfer and distributed
easily. Copying of digital media has become comparatively easy. These products can be transmitted and
redistributed easily without any authentication. So there is need for copyright protection of compact disk
data. Digital watermarking is the process of hiding digital information in a carrier signal. Information is
nothing but name of creator, status, recipient, etc. Watermarking can be done for different types of digital
data where copyright needs to be protected. Digital watermarks are used to verify the authenticity of the
carrier signal. It is prominently used for tracing copyright violations. Like traditional watermarks, digital
watermarks are only perceptible under certain environment, i.e. after using some algorithm. A watermark
is a digital code permanently embedded into cover content, in case of this system, into a video sequence.
Applications of watermarking are copying prevention, broadcast monitoring, authentication and data
hiding. The watermarking technique is used for data hiding. The main aspects of information hiding are
faculty, security and robustness[4-6]. Amount of information that can be hidden is capacity. Detecting the
information correctly is security and robustness refers to the resistance to amendment of the cover content.
Video watermarking algorithms usually prefers robustness. A robust algorithm it is not possible to remove
the watermark without rigorous degradation of the cover content. Video watermarking approaches will be
classified in to two main classes based on the method of hiding watermark bits in the host video. The two
classes are: Spatial domain watermarking wherever embedding and detection of watermark are performed
by directly manipulating the pixel intensity values of the video frame. Transform domain techniques[8-9],
on the totally different hand, alter spatial pixel values of the host video according to a pre-determined
transform and are more robust than spatial domain technique since they disperse the watermark in the
spatial domain of the video frame making it tough to remove the watermark through malicious attacks like
cropping, scaling, rotations and geometrical attacks. The commonly used convert domain techniques are
Discrete Fourier Transform (DFT), the Discrete Cosine Transform (DCT), and the Discrete Wavelet
Transform (DWT)
1) Shot segmentation of video sequences

The original input video sequence is first segmented into non-overlapping units, called shots that describe
different actions. Each shot is characterized by no significant changes in its content which is determined by
the background and the objects present in the landscape. Here, proposed method uses DCT and correlation
measure to identify the number of frames involved in each shot.

2) Bit plane slicing of grayscale image

Bit-Plane Slicing is a technique in which the image is sliced at different planes. Instead of highlighting
gray level descriptions, highlighting the contribution made to total image appearance by definite bits might
be desired.

3) Pixel permutation

After the bit plane slicing process, the sliced images are allowed to permute each pixel value to develop
the security of the hiding in sequence.

4) Decomposition of image using DWT

Wavelets are functions defined over a finite interval and having an average value of zero. The basic idea of
the wavelet transform is to represent any arbitrary occupation as a superposition of a set of such wavelets
or basis functions. The figure 1 shows the watermarking embedding development as shown below.
Several researches concentrated on using DWT because of its multi resolution characteristics. PCA has
been used in different ways in image and video watermarking methods. For implementation of robust
video watermarking scheme following transforms are used. Discrete Wavelet Transform (DWT) Principle
Component Analysis (PCA) DWT is used to put into practice a simple watermarking scheme. The 2-D
discrete wavelet transforms (DWT) decomposes the image into sub-images. The approximation look like
the original, only on the 1/4 scale. The 2-D DWT is an application of the 1-D DWT in both the horizontal
and also the vertical directions. The DWT go moldy an image into a lower resolution approximation image
(LL) as well as horizontal (HL), vertical (LH) and diagonal (HH) detail machinery. Due to its excellent
spatial-frequency localization properties DWT is very suitable to identify areas in the host video frame
where a watermark can be embedded imperceptibly. Embedding the watermark in low frequencies
obtained by wavelet breakdown increases the robustness with respect to attacks that have low pass
characteristics like lossy compression, filtering, and geometric distortions.

Figure 2.1: Standard DWT decomposition

PCA is a method of identifying patterns in data, and expressing the data in such a way so as to highlight
their similarities and differences. PCA produces linear combinations of the original variables to generate
the axes, also known as principal components, or PCs. PCA transform is used to embed the watermark in
each colour channel of each frame of video. The main improvement of this approach is that the same or
multi-watermark can be embedded into the three colour channels of the image in order to increase the
robustness of the watermark. For each block covariance matrix is calculated. Then each block is
transforms into PCA components. watermark image is taken.

In this paper, we propose a novel framework to address the analysis, detection, and processing of blur
kernels and blurry images. Central to this work is the notion of double discrete wavelet transform (DDWT)
that is designed to sparsify the blurred image and the blur kernel simultaneously. We contrast DDWT with
the work of, which regularizes image and blur kernel in terms of their sparsity in linear transform domain.
The major disadvantage of regularization approach is that the image/blur coefficients are not directly
observed, hence requiring a computationally taxing search to minimize some “cost” function. On the other
hand, out DDWT provides a way to observe the wavelet coefficients of image and blur kernel directly.
This gives DDWT coefficients a very intuitive interpretation, and simplify the task of decoupling the blur
from the signal, regardless of why the blur occurred (e.g. object motion, defocus, and camera shake) or the
type of blur (e.g. global and spatially varying blur). In this sense, DDWT is likely

to impact computer vision and image processing applications broadly. Although the primary goal of this
article is to develop the DDWT as an analytical tool, we also show example applications in blur kernel
estimation, deblurring, and near-blur-invariant image feature extraction to illustrate
the potential of DDWT.

Fig.2.2(a) DDWT analysis in (2)

Fig.2.3 . (b) DDWT analysis in (3)


The two processing pipelines above are equivalent. Though (a) is the direct result of applying DDWT to
the observed blurry image y, (b) is the interpretation we give to the DDWT coefficients.
Recent advancements on blind and non-blind deblurring have enabled us to handle complex uniform blur
kernelsBy comparison, progress in blind and non-blind deblurring for spatially varying blur kernel has
been slow since there are limited data availability to support localized blur kernel. For this reason, it is
more common to address this problem using multiple input images and additional hardware. Approaches
to computational solutions include supervised or unsupervised foreground/background segmentation,
statistical modeling, homography based blur kernel modeling methods and partial differential equation
(PDE) methods. In particular, sparsifying transforms have played key roles in the detection of blur
kernels—gradient operator and wavelet/framelet/curvelet transforms have been used for this purpose.
However, existing works have shortcomings, such as problems with ringing artifact in deblurring or
inability to handle spatially varying blur. It is also common for deblurring algorithms to require iteration,
which is highly undesirable for many real-time applications. Besides PDE, authors are unaware of any
existing framework that unify analysis, detection, and processing of camera shake, object motion, defocus,
global, and spatially varying blurs.

2.1Object Motion Detection


Naturally, human eye is well suited for the task of identifying replicated DWT coefficients u in DDWT
coefficients in military applications, for example, a human “analyst” can easily detect complex motion
(such as rotation) when presented with a picture of DDWT coefficients. For computational object motion
detection, our ability to detect complex motion is limited primarily by the capabilities of the computer
vision algorithms to extract local image features from DDWT, and find similar features that is located
unknown (k pixel) distance away. There are a
number of ways that this can be accomplished—we emphasize that the correlation-based technique we
present below should not be taken as the best way, but rather a proof-of concept on DDWT. A more
advanced image feature extraction strategy founded on computer vision principles would likely improve
the overall performance of the object motion detection—we leave this as future research plan.
Double Density Wavelete Transform

2.2 Double Density Complex DWT (DDC)


The input data were processed by two parallel iterated filter banks hi(n) and gi(n) where i = 0, 1, 2. The

real part of a complex wavelet transform [2] was produced by the sub-band signals of the upper DWT and
the imaginary part was produced by the lower DWT.

Fig.2.4 Filter bank diagram of Double Density Complex DWT

In this modern era, many ensemble learning techniques are used to automatically detect a particular
disease. These techniques need large databases as training set to learn about the disease. The learning is
effective, if the training set has less noise. As the training set is not guaranteed to be noise free, various
denoising techniques depending upon the applications are employed to suppress the unwanted data. The
different types of noises discussed in this paper are as follows:

Random noises are caused due to the interference from random process such as thermal noise and counting
of photons. The central value theorem states that irrespective of shapes of the individual random
distributions, the convolution of these random functions in cascade will tend to a Gaussian distribution
function. So the Gaussian noise is simulated for studying the effects of random noise and this is an additive
model. Let Im,n be the NxN image pixels and ηm,n be the NxN noise pixels where 1 < m,n < N;

𝑓𝑚,𝑛 = Im,n + ηm,n

The probability density function p of a Gaussian random variable z with 𝜎𝑥2 as the variance and 𝜇𝑥
as the mean of the noise is given by:

(𝑥−𝜇𝑥 )2
1 −
2𝜎𝑥2
𝑃𝑥 (𝑥) = 𝑒
√2𝜋𝜎𝑥

During transmission, the additive noise is caused due to passing automobiles, static electricity, power
lines, etc., near the transmission lines. Additive White Guassian noise(AWGN) is generated from additive
guassian noise. João M.Sanches, Jacinto C. Nascimento, and Jorge S. Marques [6] removes the Guassian
noise in an image by solving computationally intensive equations using numerical methods.

Noises in MR images are generally modeled as white and Rician distributed[7,8]. For the images with high
SNR, Rician noise is well approximated to Guassian noise. But in low SNR cases, the Rician noise
distribution is considerally different than Guassian noise. The probability of image pixels to be Rician
distributed is given by

𝐼2 + 𝑁2
𝑁 − 𝐼. 𝑁
𝑃𝑁 (𝑁) = 𝑒 2𝜎2 𝐸0 ( )
𝜎2 𝜎2

Where N is Rician noise pixel, I is the original image and σ is the variance.

Jan Aelterman et al.[9] proposes a two-step Rician noise removal process. First, the bias is removed and
then the denoising is done on the square root of the image in wavelet domain.The uncertain behavior in the
emission of photons by the sensors of the scanners leads to the poisson noise and it is characterized by a
random variable represented by poisson probability distribution function. This is associated with the
systems such as PET, SPECT, and fluorescent confocal microscopy imaging. Shot noise in the scanning
devices can also be modeled as poisson noise. The likelihood of getting poisson noise pixel η an observed
image I with λ as proportionality factor is given by:

𝑁 η
(λIm,n ) m,n 𝑒 −λIm,n
P(ηm,n | Im,n , λ) = ∏
ηm,n !
𝑚,𝑛=1

Speckle noises occur due to the random disturbance in the coherent properties of the emitted wave in the
imaging systems such as laser, acoustics and synthetic aperture radar imagery. It follows a multiplicative
noise model and it estimated by generalized gamma distribution. It is represented as

𝑓𝑚,𝑛 = Im,n ∗ ηm,n

A generalized gamma random variable X with scale parameter α, and shape parameters β, and γ, gamma
function 𝛤 has probability density function:

x γ
γx γβ−1 e−(α)
𝑃(𝑥) =
αγβ 𝛤(β)

During transmission, the multiplicative noise is caused due to turbulence in air, reflections, refractions,
etc., on the transmission lines. Y. Guo et al. modifies non-local mean filter which works better for
Gaussian denoising, to adapt with the speckle reduction process. This new filter is proved to be with good
denoising property along with the better edge information preservation.

Salt and pepper noises are induced due to malfunctioning of acquiring sensors and synchronization errors
in transmission. It causes a sudden change in the pixel value of an image and is an impulse type of noise.
Nawazish Naveeda et al. used the neural networks to detect the impulse noise and weighted average of
three filters to clear out the noise in corrupted mammographic images.

In this paper, medical image denoising is done using 2-dimensional non-subsampled contourlets, curvelets
and double density dual tree complex transform. The issues such as shift invariance, aliasing and lack of
directionalty of the traditional discrete wavelet transform(DWT) makes it ineffective in the field of
medical image denoising. Over years, many variant of wavelet transforms like the double density DWT,
dual tree complex DWT are developed to overcome these disadvantages. There are many identical factors
between the double density DWT and dual tree complex DWT such as shift invariance, over completeness
by a factor of 2 and they both use perfectly reconstruction filter banks. The double density DWT has a
single scaling function, and its output wavelets are smooth and are shift invariant. Scientists believe that
the best waveform for image processing is of gabor atom which is complex in nature. To incorporate the
complex nature for wavelets, a new type of wavelet called double density dual tree complex wavelet
transform combined with properties of double density DWT and dual tree complex DWT is introduced.

Curvelet is a multiscale transform that efficiently represents singularities and the edges of curves. Their
representation of edges are scattered than in wavelets i.e. the energy of the object is localized in few
coefficients. This property of the sparseness allows for the better image reconstructions or coding
algorithms.

Contourlet is the expansion of wavelet theory with the directionality and non-shift invariant property. Non-
subsampled contourlet(NSCT) is a directional multiresolution transform which is known for its property of
shift invariance, multiscale and the preservation of vital information in natural scenes.

In this paper, the effect of different 2D geometric multiscale transforms like non-subsampled contourlets,
curvelets and double density dual tree complex transform on many types of noises like AWGN, salt &
pepper noise, speckle noise, and poisson noise present on MR images are analysed
MATLAB
INTRODUCTION TO MATLAB

What Is MATLAB?

MATLAB® is a high-performance language for technical computing. It integrates computation,


visualization, and programming in an easy-to-use environment where problems and solutions are
expressed in familiar mathematical notation. Typical uses include

Math and computation

Algorithm development

Data acquisition

Modeling, simulation, and prototyping

Data analysis, exploration, and visualization

Scientific and engineering graphics

Application development, including graphical user interface building.

MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems, especially those with matrix
and vector formulations, in a fraction of the time it would take to write a program in a scalar non
interactive language such as C or FORTRAN.
The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide easy access
to matrix software developed by the LINPACK and EISPACK projects. Today, MATLAB engines
incorporate the LAPACK and BLAS libraries, embedding the state of the art in software for matrix
computation.

MATLAB has evolved over a period of years with input from many users. In university environments, it
is the standard instructional tool for introductory and advanced courses in mathematics, engineering, and
science. In industry, MATLAB is the tool of choice for high-productivity research, development, and
analysis.

MATLAB features a family of add-on application-specific solutions called toolboxes. Very


important to most users of MATLAB, toolboxes allow you to learn and apply specialized technology.
Toolboxes are comprehensive collections of MATLAB functions (M-files) that extend the MATLAB
environment to solve particular classes of problems. Areas in which toolboxes are available include signal
processing, control systems, neural networks, fuzzy logic, wavelets, simulation, and many others.

The MATLAB System:

The MATLAB system consists of five main parts:

Development Environment:

This is the set of tools and facilities that help you use MATLAB functions and files. Many of these
tools are graphical user interfaces. It includes the MATLAB desktop and Command Window, a command
history, an editor and debugger, and browsers for viewing help, the workspace, files, and the search path.

The MATLAB Mathematical Function:

This is a vast collection of computational algorithms ranging from elementary functions like sum,
sine, cosine, and complex arithmetic, to more sophisticated functions like matrix inverse, matrix eigen
values, Bessel functions, and fast Fourier transforms.

The MATLAB Language:


This is a high-level matrix/array language with control flow statements, functions, data structures,
input/output, and object-oriented programming features. It allows both "programming in the small" to
rapidly create quick and dirty throw-away programs, and "programming in the large" to create complete
large and complex application programs.

Graphics:

MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as
annotating and printing these graphs. It includes high-level functions for two-dimensional and three-
dimensional data visualization, image processing, animation, and presentation graphics. It also includes
low-level functions that allow you to fully customize the appearance of graphics as well as to build
complete graphical user interfaces on your MATLAB applications.

The MATLAB Application Program Interface (API):

This is a library that allows you to write C and Fortran programs that interact with MATLAB. It
includes facilities for calling routines from MATLAB (dynamic linking), calling MATLAB as a
computational engine, and for reading and writing MAT-files.

MATLAB WORKING ENVIRONMENT:

MATLAB DESKTOP:-

Matlab Desktop is the main Matlab application window. The desktop contains five sub windows, the
command window, the workspace browser, the current directory window, the command history window,
and one or more figure windows, which are shown only when the user displays a graphic.

The command window is where the user types MATLAB commands and expressions at the prompt
(>>) and where the output of those commands is displayed. MATLAB defines the workspace as the set of
variables that the user creates in a work session. The workspace browser shows these variables and some
information about them. Double clicking on a variable in the workspace browser launches the Array
Editor, which can be used to obtain information and income instances edit certain properties of the
variable.
The current Directory tab above the workspace tab shows the contents of the current directory, whose
path is shown in the current directory window. For example, in the windows operating system the path
might be as follows: C:\MATLAB\Work, indicating that directory “work” is a subdirectory of the main
directory “MATLAB”; WHICH IS INSTALLED IN DRIVE C. clicking on the arrow in the current
directory window shows a list of recently used paths. Clicking on the button to the right of the window
allows the user to change the current directory.

MATLAB uses a search path to find M-files and other MATLAB related files, which are organize
in directories in the computer file system. Any file run in MATLAB must reside in the current directory or
in a directory that is on search path. By default, the files supplied with MATLAB and math works
toolboxes are included in the search path. The easiest way to see which directories are on the search path.

The easiest way to see which directories are soon the search path, or to add or modify a search path, is to
select set path from the File menu the desktop, and then use the set path dialog box. It is good practice to
add any commonly used directories to the search path to avoid repeatedly having the change the current
directory.

The Command History Window contains a record of the commands a user has entered in the
command window, including both current and previous MATLAB sessions. Previously entered MATLAB
commands can be selected and re-executed from the command history window by right clicking on a
command or sequence of commands.

This action launches a menu from which to select various options in addition to executing the commands.
This is useful to select various options in addition to executing the commands. This is a useful feature
when experimenting with various commands in a work session.

Using the MATLAB Editor to create M-Files:

The MATLAB editor is both a text editor specialized for creating M-files and a graphical MATLAB
debugger. The editor can appear in a window by itself, or it can be a sub window in the desktop. M-files
are denoted by the extension .m, as in pixelup.m. The MATLAB editor window has numerous pull-down
menus for tasks such as saving, viewing, and debugging files. Because it performs some simple checks and
also uses color to differentiate between various elements of code, this text editor is recommended as the
tool of choice for writing and editing M-functions. To open the editor , type edit at the prompt opens the
M-file filename.m in an editor window, ready for editing. As noted earlier, the file must be in the current
directory, or in a directory in the search path.

Getting Help:

The principal way to get help online is to use the MATLAB help browser, opened as a separate
window either by clicking on the question mark symbol (?) on the desktop toolbar, or by typing help
browser at the prompt in the command window. The help Browser is a web browser integrated into the
MATLAB desktop that displays a Hypertext Markup Language(HTML) documents. The Help Browser
consists of two panes, the help navigator pane, used to find information, and the display pane, used to view
the information. Self-explanatory tabs other than navigator pane are used to perform a search.
DIGITAL IMAGE PROCESSING
BACKGROUND:

Digital image processing is an area characterized by the need for extensive experimental work to
establish the viability of proposed solutions to a given problem. An important characteristic underlying
the design of image processing systems is the significant level of testing & experimentation that normally
is required before arriving at an acceptable solution. This characteristic implies that the ability to formulate
approaches &quickly prototype candidate solutions generally plays a major role in reducing the cost &
time required to arrive at a viable system implementation.

What is DIP
An image may be defined as a two-dimensional function f(x, y), where x & y are spatial
coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of
the image at that point. When x, y & the amplitude values of f are all finite discrete quantities, we call
the image a digital image. The field of DIP refers to processing digital image by means of digital
computer. Digital image is composed of a finite number of elements, each of which has a particular
location & value. The elements are called pixels.

Vision is the most advanced of our sensor, so it is not surprising that image play the single most
important role in human perception. However, unlike humans, who are limited to the visual band of the
EM spectrum imaging machines cover almost the entire EM spectrum, ranging from gamma to radio
waves. They can operate also on images generated by sources that humans are not accustomed to
associating with image.

There is no general agreement among authors regarding where image processing stops & other related
areas such as image analysis& computer vision start. Sometimes a distinction is made by defining image
processing as a discipline in which both the input & output at a process are images. This is limiting &
somewhat artificial boundary. The area of image analysis (image understanding) is in between image
processing & computer vision.
There are no clear-cut boundaries in the continuum from image processing at one end to complete
vision at the other. However, one useful paradigm is to consider three types of computerized processes in
this continuum: low-, mid-, & high-level processes. Low-level process involves primitive operations such
as image processing to reduce noise, contrast enhancement & image sharpening. A low- level process is
characterized by the fact that both its inputs & outputs are images. Mid-level process on images involves
tasks such as segmentation, description of that object to reduce them to a form suitable for computer
processing & classification of individual objects. A mid-level process is characterized by the fact that its
inputs generally are images but its outputs are attributes extracted from those images. Finally higher- level
processing involves “Making sense” of an ensemble of recognized objects, as in image analysis & at the
far end of the continuum performing the cognitive functions normally associated with human vision.

Digital image processing, as already defined is used successfully in a broad range of areas of
exceptional social & economic value.

What is an image?

An image is represented as a two dimensional function f(x, y) where x and y are spatial co-
ordinates and the amplitude of ‘f’ at any pair of coordinates (x, y) is called the intensity of the image at
that point.

Gray scale image:

A grayscale image is a function I (xylem) of the two spatial coordinates of the image plane.

I(x, y) is the intensity of the image at the point (x, y) on the image plane.

I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] [0, b]I: [0, a]  [0,
b]  [0, info)

Color image:

It can be represented by three functions, R (xylem) for red, G (xylem) for green and B (xylem)
for blue.

An image may be continuous with respect to the x and y coordinates and also in amplitude.
Converting such an image to digital form requires that the coordinates as well as the amplitude to be
digitized. Digitizing the coordinate’s values is called sampling. Digitizing the amplitude values is called
quantization.

Coordinate convention:
The result of sampling and quantization is a matrix of real numbers. We use two principal ways to
represent digital images. Assume that an image f(x, y) is sampled so that the resulting image has M rows
and N columns. We say that the image is of size M X N. The values of the coordinates (xylem) are discrete
quantities. For notational clarity and convenience, we use integer values for these discrete coordinates. In
many image processing books, the image origin is defined to be at (xylem)=(0,0).The next coordinate
values along the first row of the image are (xylem)=(0,1).It is important to keep in mind that the notation
(0,1) is used to signify the second sample along the first row. It does not mean that these are the actual
values of physical coordinates when the image was sampled. Following figure shows the coordinate
convention. Note that x ranges from 0 to M-1 and y from 0 to N-1 in integer increments.

The coordinate convention used in the toolbox to denote arrays is different from the preceding paragraph
in two minor ways. First, instead of using (xylem) the toolbox uses the notation (race) to indicate rows and
columns. Note, however, that the order of coordinates is the same as the order discussed in the previous
paragraph, in the sense that the first element of a coordinate topples, (alb), refers to a row and the second
to a column. The other difference is that the origin of the coordinate system is at (r, c) = (1, 1); thus, r
ranges from 1 to M and c from 1 to N in integer increments. IPT documentation refers to the coordinates.
Less frequently the toolbox also employs another coordinate convention called spatial coordinates which
uses x to refer to columns and y to refers to rows. This is the opposite of our use of variables x and y.

Image as Matrices:

The preceding discussion leads to the following representation for a digitized image function:

f (0,0) f(0,1) ……….. f(0,N-1)

f (1,0) f(1,1) ………… f(1,N-1)

f (xylem)= . . .

. . .

f (M-1,0) f(M-1,1) ………… f(M-1,N-1)

The right side of this equation is a digital image by definition. Each element of this array is called
an image element, picture element, pixel or pel. The terms image and pixel are used throughout the rest of
our discussions to denote a digital image and its elements.

A digital image can be represented naturally as a MATLAB matrix:

f (1,1) f(1,2) ……. f(1,N)


f (2,1) f(2,2) …….. f (2,N)

. . .

f= . . .

f (M,1) f(M,2) …….f(M,N)

Where f (1,1) = f(0,0) (note the use of a monoscope font to denote MATLAB quantities). Clearly
the two representations are identical, except for the shift in origin. The notation f(p ,q) denotes the element
located in row p and the column q. For example f(6,2) is the element in the sixth row and second column
of the matrix f. Typically we use the letters M and N respectively to denote the number of rows and
columns in a matrix. A 1xN matrix is called a row vector whereas an Mx1 matrix is called a column
vector. A 1x1 matrix is a scalar.

Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and so on.
Variables must begin with a letter and contain only letters, numerals and underscores. As noted in the
previous paragraph, all MATLAB quantities are written using mono-scope characters. We use
conventional Roman, italic notation such as f(x ,y), for mathematical expressions

Reading Images:

Images are read into the MATLAB environment using function imread whose syntax is

Imread (‘filename’)

Format name Description recognized extension

TIFF Tagged Image File Format .tif, .tiff

JPEG Joint Photograph Experts Group .jpg, .jpeg

GIF Graphics Interchange Format .gif

BMP Windows Bitmap .bmp

PNG Portable Network Graphics .png

XWD X Window Dump .xwd


Here filename is a spring containing the complete of the image file(including any applicable
extension).For example the command line

>> f = imread (‘8. jpg’);

Reads the JPEG (above table) image chestxray into image array f. Note the use of single quotes (‘)
to delimit the string filename. The semicolon at the end of a command line is used by MATLAB for
suppressing output If a semicolon is not included. MATLAB displays the results of the operation(s)
specified in that line. The prompt symbol (>>) designates the beginning of a command line, as it appears in
the MATLAB command window.

Data Classes:

Although we work with integers coordinates the values of pixels themselves are not restricted to
be integers in MATLAB. Table above list various data classes supported by MATLAB and IPT are
representing pixels values. The first eight entries in the table are refers to as numeric data classes. The
ninth entry is the char class and, as shown, the last entry is referred to as logical data class.

All numeric computations in MATLAB are done in double quantities, so this is also a frequent data
class encounter in image processing applications. Class unit 8 also is encountered frequently, especially
when reading data from storages devices, as 8 bit images are most common representations found in
practice. These two data classes, classes logical, and, to a lesser degree, class unit 16 constitute the primary
data classes on which we focus. Many ipt functions however support all the data classes listed in table.
Data class double requires 8 bytes to represent a number uint8 and int 8 require one byte each, uint16 and
int16 requires 2bytes and unit 32.
Name Description

Double Double _ precision, floating_ point numbers the Approximate.

Uint8 unsigned 8_bit integers in the range [0,255] (1byte per


Element).

Uint16 unsigned 16_bit integers in the range [0, 65535] (2byte per element).

Uint 32 unsigned 32_bit integers in the range [0, 4294967295](4 bytes per element). Int8
signed 8_bit integers in the range [-128,127] 1 byte per element)

Int 16 signed 16_byte integers in the range [32768, 32767] (2 bytes per element).

Int 32 Signed 32_byte integers in the range [-2147483648, 21474833647] (4 byte per element).

Single single _precision floating _point numbers with values

In the approximate range (4 bytes per elements)

Char characters (2 bytes per elements).

Logical values are 0 to 1 (1byte per element).

Int 32 and single required 4 bytes each. The char data class holds characters in Unicode
representation. A character string is merely a 1*n array of characters logical array contains only the values
0 to 1,with each element being stored in memory using function logical or by using relational operators.

Image Types:

The toolbox supports four types of images:

1 .Intensity images;
2. Binary images;

3. Indexed images;

4. R G B images.

Most monochrome image processing operations are carried out using binary or intensity images,
so our initial focus is on these two image types. Indexed and RGB colour images.
Intensity Images:

An intensity image is a data matrix whose values have been scaled to represent intentions. When
the elements of an intensity image are of class unit8, or class unit 16, they have integer values in the range
[0,255] and [0, 65535], respectively. If the image is of class double, the values are floating point numbers.
Values of scaled, double intensity images are in the range [0, 1] by convention.

Binary Images:

Binary images have a very specific meaning in MATLAB.A binary image is a logical array 0s
and1s.Thus, an array of 0s and 1s whose values are of data class, say unit8, is not considered as a binary
image in MATLAB .A numeric array is converted to binary using function logical.

Thus, if A is a numeric array consisting of 0s and 1s, we create an array B using the statement.

B=logical (A)

If A contains elements other than 0s and 1s.Use of the logical function converts all nonzero
quantities to logical 1s and all entries with value 0 to logical 0s.

Using relational and logical operators also creates logical arrays.

To test if an array is logical we use the I logical function: islogical(c).

If c is a logical array, this function returns a 1.Otherwise returns a 0. Logical array can be
converted to numeric arrays using the data class conversion functions.

Indexed Images:

An indexed image has two components:

A data matrix integer, x


A color map matrix, map

Matrix map is an m*3 arrays of class double containing floating point values in the range [0,
1].The length m of the map are equal to the number of colors it defines. Each row of map specifies the red,
green and blue components of a single color. An indexed images uses “direct mapping” of pixel intensity
values color map values. The color of each pixel is determined by using the corresponding value the
integer matrix x as a pointer in to map. If x is of class double ,then all of its components with values less
than or equal to 1 point to the first row in map, all components with value 2 point to the second row and so
on. If x is of class units or unit 16, then all components value 0 point to the first row in map, all
components with value 1 point to the second and so on.

RGB Image:

An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet
corresponding to the red, green and blue components of an RGB image, at a specific spatial location. An
RGB image may be viewed as “stack” of three gray scale images that when fed in to the red, green and
blue inputs of a color monitor

Produce a color image on the screen. Convention the three images forming an RGB color image are
referred to as the red, green and blue components images. The data class of the components images
determines their range of values. If an RGB image is of class double the range of values is [0, 1].

Similarly the range of values is [0,255] or [0, 65535].For RGB images of class units or unit 16
respectively. The number of bits use to represents the pixel values of the component images determines the
bit depth of an RGB image. For example, if each component image is an 8bit image, the corresponding
RGB image is said to be 24 bits deep.

Generally, the number of bits in all component images is the same. In this case the number of
possible color in an RGB image is (2^b) ^3, where b is a number of bits in each component image. For the
8bit case the number is 16,777,216 colors.
In this letter, a novel resolution-enhancement technique based on the interpolation of the HF subband
images in the wavelet domain has been presented. In contrast with other stateof- the-art resolution-
enhancement techniques, the designed framework applies the edge and fine features information that is
obtained in WT space, performs sparse interpolation over an oriented block in an LR image, and uses the
NLM denoising algorithm for the SR restoration. Experimental results highlight the superior performance
of the proposed algorithm in terms of objective criteria, as well as in the subjective perception via the
human visual system, in comparison with other conventional methods.

PROGRESSIVE HYBRID GRAPH LAPLACIAN REGULARIZATION


As stated in the above section, the IK-GLRR and EK-GLRR models provide two complementary views about the current image
recovery task. A natural question is how to combine them together into an elegant framework. In this paper, we propose to use
a simple multi-scale framework to achieve such a purpose. There are at least several reasons why we use the multi-scale
framework. First, one important characteristic of natural images is that they are comprised of structures at different scales.
Through multi-scale decomposition, the structures of images at different scales become better exposed, and hence be more
easily predicted. Second, a multi-scale scheme will give a more compact representation of imagery data because it encodes low
frequency parts and high frequency parts separately. As well known, the second order statistics of natural images tends to be
invariant across different scales [13], [14]. Therefore, the low frequency parts can be extracted from much smaller downsampled
image. Third, the stronger
correlations among adjacent image blocks will be catured in the downsampled images because every four image blocks are
merged into one block in the downsampled image. As a consequence, in this paper, we propose an effective approach to recover
noisy imagery data by combining hybrid models and the multi-scale paradigm. In this way, we can construct a Laplacian
pyramid. In the practical implementation, we construct a tree-level Laplacian pyramid. At the beginning, we have
the image I2 at scale 2 at hand, which is defined on the coarsest grid of pixels G2. This initial image lacks a fraction
of its samples. We start off by recovering the missing samples using the proposed IK-GLRR model, which has been
detailed in Section II-C, to get a more complete grid I 2. This procedure can be performed iteratively by feeding the
processing results
I 2 to the GLRR model as a prior for computing the kernel distance κ. In the

Using the above progressive recovery based on intra-scale and inter-scale correlation, we gradually recover an image with few
artifacts. Note that the basic scheme we use is closely related to the work [27]. However, we replace most of the components it
uses with application-specific ones that we describe in the above section. The first contribution is that we explicitly take into
account the intrinsic manifold structure by making use of both labeled and unlabeled data points. This is especially useful for
the current impulse noise removal task, in which the number of clean samples is usually not enough to train a robust predictor
when the noise level is heavy. The large amounts of
unlabeled data can be effectively utilized to extract information that is useful for generalization. The second contribution is that
we propose to use the model induced by implicit kernel to consider the property of scale-invariant of natural image, which are
shown to be essential for visual perception. This model can effectively learn and propagate the statistical features across
different scales to keep the local smoothness of images.

EXPERIMENTAL RESULTS AND ANALYSIS


In this section, extensive experimental results are presented to demonstrate the superiority of the proposed algorithm on the task
of impulse noise removal. In experiments, we test two cases: only denoising and both denoising and deblurring, to show the
power of our method on handling different distortion. For these two cases, we test two kinds of impulse noise: salt-and-pepper
noise and random-value noise. For thoroughness of our comparison study, we select seven widely used images in the literature
as test images, as illustrated in Fig. 4. The images are all sized of 512 × 512. There are a few parameters involved in the
proposed algorithm. σ2 and ε2 are fixed to 0.5. λ and γ are set as 0.5 and 0.01 respectively. For comprehensive comparison, the
proposed algorithm is compared with some state-of-the-art work in the literature. More specifically, four approaches are
included in our comparative study: (1) kernel regression (KR) based methods [22]; (2) two-phase method proposed by Cai et al.
[7]; (3) iterative framelet-based method (IFASDA) proposed by Li et al. [8]; (4) our method. The source code of our method can
be available from.

Salt-and-Pepper Noise Removal We first examine the performance comparison on restoring images contaminated by
salt-and-pepper noise only. The test images are corrupted by salt-and-pepper noise with high noise rates: 80%, 85%,
90%. For detecting salt-and-pepper noise, we use the AM filter [5] with a maximum window size of 19. We quantify
the objective performance of all methods by PSNR. Table I tabulates the objective performance of the compared
methods. It is clear to see that for all images our method gives highest PSNR values among the compared methods.
The average PSNR gain is up to 0.53dB compared with second best perform algorithm. For Lena, the performance
gain is 1.01dB when the noise level is 85%.It is worth mentioning that KR can be regarded as a special case of our method,
which only performs single-scale estimation. KR fails to handle high noise levels, such as 90%. At noise levels 80% and 85%,
our method works better than KR for all images except Barbara. In Barbara, there are many globally repeated textures with
regular directions. Since such regions are not piecewise stationary, the geometric duality across different scales does not exist.
Therefore, the multiscale framework does not work well in this case. In contrast, the single-scale and patch-based kernel
regression works better. For images with repetitive structures like Barbara, we can degenerate the proposed scheme to single-
scale to get better esults. Given the fact that human visual system (HVS) is the ultimate receiver of the restored images, we also
show the subjective comparison results. The recovered results for Lena, Peppers and Boat are illustrated in Fig. 5,
corresponding to 90% salt-and-peppers noise. The contents of these noisy images are almost not invisible. From the results, we
can find that, under high noise levels, the kernel regression methods generate some spurious high frequency artifacts; Cai’s
method overblurs the results and cannot keep the edge structure well; while the IFASDA approach causes irregular outliers
along edges and textures. It can be clearly observed that the proposed algorithm achieves the best overall visual quality through
combining the intra-scale and inter-scale correlation: the image s sharper due to the property of local smoothness preservation
when using inter-scale correlation, and the edges are more consistent due to the exploration of non-local self-similarity when
using intra-scale correlation. Random-Valued Impulse Noise Removal We now consider the case that test images are corrupted
by random-valued impulse noise only. The random noise

values are identically and uniformly distributed in [dmin, dmax], therefore, clearly random-valued impulse noise are more
difficult to detect than salt-and-pepper noise. And the task of random-valued noise removal is expected to be more difficult
compared with salt-and-peppers noise removal. Therefore, for random-valued impulse noise removal, we test three medium
noise levels: 40%, 50% and 60%. In our experiments, the noise is detected by ACWMF [14], which is successively performed
four times with different parameters for one image. The parameters are chosen to be the same as those in [23]. In Table II, we
show the PSNR values when restoring the corrupted images with random-valued impulse noise. As

shoulder in Lena, the edge along the pepper in Peppers, and the edges along the mast in Boat. It is easy to find that the edge
across the region with heavy noise cannot be well recovered with other methods. This further demonstrates the power of the
proposed multi-scale impulse noise removal algorithm. The strength of the proposed progressively recovery approach comes
from its full utilization of the intra-scale and inter-scale correlations, which are neglected by the currently available single-scale
methods. And many large-scale structures can be well recovered based upon the progressively computed lowlevel results, which
is impossible for traditional single level impulse noise removal algorithms. corresponding to 90% salt-and-peppers noise and
blurring. Our method produces the most visually pleasant results among all comparative studies. Even under blur and impulse
noise simultaneously, the proposed algorithm is still capable of restoring major edges and repetitive textures of the images. It is
noticed that the proposed method can more accurately recover global object contours, such as the edge along the Random-
Valued Noise Removal and Deblurring We now consider the case that blurred images are corrupted by random-valued impulse
noise. We still test three medium noise levels: 40%, 50% and 60%. Table IV tabulates the objective quality comparison with
respect to PSNR of the four test methods. From this table, we can find at lower noise levels, such as 40%, for test images Wheel
and Man, the proposed method loses slightly compared with IFASDA. But the average PSNR is still higher than IFASDA. From
higher noise levels, such as 50% and 60%, the proposed method works the best among the compared methods for all test
images. The average gain is up to 0.65dB. This demonstrates our method can handle more difficult cases. ig. 8 illustrates the
subjective quality comparison for Lena, Peppers and Boat when the noise level is 50%. It is noticed that the proposed method
can more accurately recover images. Both the superior subjective and objective qualities of the proposed algorithm convincingly
demonstrate the potential of the proposed hybrid graph Laplacian regularized regression for impulse noise removal.

Running Time vs. Performance Comparison


Another major issue needed to consider in image denoising is the computational complexity. Here we use random-valued noise
removal and deblurring as an example to show the practical processing time and performance comparison among the compared
methods. Table V gives the PSNR versus average processing times results on a typical computer (2.5GHz Intel Dual Core, 3G
Memory). All of the compared algorithm are running on Matlab R2012a. As depicted in Table V, the computational complexity
of the proposed method is higher than KR but lower than two other state-of-the-art image impulse noise removal algorithms,
and achieves much better quality with respect to PSNR than other methods.
V. CONCLUSION

In this paper, we have proposed a scheme for upsampling piecewise


smooth signals and its extension to images by mod- elling images as lines
of piecewise smooth signals. We show that the method proposed
improves classical linear recon- struction results by making use of an
additional non-linear reconstruction method based on FRI theory. The
method is further improved by using a self-learning approach which also
makes use of FRI. The resulting algorithm outperforms state- of-the-art
methods and does not require the use of external datasets.

REFERENCES

[1] Y. V. Shkvarko, J. Tuxpan, and S. R. Santos, “l2 − l1 structured de-scriptive


experiment design regularization based enhancement of frac-tional SAR imagery,”
Signal Process., vol. 93, no. 12, pp. 3553–3566, Dec. 2013.
[2] T. M. Lillesand, R. W. Kiefer, and J. W. Chipman, Remote Sensing and Image
Interpretation. Hoboken, NJ, USA: Wiley, 2004, xiv 763 pp. Record Number:
20043080717.
[3] A. Temizel and T. Vlachos, “Image resolution upscaling in the wavelet domain using
directional cycle spinning,” J. Electron. Imaging, vol. 14, no. 4, p. 040501, 2005.
[4] M. Elad, M. A. T. Figueiredo, and Y. Ma, “On the role of sparse and redundant
representations in image processing,” Proc. IEEE, vol. 98, no. 6, pp. 972–982, Jun.
2010.
[5] M. Unser, “Splines: A perfect fit for signal and image processing,” IEEE Signal
Process. Mag., vol. 16, no. 6, pp. 22–38, Nov. 1999.
[6] K. Turkowski, “Filters for common resampling tasks,” in Graphics Gems. New
York, NY, USA: Academic, 1990.
[7] M. Protter, M. Elad, H. Takeda, and P. Milanfar, “Generalizing the nonlocal-means
to super-resolution reconstruction,” IEEE Trans. Image Process, vol. 18, no. 1, pp. 36–
51, Jan. 2009.
[8] G. Anbarjafari and H. Demirel, “Image super resolution based on interpo-lation of
wavelet domain high frequency subbands and the spatial domain input image,” ETRI J.,
vol. 32, no. 3, pp. 390–394, 2010.
[9] A. Temizel and T. Vlachos, “Wavelet domain image resolution enhance-ment using
cycle-spinning,” Electron Lett., vol. 41, no. 3, pp. 119–121, Feb. 2005.
[10] H. Demirel and G. Anbarjafari, “Image resolution enhancement by using discrete and
stationary wavelet decomposition,” IEEE Trans. Image Pro-cess, vol. 20, no. 5, pp.
1458–1460, May 2011.
[11] H. Demirel and G. Anbarjafari, “Discrete wavelet transform-based satel-lite image
resolution enhancement,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 6, pp. 1997–
2004, Jun. 2011.
[12] M. Iqbal, A. Ghafoor, and A. Siddiqui, “Satellite image resolution enhancement
using dual-tree complex wavelet transform and nonlocal means,” IEEE Geosci. Remote
Sens. Lett., vol. 10, no. 3, pp. 451–455, May 2013.
[13] [Online]. Available: http://sipi.usc.edu/database/
[14] [Online]. Available: http://www.jpl.nasa.gov/radar/sircxsar/
[15] S. Mallat and G. Yu, “Super-resolution with sparse mixing estima-tors,” IEEE Trans.
Image Process., vol. 19, no. 11, pp. 2889–2900, Nov. 2010.
[16] L. Feng, C. Y. Suen, Y. Y. Tang, and L. H. Yang, “Edge extraction of images by
reconstruction using wavelet decomposition details at differ-ent resolution levels,” Int J.
Pattern Recog. Artif. Intell., vol. 14, no. 6, pp. 779–793, Sep. 2000.
[17] Z. Wang and A. Bovik, “Mean squared error: Love it or leave it? A new look at
signal fidelity measures,” IEEE Signal Process. Mag., vol. 26, no. 1, pp. 98–117, Jan.
2009.

Você também pode gostar