Você está na página 1de 74

AUTOMATIC MAIL SORTING MACHINE (AMSM)

By
Sami ur Rehman
2006-NUST-BEE-146
Sarosh Khan
2006-NUST-BEE-147
Waqas Siddique
2006-NUST-BEE-162

Project Report in fulfillment of


the requirements for the award of
Bachelor of Engineering (Electronics Engineering) degree

In

School of Electrical Engineering and Computer Science (SEECS)


National University of Sciences and Technology (NUST)
H-12 Islamabad, Pakistan
Final Year Project Report
Spring `09

CERTIFICATE

This is to certify that, the work contained in the report entitled


“Automatic Mail Sorting Machine”
was carried out by Mr.Waqas Siddique , Mr.Sami ur Rehman, and Mr.Sarosh Khan
of Faculty Electrical Engineering, under my supervision and that in my opinion it is
fully adequate, in scope and quality, for the degree of B.S in their respective faculties.

Advisor: ___________________
Mr. Raheel Querashi

Co-Advisor: _________________
Dr. Rehan Hafiz
Acknowledgment

All praise to Almighty Allah, Who bestowed us with the knowledge and enable us to complete

this project work. We present our humble respect to the last and final Prophet Muhammad

(peace be upon him), whose life is a perfect model for whole mankind.

We are greatly thankful to our Advisor Mr. Raheel Querashi and Dr. Rehan Hafizfortheir effort in

completion of this project work. Their inspiring guidance, dynamic supervisionand constructive

criticism, helped us out to accomplish the task fairly.We would like to thank all our teachers whose

valuable knowledge, assistance, cooperation and guidance enabled us to take initiative, develop and

furnish our academic career.


ABSTRACT

Our project aims to develop “Automatic Mail Sorting Machine” (AMSM) which is a promising

replacement for the labor intensive and time consuming job of manual mail sorting hence bringing

efficiency to the mail service. As far as the software technology is concerned we implemented

Address Block Location and OCR technologies to identify the city name written in the address block

since it is this parameter on which automated sorting is being performed on mail letters. Using the

feature extraction, template matching and edge detection algorithms as the underlying concepts we

successfully locate the address block on the letter, extract the city name written on it, apply the OCR

system and move the output to the SIMULINK block. The SIMULINK block, using SIMULINK card

installed in the CPU than drives the plunger mechanism maintained on the conveyer belt to throw

the letters in the required destination bins. All the software modules are written and integrated in

MATLAB software. A speed-controlled DC motor continuously drives the conveyer belt. A webcam

inputs the image to the software which processes it and generates output through SIMULINK for the

plunger motors to throw the letters on the belt into the destination bins.
CHAPTER # 1
INTRODUCTION

HISTORY OF AUTOMATED MAIL SORTING


During most of the 21st century mail was sorted by hand using what is called a “pigeon-hole message
box” method. Addresses were read and manually slotted into specific compartments. While early
forms of a mechanical mail sorter were developed and tested in the 1920s, the first sorting machine
was put into operation in the 1950s.

In 1965, the Postal Service put the first high-speed optical character reader (OCR) into operation
that could handle a preliminary sort automatically. And in 1982, the first computer-driven single-line
optical character reader was employed – which reads the mailpiece destination address then prints
a barcode on the envelope that could be used to automate mail sorting from start to finish.

Such automated mail services are available in Post Offices of advance countries but the concept is
quite new in the country like Pakistan. We developed and designed this system to present a model
of an automated mail sorting machine which could be a stunning replacement for the tedious and
time consuming job of manual letter sorting.

BASIC MODULES OF OUR ARCHITECTURE


Our system is equipped with a computer database system, input peripheral devices, user input
devices, a webcam and plunger mechanism. The computer database system processes the data
generated from the input peripheral device and generates sorted database output in according to
the user selected sorting option. The mail or package is delivered to the appropriate designation
following the sorted database output.

We have implemented Using the feature extraction, template matching and edge detection
algorithms as the underlying concepts we successfully locate the address block on the letter, extract
the city name written on it, apply the OCR system and move the output to the SIMULINK block. The
SIMULINK block, using SIMULINK card installed in the CPU than drives the plunger mechanism
maintained on the conveyer belt to throw the letters in the required destination bins. All the
software modules are written and integrated in MATLAB software. A speed-controlled DC motor
continuously drives the conveyer belt. A webcam inputs the image to the software which processes
it and generates output through SIMULINK for the plunger motors to throw the letters on the belt
into the destination bins.

Block Diagram of our System

IMAGE ACQUISITION
The data processing part of the system starts with the image capturing which has to be done by a
fast and efficient camera. The selection of the camera most suitable for OCR was done and finalized.
The camera has its own software used for getting images from camera memory and loading it into
processing unit’s memory (which in our case will be PC). But we are doing this job using MATLAB.
The output of this stage will be an image to be processed by the OCR part.

PREPROCESSING STAGE
Now the image may have different irrelevant data etched onto it i.e. various advertisements and
unnecessary hand written information. Even in an address complete address of recipient is written
but we only want to know the city where the letter is supposed to be delivered. To separate
irrelevant data from the relevant one we need a preprocessing stage which we can say Address
Block Locator (ABL).. Now this is the second software included in our system which has to somehow
talk to the camera software.

OPTICAL CHARACTER RECOGNITION


Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic
translation of scanned images of handwritten, typewritten or printed text into machine-encoded
text. It is widely used to convert books and documents into electronic files, to computerize a record-
keeping system in an office, or to publish the text on a website. OCR makes it possible to edit the
text, search for a word or phrase, store it more compactly, display or print a copy free of scanning
artifacts, and apply techniques such as machine translation, text-to-speech etc. OCR is a field of
research in pattern recognition, artificial intelligence and computer vision.

In our case we wrote the the code of OCR in MATLAB software which successfully translated the text
written on the image into editable text for further processing.

OCR

Hardware Infrastructure

Our hardware consists primarily of 1.5 meters conveyer belt upon which are mounted the plungers
to throw the letters into their required destination bins and a web cam which acquires the image of
the letter to be sorted and sends it to computer for further processing. Our aim was to learn new
things during this project and we incorporated various new ideas into our project. The webcam
takes images of the still letter and then precedes it to the computer for further processing.
CHAPTER # 2

LITERATURE REVIEW
RGB Image:
An RGB image has three channels: red, green, and blue. RGB channels roughly follow the color receptors in
the human eye, and are used in computer displays and image scanners.

If the RGB image is 24-bit (the industry standard as of 2005), each channel has 8 bits, for red, green,
and blue—in other words, the image is composed of three images (one for each channel), where
each image can store discrete pixels with conventional brightness intensities between 0 and 255. If
the RGB image is 48-bit (very high resolution), each channel is made of 16-bit images.

RGB image from the perspective of MATLAB:


An RGB image, sometimes referred to as a true color image, is stored in MATLAB as an m-by-n-by-3
data array that defines red, green, and blue color components for each individual pixel. RGB images
do not use a palette. The color of each pixel is determined by the combination of the red, green, and
blue intensities stored in each color plane at the pixel's location. Graphics file formats store RGB
images as 24-bit images, where the red, green, and blue components are 8 bits each. This yields a
potential of 16 million colors. The precision with which a real-life image can be replicated has led to
the commonly used term truecolor image.

An RGB array can be of class double, uint8, or uint16. In an RGB array of class double, each color
component is a value between 0 and 1. A pixel whose color components are (0,0,0) is displayed as
black, and a pixel whose color components are (1,1,1) is displayed as white. The three color
components for each pixel are stored along the third dimension of the data array. For example, the
red, green, and blue color components of the pixel (10,5) are stored in RGB(10,5,1), RGB(10,5,2),
and RGB(10,5,3), respectively.

The following figure depicts an RGB image of class double:


Consider the picture above:

To determine the color of the pixel at (2,3), you would look at the RGB triplet stored in (2,3,1:3).
Suppose (2,3,1) contains the value 0.5176, (2,3,2) contains 0.1608, and (2,3,3) contains 0.0627. The
color for the pixel at (2,3) is 0.5176 0.1608 0.0627

Illustration of RGB in MATLAB


To further illustrate the concept of the three separate color planes used in an RGB image, the code
sample below creates a simple RGB image containing uninterrupted areas of red, green, and blue,
and then creates one image for each of its separate color planes (red, green, and blue). It displays
each color plane image separately, and also displays the original image.

RGB=reshape(ones(64,1)*reshape(jet(64),1,192),[64,64,3]);

R=RGB(:,:,1);

G=RGB(:,:,2);

B=RGB(:,:,3);

imshow(R)

figure, imshow(G)

figure, imshow(B)

figure, imshow(RGB)
Grayscale Image
Grayscale digital image is an image in which the value of each pixel is a single sample, that is, it
carries only intensity information. Images of this sort, also known as black-and-white, are composed
exclusively of shades of gray, varying from black at the weakest intensity to white at the strongest.
Grayscale images are distinct from one-bit black-and-white images, which in the context of
computer imaging are images with only the two colors, black, and white . Grayscale images have
many shades of gray in between. Grayscale images are also called monochromatic, denoting the
absence of any chromatic variation.

Numerical representations:
The intensity of a pixel is expressed within a given range between a minimum and a maximum,
inclusive. This range is represented in an abstract way as a range from 0 (total absence, black) and 1
(total presence, white), with any fractional values in between.

Another convention is to employ percentages, so the scale is then from 0% to 100%. This is used for
a more intuitive approach, but if only integer values are used, the range encompasses a total of only
101 intensities, which are insufficient to represent a broad gradient of grays. Also, the percentile
notation is used in printing to denote how much ink is employed in half toning, but then the scale is
reversed, being 0% the paper white (no ink) and 100% a solid black (full ink).

Converting color (RGB) to grayscale:


Conversion of a color image to grayscale is not unique; different weighting of the color channels
effectively represents the effect of shooting black-and-white film with different-colored
photographic on the cameras. A common strategy is to match the luminance of the grayscale image
to the luminance of the color image.

To convert any color to a grayscale representation of its luminance, first one must obtain the values
of its red, green, and blue (RGB) primaries in linear intensity encoding, by gamma expansion. Then,
add together 30% of the red value, 59% of the green value, and 11% of the blue value (these
weights depend on the exact choice of the RGB primaries, but are typical). The formula (11*R +
16*G + 5*B) /32 is also popular since it can be efficiently implemented using only integer operations.
Regardless of the scale employed (0.0 to 1.0, 0 to 255, 0% to 100%, etc.), the resultant number is the
desired linear luminance value; it typically needs to be gamma compressed to get back to a
conventional grayscale representation.

Here is an example of color channel splitting of a full RGB color image. The column at left shows the
isolated color channels in natural colors, while at right there are their grayscale equivalences:
Binary Image:
A binary image is a digital image that has only two possible values for each pixel. Typically the two
colors used for a binary image are black and white though any two colors can be used.The color
used for the object(s) in the image is the foreground color while the rest of the image is the
background color.

Binary images are also called bi-level or two-level. This means that each pixel is stored as a single bit
(0 or 1). The names black-and-white, B&W ,monochrome or monochromatic are often used for this
concept, but may also designate any images that have only one sample per pixel, such as grayscale.

YUV Image:
YUV is a color space typically used as part of a color image pipeline. It encodes a color image or video
taking human perception into account, allowing reduced bandwidth for chrominance components, thereby
typically enabling transmission errors or compression artifacts to be more efficiently masked by the human
perception than using a "direct" RGB-representation..

The term YUV is commonly used in the computer industry to describe file-formats that are encoded using
YCbCr.

The Y'UV model defines a color space in terms of one luma (Y') and two chrominance (UV) components
Converting between YUV and RGB:

Edges in image Processing:


Edges are significant local changes of intensity in an image.

Edges typically occur on the boundary between two different regions in an image.

What is Edge Detection?


Edge detection is a terminology in image processing and computer vision, particularly in the areas
of feature detection and feature extraction, to refer to algorithms which aim at identifying points in
a digital image at which the image brightness changes sharply or more formally has discontinuities.

The purpose of detecting sharp changes in image brightness is to capture important events and
changes in properties of the world. It can be shown that under rather general assumptions for an
image formation model, discontinuities in image brightness are likely to correspond to:

 discontinuities in depth
 discontinuities in surface orientation
 changes in material properties
 variations in scene illumination

In the ideal case, the result of applying an edge detector to an image may lead to a set of connected
curves that indicate the boundaries of objects, the boundaries of surface markings as well curves
that correspond to discontinuities in surface orientation. Thus, applying an edge detector to an
image may significantly reduce the amount of data to be processed and may therefore filter out
information that may be regarded as less relevant, while preserving the important structural
properties of an image. If the edge detection step is successful, the subsequent task of interpreting
the information contents in the original image may therefore be substantially simplified.
Unfortunately, however, it is not always possible to obtain such ideal edges from real life images of
moderate complexity. Edges extracted from non-trivial images are often hampered by
fragmentation, meaning that the edge curves are not connected, missing edge segments as well
as false edges not corresponding to interesting phenomena in the image – thus complicating the
subsequent task of interpreting the image data
Edge properties:
The edges extracted from a two-dimensional image of a three-dimensional scene can be classified as
either viewpoint dependent or viewpoint independent. A viewpoint independent edge typically
reflects inherent properties of the three-dimensional objects, such as surface markings and surface
shape. A viewpoint dependent edge may change as the viewpoint changes, and typically reflects the
geometry of the scene, such as objects occluding one another.

A typical edge might for instance be the border between a block of red color and a block of yellow,
In contrast a line ,can be a small number of pixels of a different color on an otherwise unchanging
background. For a line, there may therefore usually be one edge on each side of the line.

Edges play quite an important role in many applications of image processing, in particular
for machine vision systems that analyze scenes of man-made objects under controlled illumination
conditions.

A simple edge model:


Although certain literature has considered the detection of ideal step edges, the edges obtained
from natural images are usually not at all ideal step edges. Instead they are normally affected by
one or several of the following effects:

 Focal blur caused by a finite depth-of-field and finite point spread function.
 Penumbral blur caused by shadows created by light sources of non-zero radius.
 Shading at a smooth object

A one-dimensional image f which has exactly one edge placed at x = 0 may be modeled as:

At the left side of the edge, the intensity is

And right of the edge it is

The scale parameter σ is called the blur scale of the edge.


Why edge detection is a non-trivial task?
To illustrate why edge detection is not a trivial task, let us consider the problem of detecting edges
in the following one-dimensional signal. Here, we may intuitively say that there should be an edge
between the 4th and 5th pixels.

5 7 6 4 152 148 149

If the intensity difference were smaller between the 4th and the 5th pixels and if the intensity
differences between the adjacent neighboring pixels were higher, it would not be as easy to say that
there should be an edge in the corresponding region. Moreover, one could argue that this case is
one in which there are several edges.

5 7 6 41 113 148 149

Hence, to firmly state a specific threshold on how large the intensity change between two
neighboring pixels must be for us to say that there should be an edge between these pixels is not
always a simple problem. Indeed, this is one of the reasons why edge detection may be a non-trivial
problem unless the objects in the scene are particularly simple and the illumination conditions can
be well controlled.

Goal of edge detection:


Produce a line drawing of a scene from an image of that scene.
Important features can be extracted from the edges of an image (e.g., corners,lines, curves).
These features are used by higher-level computer vision algorithms (e.g., recognition).

What causes intensity changes?


Various physical events cause intensity changes.
Geometric events
object boundary (discontinuity in depth and/or surface color and texture)
surface boundary (discontinuity in surface orientation and/or surface color
and texture)
Non-geometric events
specularity (direct reflection of light, such as a mirror)
shadows (from other objects or from the same object)
inter-reflections

Edge descriptors
Edge normal: unit vector in the direction of maximum intensity change.
Edge direction: unit vector to perpendicular to the edge normal.
Edge position or center: the image position at which the edge is located.
Edge strength: related to the local image contrast along the normal.

Modeling intensity changes:


Edges can be modeled according to their intensity profiles.

Step edge: the image intensity abruptly changes from one value to one side of the

discontinuity to a different value on the opposite side .


Ramp edge: a step edge where the intensity change is not instantaneous but occur

over a finite distance.

Ridge edge: the image intensity abruptly changes value but then returns to the

starting value within some short distance (generated usually by lines).

Roof edge: a ridge edge where the intensity change is not instantaneous but occur

over a finite distance (generated usually by the intersection of surfaces).


The four steps of edge detection:
(1) Smoothing: Suppress as much noise as possible, without destroying the true edges.

(2) Enhancement: apply a filter to enhance the quality of the edges in the image (sharpening).

(3) Detection: determine which edge pixels should be discarded as noise and which should be
retained (usually, thresholding provides the criterion used for detection).

(4) Localization: determine the exact location of an edge (sub-pixel resolution might be required for
some applications, that is, estimate the location of an edge to better than the spacing between
pixels). Edge thinning and linking are usually required in this step.

Edge detection using derivatives:


Calculus describes changes of continuous functions using derivatives, An image is a 2D function, so
operators describing edges are expressed using partial derivatives.

- Points which lie on an edge can be detected by:

(1) Detecting local maxima or minima of the first derivative

(2) Detecting the zero-crossing of the second derivative

Definition of the gradient

- The gradient is a vector which has certain magnitude and direction:


To save computations, the magnitude of gradient is usually approximated

by:

Properties of the gradient

The magnitude of gradient provides information about the strength of the edge. The direction of
gradient is always perpendicular to the direction of the edge (the edge direction is rotated with
respect to the gradient direction by -90 degrees).

Estimating the gradient with finite differences

The gradient can be approximated by finite differences:

Using pixel-coordinate notation (remember: j corresponds to the x direction and


i to the negative y direction):

Standard deviation
Standard deviation of a statistical population, a data set, or a probability distribution is the square
root of its variance. Standard deviation is a widely used measure of the variability or dispersion,
being algebraically more tractable though practically less robust than the expected
deviation or average absolute deviation.

It shows how much variation there is from the "average" (mean) (or expected/ budgeted value). A
low standard deviation indicates that the data points tend to be very close to the mean, whereas
high standard deviation indicates that the data are spread out over a large range of values.

For example, the average height for adult men in Pakistan about 70 inches (178 cm), with a standard
deviation of around 3 in (8 cm). This means that most men (about 68 percent, assuming a normal
distribution) have a height within 3 in (8 cm) of the mean (67–73 in (170–185 cm)) – one standard
deviation, whereas almost all men (about 95%) have a height within 6 in (15 cm) of the mean (64–76
in (163–193 cm)) – 2 standard deviations. If the standard deviation were zero, then all men would
be exactly 70 in (178 cm) high. If the standard deviation were 20 in (51 cm), then men would have
much more variable heights, with a typical range of about 50 to 90 in (127 to 229 cm). Three
standard deviations account for 99.7% of the sample population being studied, assuming the
distribution is normal (bell-shaped).
OPTICAL CHARACTER RECOGNITION MODULE

WHAT IS OCR?

Full form of OCR is Optical Character Recognition. It is a computer program designed to convert
scanned or digital images of handwritten or typewritten text into machine-editable text, or to
translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or
Unicode). OCR began as a field of research in pattern recognition, artificial intelligence and machine
vision. Though academic research in the field continues, the focus on OCR has shifted to
implementation of proven techniques.

OCR BACKGROUND

Developing proprietary OCR system is a complicated task and requires a lot of effort. Such systems
usually are really complicated and can hide a lot of logic behind the code. The use of artificial neural
network in OCR applications can dramatically simplify the code and improve quality of recognition
while achieving good performance. Another benefit of using neural network in OCR is extensibility of
the system - ability to recognize more character sets than initially defined. Most of traditional OCR
systems are not extensible enough. Why? Because such task as working with tens of thousands
Chinese characters, for example, is not as easy as working with 68 English typed character set and it
can easily bring the traditional system to its knees.

MODULES OF OCR SYSTEM

OCR systems consist of five major stages

1. Pre-processing
2. Segmentation
3. Feature Extraction
4. Classification
5. Post-processing

1-Pre-processing

The raw data is subjected to a number of preliminary processing steps to make it usable in
the descriptive stages of character analysis. Pre-processing aims to produce data that are
easy for the OCR systems to operate accurately. The main objectives of pre-processing are :

• Binarization
• Noise reduction
• Stroke width normalization
• Skew correction
• Slant removal

Binarization

Document image binarization (thresholding) refers to the conversion of a gray-scale image


into a binary image. Two categories of thresholding

Document image binarization (thresholding) refers to the conversion of a gray-scale image


into a binary image. Two categories of thresholding

Adaptive (local), uses different values for each pixel according to the local area information

Noise Reduction

Noise reduction improves the quality of the document. Two main approaches:

• Filtering (masks)
• Morphological Operations (erosion, dilation, etc)

Normalization provides a tremendous reduction in data size, thinning extracts the shape
information of the characters.

Skew Correction

Skew Correction methods are used to align the paper document with the coordinate system of the
scanner. Main approaches for skew detection include correlation, projection profiles, Hough
transform.
Slant Removal

The slant of handwritten texts varies from user to user. Slant removal methods are used to
normalize the all characters to a standard form.

Popular deslanting techniques are:

Bozinovic – Shrihari Method (BSM).

• Calculation of the average angle of near-vertical elements

2-Segmentation

Segmentation implies segmenting the characters with in the text:


Two approaches are commonly used for this purpose:

 Explicit Segmentation

In explicit approaches one tries to identify the smallest possible word segments (primitive segments)
that may be smaller than letters, but surely cannot be segmented further. Later in the recognition
process these primitive segments are assembled into letters based on input from the character
recognizer. The advantage of the first strategy is that it is robust and quite straightforward, but is
not very flexible.

 Implicit Segmentation

In implicit approaches the words are recognized entirely without segmenting them into letters. This
is most effective and viable only when the set of possible words is small and known in advance, such
as the recognition of bank checks and postal address

3-Feature Extraction

In feature extraction stage each character is represented as a feature vector, which becomes its
identity. The major goal of feature extraction is to extract a set of features, which maximizes the
recognition rate with the least amount of elements.

 Due to the nature of handwriting with its high degree of variability and imprecision obtaining
these features, is a difficult task. Feature extraction methods are based on 3 types of
features:
 Statistical
 Structural
 Global transformations and moments
Statistical Features

Representation of a character image by statistical distribution of points takes care of style variations
to some extent.

The major statistical features used for character representation are:

• Zoning
• Projections and profiles
• Crossings and distances

Zoning

The character image is divided into NxM zones. From each zone features are extracted to form the
feature vector. The goal of zoning is to obtain the local characteristics instead of global
characteristics

Zoning – Density Features

The number of foreground pixels, or the normalized number of foreground pixels, in each cell is
considered a feature.

Projection Histograms

The basic idea behind using projections is that character images, which are 2-D signals, can
be represented as 1-D signal. These features, although independent to noise and
deformation, depend on rotation.
Projection histograms count the number of pixels in each column and row of a character
image. Projection histograms can separate characters such as “m” and “n”.

Profiles

The profile counts the number of pixels (distance) between the bounding box of the character image
and the edge of the character. The profiles describe well the external shapes of characters and allow
distinguishing between a great number of letters, such as “p” and “q”.

Structural Features

Three types of features


Horizontal and Vertical projection histograms.
Radial histogram .
Radial out-in and radial in-out profiles.

Feature Extraction

 Two types of features :

Features based on zones:

 The character image is divided into horizontal and vertical zones and the
density of character pixels is calculated for each zone.

Features based on character projection profiles:

 The centre mass of the image is first found.

 Upper/ lower profiles are computed by considering for each image column,
the distance between the horizontal line and the closest pixel to the
upper/lower boundary of the character image. This ends up in two zones
depending on . Then both zones are divided into vertical blocks. For all blocks
formed we calculate the area of the upper/lower character profiles.

 Similarly, we extract the features based on left/right profiles.

MOTION BLUR REMOVAL


A blurred image has an associated Point Spread Function, the mathematical function responsible for
the distortion of the image itself. We have various algorithms and filters used for removing the blur
from the image but all these algorithms assume that the knowledge of PSF is already known,
therefore these algorithms simply deconvolve the PSF with the blurred image to get the original
image.

DEBLURRING FUNCTIONS

Wiener Filter (deconvwnr)

Implements a least squares solution. You should provide some information about the noise to
reduce possible noise amplification during deblurring.

Regularized Filter (deconvreg)

Implements a constrained least squares solution, where you can place constraints on the output
image (the smoothness requirement is the default). You should provide some information about the
noise to reduce possible noise amplification during deblurring. See Deblurring with a Regularized
Filter for more information.

Lucy-Richardson Algorithm (deconvlucy)

Implements an accelerated, damped Lucy-Richardson algorithm. This function performs multiple


iterations, using optimization techniques and Poisson statistics. You do not need to provide
information about the additive noise in the corrupted image. See Deblurring with the for more
information.

Blind Deconvolution Algorithm (deconvblind)

Implements the blind deconvolution algorithm, which performs deblurring without knowledge of
the PSF. You pass as an argument your initial guess at the PSF. The deconvblind function returns a
restored PSF in addition to the restored image. The implementation uses the same damping and
iterative model as the deconvlucy function. See Deblurring with the for more information.

SOBEL’S EDGE DETECTION ALGORITHM

There are many ways to perform edge detection. However, the majority of different methods may
be grouped into two categories, gradient and Laplacian. The gradient method detects the edges by
looking for the maximum and minimum in the first derivative of the image. The Laplacian method
searches for zero crossings in the second derivative of the image to find edges. An edge has the one-
dimensional shape of a ramp and calculating the derivative of the image can highlight its location.
Suppose we have the following signal, with an edge shown by the jump in intensity below:
If we take the gradient of this signal (which, in one dimension, is just the first derivative with respect
to t) we get the following:

Based on this one-dimensional analysis, the theory can be carried over to two-dimensions as long as
there is an accurate approximation to calculate the derivative of a two-dimensional image. The
Sobel operator performs a 2-D spatial gradient measurement on an image. Typically it is used to find
the approximate absolute gradient magnitude at each point in an input grayscale image. The Sobel
edge detector uses a pair of 3x3 convolution masks, one estimating the gradient in the x-direction
(columns) and the other estimating the gradient in the y-direction (rows). A convolution mask is
usually much smaller than the actual image. As a result, the mask is slid over the image,
manipulating a square of pixels at a time. The actual Sobel masks are shown below:

The magnitude of the gradient is then calculated using the formula:

An approximate magnitude can be calculated using:

|G| = |Gx| + |Gy|


SOBEL EXPLANATION

The mask is slid over an area of the input image, changes that pixel's value and then shifts one pixel
to the right and continues to the right until it reaches the end of a row. It then starts at the
beginning of the next row. The example below shows the mask being slid over the top left portion of
the input image represented by the green outline. The formula shows how a particular pixel in the
output image would be calculated. The center of the mask is placed over the pixel you are
manipulating in the image. And the I & J values are used to move the file pointer so you can
mulitply, for example, pixel (a22) by the corresponding mask value (m22). It is important to notice
that pixels in the first and last rows, as well as the first and last columns cannot be manipulated
by a 3x3 mask. This is because when placing the center of the mask over a pixel in the first row
(for example), the mask will be outside the image boundaries.
CHAPTER # 3

SYSTEM DESING AND


ARCHITECTURE

SOFTWARE PART
Image Acquisition
We are using DANY made 1.3Mpixel webcam to acquire the image of the letter moving above
the conveyer belt for further processing.

Specifications of the camera:

Sensor: 1/6" CMOS (OV7670)

Hardware Resolution: 300K(640H*480V)

Software Resolution: 1.3 Megapixels

Pixel Point: 3.6um*3.6um

Image Format: VGA

Data Format: YUV, RGB

Operating Port: USB 2.0 and down to USB1.1

Max. Frame Rate: 30fps (VGA)

Min. Low Photo Light: 6Lux

S/N Ratio: >46Db

Definition Level: >300TV Line(middle level)

Lens: 5P F1.1.8/f2.95

Focus Range: 2cm to infinity

Visual Angle: 60°

Photo Control: Saturation, Compare, Sharp

White Balance: Automatic, Manual

Automatic White 2600°K~5000°K


Balance:

Exposure: Automatic, Manual

Image/Video: Automatic, Manual

Storage Temperature: -40° to 95°

Operating Temperature: 30° to 70°


Interfacing Camera with PC:
Interfacing webcam with the Matlab is the first Step in the software module of the system. Main
cammands used for this purpose are follow:

The imaqhwinfo gives the installed adaptors,we are using ‘winvideo’ for the purpose.

When winvideo is fed to this we get the following info;

When the command below is used following interface appeared;


Collecting the best image frame:
We have made an algorithm based on the pixel sum, It calculates the sum of the all the pixels of the
binary image, this can be done easily in the matlab.we have 640 x 480 binary image.

These images are continuously taken by the webcam and by calculating the sum of all the pixels
(white=1 and black=0).

This is done by taking sum two times,

When we take sum of the image first time a column vector is returned and taking sum of the column
vector we have a single value. This value obtained is compared with the threshold for the best
image, if it is close to that value; we assume that it is the best image.

The threshold was between 280000 to 290000.

Interaction with the Best frame selector:


In this case the pixel sum was greater than 290000 so this frame was not selected
In this case the pixel sum was determined to be between 280000 and 290000 so that frame was
selected.

Now again in this next frame (on next page) the pixel sum exceeded 290000 so that was not
selected.
Locating Address Block
We have developed an algorithm based on the zero crossing; algorithm is detecting a change in the
pixels from black pixel to the white pixels as we are dealing with the binary image.

Initially we have an RGB image which is achieved from the camera,

Consider the above part of the algorithm, by applying this on our best image this kind of image
transformation was done.

Initially we had a YUV image from the webcam,


Now by applying the conversion mentioned above in the literature survey this image was
transformed into the RGB image,

After achieving the RGB image it was converted to the gray scale image,
Now that particular image was converted into the binary image.

By negating the above image we had a negated binary image and on that image we applied our
basic algorithm.
In the algorithm below address block is located by calculating the x and y coordinates of the
address, considering the negated binary image which is obtained after certain manipulations in the
original image.

As it is known from the literature survey that the white image has a pixel value of 1 and black has a
value of zero, so what is done that this algorithm uses summing approach for identifying the address
horizontally, so if we take C1 and C2 be the starting and ending columns of the binary image matrix
above,they can be determined easily with the help of the algorithm below.

As we have found C1 with the above algorithm and now finding C2,
As the sum approach was more useful for finding the address block horizontally but after experiments
the standard deviation approach was found better than finding address block vertically.Standard
deviation is explained in the literature survey.

As we will be using the city name for applying the OCR, so what we will be doing is that our R1 (that
is the start of address block vertically) the algorithm is designed such that it identifies the city name.
After this algorithm we have R1 identified which is the start of address block, as we can roughly
estimate R2 by adding 50 into R1 as this image is going to be cropped so we just need the highte of
the rectangle which is not that critical as R1 is,so this is done by estimation.

When these values are fed to imcrop(command in matlab) the image is cropped, as shown below,
Ouput of ABL:

So this city name is passed to the OCR for further processing.

REMOVING MOTION BLUR


The blurring, or degradation, of an image can be caused by many factors. In our case the reason for
blur is the movement during the image capture process, by the camera.

A blurred image can be simply represented by the following mathematical equation:

g = Hf + n

Here “g” represents the blurred image .


“H” represents the distortion factor, also called the point spread function (PSF). In the spatial
domain, the PSF describes the degree to which an optical system blurs (spreads) a point of
light. The PSF is the inverse Fourier transform of the optical transfer function (OTF). In the
frequency domain, the OTF describes the response of a linear, position-invariant system to
an impulse. The OTF is the Fourier transform of the point spread function (PSF). The
distortion operator, when convolved with the image, creates the distortion. Distortion
caused by a point spread function is just one type of distortion.

Small “f “is the original image


Small “n” is any additive noise

The Point Spread Function describes the response of the system to a point source much like impulse
response is the response of any linear function at a certain point. If PSF is convolved with an image it
will blur the image or distort it. In our case we already receive a blurred image with an unknown PSF
and additive noise. We therefore first convolve the image with the Sobel’s operator to work out the
edges of the blurred image and then subtract the resultant by the original image to get the
deblurred image.

There are various different filters and algorithms used for deblurring the image but in most cases
either the PSF is known or the additive noise has to be known but in our case we needed to work
out PSF first.

We were now left with two options: either to calculate the PSF or stop the conveyer and then take
the image.

OPTICAL CHARACTER RECOGNITION


Our OCR system aims to read the text written on the letter and convert it into machine readable
format. Although an OCR may consists of various image processing techniques like image
segmentation, image classification, pattern detection, edge detection and so on, but the line of
action we have followed is quite simple. We only aim to detect the computer prints and not the
handwritten texts. The OCR system we established performs following two operations:

Character recognition through edge detection


Template matching with stored character templates

CHARACTER RECOGNITION THROUGH EDGE DETECTION


We used Sobel operator and convolved it with the image to detect its images. This operator
detects the intensity variations in the image through convolution and creates a resultant
image indicating only the points where intensity varies.
The gradient at a point in an image is calculated as:

The Sobels operator:

The Sx kernel is used to determine the derivatives in horizontal direction and Sy used to
detect the derivatives in vertical direction. The two kernels are convolved with the 320*240
pixel image and maintain an intensity variation threshold, points who exceed this threshold
are assigned certain values and they are shown in the resultant image while points who lie
with in the limits of this threshold imply their derivative is zero and hence are neglected.

http://nullprogram.com/img/spatial/image-test-edge.png
TEMPLATE MATCHING

The resultant image from the above process is than segmented using character
segmentation techniques and finally matched with set of characters maintained a templates
folder.

The maintained database of detectable characters

OUTPUT OF THE OCR SYSTEM

Running OCR system correctly yields following output in text format:


INTERFACING WITH THE SIMULINK
After the city is detected by the OCR the software sends the signal for that detected city to the
SIMULINK BLOCK in the form a vector. For each city there exist a unique vector as follow:

For Wah cantt: *0 0 1+’

For Multan : *0 1 0+’

For Lahore : *1 0 0+’

Detecting Wah Cantt:


Detecting Multan
Detecting Lahore
HARDWARE PART
DC POWER SUPPLY:
We needed a variable DC power supply for increasing and decreasing the speed of the conveyer
belt.

We designed a DC power supply for this purpose.

Circuitry and Explanation:


This power supply is based on the LM317 Variable Regulator., LM317 is an adjustable 3 terminal
regulator to supply a current of up to 5A over a variable output voltage of 2V to 25V DC. It will come
in handy to power up many electronic circuits when you are assembling or building any electronic
devices. The schematic and parts list are designed for a power supply input of 220VAC.

Components used in the DC power Supply:

Transformer(To Convert 220V AC to 12V AC)


Capacitor 2200 micro Farad
LM 317
TIP 142(Darlington BJTs)
Vraiable Resistor(Linear Potentiometer) 5K
Resistor of 220 ohms

The above power supply converts 220V AC to 2V-20V DC s that voltage is fed to the DC motor.
Features of our dc power supply:
* Adjustable output down to 1.2V
* Guaranteed 1.5A output current
* Line regulation typically 0.01%/V
* Load regulation typically 0.1%
* 80 dB ripple rejection

DC motor:
A brushless DC (BLDC) motor also known as a electronically commutated motor is a
synchronous electric motor powered by direct-current (DC) electricity and having an electronic
commutation system, rather than a mechanical commutator and brushes.

In BLDC motors, current to torque and voltage to rpm are linear relationships.

We have used a DC motor instead of a stepper motor.

Why not Stepper Motor for the


Conveyer Belt ?
Disadvantages of stepper motors:

Resonances can occur if not


properly controlled.
Not easy to operate at extremely
high speeds.

Advantages of DC motor:

Brushless d.c. (BLDC) motors provide performance advantages over PSC and brushed d.c. (BDC)
motors, including the following:

• The ratio of output power to frame size is higher in BLDC motors. This reduces the size and weight
of the product. This also saves the cost of motor mounting and shipping expenses.

• The BLDC motors operate at higher-power efficiency compared to induction motors and BDC
motors because they have permanent magnets on the rotor and there are no brushes for
commutation.

• Brush inspection is eliminated, making them suitable for limited-access areas like compressors and
fans. This also increases the life of the motor and reduces the service requirements.

• BLDC motors have less electromagnetic interference (EMI) generation. With BDC motors, the
brushes tend to break and make contacts while the motor is rotating, resulting in the emission of
electromagnetic noise into the surroundings.
• BLDC motors have a relatively flat speed-torque characteristic ( See Figure). This enables the
motor to operate at lower speeds without compromising torque when the motor is loaded.

Better speed versus torque characteristics


High dynamic response
High efficiency
Long operating life
Noiseless operation
Higher speed ranges
DC motors provide excellent speed control for acceleration and deceleration with effective and
simple torque control.
Power supply of a DC motor connects directly to the field of the motor allows for precise voltage
control, which is necessary with speed and torque control applications.

Comparison of both the motors:

Type Advantages Disadvantages Typical Application Typical


Drive

Stepper DC Precision Requires a Positioning in printers and DC


positioning controller floppy drives
High holding
torque

Brushless Long lifespan High initial cost Hard drives DC


DC low maintenance Requires a CD/DVD players
High efficiency controller electric vehicles

AutoCAD Layout of the Hardware:


AutoCAD is a CAD (Computer Aided Design or Computer Aided Drafting) software application for
2D and 3D design and drafting.

We have made our Hardware Design in AutoCAD.

PLUNGER MOTORS
Now we could have used two types of motors for plunging the letters into the destination bins

Free spinning electric motors


Stepper motors

 Free-Spinning Electric Motors - A free-spinning electric motor uses precisely timed opposing
magnetic fields to cause an armature shaft to rotate. Free-spinning electric motors can be
designed to run on AC or DC current, with brushes or brushless, depending upon their
application.

 Stepper Motors - The armature of a stepper motor can be rotated an exact number of turns
or just a fraction of a turn. Stepper motors are controlled by a computer to position a
mechanical device in an exact location. A typical stepper motor can be positioned to 256
different positions.
We are using window power motor which is of the type of free spinning electric motor. It
requires a minimum current of 3-4 amps for proper functionality. The motor rotates at 360
degrees and plunges accurately the letter on conveyer belt into the destination bins using
plungers mounted on it. We are using three motors for this purpose as we are sorting three
letters based on city.
Window motor used to plunge the letters

PLUNGER MOTOR DRIVER CIRCUITRY

To drive these heavy current motors we have used battery to provide these motors a current
of 3-4 amps. Designing of this circuit was done by us and it worked fine and it employed a
heavy battery.
Circuit made in multisim

CIRCUIT COMPONENTS DESCRIPTION:

Optocouplers

Optocouplers are used to detect the 5V signal at the output of the SIMULINK card. Optocouplers are
great for tinkering. They enable you to control one circuit from another circuit when there is no
electronic connection between the two circuits.

Op Amp
Op Amps are used to convert the 5 volt into 15 volt signal. Because of the high impedance of the op-
amps' input stage, combined with "bootstrapping" effects caused by negative feedback, the input
impedance of an op-amp is infinite for all practical purposes.

Relays

Relays are used to serve as current switches. They are electro-magnetically activated switches.
Literally, there is an electromagnet inside the relay, and energizing that electromagnet causes the
switch to change position by pulling the movable parts of the switch mechanism to a different
position. To the greatest extent possible, the electromagnet is made to be electrically isolated from
the signal path.

Battery

The motors drive current from the battery placed at the corner of the circuit board. The battery is
capable of producing 3-4 Volts of voltage.

SIMULINK CARD and SIMULINK MODEL


We are using NI 6052E model SIMULINK card. It takes the data from the OCR code and produces a
five volt signal with a minimal current. The output from this card is fed into the current amplifier
circuitry discussed above.The SIMULINK model is shown below:
The SIMULINK card we are using is National instruments PIC 6052E:
Used SIMULINK card in our project
Our SIMULINK card yields three outputs each with its own ground. The key specifications of the card
we are using are given below:

 Functionally equivalent to National Instruments' PCI-6052E


 16 single-ended/8 differential 16-bit analog inputs
 333 kS/s maximum sample rate
 Two 16-bit analog outputs
 Eight digital I/O lines and 2 counter/timers
 Triggering of measurement and control via both analog and digital signals
 Easy synchronization of multiple measurement boards

The SIMULINK card has an output board which generates 5V for each signal sent to it as a city is
detected by the MATLAB. From here the signal passes through the current amplifier and the
plunger motor is driven.
CHAPTER # 4
DISCUSSION AND
CONCLUSION

This chapter aims to discuses the merits and demerits the Automatic Mail Sorting machine. It
captures the essence of the project and techniques employed to complete the project. The project is
also compared with other of kinds in a more subjective way.

PRESENT STATURS OF THE PROJECT:

The goal of the project was to develop the automated mail sorting machine which should have
sorted out letters at enormous speeds, more than at least the time required for manual letter
sorting. But due to non availability of the resources and limited financial capabilities we could only
make the circuit which operates just at the same speed as required in manual letter sorting. Also our
project is only 70-80 % operational. Although the image processing part has been achieved quite
accurately and successfully in the MATLAB software but the interfacing problem presented some
major loopholes in automation.

Due to undetectable motion blur which accompanies the image acquired by the camera as the letter
moves on the conveyer belt, image processing on moving image could not be achieved although we
successfully implemented various image processing algorithms on the still image.

UPS AND DOWNS DURING THE PROJECT:

Motion blur represented a major loophole in automation. The blur could have been some how
reduced using some high resolution camera but due to our financial limitations we abandoned this
idea of buying such a state of the art design. Another way to reduce the blur was to use the
software coding and use some other image processing technique, and it was this idea which we
adopted. But for each motion deblurr technique we either need to know the PSF or the noise
function which in our case was simply unknown. Calculating the PSF is monotonous task but still we
proceeded and gained almost 30 percent successes in this area.

The very first task was to determine the address block from the acquired image. The code we wrote
for this purpose was named as Address Block Location (ABL), this task was followed by the task of
locating the city name from the worked out address block. This was yet another momentous task
and took weeks before final furnishing. This software code is followed by OCR code. It took us yet
other weeks to accurately interface both the two codes together. By now both the codes work just
fine and take at the maximum of 3 seconds to produce the output.

The plunger motor drive presented to us yet another problem. After a great deal of experimentation
and thinking we were left with only one option: using window power motor. We designed its driving
circuitry as it requires almost 3-4 amp of current for proper functionality. With the output from
SIMULINK they are accurately operating. This presents another achievement.

Throughout our experimentation on the AMSM we ended up with various achievements and were
stuck in certain areas as well. We successfully located the address block from the acquired image
and extracted the city name from that address. Then we finally ran optical character recognition
software on the extracted city and sorted out letters as per required.

Image processing on moving image could not be fully achieved, although we went a long way ahead
in this domain.

Overall we believe we have almost achieved the goal we set for ourselves in the beginning of the
project.
APPENDEX
MATLAB CODE:
% OCR (Optical Character Recognition).
% Auotomatic Mail Sorting Machine (AMSM)
%_this code is intended to receive the frames produced by camera, perform
%optical characrer recognition and generate corresponding signal for the
%simulink to derrive the plunger mechanism.
% PRINCIPAL PROGRAM
%////////////////////////////////////////////////////////////////////
warning off %#ok<WNOFF>
% Clearing command window
clc
% Closing all the opened figures
close all
%//////////////////////////////////////////////////////////////////////
Read image
camera interfacing....
vid1 = videoinput('winvideo',1,'YUY2_320x240') % for 320x240 video
set(vid1,'Returnedcolorspace','RGB')% for RGB image
preview(vid1)
snap=getsnapshot(vid1);
%imshow(snap)
imwrite(snap,'lahore.jpg')
imshow(address)
%/////////////////////////////////////////////////////////////////////////
%IMAGE FILTERING...SMOOTHING.....
f=imread('wah.jpg');
f=rgb2gray(f);
[M,N]=size(f);
F=fft2(double(f));
u=0:(M-1);
v=0:(N-1);
idx=find(u>M/2);
u(idx)=u(idx)-M;
idy=find(v>N/2);
v(idy)=v(idy)-N;
[V,U]=meshgrid(v,u);
D=sqrt(U.^2+V.^2);
%H=double(D<=P);
G=1.*F;
g=real(ifft2(double(G)));
%/////////////////////////////////////////////////////////////////////////
%imshow(f),figure,
%a=imshow(g,[0 255])
%figure;
%READING THE IMAGE.........................
imagen=imread('wah.jpg');
% Show image
imshow(imagen);
title('IMAGE TRANSFORMED INTO MATLAB')
%////////////////////////////////////////////////////////////////////////
% Converting the RGB image to gray scale to reduce processing
if size(imagen,3)==3 %RGB image
imagen=rgb2gray(imagen);
end
%////////////////////////////////////////////////////////////////////////
% Convert to BW
% First we determine the level of threshold
threshold = graythresh(imagen);
% Now we convert the image to binary format
imagen =~im2bw(imagen,threshold);
%/////////////////////////////////////////////////////////////////////////
% Adress Block location Software
% Clear variables and functions from memory.
clear all
clc
%putting two variables=0 for the further use as the counters
k=0;
u=0;
e=0;
f=0;
% Now reading a grayscale or color image from the file specified by the string FILENAME
A=imread('addd.jpg');
% % RGB2GRAY converts RGB images to grayscale by eliminating the hue and saturation information
while retaining the luminance
A=rgb2gray(A);
% %computing a global threshold (LEVEL) that can be used to convert an
% %intensity image to a binary image with IM2BW.
threshold = graythresh(A);
% %image converted to binary and negated
A=~im2bw(A,threshold);
% X=[0 -.25 0;-.25 1 -.25;0 -.25 0]
% C = convn(A,X)
% C=~C;
% calculatin the sum colum wise giving a row vector
B=sum(A)
%looop for the start of the adress block
for m=1:640
%checking the black pixels
if (B(m)>1)
e=e+1
if(e>25)
C1=m-e
% Start of the adress block identified
break
end
end
end
%loop for the end of the adress block horizontally
for i=C1:640
if(B(i)<1)
k=k+1
if(k>42)
C2=i
%End of the adress block Identified
break
end
end
end
%now calculatin the start of adress block horizontally and vertically
Z=sum(A,2)
for l=35:480
if(Z(l)>1)
f=f+1
if(f>15)
r1=l-f
%start of adress block identified vertically
break
end
end
end
for s=r1:480
if(Z(s)<1)
u=u+1
if(u>20)
r2=s
%start of adress block identified vertically
break
end
end
end
r1=r1+55
r2=r2-1
C1=C1-1
C2=C2-1

% Image negated converted to normal form


A=~A;
% RECT is a 4-element vector with the form [XMIN YMIN WIDTH HEIGHT];
% these values are specified in spatial coordinates and are provided below
% after calculation above.
D = imcrop(A,[C1 r1 C2-C1 r2-r1]);
%showing the original image and adress block
imshow(A)
figure, imshow(D)

% Remove all object containing fewer than 30 pixels


imagen = bwareaopen(imagen,30);
%Storage matrix word from image
word=[ ];
re=imagen;
%Opens text.txt as file for write
fid = fopen('text.txt', 'wt');
% Load templates
load templates
global templates
% Compute the number of letters in template file
num_letras=size(templates,2);
while 1
%Fcn 'lines' separate lines in text
[fl re]=lines(re);
imgn=fl;
%Uncomment line below to see lines one by one
%imshow(fl);pause(0.5)
%-----------------------------------------------------------------
% Label and count connected components
[L Ne] = bwlabel(imgn);
for n=1:Ne
[r,c] = find(L==n);
% Extract letter
n1=imgn(min(r):max(r),min(c):max(c));
% Resize letter (same size of template)
img_r=imresize(n1,[42 24]);
%Uncomment line below to see letters one by one
%imshow(img_r);pause(0.5)
%-------------------------------------------------------------------
% Call fcn to convert image to text
letter=read_letter(img_r,num_letras);
% Letter concatenation
word=[word letter];
end
%fprintf(fid,'%s\n',lower(word));%Write 'word' in text file (lower)
fprintf(fid,'%s\n',word);%Write 'word' in text file (upper)
% Clear 'word' variable
word=[ ];
%*When the sentences finish, breaks the loop
if isempty(re) %See variable 're' in Fcn 'lines'
break
end
end
%/////////////////////////////////////////////////////////////////////////
fclose(fid);
%Open 'text.txt' file
winopen('text.txt')
save text
load ('text')
fid = fopen('text.txt')
C = textscan(fid, '%s',2)
city=C{:}
sami=cell2mat(city);
fclose(fid);
city1= 'LAHORE'
city2='MULTAN'
city3='WAHCANTT'
match4lahore = strncmp(city,city1,3)
if matcah4lahore==1
simulink=*1 0 0+’;
match4multan = strncmp(city,city2,3)
if matcah4lahore==1
simulink=*0 1 0+’;
match4wahcantt = strncmp(city,city3,3)
if matcah4lahore==1
simulink=*0 0 1+’;
REFERENCES
BOOKS:

Digital Image Processing (2nd Edition) by Rafael C. Gonzalez Rafael C. Gonzalez


Practical Algorithms for Image Analysis by Lawrence O'Gorman Michael Seul
The Image Processing Handbook, Fifth Edition John C. Russ
Machine Vision, Third Edition: Theory, Algorithms, Practicalities (Signal Processing and its
Applications) E. R. Davies
RESEARCH PAPERS:

INTERNET RESOURCES:

www.wikipedia.org
www.owlnet.rice.edu
www.pages.drexel.edu
homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm
www.trcelectronics.com
powerelectronics.com
www.electronics-lab.com/projects

OTHER SOURCES:
 Power Point presentation by Giorgos Vamvakas titled “Optical Character Recognition for
Handwritten Characters” from National Center for Scientific Research “Demokritos” Athens
– Greece.

 Introduction to Optical Character Recognition (OCR)…. Workshop on international standards,


contemporary technologies and regional cooperation Noumea, New Caledonia, 4 – 8 February 2008

 Castleman, K.R., Digital Image Processing, Prentice Hall, 1995


 John Canny, ”A computational approach to edge detection.” IEEE Transactions on PAMI,
8(6):679–
 698, 1986.
 James Elder and Richard Goldberg, ”Image editing in the contour domain.” IEEE
Transactions on
 PAMI, 23(3):291–296, 2001.
 Scott Konishi, Alan Yuille, James Coughlin, and Song Chun Zhu, ”Statistical edge detection:
 Learning and evaluating edge cues.” IEEE Transactions on PAMI, 25(1):57–74, 2003.
 William Freeman and Edward Adelson, ”The design and use of steerable filters.” IEEE
Transactions
 on PAMI, 13:891–906, 1991.
 David Martin, Charless Fowlkes, and Jitendra Malik, ”Learning to detect natural image
boundaries
 using local brightness, color, and texture cues.” IEEE Transactions on PAMI, 26(5):530–549,
 2004.

Você também pode gostar