Escolar Documentos
Profissional Documentos
Cultura Documentos
By
Sami ur Rehman
2006-NUST-BEE-146
Sarosh Khan
2006-NUST-BEE-147
Waqas Siddique
2006-NUST-BEE-162
In
CERTIFICATE
Advisor: ___________________
Mr. Raheel Querashi
Co-Advisor: _________________
Dr. Rehan Hafiz
Acknowledgment
All praise to Almighty Allah, Who bestowed us with the knowledge and enable us to complete
this project work. We present our humble respect to the last and final Prophet Muhammad
(peace be upon him), whose life is a perfect model for whole mankind.
We are greatly thankful to our Advisor Mr. Raheel Querashi and Dr. Rehan Hafizfortheir effort in
completion of this project work. Their inspiring guidance, dynamic supervisionand constructive
criticism, helped us out to accomplish the task fairly.We would like to thank all our teachers whose
valuable knowledge, assistance, cooperation and guidance enabled us to take initiative, develop and
Our project aims to develop “Automatic Mail Sorting Machine” (AMSM) which is a promising
replacement for the labor intensive and time consuming job of manual mail sorting hence bringing
efficiency to the mail service. As far as the software technology is concerned we implemented
Address Block Location and OCR technologies to identify the city name written in the address block
since it is this parameter on which automated sorting is being performed on mail letters. Using the
feature extraction, template matching and edge detection algorithms as the underlying concepts we
successfully locate the address block on the letter, extract the city name written on it, apply the OCR
system and move the output to the SIMULINK block. The SIMULINK block, using SIMULINK card
installed in the CPU than drives the plunger mechanism maintained on the conveyer belt to throw
the letters in the required destination bins. All the software modules are written and integrated in
MATLAB software. A speed-controlled DC motor continuously drives the conveyer belt. A webcam
inputs the image to the software which processes it and generates output through SIMULINK for the
plunger motors to throw the letters on the belt into the destination bins.
CHAPTER # 1
INTRODUCTION
In 1965, the Postal Service put the first high-speed optical character reader (OCR) into operation
that could handle a preliminary sort automatically. And in 1982, the first computer-driven single-line
optical character reader was employed – which reads the mailpiece destination address then prints
a barcode on the envelope that could be used to automate mail sorting from start to finish.
Such automated mail services are available in Post Offices of advance countries but the concept is
quite new in the country like Pakistan. We developed and designed this system to present a model
of an automated mail sorting machine which could be a stunning replacement for the tedious and
time consuming job of manual letter sorting.
We have implemented Using the feature extraction, template matching and edge detection
algorithms as the underlying concepts we successfully locate the address block on the letter, extract
the city name written on it, apply the OCR system and move the output to the SIMULINK block. The
SIMULINK block, using SIMULINK card installed in the CPU than drives the plunger mechanism
maintained on the conveyer belt to throw the letters in the required destination bins. All the
software modules are written and integrated in MATLAB software. A speed-controlled DC motor
continuously drives the conveyer belt. A webcam inputs the image to the software which processes
it and generates output through SIMULINK for the plunger motors to throw the letters on the belt
into the destination bins.
IMAGE ACQUISITION
The data processing part of the system starts with the image capturing which has to be done by a
fast and efficient camera. The selection of the camera most suitable for OCR was done and finalized.
The camera has its own software used for getting images from camera memory and loading it into
processing unit’s memory (which in our case will be PC). But we are doing this job using MATLAB.
The output of this stage will be an image to be processed by the OCR part.
PREPROCESSING STAGE
Now the image may have different irrelevant data etched onto it i.e. various advertisements and
unnecessary hand written information. Even in an address complete address of recipient is written
but we only want to know the city where the letter is supposed to be delivered. To separate
irrelevant data from the relevant one we need a preprocessing stage which we can say Address
Block Locator (ABL).. Now this is the second software included in our system which has to somehow
talk to the camera software.
In our case we wrote the the code of OCR in MATLAB software which successfully translated the text
written on the image into editable text for further processing.
OCR
Hardware Infrastructure
Our hardware consists primarily of 1.5 meters conveyer belt upon which are mounted the plungers
to throw the letters into their required destination bins and a web cam which acquires the image of
the letter to be sorted and sends it to computer for further processing. Our aim was to learn new
things during this project and we incorporated various new ideas into our project. The webcam
takes images of the still letter and then precedes it to the computer for further processing.
CHAPTER # 2
LITERATURE REVIEW
RGB Image:
An RGB image has three channels: red, green, and blue. RGB channels roughly follow the color receptors in
the human eye, and are used in computer displays and image scanners.
If the RGB image is 24-bit (the industry standard as of 2005), each channel has 8 bits, for red, green,
and blue—in other words, the image is composed of three images (one for each channel), where
each image can store discrete pixels with conventional brightness intensities between 0 and 255. If
the RGB image is 48-bit (very high resolution), each channel is made of 16-bit images.
An RGB array can be of class double, uint8, or uint16. In an RGB array of class double, each color
component is a value between 0 and 1. A pixel whose color components are (0,0,0) is displayed as
black, and a pixel whose color components are (1,1,1) is displayed as white. The three color
components for each pixel are stored along the third dimension of the data array. For example, the
red, green, and blue color components of the pixel (10,5) are stored in RGB(10,5,1), RGB(10,5,2),
and RGB(10,5,3), respectively.
To determine the color of the pixel at (2,3), you would look at the RGB triplet stored in (2,3,1:3).
Suppose (2,3,1) contains the value 0.5176, (2,3,2) contains 0.1608, and (2,3,3) contains 0.0627. The
color for the pixel at (2,3) is 0.5176 0.1608 0.0627
RGB=reshape(ones(64,1)*reshape(jet(64),1,192),[64,64,3]);
R=RGB(:,:,1);
G=RGB(:,:,2);
B=RGB(:,:,3);
imshow(R)
figure, imshow(G)
figure, imshow(B)
figure, imshow(RGB)
Grayscale Image
Grayscale digital image is an image in which the value of each pixel is a single sample, that is, it
carries only intensity information. Images of this sort, also known as black-and-white, are composed
exclusively of shades of gray, varying from black at the weakest intensity to white at the strongest.
Grayscale images are distinct from one-bit black-and-white images, which in the context of
computer imaging are images with only the two colors, black, and white . Grayscale images have
many shades of gray in between. Grayscale images are also called monochromatic, denoting the
absence of any chromatic variation.
Numerical representations:
The intensity of a pixel is expressed within a given range between a minimum and a maximum,
inclusive. This range is represented in an abstract way as a range from 0 (total absence, black) and 1
(total presence, white), with any fractional values in between.
Another convention is to employ percentages, so the scale is then from 0% to 100%. This is used for
a more intuitive approach, but if only integer values are used, the range encompasses a total of only
101 intensities, which are insufficient to represent a broad gradient of grays. Also, the percentile
notation is used in printing to denote how much ink is employed in half toning, but then the scale is
reversed, being 0% the paper white (no ink) and 100% a solid black (full ink).
To convert any color to a grayscale representation of its luminance, first one must obtain the values
of its red, green, and blue (RGB) primaries in linear intensity encoding, by gamma expansion. Then,
add together 30% of the red value, 59% of the green value, and 11% of the blue value (these
weights depend on the exact choice of the RGB primaries, but are typical). The formula (11*R +
16*G + 5*B) /32 is also popular since it can be efficiently implemented using only integer operations.
Regardless of the scale employed (0.0 to 1.0, 0 to 255, 0% to 100%, etc.), the resultant number is the
desired linear luminance value; it typically needs to be gamma compressed to get back to a
conventional grayscale representation.
Here is an example of color channel splitting of a full RGB color image. The column at left shows the
isolated color channels in natural colors, while at right there are their grayscale equivalences:
Binary Image:
A binary image is a digital image that has only two possible values for each pixel. Typically the two
colors used for a binary image are black and white though any two colors can be used.The color
used for the object(s) in the image is the foreground color while the rest of the image is the
background color.
Binary images are also called bi-level or two-level. This means that each pixel is stored as a single bit
(0 or 1). The names black-and-white, B&W ,monochrome or monochromatic are often used for this
concept, but may also designate any images that have only one sample per pixel, such as grayscale.
YUV Image:
YUV is a color space typically used as part of a color image pipeline. It encodes a color image or video
taking human perception into account, allowing reduced bandwidth for chrominance components, thereby
typically enabling transmission errors or compression artifacts to be more efficiently masked by the human
perception than using a "direct" RGB-representation..
The term YUV is commonly used in the computer industry to describe file-formats that are encoded using
YCbCr.
The Y'UV model defines a color space in terms of one luma (Y') and two chrominance (UV) components
Converting between YUV and RGB:
Edges typically occur on the boundary between two different regions in an image.
The purpose of detecting sharp changes in image brightness is to capture important events and
changes in properties of the world. It can be shown that under rather general assumptions for an
image formation model, discontinuities in image brightness are likely to correspond to:
discontinuities in depth
discontinuities in surface orientation
changes in material properties
variations in scene illumination
In the ideal case, the result of applying an edge detector to an image may lead to a set of connected
curves that indicate the boundaries of objects, the boundaries of surface markings as well curves
that correspond to discontinuities in surface orientation. Thus, applying an edge detector to an
image may significantly reduce the amount of data to be processed and may therefore filter out
information that may be regarded as less relevant, while preserving the important structural
properties of an image. If the edge detection step is successful, the subsequent task of interpreting
the information contents in the original image may therefore be substantially simplified.
Unfortunately, however, it is not always possible to obtain such ideal edges from real life images of
moderate complexity. Edges extracted from non-trivial images are often hampered by
fragmentation, meaning that the edge curves are not connected, missing edge segments as well
as false edges not corresponding to interesting phenomena in the image – thus complicating the
subsequent task of interpreting the image data
Edge properties:
The edges extracted from a two-dimensional image of a three-dimensional scene can be classified as
either viewpoint dependent or viewpoint independent. A viewpoint independent edge typically
reflects inherent properties of the three-dimensional objects, such as surface markings and surface
shape. A viewpoint dependent edge may change as the viewpoint changes, and typically reflects the
geometry of the scene, such as objects occluding one another.
A typical edge might for instance be the border between a block of red color and a block of yellow,
In contrast a line ,can be a small number of pixels of a different color on an otherwise unchanging
background. For a line, there may therefore usually be one edge on each side of the line.
Edges play quite an important role in many applications of image processing, in particular
for machine vision systems that analyze scenes of man-made objects under controlled illumination
conditions.
Focal blur caused by a finite depth-of-field and finite point spread function.
Penumbral blur caused by shadows created by light sources of non-zero radius.
Shading at a smooth object
A one-dimensional image f which has exactly one edge placed at x = 0 may be modeled as:
If the intensity difference were smaller between the 4th and the 5th pixels and if the intensity
differences between the adjacent neighboring pixels were higher, it would not be as easy to say that
there should be an edge in the corresponding region. Moreover, one could argue that this case is
one in which there are several edges.
Hence, to firmly state a specific threshold on how large the intensity change between two
neighboring pixels must be for us to say that there should be an edge between these pixels is not
always a simple problem. Indeed, this is one of the reasons why edge detection may be a non-trivial
problem unless the objects in the scene are particularly simple and the illumination conditions can
be well controlled.
Edge descriptors
Edge normal: unit vector in the direction of maximum intensity change.
Edge direction: unit vector to perpendicular to the edge normal.
Edge position or center: the image position at which the edge is located.
Edge strength: related to the local image contrast along the normal.
Step edge: the image intensity abruptly changes from one value to one side of the
Ridge edge: the image intensity abruptly changes value but then returns to the
Roof edge: a ridge edge where the intensity change is not instantaneous but occur
(2) Enhancement: apply a filter to enhance the quality of the edges in the image (sharpening).
(3) Detection: determine which edge pixels should be discarded as noise and which should be
retained (usually, thresholding provides the criterion used for detection).
(4) Localization: determine the exact location of an edge (sub-pixel resolution might be required for
some applications, that is, estimate the location of an edge to better than the spacing between
pixels). Edge thinning and linking are usually required in this step.
by:
The magnitude of gradient provides information about the strength of the edge. The direction of
gradient is always perpendicular to the direction of the edge (the edge direction is rotated with
respect to the gradient direction by -90 degrees).
Standard deviation
Standard deviation of a statistical population, a data set, or a probability distribution is the square
root of its variance. Standard deviation is a widely used measure of the variability or dispersion,
being algebraically more tractable though practically less robust than the expected
deviation or average absolute deviation.
It shows how much variation there is from the "average" (mean) (or expected/ budgeted value). A
low standard deviation indicates that the data points tend to be very close to the mean, whereas
high standard deviation indicates that the data are spread out over a large range of values.
For example, the average height for adult men in Pakistan about 70 inches (178 cm), with a standard
deviation of around 3 in (8 cm). This means that most men (about 68 percent, assuming a normal
distribution) have a height within 3 in (8 cm) of the mean (67–73 in (170–185 cm)) – one standard
deviation, whereas almost all men (about 95%) have a height within 6 in (15 cm) of the mean (64–76
in (163–193 cm)) – 2 standard deviations. If the standard deviation were zero, then all men would
be exactly 70 in (178 cm) high. If the standard deviation were 20 in (51 cm), then men would have
much more variable heights, with a typical range of about 50 to 90 in (127 to 229 cm). Three
standard deviations account for 99.7% of the sample population being studied, assuming the
distribution is normal (bell-shaped).
OPTICAL CHARACTER RECOGNITION MODULE
WHAT IS OCR?
Full form of OCR is Optical Character Recognition. It is a computer program designed to convert
scanned or digital images of handwritten or typewritten text into machine-editable text, or to
translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or
Unicode). OCR began as a field of research in pattern recognition, artificial intelligence and machine
vision. Though academic research in the field continues, the focus on OCR has shifted to
implementation of proven techniques.
OCR BACKGROUND
Developing proprietary OCR system is a complicated task and requires a lot of effort. Such systems
usually are really complicated and can hide a lot of logic behind the code. The use of artificial neural
network in OCR applications can dramatically simplify the code and improve quality of recognition
while achieving good performance. Another benefit of using neural network in OCR is extensibility of
the system - ability to recognize more character sets than initially defined. Most of traditional OCR
systems are not extensible enough. Why? Because such task as working with tens of thousands
Chinese characters, for example, is not as easy as working with 68 English typed character set and it
can easily bring the traditional system to its knees.
1. Pre-processing
2. Segmentation
3. Feature Extraction
4. Classification
5. Post-processing
1-Pre-processing
The raw data is subjected to a number of preliminary processing steps to make it usable in
the descriptive stages of character analysis. Pre-processing aims to produce data that are
easy for the OCR systems to operate accurately. The main objectives of pre-processing are :
• Binarization
• Noise reduction
• Stroke width normalization
• Skew correction
• Slant removal
Binarization
Adaptive (local), uses different values for each pixel according to the local area information
Noise Reduction
Noise reduction improves the quality of the document. Two main approaches:
• Filtering (masks)
• Morphological Operations (erosion, dilation, etc)
Normalization provides a tremendous reduction in data size, thinning extracts the shape
information of the characters.
Skew Correction
Skew Correction methods are used to align the paper document with the coordinate system of the
scanner. Main approaches for skew detection include correlation, projection profiles, Hough
transform.
Slant Removal
The slant of handwritten texts varies from user to user. Slant removal methods are used to
normalize the all characters to a standard form.
2-Segmentation
Explicit Segmentation
In explicit approaches one tries to identify the smallest possible word segments (primitive segments)
that may be smaller than letters, but surely cannot be segmented further. Later in the recognition
process these primitive segments are assembled into letters based on input from the character
recognizer. The advantage of the first strategy is that it is robust and quite straightforward, but is
not very flexible.
Implicit Segmentation
In implicit approaches the words are recognized entirely without segmenting them into letters. This
is most effective and viable only when the set of possible words is small and known in advance, such
as the recognition of bank checks and postal address
3-Feature Extraction
In feature extraction stage each character is represented as a feature vector, which becomes its
identity. The major goal of feature extraction is to extract a set of features, which maximizes the
recognition rate with the least amount of elements.
Due to the nature of handwriting with its high degree of variability and imprecision obtaining
these features, is a difficult task. Feature extraction methods are based on 3 types of
features:
Statistical
Structural
Global transformations and moments
Statistical Features
Representation of a character image by statistical distribution of points takes care of style variations
to some extent.
• Zoning
• Projections and profiles
• Crossings and distances
Zoning
The character image is divided into NxM zones. From each zone features are extracted to form the
feature vector. The goal of zoning is to obtain the local characteristics instead of global
characteristics
The number of foreground pixels, or the normalized number of foreground pixels, in each cell is
considered a feature.
Projection Histograms
The basic idea behind using projections is that character images, which are 2-D signals, can
be represented as 1-D signal. These features, although independent to noise and
deformation, depend on rotation.
Projection histograms count the number of pixels in each column and row of a character
image. Projection histograms can separate characters such as “m” and “n”.
Profiles
The profile counts the number of pixels (distance) between the bounding box of the character image
and the edge of the character. The profiles describe well the external shapes of characters and allow
distinguishing between a great number of letters, such as “p” and “q”.
Structural Features
Feature Extraction
The character image is divided into horizontal and vertical zones and the
density of character pixels is calculated for each zone.
Upper/ lower profiles are computed by considering for each image column,
the distance between the horizontal line and the closest pixel to the
upper/lower boundary of the character image. This ends up in two zones
depending on . Then both zones are divided into vertical blocks. For all blocks
formed we calculate the area of the upper/lower character profiles.
DEBLURRING FUNCTIONS
Implements a least squares solution. You should provide some information about the noise to
reduce possible noise amplification during deblurring.
Implements a constrained least squares solution, where you can place constraints on the output
image (the smoothness requirement is the default). You should provide some information about the
noise to reduce possible noise amplification during deblurring. See Deblurring with a Regularized
Filter for more information.
Implements the blind deconvolution algorithm, which performs deblurring without knowledge of
the PSF. You pass as an argument your initial guess at the PSF. The deconvblind function returns a
restored PSF in addition to the restored image. The implementation uses the same damping and
iterative model as the deconvlucy function. See Deblurring with the for more information.
There are many ways to perform edge detection. However, the majority of different methods may
be grouped into two categories, gradient and Laplacian. The gradient method detects the edges by
looking for the maximum and minimum in the first derivative of the image. The Laplacian method
searches for zero crossings in the second derivative of the image to find edges. An edge has the one-
dimensional shape of a ramp and calculating the derivative of the image can highlight its location.
Suppose we have the following signal, with an edge shown by the jump in intensity below:
If we take the gradient of this signal (which, in one dimension, is just the first derivative with respect
to t) we get the following:
Based on this one-dimensional analysis, the theory can be carried over to two-dimensions as long as
there is an accurate approximation to calculate the derivative of a two-dimensional image. The
Sobel operator performs a 2-D spatial gradient measurement on an image. Typically it is used to find
the approximate absolute gradient magnitude at each point in an input grayscale image. The Sobel
edge detector uses a pair of 3x3 convolution masks, one estimating the gradient in the x-direction
(columns) and the other estimating the gradient in the y-direction (rows). A convolution mask is
usually much smaller than the actual image. As a result, the mask is slid over the image,
manipulating a square of pixels at a time. The actual Sobel masks are shown below:
The mask is slid over an area of the input image, changes that pixel's value and then shifts one pixel
to the right and continues to the right until it reaches the end of a row. It then starts at the
beginning of the next row. The example below shows the mask being slid over the top left portion of
the input image represented by the green outline. The formula shows how a particular pixel in the
output image would be calculated. The center of the mask is placed over the pixel you are
manipulating in the image. And the I & J values are used to move the file pointer so you can
mulitply, for example, pixel (a22) by the corresponding mask value (m22). It is important to notice
that pixels in the first and last rows, as well as the first and last columns cannot be manipulated
by a 3x3 mask. This is because when placing the center of the mask over a pixel in the first row
(for example), the mask will be outside the image boundaries.
CHAPTER # 3
SOFTWARE PART
Image Acquisition
We are using DANY made 1.3Mpixel webcam to acquire the image of the letter moving above
the conveyer belt for further processing.
Lens: 5P F1.1.8/f2.95
The imaqhwinfo gives the installed adaptors,we are using ‘winvideo’ for the purpose.
These images are continuously taken by the webcam and by calculating the sum of all the pixels
(white=1 and black=0).
When we take sum of the image first time a column vector is returned and taking sum of the column
vector we have a single value. This value obtained is compared with the threshold for the best
image, if it is close to that value; we assume that it is the best image.
Now again in this next frame (on next page) the pixel sum exceeded 290000 so that was not
selected.
Locating Address Block
We have developed an algorithm based on the zero crossing; algorithm is detecting a change in the
pixels from black pixel to the white pixels as we are dealing with the binary image.
Consider the above part of the algorithm, by applying this on our best image this kind of image
transformation was done.
After achieving the RGB image it was converted to the gray scale image,
Now that particular image was converted into the binary image.
By negating the above image we had a negated binary image and on that image we applied our
basic algorithm.
In the algorithm below address block is located by calculating the x and y coordinates of the
address, considering the negated binary image which is obtained after certain manipulations in the
original image.
As it is known from the literature survey that the white image has a pixel value of 1 and black has a
value of zero, so what is done that this algorithm uses summing approach for identifying the address
horizontally, so if we take C1 and C2 be the starting and ending columns of the binary image matrix
above,they can be determined easily with the help of the algorithm below.
As we have found C1 with the above algorithm and now finding C2,
As the sum approach was more useful for finding the address block horizontally but after experiments
the standard deviation approach was found better than finding address block vertically.Standard
deviation is explained in the literature survey.
As we will be using the city name for applying the OCR, so what we will be doing is that our R1 (that
is the start of address block vertically) the algorithm is designed such that it identifies the city name.
After this algorithm we have R1 identified which is the start of address block, as we can roughly
estimate R2 by adding 50 into R1 as this image is going to be cropped so we just need the highte of
the rectangle which is not that critical as R1 is,so this is done by estimation.
When these values are fed to imcrop(command in matlab) the image is cropped, as shown below,
Ouput of ABL:
g = Hf + n
The Point Spread Function describes the response of the system to a point source much like impulse
response is the response of any linear function at a certain point. If PSF is convolved with an image it
will blur the image or distort it. In our case we already receive a blurred image with an unknown PSF
and additive noise. We therefore first convolve the image with the Sobel’s operator to work out the
edges of the blurred image and then subtract the resultant by the original image to get the
deblurred image.
There are various different filters and algorithms used for deblurring the image but in most cases
either the PSF is known or the additive noise has to be known but in our case we needed to work
out PSF first.
We were now left with two options: either to calculate the PSF or stop the conveyer and then take
the image.
The Sx kernel is used to determine the derivatives in horizontal direction and Sy used to
detect the derivatives in vertical direction. The two kernels are convolved with the 320*240
pixel image and maintain an intensity variation threshold, points who exceed this threshold
are assigned certain values and they are shown in the resultant image while points who lie
with in the limits of this threshold imply their derivative is zero and hence are neglected.
http://nullprogram.com/img/spatial/image-test-edge.png
TEMPLATE MATCHING
The resultant image from the above process is than segmented using character
segmentation techniques and finally matched with set of characters maintained a templates
folder.
The above power supply converts 220V AC to 2V-20V DC s that voltage is fed to the DC motor.
Features of our dc power supply:
* Adjustable output down to 1.2V
* Guaranteed 1.5A output current
* Line regulation typically 0.01%/V
* Load regulation typically 0.1%
* 80 dB ripple rejection
DC motor:
A brushless DC (BLDC) motor also known as a electronically commutated motor is a
synchronous electric motor powered by direct-current (DC) electricity and having an electronic
commutation system, rather than a mechanical commutator and brushes.
In BLDC motors, current to torque and voltage to rpm are linear relationships.
Advantages of DC motor:
Brushless d.c. (BLDC) motors provide performance advantages over PSC and brushed d.c. (BDC)
motors, including the following:
• The ratio of output power to frame size is higher in BLDC motors. This reduces the size and weight
of the product. This also saves the cost of motor mounting and shipping expenses.
• The BLDC motors operate at higher-power efficiency compared to induction motors and BDC
motors because they have permanent magnets on the rotor and there are no brushes for
commutation.
• Brush inspection is eliminated, making them suitable for limited-access areas like compressors and
fans. This also increases the life of the motor and reduces the service requirements.
• BLDC motors have less electromagnetic interference (EMI) generation. With BDC motors, the
brushes tend to break and make contacts while the motor is rotating, resulting in the emission of
electromagnetic noise into the surroundings.
• BLDC motors have a relatively flat speed-torque characteristic ( See Figure). This enables the
motor to operate at lower speeds without compromising torque when the motor is loaded.
PLUNGER MOTORS
Now we could have used two types of motors for plunging the letters into the destination bins
Free-Spinning Electric Motors - A free-spinning electric motor uses precisely timed opposing
magnetic fields to cause an armature shaft to rotate. Free-spinning electric motors can be
designed to run on AC or DC current, with brushes or brushless, depending upon their
application.
Stepper Motors - The armature of a stepper motor can be rotated an exact number of turns
or just a fraction of a turn. Stepper motors are controlled by a computer to position a
mechanical device in an exact location. A typical stepper motor can be positioned to 256
different positions.
We are using window power motor which is of the type of free spinning electric motor. It
requires a minimum current of 3-4 amps for proper functionality. The motor rotates at 360
degrees and plunges accurately the letter on conveyer belt into the destination bins using
plungers mounted on it. We are using three motors for this purpose as we are sorting three
letters based on city.
Window motor used to plunge the letters
To drive these heavy current motors we have used battery to provide these motors a current
of 3-4 amps. Designing of this circuit was done by us and it worked fine and it employed a
heavy battery.
Circuit made in multisim
Optocouplers
Optocouplers are used to detect the 5V signal at the output of the SIMULINK card. Optocouplers are
great for tinkering. They enable you to control one circuit from another circuit when there is no
electronic connection between the two circuits.
Op Amp
Op Amps are used to convert the 5 volt into 15 volt signal. Because of the high impedance of the op-
amps' input stage, combined with "bootstrapping" effects caused by negative feedback, the input
impedance of an op-amp is infinite for all practical purposes.
Relays
Relays are used to serve as current switches. They are electro-magnetically activated switches.
Literally, there is an electromagnet inside the relay, and energizing that electromagnet causes the
switch to change position by pulling the movable parts of the switch mechanism to a different
position. To the greatest extent possible, the electromagnet is made to be electrically isolated from
the signal path.
Battery
The motors drive current from the battery placed at the corner of the circuit board. The battery is
capable of producing 3-4 Volts of voltage.
The SIMULINK card has an output board which generates 5V for each signal sent to it as a city is
detected by the MATLAB. From here the signal passes through the current amplifier and the
plunger motor is driven.
CHAPTER # 4
DISCUSSION AND
CONCLUSION
This chapter aims to discuses the merits and demerits the Automatic Mail Sorting machine. It
captures the essence of the project and techniques employed to complete the project. The project is
also compared with other of kinds in a more subjective way.
The goal of the project was to develop the automated mail sorting machine which should have
sorted out letters at enormous speeds, more than at least the time required for manual letter
sorting. But due to non availability of the resources and limited financial capabilities we could only
make the circuit which operates just at the same speed as required in manual letter sorting. Also our
project is only 70-80 % operational. Although the image processing part has been achieved quite
accurately and successfully in the MATLAB software but the interfacing problem presented some
major loopholes in automation.
Due to undetectable motion blur which accompanies the image acquired by the camera as the letter
moves on the conveyer belt, image processing on moving image could not be achieved although we
successfully implemented various image processing algorithms on the still image.
Motion blur represented a major loophole in automation. The blur could have been some how
reduced using some high resolution camera but due to our financial limitations we abandoned this
idea of buying such a state of the art design. Another way to reduce the blur was to use the
software coding and use some other image processing technique, and it was this idea which we
adopted. But for each motion deblurr technique we either need to know the PSF or the noise
function which in our case was simply unknown. Calculating the PSF is monotonous task but still we
proceeded and gained almost 30 percent successes in this area.
The very first task was to determine the address block from the acquired image. The code we wrote
for this purpose was named as Address Block Location (ABL), this task was followed by the task of
locating the city name from the worked out address block. This was yet another momentous task
and took weeks before final furnishing. This software code is followed by OCR code. It took us yet
other weeks to accurately interface both the two codes together. By now both the codes work just
fine and take at the maximum of 3 seconds to produce the output.
The plunger motor drive presented to us yet another problem. After a great deal of experimentation
and thinking we were left with only one option: using window power motor. We designed its driving
circuitry as it requires almost 3-4 amp of current for proper functionality. With the output from
SIMULINK they are accurately operating. This presents another achievement.
Throughout our experimentation on the AMSM we ended up with various achievements and were
stuck in certain areas as well. We successfully located the address block from the acquired image
and extracted the city name from that address. Then we finally ran optical character recognition
software on the extracted city and sorted out letters as per required.
Image processing on moving image could not be fully achieved, although we went a long way ahead
in this domain.
Overall we believe we have almost achieved the goal we set for ourselves in the beginning of the
project.
APPENDEX
MATLAB CODE:
% OCR (Optical Character Recognition).
% Auotomatic Mail Sorting Machine (AMSM)
%_this code is intended to receive the frames produced by camera, perform
%optical characrer recognition and generate corresponding signal for the
%simulink to derrive the plunger mechanism.
% PRINCIPAL PROGRAM
%////////////////////////////////////////////////////////////////////
warning off %#ok<WNOFF>
% Clearing command window
clc
% Closing all the opened figures
close all
%//////////////////////////////////////////////////////////////////////
Read image
camera interfacing....
vid1 = videoinput('winvideo',1,'YUY2_320x240') % for 320x240 video
set(vid1,'Returnedcolorspace','RGB')% for RGB image
preview(vid1)
snap=getsnapshot(vid1);
%imshow(snap)
imwrite(snap,'lahore.jpg')
imshow(address)
%/////////////////////////////////////////////////////////////////////////
%IMAGE FILTERING...SMOOTHING.....
f=imread('wah.jpg');
f=rgb2gray(f);
[M,N]=size(f);
F=fft2(double(f));
u=0:(M-1);
v=0:(N-1);
idx=find(u>M/2);
u(idx)=u(idx)-M;
idy=find(v>N/2);
v(idy)=v(idy)-N;
[V,U]=meshgrid(v,u);
D=sqrt(U.^2+V.^2);
%H=double(D<=P);
G=1.*F;
g=real(ifft2(double(G)));
%/////////////////////////////////////////////////////////////////////////
%imshow(f),figure,
%a=imshow(g,[0 255])
%figure;
%READING THE IMAGE.........................
imagen=imread('wah.jpg');
% Show image
imshow(imagen);
title('IMAGE TRANSFORMED INTO MATLAB')
%////////////////////////////////////////////////////////////////////////
% Converting the RGB image to gray scale to reduce processing
if size(imagen,3)==3 %RGB image
imagen=rgb2gray(imagen);
end
%////////////////////////////////////////////////////////////////////////
% Convert to BW
% First we determine the level of threshold
threshold = graythresh(imagen);
% Now we convert the image to binary format
imagen =~im2bw(imagen,threshold);
%/////////////////////////////////////////////////////////////////////////
% Adress Block location Software
% Clear variables and functions from memory.
clear all
clc
%putting two variables=0 for the further use as the counters
k=0;
u=0;
e=0;
f=0;
% Now reading a grayscale or color image from the file specified by the string FILENAME
A=imread('addd.jpg');
% % RGB2GRAY converts RGB images to grayscale by eliminating the hue and saturation information
while retaining the luminance
A=rgb2gray(A);
% %computing a global threshold (LEVEL) that can be used to convert an
% %intensity image to a binary image with IM2BW.
threshold = graythresh(A);
% %image converted to binary and negated
A=~im2bw(A,threshold);
% X=[0 -.25 0;-.25 1 -.25;0 -.25 0]
% C = convn(A,X)
% C=~C;
% calculatin the sum colum wise giving a row vector
B=sum(A)
%looop for the start of the adress block
for m=1:640
%checking the black pixels
if (B(m)>1)
e=e+1
if(e>25)
C1=m-e
% Start of the adress block identified
break
end
end
end
%loop for the end of the adress block horizontally
for i=C1:640
if(B(i)<1)
k=k+1
if(k>42)
C2=i
%End of the adress block Identified
break
end
end
end
%now calculatin the start of adress block horizontally and vertically
Z=sum(A,2)
for l=35:480
if(Z(l)>1)
f=f+1
if(f>15)
r1=l-f
%start of adress block identified vertically
break
end
end
end
for s=r1:480
if(Z(s)<1)
u=u+1
if(u>20)
r2=s
%start of adress block identified vertically
break
end
end
end
r1=r1+55
r2=r2-1
C1=C1-1
C2=C2-1
INTERNET RESOURCES:
www.wikipedia.org
www.owlnet.rice.edu
www.pages.drexel.edu
homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm
www.trcelectronics.com
powerelectronics.com
www.electronics-lab.com/projects
OTHER SOURCES:
Power Point presentation by Giorgos Vamvakas titled “Optical Character Recognition for
Handwritten Characters” from National Center for Scientific Research “Demokritos” Athens
– Greece.