Mini Project Documentation

AUTOMATIC
NUMBER
PLATE
EXTRACTION
CONTENTS
DESCRIPTION PAGE NO
1
ABSTRACT 4
CHAPTER-1 5-6
1. INTRODUCTION 6
CHAPTER-2:
2. BUSINESS PROBLEM AND SOLUTIONS 7-10
2.1 Existing System 8
2.2 Proposed Solution 9
2.3 System Requirements 8
2.4 Functional Requirements 8
2.5 Non-Functional Requirements 9
2.6 User Requirements 9
2.7 Input and Output Requirements 9
2.8 System Architecture 10
CHAPTER-3
3. LITERATURE SURVEY 11-23
3.1 Machine Learning 12-13
3.2 Types of Machine Learning 14
3.3 Testing and Validating 14-15
3.4 Methodology 16-18
3.5 Implementation 19-20
3.5.1 CV2 for Image Classification 21-22
3.5.2 Pytesseract 23
3.5.3 Numpy 23
3.5.4 PIL 23
CHAPTER-4
4. DESIGN 24-29
2
4.1 Introduction to UML 25-29
4.2 UML Diagrams 26-29
4.2.1 Use Case Diagram 26-27
4.2.2 Class Diagram 27-28
4.2.3 Architecture Diagram 29
IMPLEMENTATION 30-36
Output Images
Example 1
Fig 1 : Input Image 32
Fig 2 : Processing Image 32
Fig 3 : Output Extracted Image 33
Example 2
Fig 1 : Input Image 33
Fig 2 : Processing Image 34
Fig 3 : Output Extracted Image 34
CHAPTER-5
5. CONCLUSION AND FUTURE ENHANCEMENTS 37
REFERENCES 38
ABSTRACT
3
Automatic Number Plate Extraction is an image processing technology that is used to identify
license plates of vehicles. The main objective is to design a machine learning model by which
could extract the vehicle number from the given image. This technology can be mainly used
by the Police department. Some of the main uses are detecting traffic violators and track
vehicle thefts, also authenticate the owner of the vehicle. This can be also used by private and
government organizations near the entry and exit points for security control. This technology
first detects the vehicle and then extracts the number plate, which then will be checked with
the vehicle database and the required information about the vehicle and the vehicle owner can
be obtained.
4
CHAPTER-1
INTRODUCTION
5
Most of the number plate localization algorithms merge several procedures, resulting in long
computational time this may be reduced by applying less and simpler algorithms. The results
are highly dependent on the image quality, since the reliability of the procedures severely
degrades in the case of complex, noisy pictures that contain a lot of details. Unfortunately the
various procedures barely offer remedy for this problem, precise camera adjustment is the
only solution. This means that the car must be photographed in a way that the environment is
excluded as possible and the size of the number plate is as big as possible. Adjustment of the
size is especially difficult in the case of fast cars, since the optimum moment of exposure can
hardly be guaranteed. Number Plate Localization on the Basis of Edge Finding: The
algorithms rely on the observation that number plates usually appear as high contrast areas in
the image (black-and-white or black-and-yellow).
6
CHAPTER-2
BUSINESS PROBLEM AND SOLUTIONS
7
2.1 Existing System
The present system is brute force method in which a Traffic Constable manually note downs
the license plates, Which is a time taking processes and extra maintenance is required to
safeguard the data.
It may also lead to manual errors while noting down the registration number from license
plates since it is a human work.
We can’t even provide an image as a proof when required to prove the traffic violator as
guilty.
2.2 Proposed Solution

System name – Automatic Number Plate Recognition (ANPR)
The objective of our system is to extract registration number of 10 characters by image

processing using machine learning and then later try to output the number plate using OCR.
Our system takes an image as an input and process the image for extracting the license plate
by character segmentation using SVM and the output will be 10 characters.
2.3 System Requirements

IMHO multicore processor, i5-i7 series for instance, with at least 8GB of RAM and some
space in disk to save the graphs (SSD are preferable because of the access speed), if you
happen to have a GTX Nvidia card it’s good to go.
2.4 Functional Requirements

 Functional requirements may be calculations, technical details, data manipulation and
processing and other specific functionality that define what a system is supposed to
accomplish.
 Generally, functional requirements are expressed in the form "system must do

requirement", while non-functional requirements are "system shall be requirement"
2.5 Non-Functional Requirements

 Performance: The performance of the system will include the main aspects like
Response Time, Throughput, Utilization.
8
 Capacity: The ability of the system that tells how good the system can hold lot of input
data.
 Availability: The availability is the system being able to be used.
 Reliability: It’s the quality of being trustworthy or performing consistently well.
 Recoverability: Theability of the system to get back its data.
 Maintainability: It’s the ability as the probability of performing a successful repair

action within a given time.
2.6 User Requirements

 A good working system without any errors and crashing while the system is running.
 Takes adequate input and gives proper output.
 Runs smoothly and faster.
 Have a good and user friendly interface.
2.7 Input and Output Requirements
 A dataset which contains up to 20 pictures and with a size of 4x4cm and with a good
resolution at least 420px.
 10 characters will be the standard output which can be further used to obtain details of
the vehicle.
2.8 System Architecture

Step 1 : Pre-process the image of number plate.
Step 2 : Segment and normalize the number plate.
9
Step 3 : Extract the feature vector of each normalized candidate
Step 4 : Image Process the and Extract the Number Plate.
Step 5 : Recognize the number plate by the set of trained OCR

Technology.
Step 6 : If there are no more unclassified samples, then STOP.

Otherwise, go to Step 5.
Step 7 : Add these test samples into their corresponding database

for further training.
Fig: System Architecture
Input Image
Camera
License Plate
Extraction
Character
Segmentation
OCR Character
Recognition
Output
Text
10
CHAPTER-3
LITERATURE SURVEY
3.1 Machine Learning:
11
Machine Learning is the science (and art) of programming computers so they can learn from
data. Here is a slightly more general definition: [Machine Learning is the] field of study that
gives computers the ability to learn without being explicitly programmed.
For example, the spam filter is a Machine Learning program that can learn to flag spam
given examples of spam emails (e.g., flagged by users) and examples of regular (non-spam,
also called "ham") emails. The examples that the system uses to learn are called the training
set. Each training example is called a training instance (or sample). In this case, the task T is
to flag spam for new emails, the experience E is the training data and the performance
measure P needs to be defined; for example, the ratio of correctly classified emails can be
used. This particular performance measure is called accuracy and it is often used in
classification tasks.
Why Use Machine Learning?

Consider writing a spam filter using traditional programming techniques (Figure 2.1):
1. First look at what spam typically looks like. It can be noticed that some words or phrases
(such as "41J," "credit card," "free," and "amazing") tend to come up a lot in the subject. A
few other patterns in the sender's name, the email's body can be noticed.
2. Write a detection algorithm for each of the patterns identified, and the program would flag
emails as spam if a number of these patterns are detected.
3. Test the program, and repeat steps 1 and 2 until it is good enough.
12
Since the problem is not trivial, the program will likely become a long list of complex rules -
pretty hard to maintain. In contrast, a spam filter based on Machine Learning techniques
automatically learns which words and phrases are good predictors of spam by detecting
unusually frequent patterns of words in the spam examples compared to the ham examples
(Figure I -2). The program is much shorter, easier to maintain, and most likely more accurate.
3.2 Types of Machine Learning Systems

1. There are so many different types of Machine Learning systems that it is useful to classify
them in broad categories based on:
2. Whether or not they are trained with human supervision (supervised, unsupervised, semi-
supervised, and Reinforcement Learning)
3. Whether or not they can learn incrementally on the fly (online versus batch learning)
4. Whether they work by simply comparing new data points to known data points, or instead
detect patterns in the training data and build a predictive model, much like scientists do
(instance-based versus model-based learning).
13
3.3 Testing and Validating
The only way to know how well a model will generalize to new cases is to actually try it out
on new cases. One way to do that is to put the model in production and monitor how well it
performs. This works well, but if the model is horribly bad, users will complain - not the best
idea.
A better option is to split the data into two sets: the training set and the test set. As these
names imply, train the model using the training set, and test it using the test set. The error rate
on new cases is called the generalization error (or out-of-sample error), and by evaluating the
model on the test set, an estimation of this error can be obtained. This value tells how well the
model will perform on instances it has never seen before.
If the training error is low (i.e., the model makes few mistakes on the training set) but the
generalization error is high, it means that the model is over fitting the training data.
So evaluating a model is simple enough: just use a test set. How to decide between two
models? One option is to train both and compare how well they generalize using the test set.
Now suppose that the linear model generalizes better, but some regularization features must
be applied to avoid over fitting. The question is: how to choose the value of the regularization
hyper parameter? One option is to train 100 different models using 100 different values for
this hyper parameter. Suppose the best hyper parameter value that produces a model with the
lowest generalization error, say just 5o/o error is found and this model is launched into
production, unfortunately it does not perform as well as expected and produces 75% errors.
What just happened? The problem is that the generalization error is measured multiple times
on the test set, and the model and hyper parameters adapted to produce the best model for that
set. This means that the model is unlikely to perform as well on new data. A common solution
to this problem is to have a second holdout set called the validation set. Train multiple models
with various hyper parameters using the training set, select the model and hyper parameters
that perform best on the validation set, and run a single final test against the test set to get an
estimate of the generalization error. To avoid "wasting" too much training data in validation
sets, a common technique is to use cross validation: the training set is split into
complementary subsets, and each model is trained against a different combination of these
subsets and validated against the remaining parts. Once the model type and hyper parameters
have been selected, a final model is trained using these hyper parameters on the full training
set, and the generalized error is measured on the test set.
14
Prepare the Data for Machine Learning Algorithms:
1. It’s time to prepare the data for Machine Learning algorithms. Instead of just doing this
manually, functions should be written to do that, for several good reasons.
2. This will allow reproduction of these transformations easily on any dataset (e.g., the next
time a fresh dataset is obtained).
3. Gradually, a library of transformation functions will be built that can be reused in future
projects.
4. These functions in the live system can be used to transform the new data before feeding it
to algorithms.
5. This will make it possible to easily try various transformations and see which combination
of transformations works best.
3.4 Methodology
A) Image Processing Background:

Images have always played an important role in human life since vision is probably human
beings most important sense. As a consequence, the field of image processing has numerous
applications (medical, military, etc.). Nowadays and more than ever, images are everywhere
and it is very easy for everyone to generate a huge amount of images, thanks to the advances
in digital technologies. With such a profusion of images, traditional image processing
15
techniques have to cope with more complex problems and have to face their adaptability
according to human vision. With vision being complex, machine learning has emerged as a
key component of intelligent computer vision programs when adaptation is needed (e.g., face
recognition). With the advent of image datasets and benchmarks, machine learning and image
processing have recently received a 1ot of attention.
An innovative integration of machine learning in image processing is very likely to have a

great benefit to the field, which will contribute to a better understanding of complex images.
The number of image processing algorithms that incorporate some learning components is
expected to increase, as adaptation is needed. However, an increase in adaptation is often
linked to an increase in complexity, and one has to efficiently control any machine learning
technique to properly adapt it to image processing problems. Indeed, processing huge
amounts of images means being able to process huge quantities of data often of high
dimensions, which is problematic for most machine learning techniques. Therefore, an
interaction with the image data and with image priors is necessary to drive model selection
strategies.
The primary purpose of this special issue is to increase the awareness of image processing
researchers to the impact of machine learning algorithms. The special issue discusses
problems and their proposed solutions currently under research by the community.
B) Image Recognition:
An image recognition algorithm takes an image as input and outputs what the image contains.
In other words, the output is a class label. An image recognition algorithm knows the contents
of an image by training the algorithm in order to learn the differences between different
classes. To find number plates in images, an image recognition algorithm with thousands of
images of number plates and thousands of images of backgrounds that do not contain helmets
should be trained. Needless to say, this algorithm can only understand objects / classes it has
learned.
16
Image processing Steps
 Original Image
 Filtered and equalized image
 Top-hat image
 Threshold image
 Box image
17
 Dilate image
 Final Plate image
 Segmented image
3.5 Implementation
Data Collection:
Number plate recognition starts with the acquisition of images from an image source,
desirably from a surveillance camera. The image acquisition technique determines the
18
captured image quality of the number plate with which the detection algorithm have to work.
Better the quality of the acquired images, higher the accuracy.A method of preprocessing is to
prepare the image for better feature extraction. This can be considered as a stage to set up the
vehicle picture prepared for Pattern Recognition and Image Processing. The decision of
preprocessing strategy to be received on a vehicle image relies on upon the sort of use for
which the image is being utilized.
Data Pre-Processing:
Data preprocessing is a data mining technique that involves transforming raw

data into an understandable format.
Real-world data is often incomplete, inconsistent, and/or lacking in certain

behaviors or trends, and is likely to contain many errors. Data preprocessing is a
proven method of resolving such issues. Data preprocessing prepares raw data
for further processing.
Data preprocessing is used database-driven applications such as customer

relationship management and rule-based applications (like neural networks).
Data goes through a series of steps during preprocessing:

Data Cleaning : Data is cleansed through processes such as filling in missing values,
smoothing the noisy data, or resolving the inconsistencies in the data.
Data Integration : Data with different representations are put together and conflicts within
the data are resolved.
Data Transformation : Data is normalized, aggregated and generalized.
Data Reduction : This step aims to present a reduced representation of the data in a data
warehouse.
Data Discretization : Involves the reduction of a number of values of a continuous attribute

by dividing the range of attribute intervals.
Pre- Processing : The pre-processing is the first step in number plate recognition. It consists
the following major stages: 1.Binarization, 2.Noise Removal.
Binarization : The input image is initially processed to improve its quality and prepare it to
next stages of the system. First, the system will convert RGB images to gray-level images.
19
Noise Removal : In this noise removal stage we are going to remove the noise of the image
i.e., while preserving the sharpness of the image. After the successful Localization of the
Number Plate, we go on with Optical Character Recognition which involves the
Segmentation, Feature extraction and Number plate Recognition.
Accuracy Measures:
Accuracy is a common yet interesting concept used in project management. It is defined as
how the values that are measured are close to the intended target (value) or simply, an
assessment of correctness. Thus, if the measurements are accurate, then the values are close
to the target.
Accuracy is often mistaken for precision. It is important to take note that while accuracy is
being close to the target value, precision is defined as the values of a particular repeated
measurement that is clustered but not necessarily close to the target value. If there is less
scatter with the value, then it is considered precise even if the values are far from the desired
target. While accurate data can be precise, it does not mean the same thing or the other way
around.
Precision:
There are two measurement systems used in project management – accuracy and precision.
While accuracy is defined as the closeness... .There are two measurement systems used in
project management – accuracy and precision. While accuracy is defined as the closeness of
the measured value to a known standard, precision is defined as the measure of exactness.
The level of the increment used in every interval determines the precise measurements. This
means that precision is achieved with the greater number of increments.
Risk Data Quality Assessment : The risk data quality assessment is a project management
technique that is used to evaluate the level or degree the risk data quality assessment is a
project management technique that is used to evaluate the level or degree to which data about
risks is necessary for risk management. This technique also involves analyzing the dress
which the risk is understood. It also looks into the accuracy, reliability, quality and integrity
of the data concerning the risk.
3.5.1 CV2 for Image Classification

OpenCV is a library of programming functions for real time computer vision originally
developed by Intel and now supported by Willogarage. It is free for use under the open source
BSD license.
20
The library has more than five hundred optimized algorithms. It is used around the world,
with forty thousand people in the user group. Uses range from interactive art, to mine
inspection, and advanced robotics.
The library is mainly written in C, which makes it portable to some specific platforms such
as Digital Signal Processor. Wrappers for languages such as C, Python, Ruby and Java (using
Java CV) have been developed to encourage adoption by a wider audience.
The recent releases have interfaces for C++. It focuses mainly on real-time image processing.
OpenCV is a cross-platform library, which can run on Linux, Mac OS and Windows. To date,
OpenCV is the best open source computer vision library that developers and researchers can
think of.
Deep Learning inside OpenCV 3.3

Deep Learning is a fast growing domain of Machine Learning and if you’re working in the
field of computer vision/image processing already (or getting up to speed), it’s a crucial area
to explore.
With OpenCV 3.3, we can utilize pre-trained networks with popular deep learning
frameworks. The fact that they are pre-trained implies that we don’t need to spend many
hours training the network — rather we can complete a forward pass and utilize the output to
make a decision within our application.
OpenCV does not (and does not intend to be) to be a tool for training networks — there are
already great frameworks available for that purpose. Since a network (such as a CNN) can be
used as a classifier, it makes logical sense that OpenCV has a Deep Learning module that we
can leverage easily within the OpenCV ecosystem.
Popular network architectures compatible with OpenCV 3.3 include:
 GoogleLeNet (used in this blog post)

 AlexNet
 SqueezeNet
 VGGNet (and associated flavors)
 ResNet
The release notes for this module are available on the OpenCV repository page.
Aleksandr Rybnikov, the main contributor for this module, has ambitious plans for this
module so be sure to stay on the lookout and read his release notes (in Russian, so make sure
you have Google Translation enabled in your browser if Russian is not your native language).
It’s my opinion that the dnn module will have a big impact on the OpenCV community, so
let’s get the word out.
21
Configure your machine with OpenCV 3.3
Installing OpenCV 3.3 is on par with installing other versions. The same install tutorials can
be utilized — just make sure you download and use the correct release.
Simply follow these instructions for MacOS or Ubuntu while making sure to use
the opencvand opencv_contrib releases for OpenCV 3.3. If you opt for the MacOS +
homebrew install instructions, be sure to use the --HEAD switch (among the others
mentioned) to get the bleeding edge version of OpenCV.
If you’re using virtual environments (highly recommended), you can easily install OpenCV
3.3 alongside a previous version. Just create a brand new virtual environment (and name it
appropriately) as you follow the tutorial corresponding to your system.
OpenCV deep learning functions and frameworks
OpenCV 3.3 supports the Caffe, TensorFlow, and Torch/PyTorch frameworks.
Keras is currently not supported (since Keras is actually a wrapper around backends such as
TensorFlow and Theano), although I imagine it’s only a matter of time until Keras is directly
supported given the popularity of the deep learning library.
3.5.2 Pytesseract
22
Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will
recognize and "read" the text embedded in images.
Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It is also useful as a stand-

alone invocation script to tesseract, as it can read all image types supported by the Python
Imaging Library, including jpeg, png, gif, bmp, tiff, and others, whereas tesseract-ocr by
default only supports tiff and bmp. Additionally, if used as a script, Python-tesseract will print
the recognized text instead of writing it to a file.
3.5.3 Numpy
Numpy is the core library for scientific computing in Python. It provides a high-performance
multidimensional array object, and tools for working with these arrays.
Arrays
A numpy array is a grid of values, all of the same type, and is indexed by a tuple of
nonnegative integers. The number of dimensions is the rank of the array; the shape of an
array is a tuple of integers giving the size of the array along each dimension.
3.6 PIL
The Python Imaging Library supports a wide variety of image file formats. To read files from
disk, use the open() function in the Image module. You don’t have to know the file format to
open a file. The library automatically determines the format based on the contents of the file.
To save a file, use the save() method of the Image class. When saving files, the name
becomes important. Unless you specify the format, the library uses the filename extension to
discover which file storage format to use.
23
CHAPTER-4
24
DESIGN
4.1 Introduction to UML
The Unified Modeling Language allows the software engineer to express an analysis model
using the modeling notation that is governed by a set of syntactic, semantic and pragmatic
rules.
A UML system is represented using five different views that describe the system from
distinctly different perspective. Each view is defined by a set of diagram, which is as follows:
User Model View:

1. This view represents the system from the users' perspective.
2. The analysis representation describes a usage scenario from the end-users' perspective.
Structural Model View:

1. In this model, the data and functionality are arrived from inside the system
2. This model view models the static structures.
Behavioral Model View:

It represents the dynamic of behavioral as parts of the system, depicting he interactions of
collection between various structural elements described in the user model and structural
model view.
Implementation Model View:
25
In this view, the structural and behavioral as parts of the system are represented as they are to
be built.
Environmental Model View:

In this view, the structural and behavioral aspects of the environment in which the system is
to be implemented are represented.
4.2 UML Diagrams

4.2.1 Use Case Diagram
To model a system, the most important aspect is to capture the dynamic behaviour. To clarify
a bit in details, dynamic behaviour learns the behaviour of the system when it is running
/operating. So only static behaviour is not sufficient to model a system rather dynamic
behaviour is more important than static behaviour. In UML there are five diagrams available
to model dynamic nature and use case diagram is one of them. Now as we have to discuss
that the use case diagram is dynamic in nature there should be some internal or external
factors for making the interaction.These internal and external agents are known as actors. So
use case diagrams are consists of actors, use cases and their relationships. The diagram is
used to model the system/subsystem of an application. A single use case diagram captures a
particular functionality of a system. So to model the entire system numbers of use case
diagrams are used.
Use case diagrams are used to gather the requirements of a system including internal and
external influences. These requirements are mostly design requirements. So when a system is
analysed to gather its functionalities use cases are prepared and actors are identified.
In brief, the purposes of use case diagrams can be as follows
a. Used to gather requirements of a system.
b. Used to get an outside view of a system
c. Identify external and internal factors influencing the system
d. Show the interacting among the requirements are actors.
26
Fig: Use Case Diagram
Our software system can be used to support library environment to create a Digital Library
where several licence plate images are converted into electronic-form for accessing by the
users. For this purpose the printed plates must be recognized before they are converted into
electronic-form. The resulting electronic-documents are accessed by the users like police and
general public for reading and getting information.
4.2.2 Class Diagram

The class diagram is a static diagram. It represents the static view of an application. Class
diagram is not only used for visualizing, describing and documenting different aspects of a
system but also for constructing executable code of the software application.
The class diagram describes the attributes and operations of a class and also the constraints
imposed on the system. The class diagrams are widely used in the modeling of object
oriented systems because they are the only UML diagrams which can be mapped directly
with object oriented languages. The class diagram shows a collection of classes, interfaces,
associations, collaborations and constraints. It is also known as a structural diagram.
27
The UML diagrams like activity diagram, sequence diagram can only give the sequence flow
of the application but class diagram is a bit different. So it is the most popular UML diagram
in the coder community.
The purpose of the class diagram can be summarized as:

a. Analysis and design of the static view of an application.
b. Describe responsibilities of a system.
c. Base for component and deployment diagrams.
d. Forward and reverse engineering.
Fig: Class Diagram
The class diagram gives a clear picture of all the processes involved in the background in
order to carry out the recognition process. It shows all the classes that happens in the
background and as well gives a clear relationship on how they relates with one another to
help recognize the characters in the plates at the end of the day. The class diagram contains of
all the attributes involved in each class or method. It also gives a high clear idea towards the
entire processing of the image, how the image is being processes to cater for recognizing the
characters.
28
4.2.3 Architecture Diagram
Software architecture involves the high level structure of software system abstraction, by
using decomposition and composition, with architectural style and quality attributes. A
software architecture design must conform to the major functionality and performance
requirements of the system, as well as satisfy the non-functional requirements such as
reliability, scalability, portability, and availability.
A software architecture must describe its group of component so their connections,

interactions among them and deployment configuration of all components.
Fig: Architecture Diagram
29
IMPLEMENTATION
Code Snippets:
Import Statements:-
import PIL
from PIL import Image
import os
Re-sizing Images:-
outPath = r"N:\ML\ANPR\RDS" #Resized Images Folder
path = r"N:\ML\ANPR\DS" #Dataset
for image_path in os.listdir(path): #For Loop To Iterate in The Folder
Storing the images in Directory Path:-
input_path = os.path.join(path, image_path) #Joining the Image Variable with the

Path
img = Image.open(input_path) #Opening The Images Indvidually
#newres = 75 #Required Size
#wpercent = (newres / float(img.size[0]))
#hsize = int((float(img.size[1]) * float(wpercent))) #Maintaining the Ratio
Scaling:-
img = img.resize((500,500), PIL.Image.ANTIALIAS) #Resizing to the required

dimensions
30
fullpath = os.path.join(outPath, image_path) #Attaching the resultant Images to
DestinationPath
img.save(fullpath) #Saving the Images
Output:
Fig 1 : Input Image
31
Fig 2 : Processed Image
Fig 3 : Output Extracted Number Plate
32
Fig 4 : Input Image
Fig 5: Processed Image
33
Fig 6 : Output Extracted Number Plate
Performance measures:
Many metrics can be used to measure whether or not a program is learning to perform its task
more effectively. For supervised learning problems, many performance metrics measure the
number of prediction errors. There are two fundamental causes of prediction error: a model's
bias and its variance. Assume that there are many training sets that are all unique, but equally
representative of the population. A model with a high bias will produce similar errors for an
input regardless of the training set it was trained with; the model biases its own assumptions
about the real relationship over the relationship demonstrated in the training data. A model
with high variance, conversely, will produce different errors for an input depending on the
training set that it was trained with. A model with high bias is in exible, but a model with high
variance may be so exible that it models the noise in the training set. That is, a model with
high variance over-ts the training data, while a model with high bias under-ts the training
data. It can be helpful to visualize bias and variance as darts thrown at a dartboard. Each dart
is analogous to a prediction from a different dataset. A model with high bias but low variance
will throw darts that are far from the bull's eye, but tightly clustered. A model with high bias
and high variance will throw darts all over the board; the darts are far from the bull's eye and
each other.
Accuracy Measures:
34
Accuracy is a common yet interesting concept used in project management. It is defined as
how the values that are measured are close to the intended target (value) or simply, an
assessment of correctness. Thus, if the measurements are accurate, then the values are close
to the target.
Accuracy is often mistaken for precision. It is important to take note that while accuracy is
being close to the target value, precision is defined as the values of a particular repeated
measurement that is clustered but not necessarily close to the target value. If there is less
scatter with the value, then it is considered precise even if the values are far from the desired
target. While accurate data can be precise, it does not mean the same thing or the other way
around.
Precision:
While accuracy is defined as the closeness... .
While accuracy is defined as the closeness of the measured value to a known standard,
precision is defined as the measure of exactness. The level of the increment used in every
interval determines the precise measurements. This means that precision is achieved with the
greater number of increments.
Risk Data Quality Assessment: The risk data quality assessment is a project management
technique that is used to evaluate the level or degree the risk data quality assessment is a
project management technique that is used to evaluate the level or degree to which data about
risks is necessary for risk management. This technique also involves analyzing the dress
which the risk is understood. It also looks into the accuracy, reliability, quality and integrity
of the data concerning the risk.
Limitations:
35
The limitations of the study are those characteristics of design or methodology that impacted
or influenced the interpretation of the findings from your research. They are the constraints
on generalizability, applications to practice, and/or utility of findings that are the result of the
ways in which you initially chose to design the study and/or the method used to establish
internal and external validity.
The limitations for our project are:
* Any blur image which can't be properly gray-scaled hence can't detect the license plate.
* Any low resolution image can't be re-sized because the resolution decreases more.
* Side view of the vehicle can't recognize the license plate.
* Broken license plate can't be recognized using our model.
36
CHAPTER-5
CONCLUSION & FUTURE ENHANCEMENTS
This project presents a recognition method in which the vehicle plate image is obtained by
the digital cameras and the image is processed to get the number plate information. A rear
image of a vehicle is captured and processed using various algorithms. Further we are
planning to study about the characteristics involved with the automatic number plate system
for better performance.
ANPR can be further exploited for vehicle owner identification, vehicle model
identification traffic control, vehicle speed control and vehicle location Tracking.
It can be further extended as multilingual ANPR to identify the language of characters

automatically based on the training data
It can provide various benefits like traffic safety enforcement, security- in case of
suspicious activity by vehicle, easy to use, immediate
information availability- as compare to searching vehicle owner registration details

manually and cost effective for any country for low resolution images some
improvement algorithms like super resolution [30], [31] of images should be focused.
37
Most of the ANPR focus on processing one vehicle number plate but in real-time there can be
more than one vehicle number plates while the images are being captured.
In [5] multiple vehicle number plate images are considered for ANPR while in most of other
systems offline images of vehicle, taken from online database such as [78] are given as input
to ANPR so the exact results may deviate from the results shown in Table 1 and Table 2.
To segment multiple vehicle number plates a coarse-to-fine strategy could be helpful. And
finally adding more and more extra features.
References
1) https://www.semanticscholar.org/paper/Survey-on-Automatic-Number-Plate-
Recognition-(ANR)-Sonavane-Soni/044beb18e386503aa5551fa73169ea432062f1b5
2) https://www.irjet.net/archives/V4/i4/IRJET
3) https://ieeexplore.ieee.org/abstract/document
4) http://ictactjournals.in/paper/IJIVP
38

Mini Project Documentation

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Mini Project Documentation

Enviado por

Direitos autorais:

Formatos disponíveis

AUTOMATIC

Fig 2 : Processing Image 32

Fig 3 : Output Extracted Image 33

Fig 1 : Input Image 33

Fig 2 : Processing Image 34

Fig 3 : Output Extracted Image 34

BUSINESS PROBLEM AND SOLUTIONS

2.2 Proposed Solution

The objective of our system is to extract registration number of 10 characters by image

2.3 System Requirements

2.4 Functional Requirements

 Generally, functional requirements are expressed in the form "system must do

2.5 Non-Functional Requirements

 Availability: The availability is the system being able to be used.

 Reliability: It’s the quality of being trustworthy or performing consistently well.

 Recoverability: Theability of the system to get back its data.

 Maintainability: It’s the ability as the probability of performing a successful repair

2.6 User Requirements

 Takes adequate input and gives proper output.

 Runs smoothly and faster.

 Have a good and user friendly interface.

2.7 Input and Output Requirements

2.8 System Architecture

Step 2 : Segment and normalize the number plate.

Step 4 : Image Process the and Extract the Number Plate.

Step 5 : Recognize the number plate by the set of trained OCR

Step 6 : If there are no more unclassified samples, then STOP.

Step 7 : Add these test samples into their corresponding database

Fig: System Architecture

3.1 Machine Learning:

Why Use Machine Learning?

3.2 Types of Machine Learning Systems

A) Image Processing Background:

An innovative integration of machine learning in image processing is very likely to have a

 Filtered and equalized image

 Final Plate image

Data preprocessing is a data mining technique that involves transforming raw

Real-world data is often incomplete, inconsistent, and/or lacking in certain

Data preprocessing is used database-driven applications such as customer

Data goes through a series of steps during preprocessing:

Data Transformation : Data is normalized, aggregated and generalized.

Data Discretization : Involves the reduction of a number of values of a continuous attribute

3.5.1 CV2 for Image Classification

Deep Learning inside OpenCV 3.3

Popular network architectures compatible with OpenCV 3.3 include:

 GoogleLeNet (used in this blog post)

OpenCV deep learning functions and frameworks

OpenCV 3.3 supports the Caffe, TensorFlow, and Torch/PyTorch frameworks.

Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It is also useful as a stand-

User Model View:

Structural Model View:

2. This model view models the static structures.

Behavioral Model View:

Implementation Model View:

Environmental Model View:

4.2 UML Diagrams

In brief, the purposes of use case diagrams can be as follows

a. Used to gather requirements of a system.

b. Used to get an outside view of a system

c. Identify external and internal factors influencing the system

d. Show the interacting among the requirements are actors.

4.2.2 Class Diagram

The purpose of the class diagram can be summarized as:

b. Describe responsibilities of a system.