VEXO

A PROJECT REPORT
SUBMITTED IN PARTIAL FULFILLMENT OF THE

REQUIREMENTS FOR THE DEGREE OF
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

GOVERNMENT ENGINEERING COLLEGE, WAYANAD
NOVEMBER 2017
HANDWRITING RECOGNIZATION TOOL
Project Name: VEXO - HANDWRITING RECOGNIZATION TOOL
Project Members:
This project is done in a group of four people. Project members are:
1. ATHUL K S
2. PRANAV M
3. SHINO C S
4. VALLIVALAPPIL GOKUL
We express our sincere thanks towards our
principal Dr. Abdul Hameed.K.M
We are equally grateful to our guides Miss Amaya Anna
Joy and Amal of computer science department for their
invaluable guidance, suggestions and supervision.
We would also like to thank our Head Of Department,
Dr.Anitha, Mechanical Department for her expert
guidance.
We owe sincere thanks to all faculty members of
computer science department and our well-wishers for
their encouragement and moral support.
This is to certify that this project report, on the topic
“VEXO” is a bonafide work of ATHUL K S (WYD15CS016),
PRANAV M (WYD15CS047), SHINO C S (WYD15CS057)
and VALLIVALAPPIL GOKUL(WYD15CS060).
Signature
Dr. Anitha
HEAD OF DEPARTMENT
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING
GEC WAYANAD
1 Vexo – Handwriting Recognition
ABSTRACT 2
INTRODUCTION 3
RELATED WORKS 6
DESIGN THINKING FRAMEWORK 8
 Identifying user and user behavior 8
 Customer experience 9
PROBLEM STATEMENT 10
RESEARCH TO CONCEPT 11
DESIGN HEURISTICS 16
PROTOTYPE OF PROPOSAL 17
CONCLUTION 20
REFERENCES 21
1
The aim of this work is to review existing methods for the

handwritten character recognition problem using machine
learning algorithms and implement one of them for a user-friendly
Android application. The main tasks the application provides a
solution for are handwriting recognition based on touch input,
handwriting recognition from live camera frames or a picture file,
learning new characters, and learning interactively based on user's
feedback. The recognition model we have chosen is a multilayer
perceptron, a feed forward artificial neural network, especially
because of its high performance on nonlinearly separable
problems. It has also proved powerful in OCR and ICR systems that
could be seen as a further extension of this work. We had
evaluated the perceptron's performance and configured its
parameters in the GNU Octave programming language, after which
we implemented the Android application using the same
perceptron architecture, learning parameters and optimization
algorithms. The application was then tested on a training set
consisting of digits with the ability to learn alphabetical or different
characters.
2
Handwriting character recognition is a field of research in

artificial intelligence, computer vision, and pattern
recognition. A computer performing handwriting
recognition is said to be able to acquire and detect
characters in paper documents, pictures, touch-screen
devices and other sources and convert them into machine-
encoded form. Its application is found in optical character
recognition and more advanced intelligent character
recognition systems. Most of these systems nowadays
implement machine learning mechanisms such as neural
networks. Machine learning is a branch of artificial
intelligence inspired by psychology and biology that deals
with learning from a set of data and can be applied to solve
wide spectrum of problems. A supervised machine learning
model is given instances of data specific to a problem
domain and an answer that solves the problem for each
instance. When learning is complete, the model is able not
only to provide answers to the data it has learned on, but
also to yet unseen data with high precision. Neural networks
are learning models used in machine learning. Their aim is
to simulate the learning process that occurs in an animal or
3
human neural system. Being one of the most powerful

learning models, they are useful in automation of tasks
where the decision of a human being takes too long, or is
imprecise. A neural network can be very fast at delivering
results and may detect connections between seen instances
of data that human cannot see. We have decided to
implement a neural network in an Android application that
recognizes characters written on the device's touch screen
by hand and extracted from camera and images provided by
the device. Having acquired the knowledge that is explained
in this text, the neural network has been implemented on a
low level without using libraries that already facilitate the
process. By doing this, we evaluate the performance of
neural networks in the given problem and provide source
code for the network that can be used to solve many
different classification problems. The resulting system is a
subset of a complex OCR or ICR system; these are seen as
possible future extensions of this work. For overview, we
also briefly explain the specific algorithms that have been
used in the implementation of the project. Also, we explain
the chosen neural network model and algorithms on a low
level. Then we discuss the design choices made before
implementing the Android application. The requirements of
the application are specified and the plan of the solution is
4
laid out. The implementation, gives a description of how the

requirements have been satisfied, what problems have
arisen and how they were solved, and also serve as a
technical guide for the application users. The structure of
the source code is also depicted. Finally, in the conclusion,
we talk about the accomplishments of this work and state
how the application can be used in other projects or
extended to a more complex system.
5
iSkysoft is a perfect OCR tool for PDF files. It can

automatically recognize scanned PDF and make it editable with
built-in editing tools. And it provides several OCR languages.
Besides, you can easily edit your PDF texts, images, links and
other elements. And it lets you to convert the PDF files to other
formats. It is available for mac and windows platform. Iskysoft
is less expensive as compared to other available software. It do
not have free version.
TopOCR is designed to be simple and user-friendly for scanning

books and magazines with document cameras and scanners. It
combines a full featured Image Editor and Word Processor with
advanced multi-core image processing and three different OCR
engines. For document cameras, it also has a single-click Real-
Time Document Camera Image Preview and Capture Dialog
that makes it easy for you to properly position your documents
for scanning. It support 60+ regional languages, but only online
version of this software is available right now
MyScript is the market leader in accurate, high-

performance handwriting recognition and digital ink
management software technology. MyScript technology
combines digital ink management with easy searching of
6
handwritten text, as well as the accurate recognition of

complex mathematical equations, geometric shapes,
diagrams and music notation. Though it is available for
mobile and desktop device it is an expensive software.
MyScript uses b time-ordered digital ink stroke input for
conversion to digital form
Google Handwriting Input is handwriting recognition

software developed by Google, which works in touch input
devices. It is basically designed for android smartphones.
Google Handwriting Input is an ICR handwriting
recognition software
7
We have conducted online survey (created using Google

form) to identify our users. On our rough calculation we
could identify only 5 persona, but on the detailed survey
from 23 participants we could identify 4 more persona to
our project. From the survey report 15 people are
interested in having a handwritten recognition software to
make their job fast and easier. We have marked these 15
people as our User persona.
Users
6%1%
4%
6%
7% 22%
22%
17%
15%
Pharmaceutist Novelist Police officers

Data entry workers Journalists Students
Teachers Bank Employees others
8
Survey reports states that few people have already been

using handwritten recognition software, and we have
marked their experience in using such software. User’s
experience with current software made their job easier
was not good enough.
Following were the major problems with current software
they are using
➜ Not available in regional language
➜ Sometimes software is unable to recognize hard
handwritings
9
 Most of the available applications are expensive

 Free software have poor performance
 Most of the available software do not support
regional language
10
The algorithms used in character recognition may be

divided into three categories. They are image
preprocessing, feature extraction, and classification.
Image Preprocessing
Image preprocessing is crucial in the recognition pipeline
for correct character prediction. These methods typically
include noise removal, image segmentation, cropping,
scaling, and more. In our project, these methods have
mainly been used when recognizing from an image, but
some of them, such as cropping the written character and
scaling it to our input size, are also performed in the touch
mode. Digital capture and conversion of an image often
introduces noise which makes it hard to decide what
actually a part of the object is and what is not. Considering
the problem of character recognition, we want to reduce
as much noise as possible, while preserving the strokes of
the characters.
For this task we use convolutional masks that scan
an image, ideally reducing all unwanted noise. Masks are
11
square matrices with elements representing weights of the

surrounding area pixels that determine the light intensity
value of the pixel at hand. The task of image segmentation
is to split an image into parts with strong correlation with
objects or the real world properties of the image
represents. Probably the simplest image segmentation
method is thresholding. Thresholding is the extraction of
the foreground, which is a character in our case, from the
rather monotonic background.
Feature Extraction
Features of input data are the measurable properties of
observations, which one uses to analyze or classify these
instances of data. The task of feature extraction is to
choose relevant features that discriminate the instances
well and are independent of each other. Selection of a
feature extraction method is probably the single most
important factor in achieving high recognition
performance. There is a vast amount of methods for
feature extraction from character images, each having
different characteristics, invariance properties, and
reconstructability of characters.
12
To describe the way feature extraction is

sometimes done in handwriting recognition, we need to
study about Projection histogram. Projection histograms
were introduced in 1956 in an OCR system by Glauberman
and are used in segmentation of characters, words, and
text lines, or to detect if a scanned text page is rotated. We
collect the horizontal and vertical projections of an image
by setting each horizontal and vertical “bin” value to the
count of pixels in respective rows and columns where
neighboring bins can be merged to make the features scale
independent. The projection is, however variant to
rotation and variability in writing style. The two histograms
are then compared and a dissimilarity measure is obtained,
which can be used as a feature vector. But it takes a lot of
time.
So in our work, we have used the multilayer
perceptron neural network model, which will be more
describe later. For now, we can think of this model as a
directed graph consisting of at least 3 layers of nodes. The
first layer is called the input layer, the last layer is the
output layer, and a number of intermediate layers are
known as hidden layers. Except of the input layer, nodes of
neural networks are also called neurons or units. Each
node of a layer typically has a weighted connection to the
13
nodes of the next layer. The hidden layers are important

for feature extraction, as they create an internal
abstraction of the data fed into the network. The more
hidden layers there are in a network, the more abstract the
extracted features are.
Classification
Classification is defined as the task of assigning labels
(categories, classes) to yet unseen observations (instances
of data). In machine learning, this is done on the basis of
training an algorithm on a set of training examples.
Classification is a supervised learning problem, where a
“teacher” links a label to every instance of data. Label is a
discrete number that identifies the class a particular
instance belongs to. It is usually represented as a
nonnegative integer. There are many machine learning
models that implement classification; these are known as
classifiers. The aim of classifiers is to fit a decision
boundary in feature-space that separates the training
examples, so that the class of a new observation instance
can be correctly labeled. In general, the decision boundary
is a hyper-surface that separates an Ndimensional space
into two partitions, itself being N−1-dimensional.
14
 The police officers use this software to make s

standard copy of FIR .
 Journalists can use this software to instantly
converts the reports into standard format.
 Novelist use this software to convert their long
handwritten novels to text format.
 Converts the legal agreements to encrypted text
format.
15
Currently Vexo is under progress. Once the

prototype is ready Vexo has to be taught and
tested with handwritten digits in “THE MNIST
DATABASE
Work done so far

 Attending course and assignments on ML by
Andrew NG on coursera.
 Read and studied first four chapters on Neural
Networks and Deep Learning by Michael Nielsen
and attained few basic knowledge
 Learned and implemented basics of python, git
and lua to abasic level
 Installed and implemented torch and loaded
MNIST data.
 Understood, tried and practiced MNIST tutorial
provided by Andrea Ferretti on RNDuja Blog.
16
Works to be done
 Train vexo with data found on

http://www.ee.surrey.ac.uk/CVSSP/demos/chars
74k
 Use a better data set on
https://lvdmatten.github.io/software/code/wride
.tar.gz/
 Study convolutional neural networks and their
implementation on
http://cs231n.github.io/convolutional-networks
 Implement convolutional neural network
architecture on the old data set after filtering
garbage data.
 Use character segmentation code on MATLAB by
Diego Barragan, Technical University of Loja,
Ecuador, available at
http://www.mathworks.com/matlabcentral/filee
xchange/22922-image-segmentation---
extractionfacilitating .
 Used graph plotting tools to show graphs of loss
vs time and accuracy vs time.
17
This work has mostly been focused on the machine learning methods used
in the project. At first, we reviewed the approaches that are nowadays
used in similar applications. After that, we delved into the inner workings
of a multilayer perceptron, focusing on back propagation and resilient
back, which has been implemented in the Android application. With the
knowledge we had described, we specified the requirements of the project
and planned the solution. During the development of the application, we
ran into a few problems, which, along with the application structure and
details, have been described in the implementation chapter. Finally, the
results of the implementation of the learning algorithms have been
compared. The Android application performs character recognition based
on touch, image, and camera input. We have developed a Java package
containing classes that implement the multilayer perceptron learning
model, which can also be used in other applications due to its modular
design that supports the loose coupling principle. The application itself
uses this package in such way. Several improvements for the application or
the learning model used within can be suggested. For example, the feature
extraction performed by the neural network could be constrained to
operate on more strictly preprocessed data. Also, several classifiers
learning on different features could be combined to make the system more
robust.
We are working on a possible extension on the project VEXO.
18
 Iast four chapters on neural networks and deep

learning by Michael Nielsen.
 Tutorial provided by Andrea F.erretti on RNDuja
blog
 www.ee.surrey.ac.uk
 lvdmatten.github.io
19

VEXO

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

VEXO

Enviado por

Direitos autorais:

Formatos disponíveis

A PROJECT REPORT

SUBMITTED IN PARTIAL FULFILLMENT OF THE

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

This project is done in a group of four people. Project members are:

The aim of this work is to review existing methods for the

Handwriting character recognition is a field of research in

human neural system. Being one of the most powerful

laid out. The implementation, gives a description of how the

iSkysoft is a perfect OCR tool for PDF files. It can

TopOCR is designed to be simple and user-friendly for scanning

MyScript is the market leader in accurate, high-

handwritten text, as well as the accurate recognition of

Google Handwriting Input is handwriting recognition

We have conducted online survey (created using Google

Pharmaceutist Novelist Police officers

Survey reports states that few people have already been

 Most of the available applications are expensive

The algorithms used in character recognition may be

square matrices with elements representing weights of the

To describe the way feature extraction is

nodes of the next layer. The hidden layers are important

 The police officers use this software to make s

Currently Vexo is under progress. Once the

Work done so far

 Train vexo with data found on

We are working on a possible extension on the project VEXO.

 Iast four chapters on neural networks and deep

Você também pode gostar