Aim: optical character recognition of handwritten and printed English, Konkani and Marathi language using artificial neural network LITERATURE SURVAY
“Determination of Optimal Features Database for OCR of Printed Telugu
Text “ by C.Vasantha Lakshm, Sarika Singh and C.Patvardhan OCR (Optical Character Recognition) systems are being developed due to their numerous applications even for Indian scripts like Telugu which are complicated due to the usage of a large number of symbols. Experimental results on text document images with multiple fonts and sizes show that the strategy for database design for OCR of printed Telugu text proposed in this paper achieves both the objectives. This is the first reported approach for such a database design for Telugu OCR
Implementation of an Optical Character Reader (OCR) for Bengali
Language by Muhammed Tawfiq Chowdhury, Md. Saiful Islam ad al. In this research, ‘Solaimanlipi’ font and 200 input files are used to test the accuracy of OCR. It is found that for clean image files, the accuracy of the software is as high as 97.56%. It is to be noted that accuracy is measured as the percentage of correct characters and words. INTRODUCTION
India is a country of having multi written languages and there is
a need in digitization of books and documents and conversion of this text of these books, image text and documents into editable text. These project will be converting text image into editable text for English, Konkani and Marathi language using neural network. Conversion will be done by using matlab. The main aim is to identify the character from image, These can be achieved by pre-processing. Then these text is segmented to separate out the characters from each other. After Segmentation, letters are extracted and resized and fed to neural network which will classify the character to give us the ascii text. These text is then converted into editable text . OCR Accuracy Prediction Method Based on Blur Estimation by Van-Cuong Kieu, Florence Cloppet, and Nicole Vincent In this paper, they proposed an OCR accuracy prediction method based on a local blur estimation since blur is one of the important factors that mostly damage OCR accuracy. The proposed method is evaluated on a published database and on an industrial one. The correlation with OCR accuracy is also given to compare with the state-of-the-art methods.
Optical Character Recognition (OCR) System for Roman Script &
English Language using Artificial Neural Network (ANN) Classifier by Honey Mehta ,Sanjay Singla and Aarti Mahajan They proposed a new approach of in case of character recognition is the different styles and fonts in which the text is written by using the concept of Artificial Neural Network and Nearest Neighbour approach for character recognition from scanned images. Three layers are used for classification purpose. First is the input layer consist the input given by the segmented characters, then hidden layer consist the neurons trained by the training network and the output layer consist output neurons to generate Unicode. SOFTWARE TOOLS
Matlab RESEARCH COMPONENT
Handwritten and printed text image into editable text by using
back propagation neural network High accuracy in detecting text Detecting English, Marathi and Konkani image text Block diagram: APPLICATION
Mobile applications for translation of sign board through
internet search Digitization of old text Conversion of printed text to editable .txt document Automatic filling of e-forms through ocr sheets (form readers) Handwritten text to .txt conversion Process automation in postal and delivery services Automobile number plate recognition TIMELINE
First sem: Pre-processing and feature extraction
Second sem: Identification and classification of extracted