Você está na página 1de 17

Optical Character Recognition (OCR) USING MATLAB

Summary. - This article shows how to use Matlab and functions of its image processing toolbox to recognize an image in a word or set of words and numbers. Correlation is used to determine the likeness of the point of entry to the workforce. The font size must be greater than or equal to 24 x 42 pixels, so that fit the size of the workforce.
This program was tested on Matlab 7.1.

INTRODUCTION
OCR technology provides playback systems for scanning and imaging systems the ability to turn images of machine print characters in characters capable of being interpreted or recognized by a computer. Thus, images of machine print characters are extracted from a bitmap image reproduced by the scanner [1].

Fig.1. Schematic of an OCR. The OCR process involves several aspects such as segmentation, feature extraction and classification [2]. Image Processing Toolbox provides a set of MATLAB functions that extends the capabilities of the product for application development and new algorithms in the field of processing and image analysis. The mathematical environment MATLAB and creating ideal for image processing, as these images are, after all, matrices. This toolbox includes functions for: * Design of filters. * Improved and retouching of images. * Analysis and statistics of images.

* Morphological operations, geometry and color. * 2D transformations. Image processing is an absolutely crucial area of work for those groups and industries that are working in areas such as medical diagnostics, astronomy, geophysics, environmental sciences, data analysis in laboratory, industrial inspection, etc.. [3].

DEVELOPMENT PROGRAM
SEGMENTATION As a first step, the image is cropped to fit the text. After this, line by line is separated. The function performed by cropping the image shown below:
IMGN function = clip (image) % Crops a black background with white letter. % Example: % Image = imread ('metal.bmp'); % IMGN = clip (image); % Subplot (2,1,1), imshow (image) title ('INPUT IMAGE') % Subplot (2,1,2), imshow (~ IMGN) title ('OUTPUT IMAGE') if ~ ISLOGICAL (image) im2bw image = (Image, 0.99); end a = ~ image; [K] = find (a); lmaxc = max (c); lminc = min (c); lmaxf = max (f); lminf = min (f); IMGN = a (lminf: lmaxf, lminc: lmaxc)% Crops image

The following figure shows how this function:

Fig.2. Scheme function that crops the image to the size of the letter. As seen in the function, the threshold for binary image transformation is 0.99 (bn = im2bw (Image, 0.99)). This threshold was taken to RGB color values very close to 255 (maximum value) are considered as 0 in the binary image. Once cropped image, the next step is to separate each line. For this we used the following function:
function [fl re] = lines (aa) % Divide text in lines. % Aa-> input image; fl-> first line, re-> line REMAIN % Example: % A = imread ('heavy_metal.bmp'); % [Fl re] = lines (aa); % Subplot (3,1,1), imshow (aa), title ('INPUT IMAGE') % Subplot (3,1,2), imshow (fl), title ('FIRST LINE') % Subplot (3.1.3), imshow (re) title ('REMAIN LINES') a = clip (aa); r = size (a, 1); for s = 1: r if sum (aa (s, :)) == 0 nm = aa (1: s-1, 1: end);% First line matrix rm = aa (s: end, 1: end)% line matrix Remain

fl = ~ clip (~ nm); re = ~ clip (~ rm); % * - * - * Uncomment the lines below to see result * - * - * - * % Subplot (2,1,1), imshow (fl); % Subplot (2,1,2), imshow (re); break else fl = ~ aa;% Only one line. re = []; end end

The following figure shows how this function:

Fig.3. Outline of the function to separate lines in the image. Once obtained separately for each line of the image, we proceed to remove one letter of the image matrix fl. To this was used bwlabel function, which label the connected components of the image. In other words, this function has the solid lines and lists them. To separate each letter is used the following code:
% * - * - * - * - *-Connected components Calculating * - * - * * - * % Code from: % Http://www.mathworks.com/matlabcentral/fileexchange/ loadFile.do? objectId = 8031 & objectType = FILE

L = bwlabel (IMGN); mx = max (max (L)); BW = edge (double (IMGN), 'sobel'); [Imx, imy] = size (BW); for n = 1: mx [R, c] = find (L == n); rc = [rc]; [Sy sx] = size (rc); n1 = zeros (imx, imy); for i = 1: sx x1 = rc (i, 1); y1 = rc (i, 2); n1 (x1, y1) = 255; end % * - * - * - * - *-Connected components Calculating END * - * * - * - *

Then each point is normalized to a size of 42 x 24 pixels, which is the size of the template to perform the correlation. For normalization we used the following function:
same_dim img_r = function (imagen_g) % Example: % Imagen_g = imread ('a_reducir.bmp'); % Img_r = same_dim (imagen_g); % Subplot (2,1,1), imshow (imagen_g) title ('Image mx n') % Subplot (2,1,2), imshow (img_r) title ('Image 42 x 24')

img_r = imresize (imagen_g, [42 24]);

CLASSIFICATION

The main operation that was used for classification was the correlation in two dimensions. This operation gives a value of similarity between two matrices (images). Corr2 function develops this operation using the following equation [4]:

The following function performs the correlation between each letter templates and extracted:
read_letter letter = function (imagn) % Computes the correlation Between template and input image % And Its output is a string Containing the letter. % Size of 'imagn' must be 42 x 24 pixels % Example: % Imagn = imread ('D.bmp'); % Letter = read_letter (imagn) comp = []; load templates for n = 1:36 sem = corr2 (templates {1, n}, imagn); comp = [comp w]; end vd = find (comp == max (comp)); % * - * - * - * - * - * - * - * - * - * - * - * - * if vd == 1 letter = 'A'; elseif vd == 2 letter = 'B';

elseif vd == 3 letter = 'C'; elseif vd == 4 letter = 'D'; elseif vd == 5 letter = 'E'; elseif vd == 6 letter = 'F'; elseif vd == 7 letter = 'G'; elseif vd == 8 letter = 'H'; elseif vd == 9 letter = 'I'; elseif vd == 10 letter = 'J'; elseif vd == 11 letter = 'K'; elseif vd == 12 letter = 'L'; elseif vd == 13 letter = 'M'; elseif vd == 14 letter = 'N'; elseif vd == 15 letter = 'W';

elseif vd == 16 letter = 'P'; elseif vd == 17 letter = 'Q'; elseif vd == 18 letter = 'R'; elseif vd == 19 letter = 'S'; elseif vd == 20 letter = 'T'; elseif vd == 21 letter = 'U'; elseif vd == 22 letter = 'V'; elseif vd == 23 letter = 'W'; elseif vd == 24 letter = 'X'; elseif vd == 25 letter = 'Y'; elseif vd == 26 letter = 'Z'; % * - * - * - * - * elseif vd == 27 letter = '1 '; elseif vd == 28

letter = '2 '; elseif vd == 29 letter = '3 '; elseif vd == 30 letter = '4 '; elseif vd == 31 letter = '5 '; elseif vd == 32 letter = '6 '; elseif vd == 33 letter = '7 '; elseif vd == 34 letter = '8 '; elseif vd == 35 letter = '9 '; else letter = '0 '; end

TEMPLATES
Each template is a binary image bmp 42 x 24 pixels. The script to store templates in a cell structure is as follows:
% CREATE TEMPLATES % Letter A = imread ('A.bmp'), B = imread ('B.bmp'); C = imread ('C.bmp'), D = imread ('D.bmp'); E = imread ('E.bmp'), F = imread ('F.bmp');

G = imread ('G.bmp'), H = imread ('H.bmp'); I = imread ('I.bmp'), J = imread ('J.bmp'); K = imread ('K.bmp'), L = imread ('L.bmp'); M = imread ('M.bmp'), N = imread ('N.bmp'); O = imread ('O.bmp'), P = imread ('P.bmp'); Q = imread ('Q.bmp'), R = imread ('R.bmp'); S = imread ('S.bmp'), T = imread ('T.bmp'); U = imread ('U.bmp'), V = imread ('V.bmp'); W = imread ('W.bmp') X = imread ('X.bmp'); Y = imread ('Y.bmp'), Z = imread ('Z.bmp'); % Number one = imread ('1. bmp '); two = imread ('2. bmp'); three = imread ('3. bmp '); four = imread ('4. bmp'); five = imread ('5. bmp '); six = imread ('6. bmp'); seven = imread ('7. bmp '); eight = imread ('8. bmp'); nine = imread ('9. bmp '); zero = imread ('0. bmp'); % * - * - * - * - * - * - * - * - * - * - * letter = [ABCDEFGHIJKLM ... NOPQRSTUVWXYZ]; number = [one two three four five ... six seven eight nine zero]; character = [letter number]; templates = mat2cell (character, 42, [24 24 24 24 24 24 24 ... 24 24 24 24 24 24 24 ... 24 24 24 24 24 24 24 ... 24 24 24 24 24 24 24 ...

24 24 24 24 24 24 24 24]); save ('templates', 'templates')

MAIN PROGRAM
The main program is as follows:
% OCR (Optical Character Recognition). MAIN PROGRAM% % Authors: Diego Orlando Barragan Guerrero --% E-mail: ---- diegokillemall@yahoo.com % LOJA - ECUADOR - SOUTH AMERICA % ************************************************* ******** warning off clc, close all, clear all image = imread ('scanner.bmp')% Read Binary Image % Try with images: heavy_metal.bmp, scanner.bmp imshow (image) title ('INPUT IMAGE WITH NOISE') % * - * - * Image Noise Filter * - * - * - * if length (size (image)) == 3% RGB image rgb2gray image = (image); end medfilt2 image = (image); [K] = size (image); image (1.1) = 255; Image (f, 1) = 255; image (1, c) = 255;

image (f, c) = 255; % * - * - * Image Noise Filter END * - * - * - * word = [];% Storage word matrix from image re = image; fid = fopen ('text.txt', 'wt');% Opens as text.txt file for write while 1 [Fl re] = lines (re);% Fcn 'lines' separate lines in text IMGN = ~ fl; % * - * Uncomment line below to see lines one by one * - * * - * % Imshow (fl), pause (1) % * - * - * - * - * - * - * - * % * - * - * - * - *-Connected components Calculating * - * * - * - * % Code from: % Http://www.mathworks.com/matlabcentral/fileexchange/ loadFile.do? objectId = 8031 & objectType = FILE L = bwlabel (IMGN); mx = max (max (L)); BW = edge (double (IMGN), 'sobel'); [Imx, imy] = size (BW); for n = 1: mx [R, c] = find (L == n); rc = [rc]; [Sy sx] = size (rc); n1 = zeros (imx, imy); for i = 1: sx

x1 = rc (i, 1); y1 = rc (i, 2); n1 (x1, y1) = 255; end % * - * - * - * - *-Connected components Calculating END * - * - * - * - * n1 = ~ n1; n1 = ~ clip (n1); same_dim img_r = (n1);% Transfer. to size 42 X 24 % * - * Uncomment line below to see letters one by one * - * - * - * % Imshow (img_r) pause (1) % * - * - * - * - * - * - * - * letter = read_letter (img_r)% img to text word = [word letter]; end % Fprintf (fid, '% s \ n', lower (word));% Write 'word' in text file (lower) fprintf (fid, '% s \ n ", word);% Write' word 'in text file (upper) word = [];% Clear 'word' variable % * - * - * When the finish sentences, breaks the loop * - * - * - * if isempty (re)% See variable 're' Fcn in 'lines' break end % * - * - * - * - * - * - * - * - * - * - * - * - * - * - * * - * - * - * - * - * - * - * end fclose (fid);

WinOpen ('text.txt')% Open 'text.txt' file

TEST PROGRAM
Once the program can be used to execute the functions tic and toc at the beginning and end of the code to measure response time. When finished running it will open a text file that contains the words of the image:

Figure 5. Results of the program. Another trial was conducted with a sheet was scanned with a resolution of 200 DPI and a brightness of 128 (default parameters of the scanner) that contained several handwritten words and some numbers. The sample shown in the following figure:

Figure 6. Test text scanning. By scanning the image produces some noise (black dots) in it. The solution is to implement a filter at the beginning of the code to eliminate noise.
% * - * - * Image Noise Filter * - * - * - * if length (size (image)) == 3% RGB image rgb2gray image = (image); end medfilt2 image = (image); [K] = size (image); image (1.1) = 255; Image (f, 1) = 255; image (1, c) = 255; image (f, c) = 255; % * - * - * Image Noise Filter END * - * - * - *

The following figuta displays text output:

Figure 7. Recognized text. It can be deduced that the error present in the first three sevens is due to the small curvature that has the staff, who do have the last 7.

CODE METRICS
MATLAB has two tools that help improve the code. One is the profile, which, among other things, calculates the execution time of each function, and shows that variables can discard the code to speed it up. The second tool is code metrics, a program that is available on the MathWorks File Exchange [5]. This program determines if the names of the functions we are using does not conflict with duties as defined in the MATLAB path, a measure of program complexity (Cyclomatic complexity) and very practical suggestions to improve the functioning of the program. The result of applying metrics to OCR code is as follows:

CONCLUSIONS

The font size must not be less than 42 x 24 pixels. The input image can be colored with letters or not. If the image has more noise at the edges that can filter medfilt2 function, you must use the function to crop the image imcrop. If the font size is thin, use the function to increase the thickness imdilate before moving to the program. To result in lower case text, use the lower function on the word variable in the main program. The processing time of the test text in Figure 6 was: >> tic; OCR, toc Elapsed time is 3.806446 seconds. However, on a PC with 1 GB of RAM processing time was 2.4 seconds. This processing time is reduced if you use a template of 20 x 20 pixels.

REFERENCES
[1] http://www.pearsonncs.com/ [2] MATLAB PROJECT IN OPTICAL CHARACTER RECOGNITION (OCR), Jesse Hansen [3] http://www.eldish.net/hp/automat/matlab.htm [4] handwritten digit recognizer, Marcelo C. Valdiviezo, Magazine "In Short Circuit", January 2007. [5] link metrics in MathWorks code: http://www.mathworks.com/matlabcentral/fileexchange/ loadFile.do? ObjectId = 10514 & objectType = file

Você também pode gostar