Você está na página 1de 5

Bengali character recognition using Bidirectional Associative Memories (BAM)

neural network
M. M. Mahbubul Syeed, Fazlul Hasan Siddiqui, Abu Saleh Abdullah Al-Mamun,
Syed Khairuzzaman Tanbeer and M. Abdul Mottalib
Department of Computer Science and Information Technology (CIT)
Islamic University of Technology, Board Bazar, Gazipur-1704
Emails: rajit_cit@hotmail.com,fhsani@yahoo.com, mamuncitiut@hotmail.com, tanbeer2000@yahoo.com

Abstract: This paper presents the recognition features of b) Determination of node properties, as the activation
Bengali text using BAM (Bidirectional Associative range (discrete [0 & 1] or continuous [0,1]) of a node, type
Memories) neural network with a proposal of feature of node activation function (Hard limiting function or
extraction procedure of a Bengali character. To do this, Sigmoid function) is determined.
the conventional methods are used for text scanning to
segmentation of a text line to a single character. In this c) Determination of system dynamics, as weight
paper an efficient procedure is proposed for boundary initialization scheme, the activation calculation formula,
extraction, scaling of a character and the BAM neural the network learning rule (weight adjustment) is
network which increases the performance of character determined.
recognition are used.

Keyword: Bengali, Character, Neural Network, BAM Output layer


(Bidirectional Associative Memories), Feature, Scaling,
Recognition.

1. INTRODUCTION

Bengali character recognition has become an active area of j Hidden


research for last few years with a wide variety of layer
applications. Lots of works already have been done on
printed Bengali characters [1] and handwritten Bengali
Wji
characters [2].

In this paper an approach on recognizing scanned Bengali i Intput layer


text is proposed. The Bengali text is scanned first with a
scanner and converted into an image format. Then Feedforward Connection
applying several techniques the characters in the text are Recurrent Connection
separated. These separated characters are then applied to a
classifier that recognizes the characters as several Bengali
characters stored in memory. In this phase BAM Oj
(Bidirectional Associative Memories) neural network
model is used

Neural network is developed to perform some of the


activities of human brain. As the recognition of images, W1 Wn
voice etc. are performed most efficiently and accurately by X1 X2 Xn
human brain, so the artificial neural networks are tried to
be developed in computer to perform these recognitions.
(
O j = F in=1 Wi X i )
Like human brain a neural network has a parallel- F= Activation function
distributed architecture with a large number of nodes and
connections. Each connection points from one node to
another and is associated with a weight. A typical model Figure1: The neural network computational model
of neural network is shown in figure1.
A neural network classifier technique say, BAM is used
Construction of a neural network involves the following for recognition. BAM is an associative neural network. An
tasks: associative neural network is one that retrieves an object
a) Determination of network properties, as network or memory based on part of the object itself. The term
topology or connectivity, type of connection between the memory in associative network can be defined as,
nodes of the network, the order of connection is decided. If a binary n-dimensional vector X is a memory then for
each component (neuron) i =1, 2 ,.., n
(
X i = Fh in=0 Wij X j ) on [6], [7], [8], [9] and [10] are also found in the literature.

Here, Then the text regions are separated from non-text regions
Fh is the hard limiting function. by using any one of the methods mentioned in [9], [10],
Wij is weight from node j to node i. [13], [14]. Among these Page segmentation and
Xj is the input on node j. classification method in [14] is exercised here.

This means that X is a memory if the network is stable at Then the lines are segmented from the text, then each line
that point. Detailed discussion on BAM is given on the is segmented into words and finally the words are
section of recognition. segmented into constituent characters. In this case the
algorithm in [5] for segmentation is used. These characters
are then fed for Feature Extraction.
2. PHASES OF BENGALI TEXT RECOGNITION
3. FEATURE EXTRACTION
The full cycle of Bengali text recognition consists of the
following parts: Feature extraction [15] is an important part for character
Data acquisition. recognition. Feature Extraction helps to convert the
Text digitization and noise removing. segmented character pixels into the approximate binary
Oblique /skew detection and removing. valued character. There are two approaches for feature
Block detection. extraction, namely statistical and structural approach. Here
Segmentation. feature extraction has been done in two phases: Boundary
Feature extraction. extraction and scaling.
Learning and character recognition by neural
network. 3.1. Boundary extraction
A block diagram representation of this recognition system
is shown in figure2. It is necessary to find the boundary position of the
character image. In this phase a single character placing in
The Bengali text is first scanned by a scanning device and a single window will be extracted by horizontal and
then stored in digital image format. The histogram vertical scanning starts from the upper left and bottom
threshold technique is used for its better result. Other right position of the window. This scanning is halted only
techniques are also there, as [3] and [4]. when it faces a single pixel. The proposed algorithm for
boundary extraction is given below:
From this digital image noise is cleaned and the oblique is 1. Get the square boundary within which a single character
removed. For this, an algorithm proposed in [5] is used. exists.
Other skew detection algorithms such as: algorithms based 2. Continue horizontal scanning from the top most line
Text digitization and Oblique /skew
Data Block
noise removing detection and
acquisition detection
removing

Line
Segmentation segmentation

Word
segmentation
Learning and character
recognition
Character
segmentation

Learning by sample Knowledge


character
Feature extraction of
Classification and Input to the neural the input character
recognition network

Neural network
Output
Recognized character

Figure2: Block diagram representation of Bengali character recognition system


towards the bottom of the window until meet a single
pixel.
3. Continue vertical scanning from the left most line
towards the right of the window until meet a single pixel.
4. Say the innermost two scanned lines meet at a pixel P
5. Similarly continue horizontal scanning from the bottom
most line towards the top of the window until a single
pixel is met.
6. Continue vertical scanning from the right most line
towards the left of the window until a single pixel is met.
7. Say the innermost two scanned lines meet at a pixel Q
8. So the extracted boundary is found whose upper-left
boundary is positioned at P and bottom-right boundary is
positioned at Q.

3.2. Scaling

After the boundary position is determined the character is


needed to be scaled. Say the horizontal and vertical
lengths of the extracted character are denoted by L_H and
L_V respectively. It is to be converted into 1616 matrix T
W to produce a new input vector. This process is repeated
containing binary digits (as the inputs of the BAM neural until the network reaches a stable state (a state in which
network is taken into 16x16 matrix), so the length of the the input vector do not change in further iteration). This
each unit region is X(horizontal) and Y(vertical). Where vector is then compared with the previously stored
X=L_H / 16, Y=L_V / 16. patterns that were used to train the network. The stored
pattern which is best matched with the pattern in the stable
Each pixel in a unit region causes folding of some other state is taken as the output.
pixels around the original pixel. From our experiment it is
found that the amount of flooded pixels is 0.03 times of
the maximum capacity of each unit region. Now to convert
each unit region into binary matrix the following
procedure is used:

For each unit region


If
The region contains 38% of the maximum pixels
of each unit region
Then
Set it as a matrix block to 1
Else
Set it as a matrix block to 0

In the experiments it is found that if a considerable 38%


area of a unit region is flooded by pixels, the maximum
accuracy for converting that region into binary matrix [1]
can be obtained. The overall activities for feature
extraction are shown in the figure 3.
Figure 4:Block diagram representation of
BAM neural network
4. RECOGNITION

The neural network BAM is used for recognition. A model 4.1. Working mechanism of BAM
of BAM neural network is shown in the figure 4. As
shown in this figure BAM has two layer recurrent BAM neural network must be given some stored input
architecture in which the backward weight (the weights patterns and there associated output patterns for its
from the output to the input) matrix is the transpose of the learning. During the learning process the network
forward weight (from input to the output) matrix. The size calculate its weights for each link (from each input node to
of the input and output layers are determined by the each output node and vise versa) using these given stored
dimensions of the pairs of associated vectors. As in the patterns. For our purpose we trained our network with fifty
figure, an input pattern P is applied to the weight matrix W Bengali characters and their associated output patterns.
and produces an output vector Q, which is then applied to Example of such patterns are given bellow:
pattern is compared with the stored patterns and the best-
Input pattern BA matched stored pattern is taken as the output.
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 4.2 Performance of BAM
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1, 1, 1,-1, For testing the performance of BAM, fifty stored input
1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1, 1, 1, 1, 1,-1, Bengali characters and there associated output patterns are
1, 1, 1, 1, 1, 1, 1,-1,-1, 1, 1, 1, 1, 1, 1,-1, used. Each input pattern is in a 16x16 matrix, hence there
1, 1, 1, 1, 1,-1,-1, 1, 1, 1, 1, 1, 1, 1, 1,-1, are 256 input nodes in the network and each output pattern
1, 1, 1,-1,-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1, is in a 4x4 matrix, hence 16 output nodes are there in the
1, 1,-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1, network. The performance is measured in the following
1, 1,-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1, criteria:
1, 1, 1,-1,-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,
1, 1, 1, 1, 1,-1,-1, 1, 1, 1, 1, 1, 1, 1, 1,-1, a) In case of deformed/distorted input to the network:
1, 1, 1, 1, 1, 1, 1,-1,-1, 1, 1, 1, 1, 1, 1,-1, If the input pattern is very close to the actual pattern
1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1, 1, 1, 1, 1,-1, (character) then BAM network gives the maximum
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1, 1, 1,-1, accuracy in recognizing the character. The accuracy is
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1, shown in the accuracy diagram bellow.
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1
Associated output pattern:
-1,-1,-1- 1,
1, 1,-1,-1, Accuracy diagram
1,-1, 1,-1,
1, 1,-1,-1 120
100
80
BAM network works using the following procedure: 60
1.Calcualtion of weight 40
The calculation of forward weight (as in figure 4) is done 20
by the following algorithm: 0
1 2 3 4 5 6 7
W ji = p =1,q =1
m
(P
i, p Q j ,q ) Readings

Here, m is the number of pairs stored patterns. Deviation(%) Correctness(%)


Wji is the connection weight from unit i to unit j,
Pi,p is the i th component in the pattern vector p.
Deviation (%) 0 6 12 18 24 30 48
Q j,p is the j th component in the pattern vector q.
Correctness (%) 99.9 97.6 95.4 90.5 83.3 66.6 50.2
The backward weight is calculated by Wij= a forward
weight Wji. This diagram shows the recognition accuracy of BAM in
different deviations. For example, 6% deviation means,
2.Activation calculation the bits in the input character matrix (16x16) is changed
a. Initialization of the input units at time 0 with by 6% from the actual or correct character.
Oi (0)=Pi.
b) Execution time for recognition (in different
Here, Pi is the i th component of the input pattern.
platforms):
Oi(0) is the i th component of the output at time The average execution time for recognizing a character in
t=0. different machines, as Pentium IV(1.5 GHz),II (350
b. At time t(t>0) Oj (t+1)=Fh ( i Wji Oi (t)) MHz),III (700 MHz) and Seleron-733 MHz (keeping all
Here, Oj(t) is the activation level of unit j at time t and other configurations same)are measured. The time against
different machines are represented in diagram 1.
Fh is a hard limiting function.
Fh given by the following values:
Fh (a) = 1, if a>0.
-1,if a<0.
O j (t) , if a = 0.

3. Step 2 is repeated until equilibrium (i.e. the activation


level remains unchanged with further iterations). Then this
ICCIT2001, pp. 297-302, Dhaka, 28-29 December
2001.
Cost (in time) diagram [3] J. N. Kappor, P. K. Sahoo and Wong, A new
method for gray level picture thresholding using the
896.3645
entropy of histogram, Comput.Vision graphics
image process, 29, pp-273-285 (1985).
900 [4] N. Ostu, A threshold selection method from gray
level histogram, Proc. IEEE Trans , Systems Man
800
Cybernet. 9, pp-62-66 (1979).
549.4893
700 [5] B. B. Chaudhuri ans U.Pal, A Complete Printed
Bangla OCR System. Pattern Recognition, vol 31,
600
no.5, pp.531-549, 1998.
500 [6] L. O. Gorman, The document spectrum for page
Time(ms) layout analysis, IEEE Trans. Pattern Anal. Mach.
400
Intel. 15,pp-1162-1172 (1993).
300 [7] H. Yun, Skew correction of document images
54.945 using inter line cross-correlation. CVGIP:
200 0.1024
Graphical Model Image Process. pp- 55, 538-
100 543(1993).
0 [8] T. Pavlidis and J. Zhou, Page segmentation and
Seleron P-II P-III P-IV classification, Comput.Vision graphics image
733MHz 350MHz 700MHz 1.5GHz process. 54,484-496(1992).
processors [9] T. Akiyama and N. Hagita, Automatic entry system
for printed document, Pattern Recognition 23,
1141- 1154(1990).
[10] D. S. Le, G. R. Thoma and H. Wechsler,
Automatic page origination and skew angle
Diagram 1: Execution time in different platforms detection for binary document images, Pattern
Recognition, 27, 1325-1344(1994).
5. CONCLUSION [11] A. K. Jain and S. Bhatta Charjee, Text
segmentation using automatic document
The nature of BAM neural network is to reach a stable processing, Mach. Vision Appl. 5, 169-184(1992).
state (a state at which successive input patterns to the [12] L. OGorman, The document spectrum for page
network do not change in further iteration). This state is layout analysis, IEEE Trans. Pattern Anal. Mac.
reached by considering the given input pattern and the Intel. 15, 1162-1173(1993).
information about the stored patterns that was used in the [13] L. A. Fletcher and R. Kasturi, A robust algorithm
learning phase of the network. For this reason in the case for text string separation from mixed text / graphics
of largely deformed input patterns BAM gives higher images, IEEE trans. Pattern Anal. Mach. Intell. 10,
accuracy in recognizing the character. 910-918(1988).
[14] T. Pavelidis and J. Zhou, Page segmentation and
Again the learning time of BAM is negligible compared classification, Comput.Vision Graphics Image
to other networks (as for Back Propagation Network the Process. 54, 484-496(1992).
average learning time is about 54.945 millisecond for only [15] Md Resaul Bashar, Md. Khademul Islam Mollah
two input nodes with one hidden layer of a three layer and Md. Lutfar Rahman, Recognition of different
network, this time is further increased with the increment sized Bangla optical character using neural
of input nodes and the hidden layers) in Pentium-IV (1.5 network, Proc. ICCIT2000, pp-185-188, Dhaka,
GHz). Bangladesh, 25-26 January, 2001.
Code for this system is implemented using Turbo C++
(version 3.0).

REFERENCES

[1] Md. Morshedul Arefin, Md Khademul Islam Molla,


Md Lutfar Rahman, M.Gangar Ali, Size
independent Bangla optical character recognition
system, Proc. ICCIT2001, pp-314-318, Dhaka, 28-
29 December 2001.
[2] Mohammad Ali Asgar, Muhammad Mansur Ali,
Recognition of handwrittten Bangla digit by
intelligent regional search method, Proc .

Você também pode gostar