Você está na página 1de 6

Proceedings of the 7th

World Congress on Intelligent Control and Automation


June 25 - 27, 2008, Chongqing, China

Unconstrained Handwritten Character Recognition


Based on WEDF and Multilayer Neural Network
Minhua Li , Chunheng Wang , Ruwei Dai
Key Laboratory of Complex System and Intelligence Science ,Institute of Automation,
Chinese Academy of Sciences
Beijing, 100080, China
{minhua.li, chunheng.wang, ruwei.dai }@ia.ac.cn

Abstract - In this paper, we propose a new approach for achieved good results and was very robust to variations of
unconstrained handwritten character recognition based on style and size. Reference [6] proposed a method to recognize
Wavelet Energy Density Feature (WEDF) and Multilayer Neural unconstrained handwritten numerals using the spline wavelet
Network. Unlike other method taking the wavelet coefficients Cohen-Daubechies-Feauveau(CDF)3/7 to extract features. In
directly as features, our method using the wavelet energy density
the system, the bi-dimensional wavelet transform was applied
features instead .The proposed approach consists of a feature
extraction stage for extracting wavelet energy density features and the four resulting sub-band images of coefficients were
with wavelets transform, and a classification stage for classifying used as feature vector. The classifier was a multilayer cluster
handwritten characters with a simple neural network. In order to neural network train with the back propagation momentum
verify the performance of the proposed method, experiments are algorithm .In their later research [7], four experiments were
carried out on handwritten numerals recognition. Experimental performed in order to test improvement in the performance of
results indicate that the WEDF is stable and reliable in the recognition system using the spline wavelets CDF 3/7. The
handwritten character recognition and performs better than first experiment validated the system using numerals form
wavelet coefficient feature, it provides high recognition rate on NIST which is much lager that the database used in [3]. The
both training samples and testing samples.
second experiment used an independent neural network for
Index Terms - handwritten character recognition, wavelet
each sub-band image of the feature vector to verify the
analysis, wavelet energy density feature (WEDF), importance of each one in the recognition process. In the third
multilayer neural network experiment, ten neural networks trained with distinct class
were tested. For the forth experiment the size normalized
numerals were presented directed to a neural network without
I. INTRODUCTION
any feature extraction.
The unconstrained handwritten characters recognition has The method proposed in this paper is different from the
been a topic of intense research since the sixties. .However, above methods in the following aspects:
the problem is still not fully solved due to the large degree of (1) The wavelet energy density features (WEDF) instead
variability of human writing. To attain high recognition rate of the wavelet coefficient features are extracted in our method.
for handwritten characters needs good feature extractors and (2)The features are extracted by dividing the wavelet-
powerful classifiers which are robust to variation of writing decomposed image into some sub-blocks which decreases the
styles and sizes. The advantage of using wavelets is the multi- number of features;
resolution analysis which provides local information in both (3) The DB4 wavelet is adopted to decompose the image;
the space domain and frequency domain [4, 5]. And neural (4) The classifier in our method is the simple BP neural
network has demonstrated its abilities in various pattern network whose structure is much simpler than above methods.
recognition tasks, including characters recognitions. In this We carry out experiments to test the stability of the
paper, we adopt wavelet feature and neural network to wavelet energy density feature and study the relationship of
recognize unconstrained handwritten numerals. Some papers recognition rate and sub-block size by dividing the sub-band
have been published using the wavelet transform as a feature image into different sizes of sub-blocks. And then, in order to
extractor and neural network as a classifier for handwritten verify the performance of the proposed method, three
numerals recognition [3,6,7]. experiments are carried out .The first experiment takes the
Reference [3] proposed a scheme to recognize whole normalized character image as input feature without
unconstrained handwritten numerals using Haar wavelets to any feature extraction step; The second experiment takes all
extract multi-resolution features and a multilayer cluster the wavelet coefficients in seven sub-image as input features
neural network for classifying the features extracted. Two after two scales of wavelet decomposition; The third
types of feature vectors were considered. The first type used experiment takes the wavelet energy density features proposed
only the features at one resolution level and the second type in this paper as input features. The classifier of the three
used all the features at two resolution levels. This scheme experiments is BP neural network. Experiment results show
that the proposed method in this paper is efficient which
perform better than the wavelet coefficients, meanwhile
This work is supported by the National Natural Science Foundation of China
under Grant No 60602031 and No. 60621001

978-1-4244-2114-5/08/$25.00 © 2008 IEEE. 1143

Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on August 06,2010 at 06:41:24 UTC from IEEE Xplore. Restrictions apply.
decreases the num of features input to the neural network and −1
LL(k ) (m, n) = [[[LLkrows * H ]→2 ]coloums * H ]↓2
makes the structure of neural network simple.
The rest of the paper is organized as follows: Section II m = 1, " , M / 2k , n = 1, " N 2k
introduces the wavelet transformation and wavelet energy −1
HL(k ) (m, n) = [[[LLkrows * H ]→2 ]coloums * G ]↓2
density feature extraction method; Section III simply describes
the BP classifier adopted in this paper; Experiments and m = 1,", M / 2k , n = 1,"N 2k
results are present in Section IV; Finally, concludes the paper −1
in Section V. LH (k ) (m, n) = [[[LLkrows * G ]→2 ]coloums * H ]↓2

II. FEATURE E XTRACTION m = 1,", M / 2k , n = 1," N 2k


−1
A. Wavelet Transformation HH(k ) (m, n) = [[[LLkrows * G ]→2 ]coloums * G ]↓2
˄3˅
The wavelet transform gives us an invariant interpretation m = 1,", M / 2k , n = 1,"N 2k
of character images at different physical levels and presents a
multi-resolution analysis in form of coefficient matrices [4]. It
provides the local information in both the space and the where LL0 represents the origin image, → 2 and ↓ 2 denotes
frequency domain. The ability to capture local information is sampling along row and column respectively, k is the wavelet
essential because numerals are locally quite different. analysis times, H and G act as the low-pass and high-pass
Let ϕ (x ) and ψ (x) be the scaling and wavelet function filters, H and G are related as (4):
of variable x , respectively, then the functions by dilations and
translations of them are: Gn = (−1)1−n H1−n
(4)

ϕ k ,i ( x ) = 2 − k / 2 ϕ (2 −k x − i ) LL2 HL2
−k / 2 −k LL1 HL 1 HL1
ψ k ,i ( x) = 2 ψ (2 x − i) L H
LH 2 HH2
(1)
LH 1 HH 1 LH 1 HH 1
In the 2D case, the corresponding transform basis functions
are as follow: Fig. 1 Wavelet decomposition steps

ϕ kLL
:i , j ( x, y ) = ϕ k ,i ( x )ϕ k , j ( y )
B. Wavelet Energy Density Feature Extraction
Before extracting features the images are normalized to
ψ kLH
:i , j ( x, y ) = ψ k ,i ( x )ϕ k , j ( y ) certain size, after that, we get the skeleton of the normalized
image. And then the wavelet decomposition algorithm is
ψ kHL
:i , j ( x, y ) = ϕ k ,i ( x)ψ k , j ( y ) applied at two scales of resolution, yielding seven sub-band
ψ kHH
:i , j ( x, y ) = ψ k ,i ( x )ψ k , j ( y )
(2)
{ }
images: LL2 , LH 2 , HL2 , HH 2 , LH 1 , HL1 , HH 1 , as shown in
2
Fig.1. The sub-band image LL represents the basic shape of
Let f ( x, y ) be the intensity of a pixel at ( x, y ) , here x {
the character and the HL2 , HL1 , LH 2 , LH 1 and } { }
varies from 0 to M − 1 , y from 0 to N − 1 , M and N are { 2 1
}
HH , HH respectively represent the stroke of the character
the height and width of images respectively. For a given in vertical, horizontal and diagonal direction in different
image, the wavelet decomposition at one scale level results in scales. Fig.2 shows the decomposition result for a character
four sub-band images: { LL, LH , HL, HH }. LL corresponds to image. The features are extracted from the seven sub-band
the low-frequency components (global information), and images.
{ LH , HL , HH } represent high-frequency components, LH
gives the vertical high frequencies (horizontal details), HL
the horizontal high frequencies (vertical details) and HH the
high frequencies in both diagonal directions (diagonal details).
The Mallat decomposition algorithm plays an important part
in wavelet decomposition, which can be implemented using
filter banks consisting of high-pass and low-pass filters. The
application to an image consists of a filtering process in
horizontal direction and a subsequent filtering process in
vertical direction. The decomposition algorithm is expressed
Fig. 2 Wavelet decomposition for handwritten character at two scales
as (3), and Fig.1 shows the decomposition steps. The DB4
Wavelet energy density feature is adopted as the input
wavelet is employed here because it performs better in
feature to the classifier. The later experiments result show that
handwritten character recognition [8].
the wavelet energy density feature can perform better in
character recognition than the wavelet coefficients. In order to

1144

Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on August 06,2010 at 06:41:24 UTC from IEEE Xplore. Restrictions apply.
decrease the feature dimension and improve the recognition WE _ SubB ( n)
rate, we divide the sub-band image into sub-block and extract WED _ SubB ( n) = (7)
WE _ SubI ( k )
the wavelet energy density feature from each sub-block in the
sub-band image. The details for extracting wavelet energy
density feature can be expressed in the following steps: where SubB(n) ∈ SubI (k ) , SubI (k ) is one of
(1) Divide the sub-band image into sub-blocks.
After wavelet transformation, each sub-image is divided
{LL 2 2 2 2
, LH , HL , HH , LH , HL , HH . 1 1 1
}
(5) Construct the feature vector.
into sub-blocks, as shown in Fig.3. Dividing the sub-band For each sub-block, we can extract the wavelet density
image into different sizes will result in different num of feature according to (7). Then all the wavelet density features
features and different recognition rate. In section IV will be input to the neural network for classification.
experiments are carried out to research the relationship of
recognition rate and sub-block size. III. CLASSIFICATION
In this paper, BP neural network is applied as the
classifier. BP neural network is one kind of ANN, which is
also called error back propagation network. It is a supervised
learning network, which can realize the non-linear mapping
from n dimensions to m dimensions. Lots of convergence
algorithms are put forward, and in the paper gradient-descent
algorithm is applied to neural network learning. Fig.4 shows
the multilayer neural network structure, V and W are hidden
layer weight vector and output layer weight vector
respectively. Through learning and training, the output of the
neural network will be very close to the ideal output. When it
Fig. 3 Dividing the image into sub-blocks reaches the set precision, we can save all the weights of the
neural network such as V and W. And the neural network with
(2) Extract wavelet energy for each sub-band image. V and W can be used to character recognition.In this paper,
In this paper, the wavelet energy for sub-band image three-layer BP neural network are adopted.
k , (k = 1,2,3, " 7) is defined as (5): z1 zl

WE _ SubI ( k ) = ¦ ¦ Coef ( x , y )
x y
(5) w1 wl

where x ∈ SubI (k ) , y ∈ SubI (k ) , SubI (k ) is one of


v1
{LL 2
}
, LH , HL , HH , LH 1 , HL1 , HH 1 , Coef ( x, y) is
2 2 2 vm
the wavelet coefficient at ( x, y ) in SubI (k ) .
(3) Extract wavelet energy for each sub-block .
According to (5) the wavelet energy for each sub-block
can be computed, we note the wavelet energy of sub-block
n in as (6): x1 x2 x3 xn

¦ ¦ Coef ( x , y )
Fig. 4 Structure of multilayer neural network
WE _ SubB ( n ) = (6)
x y
IV. EXPERIMENTS AND RESULT S
where x ∈ SubB(n) , y ∈ SubB(n) , SubB(n) is a sub-block in
one sub-band image, Coef ( x, y) is the wavelet coefficient at In order to verify the performance of the proposed
method, handwritten numerals recognition is taken as
( x, y) in SubB(n) . example. We use the database collected by our lab, some
(4) Extract wavelet density feature from each sub-block samples of the database are shown in Fig.5. The database
in the sub-band image. concludes 48280 samples written by different people, 4828
Suppose a sub-block SubB(n) belongs to a sub-band samples for each number. We choose 15000 samples for
image SubI (k ) , ,wavelet energy density feature of sub-block training and 10000 samples for testing. In all the experiments,
the DB4 wavelet is adopted to compose the image and the
n can be got from (7): wavelet filter bank with low-pass filter coefficients and high-
pass filter coefficients are listed in Table I.

1145

Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on August 06,2010 at 06:41:24 UTC from IEEE Xplore. Restrictions apply.
are similar in each sub-band image and are dissimilar for
different class in each sub-band image , which proves its
stability and reliability.
0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(0) (1)
0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(2) (3)
0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(4) (5)
0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(6) (7)
Fig. 5 Some examples in the database 0.4 0.4

0.2 0.2
TABLE I
DB4 WAVELET LOW-PASS FILTER AND 0
0 2 4 6 8 10 12 14 16
0
0 2 4 6 8 10 12 14 16
(8) (9)
HIGH-PASS FILTER COEFFICIENTS

Hn 2
Gn (a) Wavelet energy density feature distribution in HL sub-band image

0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(0) (1)
0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(2) (3)
0.4 0.4

0.2 0.2
A. Experiments for Analysing Wavelet Energy Density 0 0
Feature 0 2 4 6 8
(4)
10 12 14 16 0 2 4 6 8
(5)
10 12 14 16

(1) Analyse the stability of wavelet energy density 0.4 0.4

feature 0.2 0.2

0 0
After two scales wavelet decomposition, the image is 0 2 4 6 8
(6)
10 12 14 16 0 2 4 6 8
(7)
10 12 14 16

divided into seven sub-band images, the sub-band image LL2 0.4 0.4

represents the basic shape of the character and 0.2 0.2

{ } { } {
the HL2 , HL1 , LH 2 , LH 1 and HH 2 , HH 1 respectively } 0
0 2 4 6 8
(8)
10 12 14 16
0
0 2 4 6 8
(9)
10 12 14 16

represent the energy of the character in vertical, horizontal


and diagonal direction in different scales. The wavelet energy 2
(b) Wavelet energy density feature distribution in LH sub-band image
density feature extracted from the sub-block in essence
represents the relative energy of the whole energy in different
directions and scales. In order to verify the stability of this
feature, we extract features from sub-blocks in the sub-band
{ }
images LL2 , LH 2 , HL2 , HH 2 for class 0-9 to study the
feature distribution. Each class contains 10 examples ,the
character images are normalized to 64 h 64 pixels before
wavelet decomposition, we divide the sub-band images in
{ }
LL2 , LH 2 , HL2 , HH 2 into 4h4 pixels sub-blocks ,so for
{
the sub-band images in LL2 , LH 2 , HL2 , HH 2 , 16 features}
can be extracted respectively . Fig.6 shows the wavelet energy
density feature distribution in each sub-band image for each
class extracted from ten samples.
Experimental results in Fig.6 demonstrate that the
wavelet energy density feature distribution within each class

1146

Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on August 06,2010 at 06:41:24 UTC from IEEE Xplore. Restrictions apply.
block size is 1× 1 pixels, in this situation, a sub-block contains
0.4 0.4
one pixel.
0.2 0.2
The results show that, the recognition rate is related with
0
0 2 4 6 8 10 12 14 16
0
0 2 4 6 8 10 12 14 16 the sub-block size. If the sub-image is divided into small size
(0) (1)
0.4 0.4 of sub-blocks, the recognition rate will be high, but the
0.2 0.2 number of the feature will be large which will cost a lot of
0
0 2 4 6 8 10 12 14 16
0
0 2 4 6 8 10 12 14 16
computing time. Considering the recognition rate and
0.4
(2)
0.4
(3)
computing time together, the sub-block size should not be too
0.2 0.2
big or too small.
0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(4) (5)
TABLE II
0.4 0.4 EXPERIMENTAL RESULT USING DIFFERENT SUB-BLOCK SIZE
0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(6) (7)
0.4 0.4

0.2 0.2

0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(8) (9)

2
(c) Wavelet energy density feature distribution in HH sub-band image 1

0.95
0.1 0.1
0.9
0.05 0.05

0 0 0.85
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(0) (1)

recognition rate
0.1 0.1 0.8
0.05 0.05
0.75
0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(2) (3) 0.7
0.1 0.1
0.65
0.05 0.05

0 0 0.6
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
(4) (5)
0.1 0.1 0.55

0.05 0.05
0.5
0 1 2 3 4 5 6 7 8 9 10
0 0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 sub-block size
(6) (7)
0.1 0.1

0.05 0.05
Fig. 7 Relationship of recognition rate and sub-block size

0 0
0 2 4 6 8
(8)
10 12 14 16 0 2 4 6 8
(9)
10 12 14 16 B. Experiments for Verifying The Performance of Proposed
Method
2
Three kinds of experiments have been considered to
(d) Wavelet energy density feature distribution in LL sub-band image testify the performance of the proposed method in this paper.
Fig. 6 Wavelet energy density feature distribution in the second scale sub-
band images for 10 class numerals In these experiments BP neural network is taking as the
classifier.
(2) Research relationship of recognition rate and sub- Experiment 1 : In this experiment, the character image is
block size normalized to 20×20 pixels ,and the whole image without any
Dividing the sub-band image into different sizes of sub- feature extraction with 400 dimension features is directly
blocks will result in different recognition rate. In this part, we input to the BP neural network for classification;
carry out experiments to study the relationship of recognition Experiment 2 : In this experiment, the character image is
rate and the sub-block size. In all the experiments in this part, normalized to 20×20 pixels , after the image go though 2
the simple distance matching classifier is adopted for scales wavelet transformation ˈ seven sub-image with 400
classification, all the images are normalized to 64h64 pixels wavelet coefficients are obtained. All the wavelet coefficients
before wavelet transformation. We divide the sub-band images totally 400 dimension features are input to the BP neural
into different sizes of sub-blocks to extract the wavelet energy network for classification;
density feature according to steps in section II. Experiment Experiment 3: In this experiment, in order to get high
results are listed in Table II and Fig.7. Here, n × n denotes the recognition rate and save computing time, the sub-block size
sub-block size is n × n pixels, where, 1× 1 denotes the sub- will not be too big or too small. The character images are
normalized to 48 h48 pixels before wavelet transformation.

1147

Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on August 06,2010 at 06:41:24 UTC from IEEE Xplore. Restrictions apply.
Our proposal for dividing the sub-band images into sub-blocks V. CONCLUSION
is for the sub-band images in { LL2 , LH 2 , HL2 , HH 2 },we A new method for unconstrained handwritten character
divide them into smaller size of sub-blocks and the sub-band recognition based on wavelet analysis and multilayer neural
images in { LH 1 , HL1 , HH 1 } are divided into bigger size of network is proposed in this paper. The wavelet energy density
sub-blocks. The sub-band images { LL2 , LH 2 , HL2 , HH 2 } feature is adopted as the input feature and a simple BP
are 12 h 12 pixels , we divide them into 2 h 2 pixels sub- network work is taken as the classifier. In order to testify the
blocks, 6 h 6 h 4=144 features are extracted. The sub-band performance of the method, experiments are carried out on the
handwritten numerals recognition. Experiment results indicate
images { LH 1 , HL1 , HH 1 } are 24h24 pixels, we divide them that wavelet density feature is stable and reliable in
into 4 h 4 pixels sub-blocks, 6 h 6 h 3=108 features are handwritten character recognition and performs better than the
extracted. So the total number of the features is 144+108=252, wavelet coefficients and can reduce the number of features
all the features are input to the BP neural network for input to the classifier which decreases the computing time and
classification. makes the structure of BP neural network much simpler. The
The structures of BP neural network for each experiment method proposed in this paper is efficient and gets high
are as follow: recognition rate on both the training samples and the testing
Experiment 1: 400 input neurons, 60 hidden neurons, 10 samples providing a good method for handwritten character
output neurons; recognition.
Experiment 2: 400 input neurons, 60 hidden neurons, 10
output neurons; REFERENCES
Experiment 3: input neurons, 20 hidden neurons, [1] DAI Ruwei, LIU Chenglin and XIAO Baihua, “Chinese character
10 output neurons. recognition: history, status and prospects,” The18th International
Conference on Pattern Recognition , Hongkong, China, ,August, 2006.
For every experiment in this part, there are 10 output [2] Cheng-Lin Liu, Kazuki Nakashima, Hiroshi Sako and Hiromichi
neurons, one neuron for each class. A numeral is classified as Fujisawa, “Handwritten Digit Recognition:Investigation of Normalization
belonging to class i , if the i − th output of the neural network and Feature ExtractionTechniques”, Pattern Recognition, Vol. 37, No. 2,
2004,pp. 265-279.
produces the largest value. [3] S.-W. Lee, C.-H. Kim and Y. Y. Tang, “Multiresolution recognition of
Experiment results are listed in Table III. The recognition unconstrained handwritten numerals with wavelet transform and
rate of the proposed method in this paper are 99.87% on the multilayer cluster neural network”, Pattern Recognition, 29( 12):1953-
training samples and 99.18% on the testing sample, which is 1961, 1996.
[4] I. Daubechies, “Ten Lectures on Wavelets”, PRENTICE HALL, 1998
much better than the method in experiment 1 without wavelet [5] S. G. Mallat, “A theory for multiresolution signal decomposition: The
composition whose recognition rate on training and testing wavelet representation”, IEEE Trans. on Pattern Analysis and Machine
samples are 96.6312% and 95.6787%, and better than the Intelligence, 11(7), pp. 674-693, 1989.
method in experiment 2 taking whole wavelet coefficients as [6] S.E.N. Correia and J.M.Carvalho, “Optimizing the recognition rates of
unconstrained handwritten numerals using biothogonal spline wavelets”,
features whose recognition rate on training and testing In ICPR 2000, Barcelona-Espanha, September, 2000
samples are 99.5933% and 98.15%, if the sub-images are [7] S.E.N. Correia , J.M.Carvalho and Robert Sabourin, “On the performance
divided into smaller sub-blocks, the recognition rate could be of wavelets for handwritten numerals recognition”, Pattern Recognition,
higher. Meanwhile, the number of the feature in the proposed 2002. Proceedings. 16th International Conference on.Volume 3, 11-15
Aug. 2002 ,pp.127 - 130
method is much smaller and the structure of the classifier is [8] Lei Huang, Xiao Huang, “Multiresolution recognition of offline
much simper than that in experiment 1 and experiment 2. It handwritten Chinese characters with wavelet transform”,Document
proves that the wavelet energy density feature is stable and Analysis and Recognition, 2001,Proceedings, Sixth International
reliable in handwritten character recognition and the proposed Conference on.10-13 Sept. 2001, pp.:631 - 634
[9] Yuanyan Tang, Ling Wang, “Wavelet analysis for text and character
method is effective. recognition”, Beijing: The Science Publishing House ,2005.

TABLE III
RECOGNITION RESULTS FOR 3 EXPERIMENTS

1148

Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on August 06,2010 at 06:41:24 UTC from IEEE Xplore. Restrictions apply.

Você também pode gostar