Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract The process of Face Recognition is complicated due to the background, pose
variations in the images. Using the Pre-processing techniques proposed in this paper the
essential invariant features in an image have been made available for extrac-tion.
Background Removal based on Eccentricity is implemented by incorporating both
YCbCr and HSV color models to eliminate unnecessary features in the back-ground.
Multi-scaled fusion is included for nullifying the variation in pose. Next, the images are
subjected to feature extraction using two-dimensional Discrete Wavelet Transform
(DWT) and feature selection algorithm. Experimental results show the effectiveness of
the above mentioned techniques for face recognition on two bench-mark face databases,
namely, CMU-PIE and Caltech.
1 Introduction
The concept of Face Recognition has been at the centre of Image Processing more
specifically in the field of biometrics for many years. The applications span over a
variety of different fields i.e. Surveillance and Tracking, Criminal Identifica-tion,
Authentication of Identity. It involves identifying the distinguishing features of a face
and classifying them accordingly [1]. The success of this FR system is affected by
various factors such as Background, Pose and Illumination which can be
The images used for testing the proposed techniques may include unco-operative
subjects with varying background. The background tends to affect the efficiency of
the system. The background features need to be removed from the images in order
to relieve the system of any unnecessary computation.
The face recognition process is optimized isolating only the face to be recognized
while removing the background which is non-essential. This Background removal
is a combination of skin segmentation, morphological operations, eccentricity—
range based region selection. This method can be enhanced by introducing another
color model namely HSV [4] during the skin segmentation stage, for extracting the
face region.
In addition to the background issue there is the matter of pose variance which can
also adversely affect the recognition ability of the system. The differences in pose can
be neutralised by fusing the left and right poses of the face into a single image.
3 Prior Art
This paper has improved the performance of the FR system by incorporating the
techniques mentioned below∶
In this step the face is normalised in terms of pose by combining it with its mirrored
image and compressing it as shown in Fig. 1. Thereby establishing symmetry along
the vertical axis. Lengthwise compression and fusion of images with different ratios
reduces the redundancy in images (due to vertical symmetry) [5]. This technique
reduces the effect of pose variance in images and improves the correlation between
images of the same subject in different profiles.
Face Recognition Using Background Removal Based . . . 35
50
100
85% 150
compression 50 100
50
50
100
+ 100
150
25 50
150
100 200 300
Initial image 50
Adding with mirror image
100
73% 150
compression 50 100
The Discrete Wavelet Transform (DWT) [6] is used in feature extraction because it can
produce both frequency and spatial representations of a signal simultaneously. Here 2D
DWT for feature extraction. Thus, the image is sampled into subbands approximation
(cA), horizontal (cH), vertical (cV) and diagonal (cD). The infor-mation in low spatial
frequency bands play a dominant role in face recognition [7]. The cA sub-bands facial
features are least sensitive to external parameter variations.
Xt = [x , x , x , … , x ] (1)
i i1 i2 i3 iD
Where i = (1, 2, . . . N) and N is the size of the swarm; Pi_best is the particle best
reached solution and Gbest is the global best solution in the swarm. C1 and C2 are
cognitive and social parameters that are bounded between 0 and 2. rand1 and rand2
are two random numbers, with uniform distribution U(0,1). (Vmax is the maximum
velocity).
36 A. Lawrence et al.
4 Proposed Methodology
The FR system is based on the system being divided into 2 stages i.e., training and
testing stages, as shown in Fig. 2. The training stage deals with training the FR
system using a certain amount of images from the database. Time taken to train and
select the features (Training time) signifies the complexity and speed of the
algorithm. After training, the system is tested using the remainder of the images
from the database. The training and testing stages each consists of four general
component blocks, as shown in Fig. 2.
The images are subjected to skin segmentation in the YCbCr and HSV color space and
are subjected to thresholding such that only skin colored regions are included. The HSV
color model is more akin to human color perception [9]. The H and S channel provide
information about skin color. Channel S is used to identify Asian and Caucasian ethnics
[10]. The obtained binary image is subjected to morphological
cA cV
1-D vector
cH cD
1-D
Training 2- Dimensional vector Binary Face
Image DWT PSO algorithm Feature
Gallery
gbest
100
200
300
150 300 450 600 750
50
100
150
100 200 300
(1) The color images of the subject in RGB color space is converted to YCbCr and
HSV color space as it is relatively easier to threshold skin color regions using this.
The YCbCr color space is relatively immune to illumination changes and
thus provides better separation between illumination component Y and the
color components Cb and Cr. The skin in channel H (Hue) is characterized by
values between 0.01 and 0.25, in the channel S(Saturation) from 0.9 to 0.103.
The skin pixels are assigned a 1 and non-skin pixels are assigned a 0 using (5).
Thus the image is converted to a binary image with white portions being the
skin colored regions within the threshold value.
(3) Regions having higher values of eccentricities are removed from the image.
The regions with higher values of eccentricities usually correspond to
rectangular long areas. The facial areas correspond to areas with low values of
eccentricities (usually between 0.4 and 0.6). In the event that multiple regions
are identified then they are selected based on their area.
The extraction stage deals with reducing the dimensions of the target thereby reduc-
ing and eliminating the unimportant aspects of the image. In this paper, we have
used one dimensional DWT to achieve this. For which the cA component is used
for further extraction as shown in Fig. 2. 2D DWT is applied to this and the cA sub-
band is extracted. This is done up to 2 levels using the sym4 wavelet. DWT filters
out the number of features required to effectively represent an image and thus used
as an extractor in this system.
The BPSO algorithm is used for feature selection. This optimises the set of features
by acting as funnel to reduce the quantity of features [13], such that the class sep-
aration is maximized. This minimizes the number of redundant features to give a
compact representation of the image.
The Euclidean classifier is used to ascertain the similarities between the testing
image and training images and thus, in the process recognize the subject in the test-
ing image [13]. The Euclidean distance termed as the straight line distance between
two images is a measure of this similarity. If A and B are feature vectors
correspond-ing to a training image and a testing image respectively, and L is the
length of the feature vector, then the Euclidean distance is calculated.
Fig. 4 13 images of a subject in CMU-PIE database (left) and the corresponding images after
background removal (right)
In order to verify the effectiveness and robustness of the proposed technique, the
following experiments were conducted using standard face databases, namely,
CMU-PIE and Caltech.
The CMU Pose, Illumination and Expression (PIE) database [14] contains more than
40,000 images of 68 subjects taken between October 2000 and December 2000. It
contains images with highly complex backgrounds and non-uniform lighting condi-
tions these images adopt different poses as well. For the experiments related to CMU-
PIE, a database which contains images of 30 subjects with pose variations captured
before complex backgrounds has been utilised. In order to simulate the pose variance
factor, a total of 13 images from each subject with a size of 640 × 486 (RGB) have been
considered. The images differ from one another in pose and background.
The Caltech Frontal face dataset [15] contains a total 450 face images of 27 unique
subjects each of size 896 × 592, in JPEG format. It consists of images with different
lighting conditions and non-uniform backgrounds. All the images are frontal with a
constant pose in all the images. This database has been customised by considering
a total of 16 subjects having 15 images each with varying expressions. The back-
grounds for each of these images is complex and the background removal process
has been adjusted accordingly. This database has been created for experimental pur-
poses from the initial set of images. The creation of this database takes a total time
of 62s.
40 A. Lawrence et al.
Table 1 Comparison of recognition rate for different Tr:Te ratio for the proposed datasets
Dataset Tr:te RR (%) Avg. No. Training time (s) Testing time (ms)
of features
CMU-PIE 5:8 32.04 395 75.36 118.34
7:6 34.97 393 64.56 121.45
3 : 10 26.23 387 70.37 103.23
1 : 12 21.65 402 76.97 100.64
Caltech faces 1999 5 : 10 50.20 386 46.62 32.45
6:9 54.16 391 55.36 35.62
7:8 58.73 345 64.79 36.80
8:7 58.63 358 74.37 38.67
8.2 Experimentation
8.2.1 Experiment 1
In real world applications, the training to testing ratio varies depending on the sit-uation,
we try to simulate this scenario by varying the training to testing ratio. The experiment
is carried out for different ratios as shown in Table 1. For a given ratio, the RR varies
slightly with different iterations as the training and testing images are chosen in a
pseudo-random nature [16]. Thus, the results for each of the databases mentioned in this
paper, are computed by averaging over 10 iterations. In this exper-iment, as the ratio of
training to testing images increases, the RR and the average training time increases. In
addition, testing time also increases because the image has to be compared with more
number of training images.
In the Caltech database, it can be observed that the RR seems to vary from 58 to
50 %. In this case, the best and worst subjects. The results associated with the CMU-
PIE dataset is similar. The system performs well under most constraints and the
average RR varies between 34 to 28 %. The CMU-PIE dataset consists of images
that are not at face level also and therefore getting high RR is a challenge.
8.2.2 Experiment 2
YCbCr method. The images are pose invariant and therefore multi-scaled fusion
technique does not effect the RR significantly. Thus the proposed method enhances the
recognition efficiency by implementing the proposed pre-processing technique.
9 Conclusions
The proposed pre-processing method utilises two color models in order to zero in
on the face, and reduce the image to obtain the features which improves the chances
of recognition. The processed images are in coupled with DWT based feature
extrac-tion, BPSO based feature selection and Euclidean Classifier, using
MATLAB, to achieve improved face recognition. The accuracy of this system is
realised by the results obtained by applying the proposed FR system on two different
databases, which have variations in pose (CMU-PIE) and background (Caltech) as
shown in Fig. 5. The results for CMU-PIE database indicate that the improved
background removal along with the pose remedial method has given a recognition
rate of 32.04 % with training-to-testing ratio of 5∶ 8. Considering that the Caltech
database has vary-ing background but no pose variation, the results are mainly
influenced by back-ground removal method with a recognition rate of 50.2 % with
training-to-testing ratio of 5∶ 10. Therefore the improved background removal
technique has effectively increased the recognition efficiency.
42 A. Lawrence et al.
(a)
(b)
Fig. 5 Graphical plots describing Experiment 1(a) and Experiment 2(b). a Comparison of recog-
nition rate for different Tr:Te ratio for the proposed datasets. b Comparison of recognition rates
for different methods
References
1. John D Woodward Jr, Biometrics: The Ultimate Reference, 2009 edition John Wiley and
Sons, 2009.
2. G Gordon, T Darrell, M Harville, J Woodfill, Background Estimation and Removal Based on
Range and Color,” International Conference on Computer Vision and Pattern Recognition,
1999.
3. Son Lam Phung, Abdesselam Bouzerdoum, Douglas Chai, “Skin Segmentation Using Color
Pixel Classification: Analysis and Comparison,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 27, no. 1, 2005.
4. Daithankar M V, Karande K J, Harale A D, “Analysis of skin color models for face
detection,” International Conference on Communications and Signal Processing, 2014.
5. Nitish S Prabhu, Thejas N Kesari, K Manikantan, S Ramachandran, “Face Recognition using
Eccentricity-Range based Background Removal and Multi-Scaled Fusion as Pre-processing
Techniques Subsequences,” Nirma International University Conference on Engineering,
NUiCONE, 2013.
Face Recognition Using Background Removal Based . . . 43