Escolar Documentos
Profissional Documentos
Cultura Documentos
AND ALGORITHMS
Seminar Report
Submitted by
BACHELOR DEGREE
In
COMPUTER APPLICATIONS
By
CERTIFICATE
This is to certify that the Seminar titled FACIAL RECOGNITION SYSTEMS AND
ALGORITHMS was prepared and presented SIJOMON BENNY(Reg.No:140021044833),
of the Sixth Semester, Bachelor of Computer Applications, Marian College, Kuttikkanam, for
the partial Fulfillment of requirement for the award of Degree of Bachelor of Computer
Applications by Mahatma Gandhi University, Kottayam, January 2017.
Dr Rajimol A
HEAD OF THE DEPARTMENT
(Dep. Seal)
Examiners:
1:
2:
ii
ABSTRACT
Face recognition has been one of the most interesting and important research fields in the past
two decades. The reasons come from the need of automatic recognitions and surveillance
systems, the interest in human visual system on face recognition, and the design of humancomputer interface, etc. These researches involve know-ledge and researchers from disciplines
such as neuroscience, psychology, computer vision, pattern recognition, image processing, and
machine learning, etc. A bunch of papers have been published to overcome difference factors
(such as illumination, expression, scale, pose, ) and achieve better recognition rate, while
there is still no robust technique against uncontrolled practical cases which may involve kinds
of factors simultaneously. In this report, well go through general ideas and structures of
recognition, important issues and factors of human faces, critical techniques and algorithms,
and finally give a comparison and conclusion.
iii
TABLE OF CONTENTS
TITLE
PAGE NO
Abstract
List of Tables
List of Figures
1. Introduction
1.1 What is Face Recognition?
iii
v
v
1
1
1
1
2
2. History
3. Issues and factors of human faces
3.1 Domain-knowledge of human faces and human visual system
3.1.1 Aspects from psychophysics and neuroscience
3.1.2 Holistic-based or feature-based
3.1.3 Thatcher Illusion
3.2 Factors of human appearance variations
3.3 Design issues
4. Face detection
4.1 Knowledge-based methods
4.1.1 Hierarchical knowledge-based method
4.1.2 Horizontal / vertical projection
4.2 Feature invariant approaches
4.2.1 Face Detection Using Color Information
4.2.2 Face detection based on random labelled graph matching
5. Feature Extraction and Face Recognition
5.1 Holistic-based methods
5.1.1 Eigenface and Principal Component Analysis
5.1.2 Fisherface and linear Discriminative Analysis
5.1.3 Independent Component Analysis
5.2 Part-based methods
5.2.1 Component-based face recognition
5.2.2 Person-specific SIFT features for face recognition
3
4
4
4
4
4
5
7
9
9
9
10
10
10
12
14
14
14
16
18
20
21
22
24
26
iv
List of Tables
1.
2.
3.
4.
List of Figures
1. Fig 1.1 : Configuration of a general face recognition structure
2. Fig 1.2: An example of how the three steps work on an input image.
3. Fig 3.1: (a) The head is located up-side down, and its hard to notice that the eyes are pasted
in the reverse direction in the right-side picture, while in (b) we can easily recognize the strange
appearance.
4. Fig 3.2 Face-patch changes under different illumination conditions. We can easily find
how strong the illumination can affects the face appearance.
5. Fig 3.3 Face-patch changes under different pose conditions.
6. Fig 4.1 : The multi-resolution hierarchy of images created by averaging and sub-sampling.
7. Fig 4.2: Examples of the horizontal / vertical projection method.
8. Fig 4.3: The flowchart of the face detection algorithm proposed by Hsu et al.
9. Fig 4.4: The flowchart to generate the eye map.
10. Fig 4.5: The flowchart to generate the mouth map.
11. Fig 4.6: The locations of the missing features are estimated from two feature points.
12. Fig 5.1: Eigen face and Principal Component Analysis
13. Fig 5.2: The reconstruction process based on Eigen face representation.
14. Fig 5.3: The figure of using different bases for projection.
15. Fig 5.4: (top) Example 3-D data distribution and corresponding PC and IC axes.
16. Fig 5.5: Two architectures for performing the ICA on images.
17. Fig 5.6: Image synthesis model for the Architecture I.
18. Fig 5.7: The independent basis image representation
19. Fig 5.8: Image synthesis model for the Architecture II.
20. Fig 5.9: Binary three of face images generated by divisive clustering.
21. Fig 5.10: System overview again of the component-based face detector using four
components
22. Fig 5.11: Examples of the component-based face detection and the 10 components used
for face recognition.
23. Fig 5.12: SIFT features on sample images and features matches in faces with expression
variation.
24. Fig 5.13: Sub-region construction and similarity computation scheme for the face
recognition system.
1. Introduction
1.1. What is Face Recognition?
Face Recognition is a technique used in computer system for identifying human faces using
various algorithms and extractions. Facial Recognition systems are those application soft wares
that performs this function. we generally separate the face recognition procedure into three
steps: Face Detection, Feature Extraction, and Face Recognition as shown below
with fixed dimension or a set of fiducial points and their corresponding locations. In some
literatures, feature extraction is either included in face detection or face recognition.
Fig 1.2: An example of how the three steps work on an input image. (a) The input image and
the result of face detection (the red rectangle) (b) The extracted face patch (c) The feature
vector after feature extraction (d) Comparing the input vector with the stored vectors in the
database by classification techniques and determine the most probable class (the red rectangle).
Here we express each face patch as a d-dimensional vector, the vector X m,n as the nth vector
in the mth class , and Nk as the number of faces stored in the kth class.
2. History
Automated face recognition is a relatively new concept. Developed in the 1960s, the first semiautomated system for face recognition required the administrator to locate features (such as
eyes, ears, nose, and mouth) on the photographs before it calculated distances and ratios to a
common reference point, which were then compared to reference data. In the 1970s, Goldstein,
Harmon, and Lesk1 used 21 specific subjective markers such as hair color and lip thickness to
automate the recognition. The problem with both of these early solutions was that the
measurements and locations were manually computed. In 1988, Kirby and Sirovich applied
principle component analysis, a standard linear algebra technique, to the face recognition
problem. This was considered somewhat of a milestone as it showed that less than one hundred
values were required to accurately code a suitably aligned and normalized face image.2 In
1991, Turk and Pentland discovered that while using the eigenfaces techniques, the residual
error could be used to detect faces in images3 a discovery that enabled reliable real-time
automated face recognition systems. Although the approach was somewhat constrained by
environmental factors, it nonetheless created significant interest in furthering development of
automated face recognition technologies.3 The technology first captured the publics attention
from the media reaction to a trial implementation at the January 2001 Super Bowl, which
captured surveillance images and compared them to a database of digital mugshots. This
demonstration initiated much-needed analysis on how to use the technology to support national
needs while being considerate of the publics social and privacy concerns. Today, face
recognition technology is being used to combat passport fraud, support law enforcement,
identify missing children, and minimize benefit/identity fraud.
Fig 3.1 (a) The head is located up-side down, and its hard to notice that the eyes are pasted in
the reverse direction in the right-side picture, while in (b) we can easily recognize the strange
appearance.
Pose
RST variation
Cluttering
Occlusion
Table 3.1
For our future works, well pay more attention on illumination-invariant, multiview detection
and recognition, and partial observation situations which havent been well solved.
Fig 3.2 Face-patch changes under different illumination conditions. We can easily find how
strong the illumination can affects the face appearance.
Fig 3.3 Face-patch changes under different pose conditions. When the head pose changes, the
spatial relation (distance, angle, etc.) among fiducial points (eyes, mouth, etc.) also changes
and results in serious distortion on the traditional appearance representation.
include the execution time in the experimental results as comparison and judgment against
other techniques.
Second, the training data size is another important issue in algorithm design. It is trivial that
more data are included, more information we can exploit and better performance we can
achieve. While in practical cases, the database size is usually limited due to the difficulty in
data acquisition and the human privacy. Under the condition of limited data size, the designed
algorithm should not only capture information from training data but also include some prior
knowledge or try to predict and interpolate the missing and unseen data. In the comparison
between the Eigen face and the fisher face, it has been examined that under limited data size,
the Eigen face has better performance than the fisher face.
Finally, how to bring the algorithms into uncontrolled conditions is yet an unsolved problem.
4. Face detection
From this section on, we start to talk about technical and algorithm aspects of face recognition. We
follow the three-step procedure depicted in fig. 1.1 and introduce each step in the order: Face
detection is introduced in this section, and feature ex-traction and face recognition are introduced
in the next section. The face detection algorithms are classified into four categories: know-ledgebased, feature invariant, template matching, and the appearance-based method. We follow their
idea and describe each category and present excellent examples in the following subsections. To
be noticed, there are generally two face detection cases, one is based on grey level images, and the
other one is based on coloured images.
Fig 4.1 : The multi-resolution hierarchy of images created by averaging and sub-sampling. (a) The original image.
(b) The image with each 4-by-4 square substituted by the averaged intensity of pixels in that square. (c) The image
with 8-by-8 square. (d) The image with 16-by-16 square.
Fig 4.2: Examples of the horizontal / vertical projection method. The image (a) and image (b) are sub-sampled
with 8-by-8 squares by the same method described in fig. 7, and (c) with 4-by-4. The projection method performs
well in image (a) while cant handle complicated backgrounds and mul-ti-face images in image (b) and (c).
resolution) and extracts possible face candidates based on the general look of faces. Then the
middle and bottom layers carry rule of more details such as the alignment of facial features and
verify each face candidate. This method suffers from many factors described in Section 3 especially
the RST variation and doesnt achieve high detection rate (50 true positives in 60 test images),
while the coarse-to-fine strategy does reduces the required computation and is widely adopted by
later algorithms.
10
Fig 4.3: The flowchart of the face detection algorithm proposed by Hsu et al.
11
12
Fig 4.6: The locations of the missing features are estimated from two feature points. The ellipses show the areas
which with high probability include the missing features.
13
14
The orthogonal PCA bases are proved to preserve the most projection energy and preserve the
largest variation after projection, while the proof is not included in this report. In the work proposed
by Turk et al., they proposed a speed-up algorithm to reach the eigenvectors form the covariance
matrix , and used the vectors after dimension reduction for face detection and face recognition.
They also proposed some criteria for face detection and face tracking.
The PCA method has been proved to discard noise and outlier data from the training set, while they
may also ignore some key discriminative factors which may not have large variation but dominate
our perception. Well compare this effect in the next subsection about the fisher face and linear
discriminant analysis. To be announced, the Eigen face algorithm did give significant influences
on the algorithm design for holistic-based face recognition in the past twenty years, so it is a great
starting point for readers to try building a face recognition system.
15
Fig 5.1
(a) We generate a database with only 10 faces and each face patch is of size 100-by-100. Through the
computation of PCA basis, we get (b) a mean face and (c) 9 Eigen face (the order of Eigen faces from
highest eigenvalues is listed from left to right, and from top to bottom).
16
Fig 5.2: The reconstruction process based on Eigen face representation. (a) The original face in the database could
be reconstructed by its Eigen face representation and the set of projection vectors (lossless if we use all PCA
projection vectors, or this reconstruction will be loss). (b) The reconstruction process with different number of
basis used: From left to right, and from top to bottom, we in turn add one projection vector with its corresponding
projection value. The bottom right picture using 9 projection vectors and the mean vector is the perfect
reconstruction result.
Fig 5.3: The figure of using different bases for projection. With suitable bases, the dimesion-reduced result could
preserve the discriminative nature of the original data.
17
Fig 5.4: (top) Example 3-D data distribution and corresponding PC and IC axes. Each axis is a column of the
projection matrix W found by the PCA and the ICA. Note that PC axes are orthogonal while the ICA axes are not.
IF only two components are allowed, the ICA choose a different subspace than ICA. (bottom left) Distribution of
the first PCA coordinates of the data. (bottom right) Distribution if the first ICA coordinates of the data. Note that
since the ICA axes are non-orthogonal, relative distance between points are different in the PCA than in the ICA,
as are the angels between points. As you can see, the bases found by ICA preserve more original structure than
the PCA.
pixels as random variables and the image as outcomes. The Architecture I depicted in fig. 5.5 found
n source of pixel images, where each has the appearance as shown in the column U illustrated in
fig. 5.6, and a human face could be decomposed in to a weight vector as in fig. 5.5. This architecture
finds a set of statistically independent basis images and each of them captures the features in human
faces such as eyes, eyebrows, and mouths.
The Architecture II finds the basis images which have similar appearances as the PCA does as
shown in fig. 5.7, and has the decomposition as shown in fig. 5.8. This architecture uses the ICA
to find a representation in which the coefficients used to code images are statistically independent.
18
Generally speaking, the first architectural finds spatially local basis images for the face, while the
second architecture produces a factorial face code. In their experimental results, both these two
representation were superior to the representation based on the PCA for recognizing faces across
days and changes in expression, and a classifier combining these two ICA representation gave the
best performance.
Fig 5.5: Two architectures for performing the ICA on images. (a) The Architecture I for finding statistically
independent basis images. Performing source separation on the face images produced IC images in the rows of U.
(b) The grey values at pixel location I are plotted for each face image. ICA in the Architecture I finds weight
vectors in the directions of statistical dependencies among the pixel locations. (c) The Architecture II for finding
a factorial code. Performing source separation on the pixels produced a factorial code in the columns of the output
matrix, U. (d) Each face image is plotted ac-cording to the grey values taken on at each pixel location. The ICA
in the Architecture II finds weight vectors in the directions of statistical dependencies among the face images.
Fig 5.6: Image synthesis model for the Architecture I. To find a set of IC images, the image in X are considered
to be a linear combination of statistically independent basis images, S, where A is an un-known mixing matrix.
The basis images were estimated as the learned ICA output U.
19
Fig 5.7: The independent basis image representation consisted of the coefficients, b, for the linear combination if
independent basis images, u, that comprised each face image x.
Fig 5.8: Image synthesis model for the Architecture II. Each image in the dataset was considered to be a linear
combination if underlying basis images in the matrix A. The basis images were each associated with a set of
independent causes, given by a vector of coefficients S. The basis images were estimated by , where is the
The factorial code representation consisted of the independent coefficients, u, for the linear combination of basis
images in A that comprised each face image x.
21
Fig 5.10: System overview again of the component-based face detector using four components
Fig 5.11: Examples of the component-based face detection and the 10 components used for face recognition.
fashion, and generated the descriptor based on the orientation and gradient information calculated
in a scale-specific region around each detected point. Fig. 5.12 shows the SIFT features extracted
on sample faces and some corresponding matching point in two face images. In each face image
the number and the positions of the features selected by the SIFT point detector are different, so
these features are person-specific. In order to only compare the feature pairs with similar physical
meaning between faces in the database and the input face, same number of sub-regions are
constructed in each face image to compute the similarity between each pair of sub-regions based
on the features inside and at last get the average similarity values. They proposed to ensemble a Kmeans clustering scheme to construct the sub-regions automatically based on the locations of
features in training samples. After constructing the sub-regions on face images, when testing a new
image, all the SIFT features extracted from the image are assigned into corresponding sub-regions
based on the locations. The construction of five sub-regions is illustrated in fig. 5.13, and it can be
seen that the centres of regions denoted by crosses just correspond to two eyes, nose and two mouth
corners that agree with the opinion of face recognition experience as these areas are the most
discriminative parts of face images. Based on the constructed sub-regions, a local-and-global
combined matching strategy is used for face recognition.
Fig 5.12: SIFT features on sample images and features matches in faces with expression variation.
Fig 5.13: Sub-region construction and similarity computation scheme for the face recognition system.
23
Feature-based
Example-based learning
Appearance-based
Appearance-based
Part-based
Part-based
Template matching
Characteristics
Coarse-to-fine procedure
Coarse-to-fine
Combining skin-color
detection, face shape
verification, and facial feature
configuration for detection
Combining simple features
with statistical learning and
esti-mation
Learning facial shape and
appearance variation by data
Learning the face and non-face
distribution by mixture of
Gaussian
Adaboost for speed-up
Unsupervisedly extracting
important facial features, and
learning the relation among
parts and discrimination between face and non-face by
the graphical model structure.
Learning global and local
SVM for detection
Holisticbased
Sparse
representation
Gabor and
dynamic link
architecture
Gabor and elastic
bunch graph
matching
LBP
Holisticbased
Featurebased
LTP
AAM
Featurebased
Featurebased
Featurebased
Template
matching
Table 6.3. The summary of popular databases used for detection and recognition tasks
Finally, there is still no robust face detection and recognition technique for unconstraint realworld applications, and these are our direction for future works.
25
7. REFERENCES
[1] R. Chellappa, C. L. Wilson, and S. Sirohey, Human and machine recognition of faces: a
survey, Proc. IEEE, vol. 83, no. 5, pp. 705-740, 1995.
[2] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, Face recognition: a literature
survey, Tech-nical Report CAR-TR-948, Center for Automation Research, University of
Maryland (2002).
[3] A. F. Abate, M. Nappi, D. Riccio, and G. Sabatino, 2D and 3D face recognition: a survey,
Pattern Recognition Letter, vol. 28, no. 14, pp. 18851906, 2007.
[4] M. Grgic, and K, Delac, Face recognition homepage. [online] Available:
http://www.face-rec.org/general-info. [Accessed May. 27, 2010].
[5] A. K. Jain, R. P. W. Duin, and J. C. Mao, Statistical pattern recognition: a review, IEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 437, 2000.
[6] [online] Available: http://www.michaelbach.de/ot/fcs_thompson-thatcher/index.html.
[Accessed May. 27, 2010].
[7] M. H. Yang, D. J. Kriegman, and N. Ahuja, Detecting face in images: a survey, IEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 24, pp. 3458, 2002.
[8] G. Yang and T. S. Huang, Human face detection in complex background, Pattern
Recognition Let-ter, vol. 27, no. 1, pp. 53-63, 1994.
[9] C. Kotropoulos and I. Pitas, Rule-based face detection in frontal views, Proc. Intl Conf.
Acoustics, Speech and Signal Processing, vol. 4, pp. 2537-2540, 1997.
[10] R. L. Hsu, M. Abdel-Mottaleb, and A. K. Jain, Face detection in color images, IEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, 696706, 2002. 2.
[11] www.wikipedia.com
26