Você está na página 1de 81

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/261703721

REAL TIME MULTIPLE FACE RECOGNITION


SECURITY SYSTEM (RTM-FS)

Article April 2013

CITATIONS READS

0 2,197

4 authors, including:

Anand Raj
VIT University
3 PUBLICATIONS 0 CITATIONS

SEE PROFILE

All content following this page was uploaded by Anand Raj on 18 April 2014.

The user has requested enhancement of the downloaded file.


REAL TIME MULTIPLE FACE
RECOGNITION SECURITY SYSTEM
(RTM-FS)
College Of Engineering Perumon- EC - Project 2013

0|Page
ACKNOWLEDGEMENT

It is with great pleasure and pride that we present this report before you. At this moment
of triumph, it would be unfair to neglect all those who helped using successful completion
of this project.
First of all, we would like to place myself at the feet of God Almighty for his everlasting
love and for blessings and courage he gave me, which made it possible to see through the
turbulence and set me in the right path. We would like to thank our Principal
Dr.Z.A.Zoya for the proper ambience to go on with the project. We would also like to
thank our head of the department, Mrs. Ananda Resmi for all help and guidance that she
provided to me. We are grateful to my project coordinator, Mr .Sudheer V.R Assistant
Professor in Department of Electronics And Communication for the guidance and whole
heated support.
We would take this opportunity to thank my friends who were always a source of
encouragement.

1|Page
ABSTRACT

Here we are trying to understand the working and the implementation of multiple face
detection cum recognition system standing strictly on under graduate level of
understanding. This covers familiarization of the topic and details about the creation of a
working model and the testing of the same. The scope of the discussion is generally to
understand the working of the face recognition system and its implementation model. The
project is designed to improve the automated security systems. Here we can identify
people in real time and sounds alarm in case of person recognized as dangerous by law
enforcement agencies or by the user himself. So in essence by implementing this system
we could alert the user when a person recorded in the database comes under our
surveillance camera.

2|Page
TABLE OF CONTENTS

NAME PAGE NO

ACKNOWLEDGEMENT 1
ABSTRACT 2
TABLE OF CONTENTS 3
LIST OF FIGURES 4
CHAPTER 1 INTRODUCTION 5
CHAPTER 2 HISTORY 8
CHAPTER 3 OVERVIEW OF THE SYSTEM 9
CHAPTER4 DISCRIPTION 10
CHAPTER 5 REQUIREMENTS 25
CHAPTER 6 IMPLEMENTATION DETAILS 26
CHAPTER 7 TEST AND TRIAL 74
CHAPTER 8 APPLICATION OF THE PROPOSED SYSTEM 75
CHAPTER 9 CONCLUSION 77
REFERENCE 78
APPENDIX

3|Page
LIST OF FIGURES

SL NO NAME PAGE NO

FIG 4.1 EXAMPLE OF HAAR FEATURES 11


FIG 4.2 USE OF HAAR CASCADE 12
FIG 4.3.1 COMPUTATION OF INTEGRAL IMAGE 13
FIG 4.3.2 ALGORITHM FLOW CHART 14
FIG 4.4 FACE RECOGNITION SCHEMATICS 16
FIG 4.5 M-TRAINING FACES 18
FIG 4.6 K-EIGEN FACES 18
FIG 4.7 EIGEN FACE REPRESENTATION 19
FIG 4.8 WHAT PCA DOES 19
FIG 4.9 DIMENTIONALITY REDUCTION 20
FIG 4.10 REPRESENTATION OF MEAN IMAGE 20
FIG 6.1 EMGU CV ARCHITECTURE 29

4|Page
CHAPTER 1 INTRODUCTION

Human face recognition has drawn considerable attention from the researchers in recent
years. An automatic face recognition system will find many applications in areas such as
human-computer interfaces, model-based video coding and security control systems. In
addition, face recognition has the potential of being a non-intrusive form of biometric
identification.
The difficulties of face recognition lie in the inherent variability arising from face
characteristics (age, gender and race), geometry (distance and viewpoint), image quality
(resolution, illumination, signal to noise ratio), and image content (background, occlusion
and disguise). Because of such complexity, most face recognition systems to date assume
a well-controlled environment and recognize only near frontal faces. However, these
constraints need to be relaxed in practice. Also, in applications such as video database
search, a persons face can appear in arbitrary backgrounds with unknown size and
orientation. Thus there is a need for robust face recognition systems to handle these
uncertainties.
People have amazing ability to recognize and remember thousands of faces.
Face is an important part of who you are and how people identify you.
While humans have had the innate ability to recognize and distinguish faces
for millions of years, computers are just catching up.
Face recognition is a fascinating problem with important commercial
applications such as mug shot matching, crowd surveillance & witness face
reconstruction.
In computer vision most of the popular face recognition algorithms have
been biologically motivated.
Using these models researchers can quantify the similarity between faces;
images whose projections are close in face space are likely to be from the
same individual.
Compare results of these models with human perception to determine
whether distance in face space corresponds to the human notion of facial
similarity.
Biometrics is used for that purpose.

1.1 What is biometrics?

A biometric is a unique, measureable characteristic of a human being that


can be used to automatically recognize an individual or verify an
individuals identity.
Biometrics can measure both physiological and behavioral characteristics.
Physiological biometrics (based on measurements and data derived from
direct measurement of a part of human) include:
a. Finger scan
5|Page
b. Facial recognition
c. Iris scan
d. Retina scan
e. Hand scan.
Behavioral biometrics (based on measurements and data derived from an
action) include:
a. Voice scan
b. Signature scan
c. Keystrokes scan.
A biometric system refers to the integrated hardware and software used to
conduct biometric identification and verification.

1.2 Why choose face recognition over other biometrics?


It is non intrusive and requires no physical interaction on behalf of the user.
It is accurate and allows for high enrollment and verification rates.
It does not require an expert to interpret the comparisons.
It can use your existing hardware infrastructure; existing cameras and
image capture devices will work with no problem.
You can use existing images without having to re-enroll every user.(e.g.:
passports, ID cards, drivers licenses etc)
It is the only biometric that allows performing passive identification in 1 to
many environments (eg: identifying a terrorist in a busy airport terminal).

1.3 What is face recognition system?


.
In clear terms face recognition system is a system, which turns your face to
computer code so that it can be compared with thousands of faces.
In order for face recognition system to work it has to know what a basic
face looks like.
Face recognition system is based on ability to first recognize faces, which is
a technological feat in itself and then measure the various features of each
face.
If you look into mirror you can see that your face has certain
distinguishable landmarks sometimes called as nodal points.
There are about 80 nodal points on human face like
a. distance between eyes
b. width of nose
c. depth of eye sockets
d. cheekbones
e. jaw line
f. chin
These nodal points are used to create numerical code, a string of numbers
that represents the face in database (called face print).
Only 14-22 nodal points are needed to complete the recognition process.

6|Page
The security system deals with detecting face from live video then recognizing it and it
sounds an alarm in case of security breach. The system uses Open CV as image
processing tool, server system as hardware. Since our system is a real time one we need to
select an accurate and fast algorithm. Since there are several algorithms available, the
most promising algorithm for face detection is Viola Jones using AdaBoost (~95%
accuracy) for recognition is PCA Eigen Faces (~ 75% accuracy).

7|Page
CHAPTER 2: HISTORY OF FACE RECOGNITION

1960s
First semi-automated system
The first semi-automated facial recognition programs were created by Woody Bledsoe,
Helen Chan Wolf, and Charles Bisson. Their programs required the administrator to
locate features such as the eyes, ears, nose, and mouth on the photograph. It then
calculated distances and ratios to a common reference point which was then compared to
reference data.

1970s

Goldstein, Harmon, and Lesk


Used 21 specific subjective markers, such as hair color and lip thickness, to automate the
recognition. The measurements and locations needed to be manually computed, causing
the program to require a lot of labor time.

1988

Kirby and Sirovich

Applied principle component analysis, a standard linear algebra technique , to the face
recognition problem. Considered a milestone because it showed that less than one
hundred values were required to accurately code a suitable aligned and normalized face.

8|Page
CHAPTER 3- OVERVIEW OF THE SYSTEM

Overview
Real Time
Viola Jones using AdaBoost (~95% accuracy)
PCA Eigen Faces (~ 75% accuracy)
Server Hardware

BLOCK DIAGRAM

HDMI PORT
WEB CAMERA
AMD

FX4100
USB PORT MONITOR

BUZZER

SPEECH OUT

9|Page
CHAPTER 4- DESCRIPTION

4.1 Flow Chart

Video Frame Haar-Cascade Viola Jones

Face Detection

PCA-Eigen Face Training Images


Face Recognition

Yes/No

Face Not Face

4.2 Face Detection


Face detection is a computer vision technology that determines the locations and sizes of
human faces in arbitrary (digital) images. It detects facial features and ignores anything
else, such as buildings, trees and bodies. Face detection can be regarded as a specific case
of object-class detection. In object-class detection, the task is to find the locations and
sizes of all objects in a digital image that belong to a given class. Examples include upper
torsos, pedestrians, and cars.
There are many ways to detect a face in a scene - easier and harder ones. Here is a list of
the most common approaches in face detection:

10 | P a g e
Finding faces in images with controlled background
Finding faces by color
Finding faces by motion
Using a mixture of the above
Finding faces in unconstrained scenes:
Neural Net approach
Neural Nets using statistical cluster information
Model-based Face Tracking
Weak classifier cascades"

We use Viola-Jones method for face detection because it gives 95% accuracy

4.2.1 How Face Detection Works


OpenCV's face detector uses a method that Paul Viola and Michael Jones published in
2001. Usually called simply the Viola-Jones method, or even just Viola-Jones, this
approach to detecting objects in images combines four key concepts:
Simple rectangular features, called Haar features
An Integral Image for rapid feature detection
The AdaBoost machine-learning method
A cascaded classifier to combine many features efficiently

Fig: 4.1 Examples of the Haar features used in OpenCV

11 | P a g e
The features that Viola and Jones used are based on Haar wavelets. Haar wavelets are
single wavelength square waves (one high interval and one low interval). In two
dimensions, a square wave is a pair of adjacent rectangles - one light and one dark.
The features that Viola and Jones used are based on Haar wavelets. Haar wavelets are
single wavelength square waves (one high interval and one low interval). In two
dimensions, a square wave is a pair of adjacent rectangles - one light and one dark.
The actual rectangle combinations used for visual object detection are not true Haar
wavelets. Instead, they contain rectangle combinations better suited to visual recognition
tasks. Because of that difference, these features are called Haar features, or Haarlike
features, rather than Haar wavelets. Figure 4. 1 shows the features that OpenCV uses.

Fig: 4.2 Use of Haar cascade in face

The presence of a Haar feature is determined by subtracting the average dark-region pixel
value from the average light-region pixel value. If the difference is above a threshold (set
during learning), that feature is said to be present.
To determine the presence or absence of hundreds of Haar features at every image
location and at several scales efficiently, Viola and Jones used a technique called an
Integral Image. In general, "integrating" means adding small units together. In this case,
the small units are pixel values. The integral value for each pixel is the sum of all the
pixels above it and to its left. Starting at the top left and traversing to the right and down,

12 | P a g e
the entire image can be integrated with a few integer operations per pixel, after
integration, the value at each pixel location, (x,y), contains the sum of all pixel values
within a rectangular region that has one corner at the top left of the image and the other at
location (x,y). To find the average pixel value in this rectangle, you'd only need to divide
the value at (x,y) by the rectangle's area.

Fig: 4.3.1 Computation of Integral Image

But what if you want to know the summed values for some other rectangle, one that
doesn't have one corner at the upper left of the image? Figure 4.3 shows the solution to
that problem. Suppose you want the summed values in D. You can think of that as being
the sum of pixel values in the combined rectangle, A+B+C+D, minus the sums in
rectangles A+B and A+C, plus the sum of pixel values in A. In other words,
D = A+B+C+D - (A+B) - (A+C) + A.

Conveniently, A+B+C+D is the Integral Image's value at location 4, A+B is the value at
location 2, A+C is the value at location 3, and A is the value at location 1. So, with an
Integral Image, you can find the sum of pixel values for any rectangle in the original
image with just three integer operations: (x4, y4) - (x2, y2) - (x3, y3) + (x1, y1).
To select the specific Haar features to use, and to set threshold levels, Viola and Jones use
a machine-learning method called AdaBoost. AdaBoost combines many "weak"
classifiers to create one "strong" classifier. "Weak" here means the classifier only gets the
right answer a little more often than random guessing would. That's not very good. But if
you had a whole lot of these weak classifiers, and each one "pushed" the final answer a
little bit in the right direction, you'd have a strong, combined force for arriving at the
correct solution. AdaBoost selects a set of weak classifiers to combine and assigns a
weight to each. This weighted combination is the strong classifier.

13 | P a g e
Fig 4.3.2 the classifier cascade is a chain of filters. Image sub regions that make it through the
entire cascade are classified as "Face." All others are classified as "Not Face."

Viola and Jones combined a series of AdaBoost classifiers as a filter chain, shown in
Figure 3, that's especially efficient for classifying image regions. Each filter is a separate
AdaBoost classifier with a fairly small number of weak classifiers.

The acceptance threshold at each level is set low enough to pass all, or nearly all, face
examples in the training set. The filters at each level are trained to classify training
images that passed all previous stages. (The training set is a large database of faces,
maybe a thousand or so.) During use, if any one of these filters fails to pass an image
region, that region is immediately classified as "Not Face." When a filter passes an image
region, it goes to the next filter in the chain. Image regions that pass through all filters in
the chain are classified as "Face." Viola and Jones dubbed this filtering chain a cascade.

14 | P a g e
Source: Internet

The order of filters in the cascade is based on the importance weighting that AdaBoost
assigns. The more heavily weighted filters come first, to eliminate non-face image regions
as quickly as possible.

4.3 Face Recognition


Face recognition is the task of identifying an already detected object as a known or
unknown face, and in more advanced cases, telling exactly whos face it is.

15 | P a g e
Fig: 4.4 how face recognized-schematic

A. Recognition algorithms can be divided into two main approaches:


1- Geometric: which looks at distinguishing features.
2- Photometric: which is a statistical approach that distills an image into values and
comparing the values with templates to eliminate variances.
B. Popular recognition algorithms include
1. Principal Component Analysis using Eigen faces,
2. Linear Discriminate Analysis,
3. Elastic Bunch Graph Matching using the Fisher face algorithm,
4. The Hidden Markov model, and
5. The neuronal motivated dynamic link matching.

4.3.1 PCA - Eigen Face algorithm for Face Recognition


PCA based Eigen face method is at the most primary level and simplest of efficient face
recognition algorithms and is therefore a great place for beginners to start learning face
recognition.

No face recognition algorithm is yet 100% efficient. It could reach 100% efficiency but
not always! So, no existing face recognition algorithm is 100% foolproof! That is why;
its a very hot topic of research today: to optimize face recognition such that it gives near-
perfect efficiency in real-time and critical environment!

16 | P a g e
Therefore Secondly: PCA based Eigen faces method is not 100% efficient! In fact, on the
average, it goes up to 70% to 75% efficiency honestly.
However, it works well enough, to be used in a beginner or hobbyist robotics/computer
vision project. Because, even though there are other better existing algorithms for face
recognition, they are still not 100% efficient!
And those other recognition algorithms, though better than PCA based Eigen Face, have a
bigger overhead of "coding" effort you need to put in to implement them in our project.

4.3.2 Working of PCA-based Eigen faces method

The task of facial recognition is discriminating input signals (image data) into several
classes (persons). The input signals are highly noisy (e.g. the noise is caused by differing
lighting conditions, pose etc.), yet the input images are not completely random and in
spite of their differences there are patterns which occur in any input signal. Such patterns,
which can be observed in all signals, could be - in the domain of facial recognition - the
presence of some objects (eyes, nose, mouth) in any face as well as relative distances
between these objects. These characteristic features are called eigenfaces in the facial
recognition domain (or principal components generally). They can be extracted out of
original image data by means of a mathematical tool called Principal Component
Analysis (PCA).

By means of PCA one can transform each original image of the training set into a
corresponding eigenface. An important feature of PCA is that one can reconstruct any
original image from the training set by combining the eigenfaces. Remember that
eigenfaces are nothing less than characteristic features of the faces. Therefore one could
say that the original face image can be reconstructed from eigenfaces if one adds up all
the eigenfaces (features) in the right proportion. Each eigenface represents only certain
features of the face, which may or may not be present in the original image. If the feature
is present in the original image to a higher degree, the share of the corresponding
eigenface in the sum of the eigenfaces should be greater. If, contrary, the particular
feature is not (or almost not) present in the original image, then the corresponding
eigenface should contribute a smaller (or not at all) part to the sum of eigenfaces. So, in
order to reconstruct the original image from the eigenfaces, one has to build a kind of
weighted sum of all eigenfaces. That is, the reconstructed original image is equal to a sum
of all eigenfaces, with each eigenface having a certain weight. This weight specifies, to
what degree the specific feature (eigenface) is present in the original image.

If one uses all the eigenfaces extracted from original images, one can reconstruct the
original images from the eigenfaces exactly. But one can also use only a part of the
eigenfaces. Then the reconstructed image is an approximation of the original image.
However, one can ensure that losses due to omitting some of the eigenfaces can be
minimized. This happens by choosing only the most important features (eigenfaces).
Omission of eigenfaces is necessary due to scarcity of computational resources.

17 | P a g e
How does this relate to facial recognition? The clue is that it is possible not only to
extract the face from eigenfaces given a set of weights, but also to go the opposite way.
This opposite way would be to extract the weights from eigenfaces and the face to be
recognized. These weights tell nothing less, as the amount by which the face in question
differs from typical faces represented by the eigenfaces. Therefore, using this weights
one can determine two important things:

1. Determine, if the image in question is a face at all. In the case the weights of the
image differ too much from the weights of face images (i.e. images, from which we know
for sure that they are faces), the image probably is not a face.

2. Similar faces (images) possess similar features (eigenfaces) to similar degrees


(weights). If one extracts weights from all the images available, the images could be
grouped to clusters. That is, all images having similar weights are likely to be similar
faces.

4.3.3 Computation by PCA-Eigen Face Method


The initial condition while doing PCA is that the training set and known face image must
be same size. The PCA Eigen face converts each of these of these images into vector
matrices and works on these vector forms.

Fig 4.5 M, training faces

Fig 4.6 K, Eigen Faces

18 | P a g e
The PCA is used to generate K Eigen faces for a training set of M images where K<M
thereby reducing the number of values (from M to K) need to identify an unknown face.

4.3.3.1 PCA and its relation to Face Recognition?

Converts database of M face images into a list of k variable called Eigen faces
(K<M)
First principal component most dominant succeeding components shows next
most dominant

Fig 4.7 Showing Eigen Faces the first picture has almost characteristics of training images it
decreases as we moves to next picture

Fig4.8 this is what PCA does

19 | P a g e
Calculation of 2 the covariance matrix result in the creation of 2 x 2 matrix where,
N=dimension (50x50 pixels), so it results in 2500x2500 dimension .This causes the
system to slow down terribly or run out of memory. So we reduce the noise region of the
Eigen faces. Only K components are selected and others are discarded.

Fig 4.9 Dimensionality Reduction

So each variable in original data set can be represented in K principal components

Fig4.10 Representation of mean image

20 | P a g e
4.3.3.3 Mathematical Analysis of PCA-Eigen Face

To perform PCA several steps are undertaken:


Stage 1: Subtract the Mean of the data from each variable (our adjusted data)
Stage 2: Calculate and form a covariance Matrix
Stage 3: Calculate Eigenvectors and Eigenvalues from the covariance Matrix
Stage 4: Chose a Feature Vector (a fancy name for a matrix of vectors)
Stage 5: Multiply the transposed Feature Vectors by the transposed adjusted data

STAGE 1: Mean Subtraction


This data is fairly simple and makes the calculation of our covariance matrix a little
simpler now this is not the subtraction of the overall mean from each of our values as for
covariance we need at least two dimensions of data. It is in fact the subtraction of the
mean of each row from each element in that row.
(Alternatively the mean of each column from each element in the column however this
would adjust the way we calculate the covariance matrix)

STAGE 2: Covariance Matrix


The basic Covariance equation for two dimensional data is:

Which is similar to the formula for variance however, the change of x is in respect to the
change in y rather than solely the change of x in respect to x. In this equation x represents
the pixel value and x is the mean of all x values, and n the total number of values.
The covariance matrix that is formed of the image data represents how much the
dimensions vary from the mean with respect to each other. The definition of a covariance
matrix is:

Now the easiest way to explain this is but an example the easiest of which is a 3x3 matrix.

21 | P a g e
Now with larger matrices this can become more complicated and the use of
computational algorithms essential.

STAGE 3: Eigenvectors and Eigen values

Eigenvalues are a product of multiplying matrices however they are as special


case. Eigenvalues are found by multiples of the covariance matrix by a vector in 2
dimensional spaces (i.e. a Eigenvector). This makes the covariance matrix the equivalent
of a transformation matrix. It is easier to show in a example:

Eigenvectors can be scaled so or x2 of the vector will still produce the same type of
results. A vector is a direction and all you will be doing is changing the scale not the
direction.

Eigenvectors are usually scaled to have a length of 1:

The Eigenvalue is closely related to the Eigenvector used and is the value of which the
original vector was scaled in the example the Eigenvalue is 4.

22 | P a g e
STAGE 4: Feature Vector

Now a usually the results of Eigenvalues and Eigenvectors are not as clean as in the
example above. In most cases the results provided are scaled to a length of 1.

Once Eigenvectors are found from the covariance matrix, the next step is to order them
by Eigenvalue, highest to lowest. This gives you the components in order of significance.
Here the data can be compressed and the weaker vectors are removed producing a lossy
compression method, the data lost is deemed to be insignificant.

STAGE 5: Transposition

The final stage in PCA is to take the transpose of the feature vector matrix and multiply it
on the left of the transposed adjusted data set (the adjusted data set is from Stage 1 where
the mean was subtracted from the data).

The EigenObjectRecognizer class performs all of this and then feeds the transposed data
as a training set. When it is passed an image to recognize it performs PCA and compares
the generated Eigenvalues and Eigenvectors to the ones from the training set and then
produces a match if one has been found or a negative match if no match is found.

23 | P a g e
4.3.3.4 Recognition of Unknown Face

24 | P a g e
CHAPTER 5- REQUIREMENTS

HARDWARE REQUIREMENTS

AMD-BULDOZER FX 4100
BIOSTAR MOTHERBOARD
GRAPHICS CARD HD Radeon 7770 OC
COOLING SYSTEMS
1. GPU COOLING
2. PROCESSOR COOLING
3. MOTHERBOARD COOLING
WEB CAMERA
MONITOR
KEY BOARD
MOUSE
BUZZER

SOFTWARE REQUIREMENTS

OPEN CV LIBRARY
EMGU CV LIBRARY
C# , C++
WINDOWS PLATFORM,LINUX PLATFORM

25 | P a g e
CHAPTER 6-IMPLEMENTATION DETAILS

The implementation section consists of two units software and hardware

6.1 Software Section

Programming Language C#,C++


Library Emgu Cv, Open Cv
OpenCV (Open Source Computer Vision Library) is a library of
programming functions mainly aimed at real-time computer vision,
developed by Intel
Emgu CV is a cross platform .NET wrapper to the OpenCV image
processing library.

6.1.1 C#

C# (pronounced C Sharp) is no doubt the language of choice in the .Net environment. It is


a whole new language free of backward compatibility curse with a new bunch of exciting,
promising features. It is an object oriented programming language and has its core, many
similarities to Java, C++ and VB. In fact, C# combines the power and efficiency of C++,
the simple and clean OO design of Java and the language simplification of Visual Basic.

6.1.2 Open CV

What Is OpenCV? OpenCV [OpenCV] is an open source (see http://opensource.org)


computer vision library available from http://SourceForge.net/projects/opencvlibrary. The
library is written in C and C++ and runs under Linux, Windows and Mac OS X. There is
active development on interfaces for Python, Ruby, Matlab, and other languages.

OpenCV was designed for computational efficiency and with a strong focus on real- time
applications. OpenCV is written in optimized C and can take advantage of multicore
processors. If you desire further automatic optimization on Intel architectures [Intel], you
can buy Intels Integrated Performance Primitives (IPP) libraries [IPP], which consist of
low-level optimized routines in many diff rent algorithmic areas. OpenCV automatically
uses the appropriate IPP library at runtime if that library is installed.

One of OpenCVs goals is to provide a simple-to-use computer vision infrastructure that


helps people build fairly sophisticated vision applications quickly. The OpenCV library
contains over 500 functions that span many areas in vision, including factory product
inspection, medical imaging, security, user interface, camera calibration, stereo vision,
and robotics. Because computer vision and machine learning often go hand-in- hand,
OpenCV also contains a full, general-purpose Machine Learning Library (MLL). The sub
library is focused on statistical pattern recognition and clustering. The MLL is highly
26 | P a g e
useful for the vision tasks that are at the core of OpenCVs mission, but it is general
enough to be used for any machine learning problem.

6.1.3 Emgu CV

Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library.
Allowing OpenCV functions to be called from .NET compatible languages such as C#,
VB, VC++, IronPython etc. The wrapper can be compiled in Mono and run on Windows,
Linux, Mac OS X, iPhone, iPad and Android devices.

A Comparison of Emgu CV Versions


Emgu CV Emgu CV
Emgu CV for iOS Emgu CV for
Name (Open (Commercial
(Commercial) Android (Beta)
Source) Optimized)

Windows,
iOS (iPhone, IPad,
OS Linux, Mac Windows Android
IPod Touch)
OSX

armeabi, armeabi- armeabi,


Supported CPU
i386, x64 i386, x64 v7, i386 armeabi-v7a,
Architecture
(Simulator) x86

GPU Processing X X

Machine
Learning

Tesseract OCR

Intel TBB (multi-


X X X
thread)

Intel IPP (high


X X X
performance)

Intel C++
Compiler (fast X X X
code)

Exception
Handling

27 | P a g e
Debugger
X X
Visualizer

Emgu.CV.UI X X

Commercial Commercial Commercial


License GPL
License License License

Table 6.1 comparison of Emgu CV versions

28 | P a g e
Fig:6.1 Emgu CV architecture

29 | P a g e
6.1.4 Implementation Procedure

The software implementation can be divided to three processes

Camera Capture
Face detection
Face Recognition
Alarm Out

6.1.4.1 Camera Capture

Camera Capture is basically

1. Take image from a web camera continuously


2. Show it in an Emgu CV Image box
3. Application should start when "Start" button is pressed and pause when it is again
pressed and vice versa

STEP-1: open Visual Studio 2010 and select File-> New->Project as follows:

30 | P a g e
STEP-2: in the Visual C# Project menu, Select "Windows Forms Application" and name
the project "Camera Capture", and Click "OK"

STEP-3: Lets first add Emgu References to our project.(though you can add them at any
time later BUT you must add references before debugging.) Select the Browse tab in the
window that pops up, go to Emgu Cv's bin folder as in Level-0 tutorial, and select the
following 3 .dllfiles (Emgu.CV.dll, Emgu.CV.UI.dll and Emgu.Util.dll) click OK to
continue.

STEP-5: Rename Form1.cs to CameraCapture.cs and change its Text Field to "Camera
Output Add Emgu CV Tools to your Visual Studio, because we will be using those
tools, such as Image Box. Add a button to the form and please do some more required
"housekeeping" as below:

Image Box Properties:

Name: CamImageBox

Border Style: Fixed single

31 | P a g e
Button properties:

(Name): btnStart

Text: Start!

Then Debug and Save.

6.1.4.2 Face Detection

STEP 1: DECLARING THE CLASSIFIER - Declare an object of class HaarCascade

private HaarCascade haar;

STEP 2: LOAD THE HaarCascade XML file

A classifier uses data stored in an XML file to decide how to classify each image
location. So naturally, Haar will need some XML file to load trained data from.

You'll need to tell the classifier (Haar object in this case) where to find this data file you
want it to use. It's better to locate the XML file we want to use and make sure our path to
it is correct, before we code the rest of our face-detection program.

32 | P a g e
haar = new HaarCascade("haarcascade_frontalface_alt_tree.xml");

STEP 3: SET THE IMAGE SOURCE FOR FACE DETECTION

STEP 4: INSERT THE FACE DETECTION CODE:

'MCvAvgComp[]

var faces = grayframe.DetectHaarCascade(haar, 1.4, 4,

HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,

new Size(25, 25))[0];

STEPS 5 DEBUG THE PROGRAM

6.1.4.3 Face Recognition

STEP 1: ENTERING THE CRITERIA FOR FACE RECOGITION

MCvTermCriteria termCrit = new MCvTermCriteria(ContTrain, 0.001); int thrs = 1000;

STEP 2 ADD EIGEN FACE RECOGNIZER CODE

EigenObjectRecognizer recognizer = new EigenObjectRecognizer(


trainingImages.ToArray(),
labels.ToArray(),
thrs,
ref termCrit);

STEPS 3 DRAW THE LABEL FOR EACH FACE DETECTED AND RCOGNIZED

currentFrame.Draw(name, ref font, new Point(f.rect.X - 2, f.rect.Y - 2), new


Bgr(Color.LightGreen)); if (name != "")

6.1.4.4 Alarm Out

STEP 1 ADDING THE BUZZER OUT

33 | P a g e
Console.Beep(2000, 1000);

STEP 2 ADDING SPEECH OUT

Define speech synthesizer

new SpeechSynthesizer().Speak("Security Alert Person Identified ");

6.1.5 Program Code


Main Form.cs
using System;
using System.Collections.Generic;
using System.Drawing;
using System.Windows.Forms;
using Emgu.CV;
using Emgu.CV.Structure;
using Emgu.CV.CvEnum;
using System.IO;
using System.Diagnostics;
using System.Speech.Synthesis;

namespace MultiFaceRec
{
public partial class FrmPrincipal : Form
{
//Declararation of all variables, vectors and haarcascades
Image<Bgr, Byte> currentFrame;
Capture grabber;
HaarCascade face;
HaarCascade eye;
MCvFont font = new MCvFont(FONT.CV_FONT_HERSHEY_TRIPLEX, 0.5d, 0.5d);
Image<Gray, byte> result, TrainedFace = null;
Image<Gray, byte> gray = null;
List<Image<Gray, byte>> trainingImages = new List<Image<Gray, byte>>();
List<string> labels= new List<string>();
List<string> NamePersons = new List<string>();
int ContTrain, NumLabels, t;
string name, names = null;

public FrmPrincipal()
{
InitializeComponent();
//Load haarcascades for face detection
face = new HaarCascade("haarcascade_frontalface_default.xml");
eye = new HaarCascade("haarcascade_eye.xml");
try
{
//Load of previus trainned faces and labels for each image
string Labelsinfo = File.ReadAllText(Application.StartupPath +
"/TrainedFaces/TrainedLabels.txt");
string[] Labels = Labelsinfo.Split('%');
NumLabels = Convert.ToInt16(Labels[0]);
ContTrain = NumLabels;
string LoadFaces;

34 | P a g e
for (int tf = 1; tf < NumLabels+1; tf++)
{
LoadFaces = "face" + tf + ".bmp";
trainingImages.Add(new Image<Gray,
byte>(Application.StartupPath + "/TrainedFaces/" + LoadFaces));
labels.Add(Labels[tf]);
}

}
catch(Exception e)
{
//MessageBox.Show(e.ToString());
MessageBox.Show("Nothing in binary database, please add at least a
face(Simply train the prototype with the Add Face Button).", "Triained faces
load", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
}

private void button1_Click(object sender, EventArgs e)


{
//Initialize the capture device
grabber = new Capture();
grabber.QueryFrame();
//Initialize the FrameGraber event
Application.Idle += new EventHandler(FrameGrabber);
button1.Enabled = false;
}

private void button2_Click(object sender, System.EventArgs e)


{
try
{
//Trained face counter
ContTrain = ContTrain + 1;

//Get a gray frame from capture device


gray = grabber.QueryGrayFrame().Resize(320, 240,
Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);

//Face Detector
MCvAvgComp[][] facesDetected = gray.DetectHaarCascade(
face,
1.2,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new Size(20, 20));

//Action for each element detected


foreach (MCvAvgComp f in facesDetected[0])
{
TrainedFace = currentFrame.Copy(f.rect).Convert<Gray, byte>();
break;
}

//resize face detected image for force to compare the same size
with the
//test image with cubic interpolation type method

35 | P a g e
TrainedFace = result.Resize(100, 100,
Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
trainingImages.Add(TrainedFace);
labels.Add(textBox1.Text);

//Show face added in gray scale


imageBox1.Image = TrainedFace;

//Write the number of triained faces in a file text for further


load
File.WriteAllText(Application.StartupPath +
"/TrainedFaces/TrainedLabels.txt", trainingImages.ToArray().Length.ToString() +
"%");

//Write the labels of triained faces in a file text for further


load
for (int i = 1; i < trainingImages.ToArray().Length + 1; i++)
{
trainingImages.ToArray()[i - 1].Save(Application.StartupPath +
"/TrainedFaces/face" + i + ".bmp");
File.AppendAllText(Application.StartupPath +
"/TrainedFaces/TrainedLabels.txt", labels.ToArray()[i - 1] + "%");
}

MessageBox.Show(textBox1.Text + "s face detected and added :)",


"Training OK", MessageBoxButtons.OK, MessageBoxIcon.Information);
}
catch
{
MessageBox.Show("Enable the face detection first", "Training
Fail", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
}
}

void FrameGrabber(object sender, EventArgs e)


{
label3.Text = "0";
//label4.Text = "";
NamePersons.Add("");

//Get the current frame form capture device


currentFrame = grabber.QueryFrame().Resize(320, 240,
Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);

//Convert it to Grayscale
gray = currentFrame.Convert<Gray, Byte>();

//Face Detector
MCvAvgComp[][] facesDetected = gray.DetectHaarCascade(
face,
1.2,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new Size(20, 20));

//Action for each element detected


foreach (MCvAvgComp f in facesDetected[0])
{
t = t + 1;

36 | P a g e
result = currentFrame.Copy(f.rect).Convert<Gray,
byte>().Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
//draw the face detected in the 0th (gray) channel with blue color
currentFrame.Draw(f.rect, new Bgr(Color.Red), 2);

if (trainingImages.ToArray().Length != 0)
{
//TermCriteria for face recognition with numbers of trained
images like maxIteration
MCvTermCriteria termCrit = new MCvTermCriteria(ContTrain,
0.001); int thrs = 1000;

try
{
thrs = int.Parse(txtThreshold.Text);
}
catch (Exception ex)
{
// MessageBox.Show("Enter integer as threshold value");
}

//Eigen face recognizer


EigenObjectRecognizer recognizer = new EigenObjectRecognizer(
trainingImages.ToArray(),
labels.ToArray(),
thrs,
ref termCrit);

name = recognizer.Recognize(result);

//Draw the label for each face detected and recognized


currentFrame.Draw(name, ref font, new Point(f.rect.X - 2,
f.rect.Y - 2), new Bgr(Color.LightGreen)); if (name != "") {
Console.Beep(2000, 1000);

//add sound

new SpeechSynthesizer().Speak("Security Alert Person


Identified ");

NamePersons[t - 1] = name;
NamePersons.Add("");

//Set the number of faces detected on the scene


label3.Text = facesDetected[0].Length.ToString();

/*
//Set the region of interest on the faces

gray.ROI = f.rect;
MCvAvgComp[][] eyesDetected = gray.DetectHaarCascade(
37 | P a g e
eye,
1.1,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new Size(20, 20));
gray.ROI = Rectangle.Empty;

foreach (MCvAvgComp ey in eyesDetected[0])


{
Rectangle eyeRect = ey.rect;
eyeRect.Offset(f.rect.X, f.rect.Y);
currentFrame.Draw(eyeRect, new Bgr(Color.Blue), 2);
}
*/

}
t = 0;

//Names concatenation of persons recognized


for (int nnn = 0; nnn < facesDetected[0].Length; nnn++)
{
names = names + NamePersons[nnn] + ", ";
}
//Show the faces procesed and recognized
imageBoxFrameGrabber.Image = currentFrame;

label4.Text = names;

names = "";
{

NamePersons.Clear();

//Clear the list(vector) of names

private void button3_Click(object sender, EventArgs e)


{
Process.Start("copyright Anand Raj & Team and Open Source");
}

private void textBox1_TextChanged(object sender, EventArgs e)


{

private void FrmPrincipal_Load(object sender, EventArgs e)


{

private void groupBox1_Enter(object sender, EventArgs e)


{

38 | P a g e
private void textBox2_TextChanged(object sender, EventArgs e)
{

}
}

Eigen Object Recogniser.cs


using System;
using System.Diagnostics;
using Emgu.CV.Structure;

namespace Emgu.CV
{
/// <summary>
/// An object recognizer using PCA (Principle Components Analysis)
/// </summary>
[Serializable]
public class EigenObjectRecognizer
{
private Image<Gray, Single>[] _eigenImages;
private Image<Gray, Single> _avgImage;
private Matrix<float>[] _eigenValues;
private string[] _labels;
private double _eigenDistanceThreshold;

/// <summary>
/// Get the eigen vectors that form the eigen space
/// </summary>
/// <remarks>The set method is primary used for deserialization, do not
attemps to set it unless you know what you are doing</remarks>
public Image<Gray, Single>[] EigenImages
{
get { return _eigenImages; }
set { _eigenImages = value; }
}

/// <summary>
/// Get or set the labels for the corresponding training image
/// </summary>
public String[] Labels
{
get { return _labels; }
set { _labels = value; }
}

/// <summary>
/// Get or set the eigen distance threshold.
/// The smaller the number, the more likely an examined image will be
treated as unrecognized object.
/// Set it to a huge number (e.g. 5000) and the recognizer will always
treated the examined image as one of the known object.
/// </summary>
public double EigenDistanceThreshold
{
get { return _eigenDistanceThreshold; }
set { _eigenDistanceThreshold = value; }
}

39 | P a g e
/// <summary>
/// Get the average Image.
/// </summary>
/// <remarks>The set method is primary used for deserialization, do not
attemps to set it unless you know what you are doing</remarks>
public Image<Gray, Single> AverageImage
{
get { return _avgImage; }
set { _avgImage = value; }
}

/// <summary>
/// Get the eigen values of each of the training image
/// </summary>
/// <remarks>The set method is primary used for deserialization, do not
attemps to set it unless you know what you are doing</remarks>
public Matrix<float>[] EigenValues
{
get { return _eigenValues; }
set { _eigenValues = value; }
}

private EigenObjectRecognizer()
{
}

/// <summary>
/// Create an object recognizer using the specific tranning data and
parameters, it will always return the most similar object
/// </summary>
/// <param name="images">The images used for training, each of them should
be the same size. It's recommended the images are histogram normalized</param>
/// <param name="termCrit">The criteria for recognizer training</param>
public EigenObjectRecognizer(Image<Gray, Byte>[] images, ref MCvTermCriteria
termCrit)
: this(images, GenerateLabels(images.Length), ref termCrit)
{
}

private static String[] GenerateLabels(int size)


{
String[] labels = new string[size];
for (int i = 0; i < size; i++)
labels[i] = i.ToString();
return labels;
}

/// <summary>
/// Create an object recognizer using the specific tranning data and
parameters, it will always return the most similar object
/// </summary>
/// <param name="images">The images used for training, each of them should
be the same size. It's recommended the images are histogram normalized</param>
/// <param name="labels">The labels corresponding to the images</param>
/// <param name="termCrit">The criteria for recognizer training</param>
public EigenObjectRecognizer(Image<Gray, Byte>[] images, String[] labels,
ref MCvTermCriteria termCrit)
: this(images, labels, 0, ref termCrit)
{
}
40 | P a g e
/// <summary>
/// Create an object recognizer using the specific tranning data and
parameters
/// </summary>
/// <param name="images">The images used for training, each of them should
be the same size. It's recommended the images are histogram normalized</param>
/// <param name="labels">The labels corresponding to the images</param>
/// <param name="eigenDistanceThreshold">
/// The eigen distance threshold, (0, ~1000].
/// The smaller the number, the more likely an examined image will be
treated as unrecognized object.
/// If the threshold is &lt; 0, the recognizer will always treated the
examined image as one of the known object.
/// </param>
/// <param name="termCrit">The criteria for recognizer training</param>
public EigenObjectRecognizer(Image<Gray, Byte>[] images, String[] labels,
double eigenDistanceThreshold, ref MCvTermCriteria termCrit)
{
Debug.Assert(images.Length == labels.Length, "The number of images should
equals the number of labels");
Debug.Assert(eigenDistanceThreshold >= 0.0, "Eigen-distance threshold
should always >= 0.0");

CalcEigenObjects(images, ref termCrit, out _eigenImages, out _avgImage);

/*
_avgImage.SerializationCompressionRatio = 9;

foreach (Image<Gray, Single> img in _eigenImages)


//Set the compression ration to best compression. The serialized
object can therefore save spaces
img.SerializationCompressionRatio = 9;
*/

_eigenValues = Array.ConvertAll<Image<Gray, Byte>, Matrix<float>>(images,


delegate(Image<Gray, Byte> img)
{
return new Matrix<float>(EigenDecomposite(img, _eigenImages,
_avgImage));
});

_labels = labels;

_eigenDistanceThreshold = eigenDistanceThreshold;
}

#region static methods


/// <summary>
/// Caculate the eigen images for the specific traning image
/// </summary>
/// <param name="trainingImages">The images used for training </param>
/// <param name="termCrit">The criteria for tranning</param>
/// <param name="eigenImages">The resulting eigen images</param>
/// <param name="avg">The resulting average image</param>
public static void CalcEigenObjects(Image<Gray, Byte>[] trainingImages, ref
MCvTermCriteria termCrit, out Image<Gray, Single>[] eigenImages, out Image<Gray,
Single> avg)
{
int width = trainingImages[0].Width;
int height = trainingImages[0].Height;

41 | P a g e
IntPtr[] inObjs = Array.ConvertAll<Image<Gray, Byte>,
IntPtr>(trainingImages, delegate(Image<Gray, Byte> img) { return img.Ptr; });

if (termCrit.max_iter <= 0 || termCrit.max_iter > trainingImages.Length)


termCrit.max_iter = trainingImages.Length;

int maxEigenObjs = termCrit.max_iter;

#region initialize eigen images


eigenImages = new Image<Gray, float>[maxEigenObjs];
for (int i = 0; i < eigenImages.Length; i++)
eigenImages[i] = new Image<Gray, float>(width, height);
IntPtr[] eigObjs = Array.ConvertAll<Image<Gray, Single>,
IntPtr>(eigenImages, delegate(Image<Gray, Single> img) { return img.Ptr; });
#endregion

avg = new Image<Gray, Single>(width, height);

CvInvoke.cvCalcEigenObjects(
inObjs,
ref termCrit,
eigObjs,
null,
avg.Ptr);
}

/// <summary>
/// Decompose the image as eigen values, using the specific eigen vectors
/// </summary>
/// <param name="src">The image to be decomposed</param>
/// <param name="eigenImages">The eigen images</param>
/// <param name="avg">The average images</param>
/// <returns>Eigen values of the decomposed image</returns>
public static float[] EigenDecomposite(Image<Gray, Byte> src, Image<Gray,
Single>[] eigenImages, Image<Gray, Single> avg)
{
return CvInvoke.cvEigenDecomposite(
src.Ptr,
Array.ConvertAll<Image<Gray, Single>, IntPtr>(eigenImages,
delegate(Image<Gray, Single> img) { return img.Ptr; }),
avg.Ptr);
}
#endregion

/// <summary>
/// Given the eigen value, reconstruct the projected image
/// </summary>
/// <param name="eigenValue">The eigen values</param>
/// <returns>The projected image</returns>
public Image<Gray, Byte> EigenProjection(float[] eigenValue)
{
Image<Gray, Byte> res = new Image<Gray, byte>(_avgImage.Width,
_avgImage.Height);
CvInvoke.cvEigenProjection(
Array.ConvertAll<Image<Gray, Single>, IntPtr>(_eigenImages,
delegate(Image<Gray, Single> img) { return img.Ptr; }),
eigenValue,
_avgImage.Ptr,
res.Ptr);
return res;
}

42 | P a g e
/// <summary>
/// Get the Euclidean eigen-distance between <paramref name="image"/> and
every other image in the database
/// </summary>
/// <param name="image">The image to be compared from the training
images</param>
/// <returns>An array of eigen distance from every image in the training
images</returns>
public float[] GetEigenDistances(Image<Gray, Byte> image)
{
using (Matrix<float> eigenValue = new
Matrix<float>(EigenDecomposite(image, _eigenImages, _avgImage)))
return Array.ConvertAll<Matrix<float>, float>(_eigenValues,
delegate(Matrix<float> eigenValueI)
{
return (float)CvInvoke.cvNorm(eigenValue.Ptr, eigenValueI.Ptr,
Emgu.CV.CvEnum.NORM_TYPE.CV_L2, IntPtr.Zero);
});
}

/// <summary>
/// Given the <paramref name="image"/> to be examined, find in the database
the most similar object, return the index and the eigen distance
/// </summary>
/// <param name="image">The image to be searched from the database</param>
/// <param name="index">The index of the most similar object</param>
/// <param name="eigenDistance">The eigen distance of the most similar
object</param>
/// <param name="label">The label of the specific image</param>
public void FindMostSimilarObject(Image<Gray, Byte> image, out int index,
out float eigenDistance, out String label)
{
float[] dist = GetEigenDistances(image);

index = 0;
eigenDistance = dist[0];
for (int i = 1; i < dist.Length; i++)
{
if (dist[i] < eigenDistance)
{
index = i;
eigenDistance = dist[i];
}
}
label = Labels[index];
}

/// <summary>
/// Try to recognize the image and return its label
/// </summary>
/// <param name="image">The image to be recognized</param>
/// <returns>
/// String.Empty, if not recognized;
/// Label of the corresponding image, otherwise
/// </returns>
public String Recognize(Image<Gray, Byte> image)
{
int index;
float eigenDistance;
String label;
FindMostSimilarObject(image, out index, out eigenDistance, out label);

43 | P a g e
return (_eigenDistanceThreshold <= 0 || eigenDistance <
_eigenDistanceThreshold ) ? _labels[index] : String.Empty;
}
}
}

Code using OpenCV

#include <stdio.h>
#if defined WIN32 || defined _WIN32
#include <conio.h> // For _kbhit() on Windows
#include <direct.h> // For mkdir(path) on Windows
#define snprintf sprintf_s // Visual Studio on Windows comes
with sprintf_s() instead of snprintf()
#else
#include <stdio.h> // For getchar() on Linux
#include <termios.h> // For kbhit() on Linux
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h> // For mkdir(path, options) on Linux
#endif
#include <vector>
#include <string>
//#include <string.h>
#include "cv.h"
#include "cvaux.h"
#include "highgui.h"

#ifndef BOOL
#define BOOL bool
#endif

using namespace std;

// Haar Cascade file, used for Face Detection.


const char *faceCascadeFilename = "haarcascade_frontalface_alt.xml";

int SAVE_EIGENFACE_IMAGES = 1; // Set to 0 if you dont want


images of the Eigenvectors saved to files (for debugging).
//#define USE_MAHALANOBIS_DISTANCE // You might get better
recognition accuracy if you enable this.

// Global variables
IplImage ** faceImgArr = 0; // array of face images
CvMat * personNumTruthMat = 0; // array of person numbers
//#define MAX_NAME_LENGTH 256 // Give each name a fixed
size for easier code.
//char **personNames = 0; // array of person names
(indexed by the person number).
vector<string> personNames; // array of person names
(indexed by the person number).
int faceWidth = 120; // Default dimensions for faces in the face
recognition database.

44 | P a g e
int faceHeight = 90; // " " " "
" " " "
int nPersons = 0; // the number of people in the
training set.
int nTrainFaces = 0; // the number of training images
int nEigens = 0; // the number of eigenvalues
IplImage * pAvgTrainImg = 0; // the average image
IplImage ** eigenVectArr = 0; // eigenvectors
CvMat * eigenValMat = 0; // eigenvalues
CvMat * projectedTrainFaceMat = 0; // projected training faces

CvCapture* camera = 0; // The camera device.

// Function prototypes
void printUsage();
void learn(const char *szFileTrain);
void doPCA();
void storeTrainingData();
int loadTrainingData(CvMat ** pTrainPersonNumMat);
int findNearestNeighbor(float * projectedTestFace);
int findNearestNeighbor(float * projectedTestFace, float
*pConfidence);
int loadFaceImgArray(const char * filename);
void recognizeFileList(const char *szFileTest);
void recognizeFromCam(void);
IplImage* getCameraFrame(void);
IplImage* convertImageToGreyscale(const IplImage *imageSrc);
IplImage* cropImage(const IplImage *img, const CvRect region);
IplImage* resizeImage(const IplImage *origImg, int newWidth, int
newHeight);
IplImage* convertFloatImageToUcharImage(const IplImage *srcImg);
void saveFloatImage(const char *filename, const IplImage *srcImg);
CvRect detectFaceInImage(const IplImage *inputImg, const
CvHaarClassifierCascade* cascade );
CvMat* retrainOnline(void);

// Show how to use this program from the command-line.


void printUsage()
{
printf("Usage: OnlineFaceRecogntion [<command>] \n"
" Valid commands are: \n"
" train <train_file> \n"
" test <test_file> \n"
" (if no args are supplied, then online camera mode is
enabled).\n"
);
}

// Startup routine.
int main( int argc, char** argv )
{
printUsage();

if( argc >= 2 && strcmp(argv[1], "train") == 0 ) {


char *szFileTrain;
if (argc == 3)
45 | P a g e
szFileTrain = argv[2]; // use the given arg
else {
printf("ERROR: No training file given.\n");
return 1;
}
learn(szFileTrain);
}
else if( argc >= 2 && strcmp(argv[1], "test") == 0) {
char *szFileTest;
if (argc == 3)
szFileTest = argv[2]; // use the given arg
else {
printf("ERROR: No testing file given.\n");
return 1;
}
recognizeFileList(szFileTest);
}
else {
recognizeFromCam();
}
return 0;
}

#if defined WIN32 || defined _WIN32


// Wrappers of kbhit() and getch() for Windows:
#define changeKeyboardMode
#define kbhit _kbhit
#else
// Create an equivalent to kbhit() and getch() for Linux,

#define VK_ESCAPE 0x1B // Escape character

// If 'dir' is 1, get the Linux terminal to return the 1st


keypress instead of waiting for an ENTER key.
// If 'dir' is 0, will reset the terminal back to the original
settings.
void changeKeyboardMode(int dir)
{
static struct termios oldt, newt;

if ( dir == 1 ) {
tcgetattr( STDIN_FILENO, &oldt);
newt = oldt;
newt.c_lflag &= ~( ICANON | ECHO );
tcsetattr( STDIN_FILENO, TCSANOW, &newt);
}
else
tcsetattr( STDIN_FILENO, TCSANOW, &oldt);
}

// Get the next keypress.


int kbhit(void)
{
struct timeval tv;
fd_set rdfs;

tv.tv_sec = 0;
tv.tv_usec = 0;
46 | P a g e
FD_ZERO(&rdfs);
FD_SET (STDIN_FILENO, &rdfs);

select(STDIN_FILENO+1, &rdfs, NULL, NULL, &tv);


return FD_ISSET(STDIN_FILENO, &rdfs);
}

// Use getchar() on Linux instead of getch().


#define getch() getchar()
#endif

// Save all the eigenvectors as images, so that they can be checked.


void storeEigenfaceImages()
{
// Store the average image to a file
printf("Saving the image of the average face as
'out_averageImage.bmp'.\n");
cvSaveImage("out_averageImage.bmp", pAvgTrainImg);
// Create a large image made of many eigenface images.
// Must also convert each eigenface image to a normal 8-bit
UCHAR image instead of a 32-bit float image.
printf("Saving the %d eigenvector images as
'out_eigenfaces.bmp'\n", nEigens);
if (nEigens > 0) {
// Put all the eigenfaces next to each other.
int COLUMNS = 8; // Put upto 8 images on a row.
int nCols = min(nEigens, COLUMNS);
int nRows = 1 + (nEigens / COLUMNS); // Put the rest
on new rows.
int w = eigenVectArr[0]->width;
int h = eigenVectArr[0]->height;
CvSize size;
size = cvSize(nCols * w, nRows * h);
IplImage *bigImg = cvCreateImage(size, IPL_DEPTH_8U, 1);
// 8-bit Greyscale UCHAR image
for (int i=0; i<nEigens; i++) {
// Get the eigenface image.
IplImage *byteImg =
convertFloatImageToUcharImage(eigenVectArr[i]);
// Paste it into the correct position.
int x = w * (i % COLUMNS);
int y = h * (i / COLUMNS);
CvRect ROI = cvRect(x, y, w, h);
cvSetImageROI(bigImg, ROI);
cvCopyImage(byteImg, bigImg);
cvResetImageROI(bigImg);
cvReleaseImage(&byteImg);
}
cvSaveImage("out_eigenfaces.bmp", bigImg);
cvReleaseImage(&bigImg);
}
}

// Train from the data in the given text file, and store the trained
data into the file 'facedata.xml'.
void learn(const char *szFileTrain)
47 | P a g e
{
int i, offset;

// load training data


printf("Loading the training images in '%s'\n", szFileTrain);
nTrainFaces = loadFaceImgArray(szFileTrain);
printf("Got %d training images.\n", nTrainFaces);
if( nTrainFaces < 2 )
{
fprintf(stderr,
"Need 2 or more training faces\n"
"Input file contains only %d\n", nTrainFaces);
return;
}

// do PCA on the training faces


doPCA();

// project the training images onto the PCA subspace


projectedTrainFaceMat = cvCreateMat( nTrainFaces, nEigens,
CV_32FC1 );
offset = projectedTrainFaceMat->step / sizeof(float);
for(i=0; i<nTrainFaces; i++)
{
//int offset = i * nEigens;
cvEigenDecomposite(
faceImgArr[i],
nEigens,
eigenVectArr,
0, 0,
pAvgTrainImg,
//projectedTrainFaceMat->data.fl + i*nEigens);
projectedTrainFaceMat->data.fl + i*offset);
}

// store the recognition data as an xml file


storeTrainingData();

// Save all the eigenvectors as images, so that they can be


checked.
if (SAVE_EIGENFACE_IMAGES) {
storeEigenfaceImages();
}

// Open the training data from the file 'facedata.xml'.


int loadTrainingData(CvMat ** pTrainPersonNumMat)
{
CvFileStorage * fileStorage;
int i;

// create a file-storage interface


fileStorage = cvOpenFileStorage( "facedata.xml", 0,
CV_STORAGE_READ );
if( !fileStorage ) {

48 | P a g e
printf("Can't open training database file
'facedata.xml'.\n");
return 0;
}

// Load the person names.


personNames.clear(); // Make sure it starts as empty.
nPersons = cvReadIntByName( fileStorage, 0, "nPersons", 0 );
if (nPersons == 0) {
printf("No people found in the training database
'facedata.xml'.\n");
return 0;
}
// Load each person's name.
for (i=0; i<nPersons; i++) {
string sPersonName;
char varname[200];
snprintf( varname, sizeof(varname)-1, "personName_%d",
(i+1) );
sPersonName = cvReadStringByName(fileStorage, 0, varname
);
personNames.push_back( sPersonName );
}

// Load the data


nEigens = cvReadIntByName(fileStorage, 0, "nEigens", 0);
nTrainFaces = cvReadIntByName(fileStorage, 0, "nTrainFaces",
0);
*pTrainPersonNumMat = (CvMat *)cvReadByName(fileStorage, 0,
"trainPersonNumMat", 0);
eigenValMat = (CvMat *)cvReadByName(fileStorage, 0,
"eigenValMat", 0);
projectedTrainFaceMat = (CvMat *)cvReadByName(fileStorage, 0,
"projectedTrainFaceMat", 0);
pAvgTrainImg = (IplImage *)cvReadByName(fileStorage, 0,
"avgTrainImg", 0);
eigenVectArr = (IplImage **)cvAlloc(nTrainFaces*sizeof(IplImage
*));
for(i=0; i<nEigens; i++)
{
char varname[200];
snprintf( varname, sizeof(varname)-1, "eigenVect_%d", i
);
eigenVectArr[i] = (IplImage *)cvReadByName(fileStorage,
0, varname, 0);
}

// release the file-storage interface


cvReleaseFileStorage( &fileStorage );

printf("Training data loaded (%d training images of %d


people):\n", nTrainFaces, nPersons);
printf("People: ");
if (nPersons > 0)
printf("<%s>", personNames[0].c_str());
for (i=1; i<nPersons; i++) {
printf(", <%s>", personNames[i].c_str());
}
49 | P a g e
printf(".\n");

return 1;
}

// Save the training data to the file 'facedata.xml'.


void storeTrainingData()
{
CvFileStorage * fileStorage;
int i;

// create a file-storage interface


fileStorage = cvOpenFileStorage( "facedata.xml", 0,
CV_STORAGE_WRITE );

// Store the person names.


cvWriteInt( fileStorage, "nPersons", nPersons );
for (i=0; i<nPersons; i++) {
char varname[200];
snprintf( varname, sizeof(varname)-1, "personName_%d",
(i+1) );
cvWriteString(fileStorage, varname,
personNames[i].c_str(), 0);
}

// store all the data


cvWriteInt( fileStorage, "nEigens", nEigens );
cvWriteInt( fileStorage, "nTrainFaces", nTrainFaces );
cvWrite(fileStorage, "trainPersonNumMat", personNumTruthMat,
cvAttrList(0,0));
cvWrite(fileStorage, "eigenValMat", eigenValMat,
cvAttrList(0,0));
cvWrite(fileStorage, "projectedTrainFaceMat",
projectedTrainFaceMat, cvAttrList(0,0));
cvWrite(fileStorage, "avgTrainImg", pAvgTrainImg,
cvAttrList(0,0));
for(i=0; i<nEigens; i++)
{
char varname[200];
snprintf( varname, sizeof(varname)-1, "eigenVect_%d", i
);
cvWrite(fileStorage, varname, eigenVectArr[i],
cvAttrList(0,0));
}

// release the file-storage interface


cvReleaseFileStorage( &fileStorage );
}

// Find the most likely person based on a detection. Returns the


index, and stores the confidence value into pConfidence.
int findNearestNeighbor(float * projectedTestFace, float
*pConfidence)
{
//double leastDistSq = 1e12;
double leastDistSq = DBL_MAX;
int i, iTrain, iNearest = 0;
50 | P a g e
for(iTrain=0; iTrain<nTrainFaces; iTrain++)
{
double distSq=0;

for(i=0; i<nEigens; i++)


{
float d_i = projectedTestFace[i] -
projectedTrainFaceMat->data.fl[iTrain*nEigens + i];
#ifdef USE_MAHALANOBIS_DISTANCE
distSq += d_i*d_i / eigenValMat->data.fl[i]; //
Mahalanobis distance (might give better results than Eucalidean
distance)
#else
distSq += d_i*d_i; // Euclidean distance.
#endif
}

if(distSq < leastDistSq)


{
leastDistSq = distSq;
iNearest = iTrain;
}
}

// Return the confidence level based on the Euclidean distance,


// so that similar images should give a confidence between 0.5
to 1.0,
// and very different images should give a confidence between
0.0 to 0.5.
*pConfidence = 1.0f - sqrt( leastDistSq / (float)(nTrainFaces *
nEigens) ) / 255.0f;

// Return the found index.


return iNearest;
}

// Do the Principal Component Analysis, finding the average image


// and the eigenfaces that represent any image in the given dataset.
void doPCA()
{
int i;
CvTermCriteria calcLimit;
CvSize faceImgSize;

// set the number of eigenvalues to use


nEigens = nTrainFaces-1;

// allocate the eigenvector images


faceImgSize.width = faceImgArr[0]->width;
faceImgSize.height = faceImgArr[0]->height;
eigenVectArr = (IplImage**)cvAlloc(sizeof(IplImage*) *
nEigens);
for(i=0; i<nEigens; i++)
eigenVectArr[i] = cvCreateImage(faceImgSize,
IPL_DEPTH_32F, 1);

// allocate the eigenvalue array


51 | P a g e
eigenValMat = cvCreateMat( 1, nEigens, CV_32FC1 );

// allocate the averaged image


pAvgTrainImg = cvCreateImage(faceImgSize, IPL_DEPTH_32F, 1);

// set the PCA termination criterion


calcLimit = cvTermCriteria( CV_TERMCRIT_ITER, nEigens, 1);

// compute average image, eigenvalues, and eigenvectors


cvCalcEigenObjects(
nTrainFaces,
(void*)faceImgArr,
(void*)eigenVectArr,
CV_EIGOBJ_NO_CALLBACK,
0,
0,
&calcLimit,
pAvgTrainImg,
eigenValMat->data.fl);

cvNormalize(eigenValMat, eigenValMat, 1, 0, CV_L1, 0);


}

// Read the names & image filenames of people from a text file, and
load all those images listed.
int loadFaceImgArray(const char * filename)
{
FILE * imgListFile = 0;
char imgFilename[512];
int iFace, nFaces=0;
int i;

// open the input file


if( !(imgListFile = fopen(filename, "r")) )
{
fprintf(stderr, "Can\'t open file %s\n", filename);
return 0;
}

// count the number of faces


while( fgets(imgFilename, sizeof(imgFilename)-1, imgListFile) )
++nFaces;
rewind(imgListFile);

// allocate the face-image array and person number matrix


faceImgArr = (IplImage **)cvAlloc(
nFaces*sizeof(IplImage *) );
personNumTruthMat = cvCreateMat( 1, nFaces, CV_32SC1 );

personNames.clear(); // Make sure it starts as empty.


nPersons = 0;

// store the face images in an array


for(iFace=0; iFace<nFaces; iFace++)
{
char personName[256];
string sPersonName;
int personNumber;
52 | P a g e
// read person number (beginning with 1), their name and
the image filename.
fscanf(imgListFile, "%d %s %s", &personNumber,
personName, imgFilename);
sPersonName = personName;
//printf("Got %d: %d, <%s>, <%s>.\n", iFace,
personNumber, personName, imgFilename);

// Check if a new person is being loaded.


if (personNumber > nPersons) {
// Allocate memory for the extra person (or
possibly multiple), using this new person's name.
for (i=nPersons; i < personNumber; i++) {
personNames.push_back( sPersonName );
}
nPersons = personNumber;
//printf("Got new person <%s> -> nPersons = %d
[%d]\n", sPersonName.c_str(), nPersons, personNames.size());
}

// Keep the data


personNumTruthMat->data.i[iFace] = personNumber;

// load the face image


faceImgArr[iFace] = cvLoadImage(imgFilename,
CV_LOAD_IMAGE_GRAYSCALE);

if( !faceImgArr[iFace] )
{
fprintf(stderr, "Can\'t load image from %s\n",
imgFilename);
return 0;
}
}

fclose(imgListFile);

printf("Data loaded from '%s': (%d images of %d people).\n",


filename, nFaces, nPersons);
printf("People: ");
if (nPersons > 0)
printf("<%s>", personNames[0].c_str());
for (i=1; i<nPersons; i++) {
printf(", <%s>", personNames[i].c_str());
}
printf(".\n");

return nFaces;
}

// Recognize the face in each of the test images given, and compare
the results with the truth.
void recognizeFileList(const char *szFileTest)
{
int i, nTestFaces = 0; // the number of test images

53 | P a g e
CvMat * trainPersonNumMat = 0; // the person numbers during
training
float * projectedTestFace = 0;
const char *answer;
int nCorrect = 0;
int nWrong = 0;
double timeFaceRecognizeStart;
double tallyFaceRecognizeTime;
float confidence;

// load test images and ground truth for person number


nTestFaces = loadFaceImgArray(szFileTest);
printf("%d test faces loaded\n", nTestFaces);

// load the saved training data


if( !loadTrainingData( &trainPersonNumMat ) ) return;

// project the test images onto the PCA subspace


projectedTestFace = (float *)cvAlloc( nEigens*sizeof(float) );
timeFaceRecognizeStart = (double)cvGetTickCount(); // Record
the timing.
for(i=0; i<nTestFaces; i++)
{
int iNearest, nearest, truth;

// project the test image onto the PCA subspace


cvEigenDecomposite(
faceImgArr[i],
nEigens,
eigenVectArr,
0, 0,
pAvgTrainImg,
projectedTestFace);

iNearest = findNearestNeighbor(projectedTestFace,
&confidence);
truth = personNumTruthMat->data.i[i];
nearest = trainPersonNumMat->data.i[iNearest];

if (nearest == truth) {
answer = "Correct";
nCorrect++;
}
else {
answer = "WRONG!";
nWrong++;
}
printf("nearest = %d, Truth = %d (%s). Confidence =
%f\n", nearest, truth, answer, confidence);
}
tallyFaceRecognizeTime = (double)cvGetTickCount() -
timeFaceRecognizeStart;
if (nCorrect+nWrong > 0) {
printf("TOTAL ACCURACY: %d%% out of %d tests.\n",
nCorrect * 100/(nCorrect+nWrong), (nCorrect+nWrong));
printf("TOTAL TIME: %.1fms average.\n",
tallyFaceRecognizeTime/((double)cvGetTickFrequency() * 1000.0 *
(nCorrect+nWrong) ) );
54 | P a g e
}

// Grab the next camera frame. Waits until the next frame is ready,
// and provides direct access to it, so do NOT modify the returned
image or free it!
// Will automatically initialize the camera on the first frame.
IplImage* getCameraFrame(void)
{
IplImage *frame;

// If the camera hasn't been initialized, then open it.


if (!camera) {
printf("Acessing the camera ...\n");
camera = cvCaptureFromCAM( 0 );
if (!camera) {
printf("ERROR in getCameraFrame(): Couldn't access
the camera.\n");
exit(1);
}
// Try to set the camera resolution
cvSetCaptureProperty( camera, CV_CAP_PROP_FRAME_WIDTH,
320 );
cvSetCaptureProperty( camera, CV_CAP_PROP_FRAME_HEIGHT,
240 );
// Wait a little, so that the camera can auto-adjust
itself
#if defined WIN32 || defined _WIN32
Sleep(1000); // (in milliseconds)
#endif
frame = cvQueryFrame( camera ); // get the first frame,
to make sure the camera is initialized.
if (frame) {
printf("Got a camera using a resolution of
%dx%d.\n", (int)cvGetCaptureProperty( camera,
CV_CAP_PROP_FRAME_WIDTH), (int)cvGetCaptureProperty( camera,
CV_CAP_PROP_FRAME_HEIGHT) );
}
}

frame = cvQueryFrame( camera );


if (!frame) {
fprintf(stderr, "ERROR in recognizeFromCam(): Could not
access the camera or video file.\n");
exit(1);
//return NULL;
}
return frame;
}

// Return a new image that is always greyscale, whether the input


image was RGB or Greyscale.
// Remember to free the returned image using cvReleaseImage() when
finished.
IplImage* convertImageToGreyscale(const IplImage *imageSrc)
{
55 | P a g e
IplImage *imageGrey;
// Either convert the image to greyscale, or make a copy of the
existing greyscale image.
// This is to make sure that the user can always call
cvReleaseImage() on the output, whether it was greyscale or not.
if (imageSrc->nChannels == 3) {
imageGrey = cvCreateImage( cvGetSize(imageSrc),
IPL_DEPTH_8U, 1 );
cvCvtColor( imageSrc, imageGrey, CV_BGR2GRAY );
}
else {
imageGrey = cvCloneImage(imageSrc);
}
return imageGrey;
}

// Creates a new image copy that is of a desired size.


// Remember to free the new image later.
IplImage* resizeImage(const IplImage *origImg, int newWidth, int
newHeight)
{
IplImage *outImg = 0;
int origWidth;
int origHeight;
if (origImg) {
origWidth = origImg->width;
origHeight = origImg->height;
}
if (newWidth <= 0 || newHeight <= 0 || origImg == 0 ||
origWidth <= 0 || origHeight <= 0) {
printf("ERROR in resizeImage: Bad desired image size of
%dx%d\n.", newWidth, newHeight);
exit(1);
}

// Scale the image to the new dimensions, even if the aspect


ratio will be changed.
outImg = cvCreateImage(cvSize(newWidth, newHeight), origImg-
>depth, origImg->nChannels);
if (newWidth > origImg->width && newHeight > origImg->height) {
// Make the image larger
cvResetImageROI((IplImage*)origImg);
cvResize(origImg, outImg, CV_INTER_LINEAR); //
CV_INTER_CUBIC or CV_INTER_LINEAR is good for enlarging
}
else {
// Make the image smaller
cvResetImageROI((IplImage*)origImg);
cvResize(origImg, outImg, CV_INTER_AREA); //
CV_INTER_AREA is good for shrinking / decimation, but bad at
enlarging.
}

return outImg;
}

// Returns a new image that is a cropped version of the original


image.
56 | P a g e
IplImage* cropImage(const IplImage *img, const CvRect region)
{
IplImage *imageTmp;
IplImage *imageRGB;
CvSize size;
size.height = img->height;
size.width = img->width;

if (img->depth != IPL_DEPTH_8U) {
printf("ERROR in cropImage: Unknown image depth of %d
given in cropImage() instead of 8 bits per pixel.\n", img->depth);
exit(1);
}

// First create a new (color or greyscale) IPL Image and copy


contents of img into it.
imageTmp = cvCreateImage(size, IPL_DEPTH_8U, img->nChannels);
cvCopy(img, imageTmp, NULL);

// Create a new image of the detected region


// Set region of interest to that surrounding the face
cvSetImageROI(imageTmp, region);
// Copy region of interest (i.e. face) into a new iplImage
(imageRGB) and return it
size.width = region.width;
size.height = region.height;
imageRGB = cvCreateImage(size, IPL_DEPTH_8U, img->nChannels);
cvCopy(imageTmp, imageRGB, NULL); // Copy just the region.

cvReleaseImage( &imageTmp );
return imageRGB;
}

// Get an 8-bit equivalent of the 32-bit Float image.


// Returns a new image, so remember to call 'cvReleaseImage()' on
the result.
IplImage* convertFloatImageToUcharImage(const IplImage *srcImg)
{
IplImage *dstImg = 0;
if ((srcImg) && (srcImg->width > 0 && srcImg->height > 0)) {

// Spread the 32bit floating point pixels to fit within


8bit pixel range.
double minVal, maxVal;
cvMinMaxLoc(srcImg, &minVal, &maxVal);

//cout << "FloatImage:(minV=" << minVal << ", maxV=" <<


maxVal << ")." << endl;

// Deal with NaN and extreme values, since the DFT seems
to give some NaN results.
if (cvIsNaN(minVal) || minVal < -1e30)
minVal = -1e30;
if (cvIsNaN(maxVal) || maxVal > 1e30)
maxVal = 1e30;
if (maxVal-minVal == 0.0f)
maxVal = minVal + 0.001; // remove potential
divide by zero errors.
57 | P a g e
// Convert the format
dstImg = cvCreateImage(cvSize(srcImg->width, srcImg-
>height), 8, 1);
cvConvertScale(srcImg, dstImg, 255.0 / (maxVal - minVal),
- minVal * 255.0 / (maxVal-minVal));
}
return dstImg;
}

// Store a greyscale floating-point CvMat image into a


BMP/JPG/GIF/PNG image,
// since cvSaveImage() can only handle 8bit images (not 32bit float
images).
void saveFloatImage(const char *filename, const IplImage *srcImg)
{
//cout << "Saving Float Image '" << filename << "' (" <<
srcImg->width << "," << srcImg->height << "). " << endl;
IplImage *byteImg = convertFloatImageToUcharImage(srcImg);
cvSaveImage(filename, byteImg);
cvReleaseImage(&byteImg);
}

// Perform face detection on the input image, using the given Haar
cascade classifier.
// Returns a rectangle for the detected region in the given image.
CvRect detectFaceInImage(const IplImage *inputImg, const
CvHaarClassifierCascade* cascade )
{
const CvSize minFeatureSize = cvSize(20, 20);
const int flags = CV_HAAR_FIND_BIGGEST_OBJECT |
CV_HAAR_DO_ROUGH_SEARCH; // Only search for 1 face.
const float search_scale_factor = 1.1f;
IplImage *detectImg;
IplImage *greyImg = 0;
CvMemStorage* storage;
CvRect rc;
double t;
CvSeq* rects;
int i;

storage = cvCreateMemStorage(0);
cvClearMemStorage( storage );

// If the image is color, use a greyscale copy of the image.


detectImg = (IplImage*)inputImg; // Assume the input image is
to be used.
if (inputImg->nChannels > 1)
{
greyImg = cvCreateImage(cvSize(inputImg->width, inputImg-
>height), IPL_DEPTH_8U, 1 );
cvCvtColor( inputImg, greyImg, CV_BGR2GRAY );
detectImg = greyImg; // Use the greyscale version as
the input.
}

// Detect all the faces.


t = (double)cvGetTickCount();
58 | P a g e
rects = cvHaarDetectObjects( detectImg,
(CvHaarClassifierCascade*)cascade, storage,
search_scale_factor, 3, flags, minFeatureSize
);
t = (double)cvGetTickCount() - t;
printf("[Face Detection took %d ms and found %d objects]\n",
cvRound( t/((double)cvGetTickFrequency()*1000.0) ), rects->total );

// Get the first detected face (the biggest).


if (rects->total > 0) {
rc = *(CvRect*)cvGetSeqElem( rects, 0 );
}
else
rc = cvRect(-1,-1,-1,-1); // Couldn't find the face.

//cvReleaseHaarClassifierCascade( &cascade );
//cvReleaseImage( &detectImg );
if (greyImg)
cvReleaseImage( &greyImg );
cvReleaseMemStorage( &storage );

return rc; // Return the biggest face found, or (-1,-1,-1,-1).


}

// Re-train the new face rec database without shutting down.


// Depending on the number of images in the training set and number
of people, it might take 30 seconds or so.
CvMat* retrainOnline(void)
{
CvMat *trainPersonNumMat;
int i;

// Free & Re-initialize the global variables.


if (faceImgArr) {
for (i=0; i<nTrainFaces; i++) {
if (faceImgArr[i])
cvReleaseImage( &faceImgArr[i] );
}
}
cvFree( &faceImgArr ); // array of face images
cvFree( &personNumTruthMat ); // array of person numbers
personNames.clear(); // array of person names
(indexed by the person number).
nPersons = 0; // the number of people in the training set.
nTrainFaces = 0; // the number of training images
nEigens = 0; // the number of eigenvalues
cvReleaseImage( &pAvgTrainImg ); // the average image
for (i=0; i<nTrainFaces; i++) {
if (eigenVectArr[i])
cvReleaseImage( &eigenVectArr[i] );
}
cvFree( &eigenVectArr ); // eigenvectors
cvFree( &eigenValMat ); // eigenvalues
cvFree( &projectedTrainFaceMat ); // projected training faces

// Retrain from the data in the files


printf("Retraining with the new person ...\n");
learn("train.txt");
59 | P a g e
printf("Done retraining.\n");

// Load the previously saved training data


if( !loadTrainingData( &trainPersonNumMat ) ) {
printf("ERROR in recognizeFromCam(): Couldn't load the
training data!\n");
exit(1);
}

return trainPersonNumMat;
}

// Continuously recognize the person in the camera.


void recognizeFromCam(void)
{
int i;
CvMat * trainPersonNumMat; // the person numbers during
training
float * projectedTestFace;
double timeFaceRecognizeStart;
double tallyFaceRecognizeTime;
CvHaarClassifierCascade* faceCascade;
char cstr[256];
BOOL saveNextFaces = FALSE;
char newPersonName[256];
int newPersonFaces;

trainPersonNumMat = 0; // the person numbers during training


projectedTestFace = 0;
saveNextFaces = FALSE;
newPersonFaces = 0;

printf("Recognizing person in the camera ...\n");

// Load the previously saved training data


if( loadTrainingData( &trainPersonNumMat ) ) {
faceWidth = pAvgTrainImg->width;
faceHeight = pAvgTrainImg->height;
}
else {
//printf("ERROR in recognizeFromCam(): Couldn't load the
training data!\n");
//exit(1);
}

// Project the test images onto the PCA subspace


projectedTestFace = (float *)cvAlloc( nEigens*sizeof(float) );

// Create a GUI window for the user to see the camera image.
cvNamedWindow("Input", CV_WINDOW_AUTOSIZE);

// Make sure there is a "data" folder, for storing the new


person.
#if defined WIN32 || defined _WIN32
mkdir("data");
#else
// For Linux, make the folder to be Read-Write-Executable
for this user & group but only Readable for others.
60 | P a g e
mkdir("data", S_IRWXU | S_IRWXG | S_IROTH);
#endif

// Load the HaarCascade classifier for face detection.


faceCascade =
(CvHaarClassifierCascade*)cvLoad(faceCascadeFilename, 0, 0, 0 );
if( !faceCascade ) {
printf("ERROR in recognizeFromCam(): Could not load Haar
cascade Face detection classifier in '%s'.\n", faceCascadeFilename);
exit(1);
}

// Tell the Linux terminal to return the 1st keypress instead


of waiting for an ENTER key.
changeKeyboardMode(1);

timeFaceRecognizeStart = (double)cvGetTickCount(); // Record


the timing.

while (1)
{
int iNearest, nearest, truth;
IplImage *camImg;
IplImage *greyImg;
IplImage *faceImg;
IplImage *sizedImg;
IplImage *equalizedImg;
IplImage *processedFaceImg;
CvRect faceRect;
IplImage *shownImg;
int keyPressed = 0;
FILE *trainFile;
float confidence;

// Handle non-blocking keyboard input in the console.


if (kbhit())
keyPressed = getch();

if (keyPressed == VK_ESCAPE) { // Check if the user


hit the 'Escape' key
break; // Stop processing input.
}
switch (keyPressed) {
case 'n': // Add a new person to the training set.
// Train from the following images.
printf("Enter your name: ");
strcpy(newPersonName, "newPerson");

// Read a string from the console. Waits


until they hit ENTER.
changeKeyboardMode(0);
fgets(newPersonName, sizeof(newPersonName)-1,
stdin);
changeKeyboardMode(1);
// Remove 1 or 2 newline characters if they
were appended (eg: Linux).
i = strlen(newPersonName);

61 | P a g e
if (i > 0 && (newPersonName[i-1] == 10 ||
newPersonName[i-1] == 13)) {
newPersonName[i-1] = 0;
i--;
}
if (i > 0 && (newPersonName[i-1] == 10 ||
newPersonName[i-1] == 13)) {
newPersonName[i-1] = 0;
i--;
}

if (i > 0) {
printf("Collecting all images until you
hit 't', to start Training the images as '%s' ...\n",
newPersonName);
newPersonFaces = 0; // restart
training a new person
saveNextFaces = TRUE;
}
else {
printf("Did not get a valid name from
you, so will ignore it. Hit 'n' to retry.\n");
}
break;
case 't': // Start training
saveNextFaces = FALSE; // stop saving next
faces.
// Store the saved data into the training
file.
printf("Storing the training data for new
person '%s'.\n", newPersonName);
// Append the new person to the end of the
training data.
trainFile = fopen("train.txt", "a");
for (i=0; i<newPersonFaces; i++) {
snprintf(cstr, sizeof(cstr)-1,
"data/%d_%s%d.pgm", nPersons+1, newPersonName, i+1);
fprintf(trainFile, "%d %s %s\n",
nPersons+1, newPersonName, cstr);
}
fclose(trainFile);

// Now there is one more person in the


database, ready for retraining.
//nPersons++;

//break;
//case 'r':

// Re-initialize the local data.


projectedTestFace = 0;
saveNextFaces = FALSE;
newPersonFaces = 0;

// Retrain from the new database without


shutting down.
// Depending on the number of images in the
training set and number of people, it might take 30 seconds or so.
62 | P a g e
cvFree( &trainPersonNumMat ); // Free the
previous data before getting new data
trainPersonNumMat = retrainOnline();
// Project the test images onto the PCA
subspace
cvFree(&projectedTestFace); // Free the
previous data before getting new data
projectedTestFace = (float *)cvAlloc(
nEigens*sizeof(float) );

printf("Recognizing person in the camera


...\n");
continue; // Begin with the next frame.
break;
}

// Get the camera frame


camImg = getCameraFrame();
if (!camImg) {
printf("ERROR in recognizeFromCam(): Bad input
image!\n");
exit(1);
}
// Make sure the image is greyscale, since the Eigenfaces
is only done on greyscale image.
greyImg = convertImageToGreyscale(camImg);

// Perform face detection on the input image, using the


given Haar cascade classifier.
faceRect = detectFaceInImage(greyImg, faceCascade );
// Make sure a valid face was detected.
if (faceRect.width > 0) {
faceImg = cropImage(greyImg, faceRect); // Get the
detected face image.
// Make sure the image is the same dimensions as
the training images.
sizedImg = resizeImage(faceImg, faceWidth,
faceHeight);
// Give the image a standard brightness and
contrast, in case it was too dark or low contrast.
equalizedImg = cvCreateImage(cvGetSize(sizedImg),
8, 1); // Create an empty greyscale image
cvEqualizeHist(sizedImg, equalizedImg);
processedFaceImg = equalizedImg;
if (!processedFaceImg) {
printf("ERROR in recognizeFromCam(): Don't
have input image!\n");
exit(1);
}

// If the face rec database has been loaded, then


try to recognize the person currently detected.
if (nEigens > 0) {
// project the test image onto the PCA
subspace
cvEigenDecomposite(
processedFaceImg,
nEigens,
63 | P a g e
eigenVectArr,
0, 0,
pAvgTrainImg,
projectedTestFace);

// Check which person it is most likely to


be.
iNearest =
findNearestNeighbor(projectedTestFace, &confidence);
nearest = trainPersonNumMat-
>data.i[iNearest];
if (confidence > 0.82)
{

printf("Most likely person in camera: '%s'


(confidence=%f).\n", personNames[nearest-1].c_str(), confidence);
}
else
{
printf("UnAuthrized Person \n");
}

}//endif nEigens

// Possibly save the processed face to the training


set.
if (saveNextFaces) {
// MAYBE GET IT TO ONLY TRAIN SOME IMAGES ?
// Use a different filename each time.
snprintf(cstr, sizeof(cstr)-1,
"data/%d_%s%d.pgm", nPersons+1, newPersonName, newPersonFaces+1);
printf("Storing the current face of '%s' into
image '%s'.\n", newPersonName, cstr);
cvSaveImage(cstr, processedFaceImg, NULL);
newPersonFaces++;
}

// Free the resources used for this frame.


cvReleaseImage( &greyImg );
cvReleaseImage( &faceImg );
cvReleaseImage( &sizedImg );
cvReleaseImage( &equalizedImg );
}

// Show the data on the screen.


shownImg = cvCloneImage(camImg);
if (faceRect.width > 0) { // Check if a face was
detected.
// Show the detected face region.
cvRectangle(shownImg, cvPoint(faceRect.x,
faceRect.y), cvPoint(faceRect.x + faceRect.width-1, faceRect.y +
faceRect.height-1), CV_RGB(0,255,0), 1, 8, 0);
if (nEigens > 0) { // Check if the face
recognition database is loaded and a person was recognized.
// Show the name of the recognized person,
overlayed on the image below their face.
64 | P a g e
CvFont font;
cvInitFont(&font,CV_FONT_HERSHEY_PLAIN, 1.0,
1.0, 0,1,CV_AA);
CvScalar textColor = CV_RGB(0,255,255); //
light blue text
char text[256];
if (confidence > 0.82)
{

snprintf(text, sizeof(text)-1, "Name: '%s'",


personNames[nearest-1].c_str());
cvPutText(shownImg, text, cvPoint(faceRect.x,
faceRect.y + faceRect.height + 15), &font, textColor);
snprintf(text, sizeof(text)-1, "Confidence:
%f", confidence);
cvPutText(shownImg, text, cvPoint(faceRect.x,
faceRect.y + faceRect.height + 30), &font, textColor);
}

else
{
snprintf(text, sizeof(text)-1, "Unauthorised Person:
'%s'");
cvPutText(shownImg, text, cvPoint(faceRect.x,
faceRect.y + faceRect.height + 15), &font, textColor);
snprintf(text, sizeof(text)-1, "Confidence:
%f", confidence);
cvPutText(shownImg, text, cvPoint(faceRect.x,
faceRect.y + faceRect.height + 30), &font, textColor);

}
}
}

// Display the image.


cvShowImage("Input", shownImg);

// Give some time for OpenCV to draw the GUI and check if
the user has pressed something in the GUI window.
keyPressed = cvWaitKey(10);
if (keyPressed == VK_ESCAPE) { // Check if the user
hit the 'Escape' key in the GUI window.
break; // Stop processing input.
}

cvReleaseImage( &shownImg );
}
tallyFaceRecognizeTime = (double)cvGetTickCount() -
timeFaceRecognizeStart;

// Reset the Linux terminal back to the original settings.


changeKeyboardMode(0);

// Free the camera and memory resources used.


cvReleaseCapture( &camera );
cvReleaseHaarClassifierCascade( &faceCascade );
}

65 | P a g e
6.2 Hardware Section

6.2.1 Hardware Requirements


Video capture device requires 320 x 420 resolution and at least 3-5 frames/sec.
More frames/sec leads to better performance.
For slightly greater distances there is strong correlation between camera quality
An adequate GPU and processor speed are key components.
Face recognition system do not works well with standard off the shelf pc.

6.2.2 HARDWARE USED


AMD-BULDOZER FX 4100
BIOSTAR MOTHERBOARD
GRAPHICS CARD HD Radeon 7770 OC
COOLING SYSTEMS
1. GPU COOLING
2. PROCESSOR COOLING
3. MOTHERBOARD COOLING
WEB CAMERA
MONITOR
KEY BOARD
MOUSE
BUZZER

66 | P a g e
6.2.2.1AMD-BULDOZER FX 4100

Factory clock 3.6Ghz


Over clock stable up to 4Ghz
4 physical cores
Piledriver technology
Signature of 95W
32nm technology
Holds the Guinness record

67 | P a g e
6.2.2.2 BIOSTAR MOTHERBOARD

AM3+ Socket

Ethernet port

HDMI / DVI / VGA Ports

SATA Ports (4)

Supports DDR3 RAM

PCI Ports (2)/PCI Express

USB connectors (2)/USB port (4)

Supports HD Audio /5.1ch

68 | P a g e
6.2.2.3 GRAPHICS CARD HD Radeon 7770 OC

ATI Manufactured
Clock 1.1Ghz
DDR5 1GB
Directx 11.1
Open GL support
Vapour x technology

69 | P a g e
6.2.2.4 COOLING SYSTEMS
As we all are aware of the fact that our computers are very hot when they are processing
large volumes of data as here in the case of face recognition and detection. For increased
efficiency we deploy specialised cooling methodologies like copper tube water assembly
or more efficient commercially available variants. Our processor rated at around 95w
working 24x7 will lead to an extremely hot stage so we need special cooling system
rather than that comes with the processor which is a conventional convection type air
cooler and te graphics card will have its own separate cooling arrangement as the
revolutionary vapour x technology we will look into the details as we go.
1. GPU Cooling
2. Processor Cooling

GPU Cooling

This is the opened view of the vapour x technology device by ATI here the
conventional cooling systems deployed by laptop processors thats the water
copper tube assembly is deployed for the cooling of the GPU core. As all the
gamers have experienced the overheated graphics is a really nasty problems since
our processor is ready to go on but the card wont let us game. So this technique
ensures a constant cooling delivered to the GPU core so we dont need to stop for
the card to cool down. The heated fluid will rise through the tube with capillary
action and cooled down with a conventional heat sink fan assembly. The working
is essentially what we consider as heat pipes

Processor cooling

70 | P a g e
The processor needs to keep at the prescribed heat level to ensure maximum life
and to ensure proper working. The picture below is the conventional laptop
processor cooling system since laptop are prepared to work with maximum
efficiency and has only a very low space for a heat sink assembly so a bulky heat
sink as in the case of a desktop is not affordable

A laptop processor cooling system (copper-water usually)

Liquid cooling
To ensure maximum efficiency in cooling we utilize liquid cooling as an
alternative to conventional cooling since the system will be up and running 24x7 a
reliable cooling system is a must. So we deploy the Atech khuler h20 620 liquid
cooler since it was the most rated cooler under 5k range. The photo of the
assembly is given over leaf

71 | P a g e
6.2.2.5 Implementation Challenges

Implementing in Beagle board


Environmental Factors

Implementing in Beagle board

At our primary stages of project planning we were inclined to the well know
development board the Beagle board. The details of the platform is like this ARM A8
cortex 512 mb ram and a hdmi port along with audio jack so all that we want was in-built
on board so we chose the beagle as implementation platform later when we began to
actually implement and debug the code on beagle the board was not that awesome there
was 2-5 seconds lag in images.
So we had to re-think our strategy, then we thought out of the box and came up
with this radical idea of implementing the code on a custom built pc assembled by us and

72 | P a g e
by Gods grace we succeeded. There we made a pc which would work 24x7 without a
hitch.

Environmental challenges
In our code we do histogram matching so that the brightness variation in the images took
at say different times of a day or in a different background will not affect our recognition
patterns. But the environmental factors like light intensity mainly has a tremendous role
on making the efficiency of recognition drop beyond 50% even though we took more than
55 pictures of the subject we want to recognize the we are utilizing greyscale images for
detection and recognition so brightness has a great role in this. The alternative of this will
be to use a different algorithm like AAM or to implement a booth in the clearance area
where passengers are checking in and out thus we will have control over the luminance.

73 | P a g e
CHAPTER 7 - TEST AND TRIAL

74 | P a g e
CHAPTER 8 APPLICATIONS OF THE PROPOSED SYSTEM

8.1 Law enforcement and justice solutions:


Today's law enforcement agencies are looking for innovative technologies to help
them stay one step ahead of the world's ever-advancing criminals.
As such, CABS-computerized arrest and booking system and the child base
protection, a software solution for global law enforcement agencies to help protect
and recover missing and sexually exploited children, particularly as it relates to
child pornography.

8.2 CABS:
Store all offence-related detain one easy-to-use system -- data is entered once and
only once.
Integrate with any database -- including other detachments and other applications
(RMS, CAD, Jail Management systems, and "most-wanted" databases) .
Link victims to offenders -- to aid in criminal analysis and investigations
Capture and store digital images of the offender -- encode all mug shots, marks,
tattoos, and scars
Perform rapid and accurate searches -- on all data and image fields for crime
statistics and reporting
Produce digital line-ups -- using any stored image in minutes
Identify previous offenders -- pre-integrated with advanced biometric face
recognition software.

8.3 Child base protection:


Child Base is an application that helps protect and recover missing and sexually-
exploited children, particularly those children victimized through child abuse
images.

8.4 Identification solutions:


With regards to primary identification documents, (Passports, Driver's licenses, and ID
Cards), the use of face recognition for identification programs has several advantages
over other biometric technologies.
Leverage your existing identification infrastructure. This includes, using existing
photo databases and the existing enrolment technology (e.g. cameras and capture
stations); and
Increase the public's cooperation by using a process (taking a picture of one's face)
that is already accepted and expected;
Integrate with terrorist watch lists, including regional, national, and international
"most-wanted" databases.

8.5 Homeland defence:

75 | P a g e
Since the terrorist events of September 11, 2001, the world has paid much more
attention to the idea of Homeland Defence, and both governments and private
industries alike are committed to the cause of national defence.
This includes everything from preventing terrorists from boarding aircraft, to
protecting critical infrastructure from attack or tampering (e.g. dams, bridges,
water reservoirs, energy plants, etc.), to the identification of known terrorists.

8.6 Airport security:


Airport and other transportation terminal security is not a new thing. People have
long had to pass through metal detectors before they boarded a plane, been subject
to questioning by security personnel, and restricted from entering "secure" areas.
What has changed is the vigilance in which these security efforts are being
applied.
The use of biometric identification, can enhance security efforts already underway
at most airports and other major transportation hubs (seaports, train stations, etc.).
This includes the identification of known terrorists before they get onto an
airplane or into a secure location.

8.7 Immigration:
Most countries do not want to be perceived as being a "weak link" when it comes
to accepting immigrants and refugees, particularly if that individual uses the new
country as a staging ground for multi-national criminal and terrorist activities.
Consequently, governments around the world are examining their immigration
policies and procedures.
Biometric technology, particularly face recognition software, can enhance the
effectiveness of immigration and customs personnel. After all, to the human eye it
is often difficult to determine a person's identity by looking at a photo, especially
if the person has aged, is of a different ethnic background, has altered their hair
style, shaved their beard, etc

76 | P a g e
CHAPTER 9- CONCLUSION

The human face plays an important role in our social interactions conveying peoples
identities. Using human face as a key to security biometric face recognition technology
has received significant attention in the past several years due to the potential of wide
verity of applications both law-enforcement and non.

As compared with other biometric systems using palm print finger print and iris
recognitions face recognition has a significant advantage due to its non contact process.
The face images can be captured at a distance without touching the person being
identified and identification does not require interaction with person. In addition face
recognition also serves a crime deterrent purpose because face images that have been
recorded and archived can later help identify a person

Our system can identify human face in a picture or a video. In our system we add a
security system on airport as to detect any offenders travelling in fake identity or to detect
multiple passports for one person; with a system like ours we could easily do these things.
But the scope of this system is not limited to this it could be further improved.

77 | P a g e
REFERENCES

A Real-time Face Detection And Recognition System IEEE 2012

Learning OpenCV Computer Vision with the OpenCV Library- Gary Bradski and
Adrian Kaehler

C# school- Faraz Rasheed

Robust Real-Time Face Detection- Paul Viola, Michael J Jones

A Survey of Methods for Face Detection-Andrew King

78 | P a g e
APPENDIX -1

COST ESTIMATION

Component Name Quantity Price


Samsung LED monitor 1 5250/-
Cooler master Elite Cabinet 1 2500/-
Smps 500w 1 3500/-
DDR III RAM 4GB (2X2GB) 2 1200/-
AMD FX4100 1 6850/-
Mother board 1 5250/-
ATI Radeon 7770 oc 1 9998/-
Antec khuler 1 4560/-
Logitech webcam 1 1500/-
Visual Studio 25737.63/-
Windows 8 3499/-

79 | P a g e

View publication stats

Você também pode gostar