Escolar Documentos
Profissional Documentos
Cultura Documentos
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/261703721
CITATIONS READS
0 2,197
4 authors, including:
Anand Raj
VIT University
3 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Anand Raj on 18 April 2014.
0|Page
ACKNOWLEDGEMENT
It is with great pleasure and pride that we present this report before you. At this moment
of triumph, it would be unfair to neglect all those who helped using successful completion
of this project.
First of all, we would like to place myself at the feet of God Almighty for his everlasting
love and for blessings and courage he gave me, which made it possible to see through the
turbulence and set me in the right path. We would like to thank our Principal
Dr.Z.A.Zoya for the proper ambience to go on with the project. We would also like to
thank our head of the department, Mrs. Ananda Resmi for all help and guidance that she
provided to me. We are grateful to my project coordinator, Mr .Sudheer V.R Assistant
Professor in Department of Electronics And Communication for the guidance and whole
heated support.
We would take this opportunity to thank my friends who were always a source of
encouragement.
1|Page
ABSTRACT
Here we are trying to understand the working and the implementation of multiple face
detection cum recognition system standing strictly on under graduate level of
understanding. This covers familiarization of the topic and details about the creation of a
working model and the testing of the same. The scope of the discussion is generally to
understand the working of the face recognition system and its implementation model. The
project is designed to improve the automated security systems. Here we can identify
people in real time and sounds alarm in case of person recognized as dangerous by law
enforcement agencies or by the user himself. So in essence by implementing this system
we could alert the user when a person recorded in the database comes under our
surveillance camera.
2|Page
TABLE OF CONTENTS
NAME PAGE NO
ACKNOWLEDGEMENT 1
ABSTRACT 2
TABLE OF CONTENTS 3
LIST OF FIGURES 4
CHAPTER 1 INTRODUCTION 5
CHAPTER 2 HISTORY 8
CHAPTER 3 OVERVIEW OF THE SYSTEM 9
CHAPTER4 DISCRIPTION 10
CHAPTER 5 REQUIREMENTS 25
CHAPTER 6 IMPLEMENTATION DETAILS 26
CHAPTER 7 TEST AND TRIAL 74
CHAPTER 8 APPLICATION OF THE PROPOSED SYSTEM 75
CHAPTER 9 CONCLUSION 77
REFERENCE 78
APPENDIX
3|Page
LIST OF FIGURES
SL NO NAME PAGE NO
4|Page
CHAPTER 1 INTRODUCTION
Human face recognition has drawn considerable attention from the researchers in recent
years. An automatic face recognition system will find many applications in areas such as
human-computer interfaces, model-based video coding and security control systems. In
addition, face recognition has the potential of being a non-intrusive form of biometric
identification.
The difficulties of face recognition lie in the inherent variability arising from face
characteristics (age, gender and race), geometry (distance and viewpoint), image quality
(resolution, illumination, signal to noise ratio), and image content (background, occlusion
and disguise). Because of such complexity, most face recognition systems to date assume
a well-controlled environment and recognize only near frontal faces. However, these
constraints need to be relaxed in practice. Also, in applications such as video database
search, a persons face can appear in arbitrary backgrounds with unknown size and
orientation. Thus there is a need for robust face recognition systems to handle these
uncertainties.
People have amazing ability to recognize and remember thousands of faces.
Face is an important part of who you are and how people identify you.
While humans have had the innate ability to recognize and distinguish faces
for millions of years, computers are just catching up.
Face recognition is a fascinating problem with important commercial
applications such as mug shot matching, crowd surveillance & witness face
reconstruction.
In computer vision most of the popular face recognition algorithms have
been biologically motivated.
Using these models researchers can quantify the similarity between faces;
images whose projections are close in face space are likely to be from the
same individual.
Compare results of these models with human perception to determine
whether distance in face space corresponds to the human notion of facial
similarity.
Biometrics is used for that purpose.
6|Page
The security system deals with detecting face from live video then recognizing it and it
sounds an alarm in case of security breach. The system uses Open CV as image
processing tool, server system as hardware. Since our system is a real time one we need to
select an accurate and fast algorithm. Since there are several algorithms available, the
most promising algorithm for face detection is Viola Jones using AdaBoost (~95%
accuracy) for recognition is PCA Eigen Faces (~ 75% accuracy).
7|Page
CHAPTER 2: HISTORY OF FACE RECOGNITION
1960s
First semi-automated system
The first semi-automated facial recognition programs were created by Woody Bledsoe,
Helen Chan Wolf, and Charles Bisson. Their programs required the administrator to
locate features such as the eyes, ears, nose, and mouth on the photograph. It then
calculated distances and ratios to a common reference point which was then compared to
reference data.
1970s
1988
Applied principle component analysis, a standard linear algebra technique , to the face
recognition problem. Considered a milestone because it showed that less than one
hundred values were required to accurately code a suitable aligned and normalized face.
8|Page
CHAPTER 3- OVERVIEW OF THE SYSTEM
Overview
Real Time
Viola Jones using AdaBoost (~95% accuracy)
PCA Eigen Faces (~ 75% accuracy)
Server Hardware
BLOCK DIAGRAM
HDMI PORT
WEB CAMERA
AMD
FX4100
USB PORT MONITOR
BUZZER
SPEECH OUT
9|Page
CHAPTER 4- DESCRIPTION
Face Detection
Yes/No
10 | P a g e
Finding faces in images with controlled background
Finding faces by color
Finding faces by motion
Using a mixture of the above
Finding faces in unconstrained scenes:
Neural Net approach
Neural Nets using statistical cluster information
Model-based Face Tracking
Weak classifier cascades"
We use Viola-Jones method for face detection because it gives 95% accuracy
11 | P a g e
The features that Viola and Jones used are based on Haar wavelets. Haar wavelets are
single wavelength square waves (one high interval and one low interval). In two
dimensions, a square wave is a pair of adjacent rectangles - one light and one dark.
The features that Viola and Jones used are based on Haar wavelets. Haar wavelets are
single wavelength square waves (one high interval and one low interval). In two
dimensions, a square wave is a pair of adjacent rectangles - one light and one dark.
The actual rectangle combinations used for visual object detection are not true Haar
wavelets. Instead, they contain rectangle combinations better suited to visual recognition
tasks. Because of that difference, these features are called Haar features, or Haarlike
features, rather than Haar wavelets. Figure 4. 1 shows the features that OpenCV uses.
The presence of a Haar feature is determined by subtracting the average dark-region pixel
value from the average light-region pixel value. If the difference is above a threshold (set
during learning), that feature is said to be present.
To determine the presence or absence of hundreds of Haar features at every image
location and at several scales efficiently, Viola and Jones used a technique called an
Integral Image. In general, "integrating" means adding small units together. In this case,
the small units are pixel values. The integral value for each pixel is the sum of all the
pixels above it and to its left. Starting at the top left and traversing to the right and down,
12 | P a g e
the entire image can be integrated with a few integer operations per pixel, after
integration, the value at each pixel location, (x,y), contains the sum of all pixel values
within a rectangular region that has one corner at the top left of the image and the other at
location (x,y). To find the average pixel value in this rectangle, you'd only need to divide
the value at (x,y) by the rectangle's area.
But what if you want to know the summed values for some other rectangle, one that
doesn't have one corner at the upper left of the image? Figure 4.3 shows the solution to
that problem. Suppose you want the summed values in D. You can think of that as being
the sum of pixel values in the combined rectangle, A+B+C+D, minus the sums in
rectangles A+B and A+C, plus the sum of pixel values in A. In other words,
D = A+B+C+D - (A+B) - (A+C) + A.
Conveniently, A+B+C+D is the Integral Image's value at location 4, A+B is the value at
location 2, A+C is the value at location 3, and A is the value at location 1. So, with an
Integral Image, you can find the sum of pixel values for any rectangle in the original
image with just three integer operations: (x4, y4) - (x2, y2) - (x3, y3) + (x1, y1).
To select the specific Haar features to use, and to set threshold levels, Viola and Jones use
a machine-learning method called AdaBoost. AdaBoost combines many "weak"
classifiers to create one "strong" classifier. "Weak" here means the classifier only gets the
right answer a little more often than random guessing would. That's not very good. But if
you had a whole lot of these weak classifiers, and each one "pushed" the final answer a
little bit in the right direction, you'd have a strong, combined force for arriving at the
correct solution. AdaBoost selects a set of weak classifiers to combine and assigns a
weight to each. This weighted combination is the strong classifier.
13 | P a g e
Fig 4.3.2 the classifier cascade is a chain of filters. Image sub regions that make it through the
entire cascade are classified as "Face." All others are classified as "Not Face."
Viola and Jones combined a series of AdaBoost classifiers as a filter chain, shown in
Figure 3, that's especially efficient for classifying image regions. Each filter is a separate
AdaBoost classifier with a fairly small number of weak classifiers.
The acceptance threshold at each level is set low enough to pass all, or nearly all, face
examples in the training set. The filters at each level are trained to classify training
images that passed all previous stages. (The training set is a large database of faces,
maybe a thousand or so.) During use, if any one of these filters fails to pass an image
region, that region is immediately classified as "Not Face." When a filter passes an image
region, it goes to the next filter in the chain. Image regions that pass through all filters in
the chain are classified as "Face." Viola and Jones dubbed this filtering chain a cascade.
14 | P a g e
Source: Internet
The order of filters in the cascade is based on the importance weighting that AdaBoost
assigns. The more heavily weighted filters come first, to eliminate non-face image regions
as quickly as possible.
15 | P a g e
Fig: 4.4 how face recognized-schematic
No face recognition algorithm is yet 100% efficient. It could reach 100% efficiency but
not always! So, no existing face recognition algorithm is 100% foolproof! That is why;
its a very hot topic of research today: to optimize face recognition such that it gives near-
perfect efficiency in real-time and critical environment!
16 | P a g e
Therefore Secondly: PCA based Eigen faces method is not 100% efficient! In fact, on the
average, it goes up to 70% to 75% efficiency honestly.
However, it works well enough, to be used in a beginner or hobbyist robotics/computer
vision project. Because, even though there are other better existing algorithms for face
recognition, they are still not 100% efficient!
And those other recognition algorithms, though better than PCA based Eigen Face, have a
bigger overhead of "coding" effort you need to put in to implement them in our project.
The task of facial recognition is discriminating input signals (image data) into several
classes (persons). The input signals are highly noisy (e.g. the noise is caused by differing
lighting conditions, pose etc.), yet the input images are not completely random and in
spite of their differences there are patterns which occur in any input signal. Such patterns,
which can be observed in all signals, could be - in the domain of facial recognition - the
presence of some objects (eyes, nose, mouth) in any face as well as relative distances
between these objects. These characteristic features are called eigenfaces in the facial
recognition domain (or principal components generally). They can be extracted out of
original image data by means of a mathematical tool called Principal Component
Analysis (PCA).
By means of PCA one can transform each original image of the training set into a
corresponding eigenface. An important feature of PCA is that one can reconstruct any
original image from the training set by combining the eigenfaces. Remember that
eigenfaces are nothing less than characteristic features of the faces. Therefore one could
say that the original face image can be reconstructed from eigenfaces if one adds up all
the eigenfaces (features) in the right proportion. Each eigenface represents only certain
features of the face, which may or may not be present in the original image. If the feature
is present in the original image to a higher degree, the share of the corresponding
eigenface in the sum of the eigenfaces should be greater. If, contrary, the particular
feature is not (or almost not) present in the original image, then the corresponding
eigenface should contribute a smaller (or not at all) part to the sum of eigenfaces. So, in
order to reconstruct the original image from the eigenfaces, one has to build a kind of
weighted sum of all eigenfaces. That is, the reconstructed original image is equal to a sum
of all eigenfaces, with each eigenface having a certain weight. This weight specifies, to
what degree the specific feature (eigenface) is present in the original image.
If one uses all the eigenfaces extracted from original images, one can reconstruct the
original images from the eigenfaces exactly. But one can also use only a part of the
eigenfaces. Then the reconstructed image is an approximation of the original image.
However, one can ensure that losses due to omitting some of the eigenfaces can be
minimized. This happens by choosing only the most important features (eigenfaces).
Omission of eigenfaces is necessary due to scarcity of computational resources.
17 | P a g e
How does this relate to facial recognition? The clue is that it is possible not only to
extract the face from eigenfaces given a set of weights, but also to go the opposite way.
This opposite way would be to extract the weights from eigenfaces and the face to be
recognized. These weights tell nothing less, as the amount by which the face in question
differs from typical faces represented by the eigenfaces. Therefore, using this weights
one can determine two important things:
1. Determine, if the image in question is a face at all. In the case the weights of the
image differ too much from the weights of face images (i.e. images, from which we know
for sure that they are faces), the image probably is not a face.
18 | P a g e
The PCA is used to generate K Eigen faces for a training set of M images where K<M
thereby reducing the number of values (from M to K) need to identify an unknown face.
Converts database of M face images into a list of k variable called Eigen faces
(K<M)
First principal component most dominant succeeding components shows next
most dominant
Fig 4.7 Showing Eigen Faces the first picture has almost characteristics of training images it
decreases as we moves to next picture
19 | P a g e
Calculation of 2 the covariance matrix result in the creation of 2 x 2 matrix where,
N=dimension (50x50 pixels), so it results in 2500x2500 dimension .This causes the
system to slow down terribly or run out of memory. So we reduce the noise region of the
Eigen faces. Only K components are selected and others are discarded.
20 | P a g e
4.3.3.3 Mathematical Analysis of PCA-Eigen Face
Which is similar to the formula for variance however, the change of x is in respect to the
change in y rather than solely the change of x in respect to x. In this equation x represents
the pixel value and x is the mean of all x values, and n the total number of values.
The covariance matrix that is formed of the image data represents how much the
dimensions vary from the mean with respect to each other. The definition of a covariance
matrix is:
Now the easiest way to explain this is but an example the easiest of which is a 3x3 matrix.
21 | P a g e
Now with larger matrices this can become more complicated and the use of
computational algorithms essential.
Eigenvectors can be scaled so or x2 of the vector will still produce the same type of
results. A vector is a direction and all you will be doing is changing the scale not the
direction.
The Eigenvalue is closely related to the Eigenvector used and is the value of which the
original vector was scaled in the example the Eigenvalue is 4.
22 | P a g e
STAGE 4: Feature Vector
Now a usually the results of Eigenvalues and Eigenvectors are not as clean as in the
example above. In most cases the results provided are scaled to a length of 1.
Once Eigenvectors are found from the covariance matrix, the next step is to order them
by Eigenvalue, highest to lowest. This gives you the components in order of significance.
Here the data can be compressed and the weaker vectors are removed producing a lossy
compression method, the data lost is deemed to be insignificant.
STAGE 5: Transposition
The final stage in PCA is to take the transpose of the feature vector matrix and multiply it
on the left of the transposed adjusted data set (the adjusted data set is from Stage 1 where
the mean was subtracted from the data).
The EigenObjectRecognizer class performs all of this and then feeds the transposed data
as a training set. When it is passed an image to recognize it performs PCA and compares
the generated Eigenvalues and Eigenvectors to the ones from the training set and then
produces a match if one has been found or a negative match if no match is found.
23 | P a g e
4.3.3.4 Recognition of Unknown Face
24 | P a g e
CHAPTER 5- REQUIREMENTS
HARDWARE REQUIREMENTS
AMD-BULDOZER FX 4100
BIOSTAR MOTHERBOARD
GRAPHICS CARD HD Radeon 7770 OC
COOLING SYSTEMS
1. GPU COOLING
2. PROCESSOR COOLING
3. MOTHERBOARD COOLING
WEB CAMERA
MONITOR
KEY BOARD
MOUSE
BUZZER
SOFTWARE REQUIREMENTS
OPEN CV LIBRARY
EMGU CV LIBRARY
C# , C++
WINDOWS PLATFORM,LINUX PLATFORM
25 | P a g e
CHAPTER 6-IMPLEMENTATION DETAILS
6.1.1 C#
6.1.2 Open CV
OpenCV was designed for computational efficiency and with a strong focus on real- time
applications. OpenCV is written in optimized C and can take advantage of multicore
processors. If you desire further automatic optimization on Intel architectures [Intel], you
can buy Intels Integrated Performance Primitives (IPP) libraries [IPP], which consist of
low-level optimized routines in many diff rent algorithmic areas. OpenCV automatically
uses the appropriate IPP library at runtime if that library is installed.
6.1.3 Emgu CV
Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library.
Allowing OpenCV functions to be called from .NET compatible languages such as C#,
VB, VC++, IronPython etc. The wrapper can be compiled in Mono and run on Windows,
Linux, Mac OS X, iPhone, iPad and Android devices.
Windows,
iOS (iPhone, IPad,
OS Linux, Mac Windows Android
IPod Touch)
OSX
GPU Processing X X
Machine
Learning
Tesseract OCR
Intel C++
Compiler (fast X X X
code)
Exception
Handling
27 | P a g e
Debugger
X X
Visualizer
Emgu.CV.UI X X
28 | P a g e
Fig:6.1 Emgu CV architecture
29 | P a g e
6.1.4 Implementation Procedure
Camera Capture
Face detection
Face Recognition
Alarm Out
STEP-1: open Visual Studio 2010 and select File-> New->Project as follows:
30 | P a g e
STEP-2: in the Visual C# Project menu, Select "Windows Forms Application" and name
the project "Camera Capture", and Click "OK"
STEP-3: Lets first add Emgu References to our project.(though you can add them at any
time later BUT you must add references before debugging.) Select the Browse tab in the
window that pops up, go to Emgu Cv's bin folder as in Level-0 tutorial, and select the
following 3 .dllfiles (Emgu.CV.dll, Emgu.CV.UI.dll and Emgu.Util.dll) click OK to
continue.
STEP-5: Rename Form1.cs to CameraCapture.cs and change its Text Field to "Camera
Output Add Emgu CV Tools to your Visual Studio, because we will be using those
tools, such as Image Box. Add a button to the form and please do some more required
"housekeeping" as below:
Name: CamImageBox
31 | P a g e
Button properties:
(Name): btnStart
Text: Start!
A classifier uses data stored in an XML file to decide how to classify each image
location. So naturally, Haar will need some XML file to load trained data from.
You'll need to tell the classifier (Haar object in this case) where to find this data file you
want it to use. It's better to locate the XML file we want to use and make sure our path to
it is correct, before we code the rest of our face-detection program.
32 | P a g e
haar = new HaarCascade("haarcascade_frontalface_alt_tree.xml");
'MCvAvgComp[]
HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
STEPS 3 DRAW THE LABEL FOR EACH FACE DETECTED AND RCOGNIZED
33 | P a g e
Console.Beep(2000, 1000);
namespace MultiFaceRec
{
public partial class FrmPrincipal : Form
{
//Declararation of all variables, vectors and haarcascades
Image<Bgr, Byte> currentFrame;
Capture grabber;
HaarCascade face;
HaarCascade eye;
MCvFont font = new MCvFont(FONT.CV_FONT_HERSHEY_TRIPLEX, 0.5d, 0.5d);
Image<Gray, byte> result, TrainedFace = null;
Image<Gray, byte> gray = null;
List<Image<Gray, byte>> trainingImages = new List<Image<Gray, byte>>();
List<string> labels= new List<string>();
List<string> NamePersons = new List<string>();
int ContTrain, NumLabels, t;
string name, names = null;
public FrmPrincipal()
{
InitializeComponent();
//Load haarcascades for face detection
face = new HaarCascade("haarcascade_frontalface_default.xml");
eye = new HaarCascade("haarcascade_eye.xml");
try
{
//Load of previus trainned faces and labels for each image
string Labelsinfo = File.ReadAllText(Application.StartupPath +
"/TrainedFaces/TrainedLabels.txt");
string[] Labels = Labelsinfo.Split('%');
NumLabels = Convert.ToInt16(Labels[0]);
ContTrain = NumLabels;
string LoadFaces;
34 | P a g e
for (int tf = 1; tf < NumLabels+1; tf++)
{
LoadFaces = "face" + tf + ".bmp";
trainingImages.Add(new Image<Gray,
byte>(Application.StartupPath + "/TrainedFaces/" + LoadFaces));
labels.Add(Labels[tf]);
}
}
catch(Exception e)
{
//MessageBox.Show(e.ToString());
MessageBox.Show("Nothing in binary database, please add at least a
face(Simply train the prototype with the Add Face Button).", "Triained faces
load", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
}
//Face Detector
MCvAvgComp[][] facesDetected = gray.DetectHaarCascade(
face,
1.2,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new Size(20, 20));
//resize face detected image for force to compare the same size
with the
//test image with cubic interpolation type method
35 | P a g e
TrainedFace = result.Resize(100, 100,
Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
trainingImages.Add(TrainedFace);
labels.Add(textBox1.Text);
//Convert it to Grayscale
gray = currentFrame.Convert<Gray, Byte>();
//Face Detector
MCvAvgComp[][] facesDetected = gray.DetectHaarCascade(
face,
1.2,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new Size(20, 20));
36 | P a g e
result = currentFrame.Copy(f.rect).Convert<Gray,
byte>().Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
//draw the face detected in the 0th (gray) channel with blue color
currentFrame.Draw(f.rect, new Bgr(Color.Red), 2);
if (trainingImages.ToArray().Length != 0)
{
//TermCriteria for face recognition with numbers of trained
images like maxIteration
MCvTermCriteria termCrit = new MCvTermCriteria(ContTrain,
0.001); int thrs = 1000;
try
{
thrs = int.Parse(txtThreshold.Text);
}
catch (Exception ex)
{
// MessageBox.Show("Enter integer as threshold value");
}
name = recognizer.Recognize(result);
//add sound
NamePersons[t - 1] = name;
NamePersons.Add("");
/*
//Set the region of interest on the faces
gray.ROI = f.rect;
MCvAvgComp[][] eyesDetected = gray.DetectHaarCascade(
37 | P a g e
eye,
1.1,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new Size(20, 20));
gray.ROI = Rectangle.Empty;
}
t = 0;
label4.Text = names;
names = "";
{
NamePersons.Clear();
38 | P a g e
private void textBox2_TextChanged(object sender, EventArgs e)
{
}
}
namespace Emgu.CV
{
/// <summary>
/// An object recognizer using PCA (Principle Components Analysis)
/// </summary>
[Serializable]
public class EigenObjectRecognizer
{
private Image<Gray, Single>[] _eigenImages;
private Image<Gray, Single> _avgImage;
private Matrix<float>[] _eigenValues;
private string[] _labels;
private double _eigenDistanceThreshold;
/// <summary>
/// Get the eigen vectors that form the eigen space
/// </summary>
/// <remarks>The set method is primary used for deserialization, do not
attemps to set it unless you know what you are doing</remarks>
public Image<Gray, Single>[] EigenImages
{
get { return _eigenImages; }
set { _eigenImages = value; }
}
/// <summary>
/// Get or set the labels for the corresponding training image
/// </summary>
public String[] Labels
{
get { return _labels; }
set { _labels = value; }
}
/// <summary>
/// Get or set the eigen distance threshold.
/// The smaller the number, the more likely an examined image will be
treated as unrecognized object.
/// Set it to a huge number (e.g. 5000) and the recognizer will always
treated the examined image as one of the known object.
/// </summary>
public double EigenDistanceThreshold
{
get { return _eigenDistanceThreshold; }
set { _eigenDistanceThreshold = value; }
}
39 | P a g e
/// <summary>
/// Get the average Image.
/// </summary>
/// <remarks>The set method is primary used for deserialization, do not
attemps to set it unless you know what you are doing</remarks>
public Image<Gray, Single> AverageImage
{
get { return _avgImage; }
set { _avgImage = value; }
}
/// <summary>
/// Get the eigen values of each of the training image
/// </summary>
/// <remarks>The set method is primary used for deserialization, do not
attemps to set it unless you know what you are doing</remarks>
public Matrix<float>[] EigenValues
{
get { return _eigenValues; }
set { _eigenValues = value; }
}
private EigenObjectRecognizer()
{
}
/// <summary>
/// Create an object recognizer using the specific tranning data and
parameters, it will always return the most similar object
/// </summary>
/// <param name="images">The images used for training, each of them should
be the same size. It's recommended the images are histogram normalized</param>
/// <param name="termCrit">The criteria for recognizer training</param>
public EigenObjectRecognizer(Image<Gray, Byte>[] images, ref MCvTermCriteria
termCrit)
: this(images, GenerateLabels(images.Length), ref termCrit)
{
}
/// <summary>
/// Create an object recognizer using the specific tranning data and
parameters, it will always return the most similar object
/// </summary>
/// <param name="images">The images used for training, each of them should
be the same size. It's recommended the images are histogram normalized</param>
/// <param name="labels">The labels corresponding to the images</param>
/// <param name="termCrit">The criteria for recognizer training</param>
public EigenObjectRecognizer(Image<Gray, Byte>[] images, String[] labels,
ref MCvTermCriteria termCrit)
: this(images, labels, 0, ref termCrit)
{
}
40 | P a g e
/// <summary>
/// Create an object recognizer using the specific tranning data and
parameters
/// </summary>
/// <param name="images">The images used for training, each of them should
be the same size. It's recommended the images are histogram normalized</param>
/// <param name="labels">The labels corresponding to the images</param>
/// <param name="eigenDistanceThreshold">
/// The eigen distance threshold, (0, ~1000].
/// The smaller the number, the more likely an examined image will be
treated as unrecognized object.
/// If the threshold is < 0, the recognizer will always treated the
examined image as one of the known object.
/// </param>
/// <param name="termCrit">The criteria for recognizer training</param>
public EigenObjectRecognizer(Image<Gray, Byte>[] images, String[] labels,
double eigenDistanceThreshold, ref MCvTermCriteria termCrit)
{
Debug.Assert(images.Length == labels.Length, "The number of images should
equals the number of labels");
Debug.Assert(eigenDistanceThreshold >= 0.0, "Eigen-distance threshold
should always >= 0.0");
/*
_avgImage.SerializationCompressionRatio = 9;
_labels = labels;
_eigenDistanceThreshold = eigenDistanceThreshold;
}
41 | P a g e
IntPtr[] inObjs = Array.ConvertAll<Image<Gray, Byte>,
IntPtr>(trainingImages, delegate(Image<Gray, Byte> img) { return img.Ptr; });
CvInvoke.cvCalcEigenObjects(
inObjs,
ref termCrit,
eigObjs,
null,
avg.Ptr);
}
/// <summary>
/// Decompose the image as eigen values, using the specific eigen vectors
/// </summary>
/// <param name="src">The image to be decomposed</param>
/// <param name="eigenImages">The eigen images</param>
/// <param name="avg">The average images</param>
/// <returns>Eigen values of the decomposed image</returns>
public static float[] EigenDecomposite(Image<Gray, Byte> src, Image<Gray,
Single>[] eigenImages, Image<Gray, Single> avg)
{
return CvInvoke.cvEigenDecomposite(
src.Ptr,
Array.ConvertAll<Image<Gray, Single>, IntPtr>(eigenImages,
delegate(Image<Gray, Single> img) { return img.Ptr; }),
avg.Ptr);
}
#endregion
/// <summary>
/// Given the eigen value, reconstruct the projected image
/// </summary>
/// <param name="eigenValue">The eigen values</param>
/// <returns>The projected image</returns>
public Image<Gray, Byte> EigenProjection(float[] eigenValue)
{
Image<Gray, Byte> res = new Image<Gray, byte>(_avgImage.Width,
_avgImage.Height);
CvInvoke.cvEigenProjection(
Array.ConvertAll<Image<Gray, Single>, IntPtr>(_eigenImages,
delegate(Image<Gray, Single> img) { return img.Ptr; }),
eigenValue,
_avgImage.Ptr,
res.Ptr);
return res;
}
42 | P a g e
/// <summary>
/// Get the Euclidean eigen-distance between <paramref name="image"/> and
every other image in the database
/// </summary>
/// <param name="image">The image to be compared from the training
images</param>
/// <returns>An array of eigen distance from every image in the training
images</returns>
public float[] GetEigenDistances(Image<Gray, Byte> image)
{
using (Matrix<float> eigenValue = new
Matrix<float>(EigenDecomposite(image, _eigenImages, _avgImage)))
return Array.ConvertAll<Matrix<float>, float>(_eigenValues,
delegate(Matrix<float> eigenValueI)
{
return (float)CvInvoke.cvNorm(eigenValue.Ptr, eigenValueI.Ptr,
Emgu.CV.CvEnum.NORM_TYPE.CV_L2, IntPtr.Zero);
});
}
/// <summary>
/// Given the <paramref name="image"/> to be examined, find in the database
the most similar object, return the index and the eigen distance
/// </summary>
/// <param name="image">The image to be searched from the database</param>
/// <param name="index">The index of the most similar object</param>
/// <param name="eigenDistance">The eigen distance of the most similar
object</param>
/// <param name="label">The label of the specific image</param>
public void FindMostSimilarObject(Image<Gray, Byte> image, out int index,
out float eigenDistance, out String label)
{
float[] dist = GetEigenDistances(image);
index = 0;
eigenDistance = dist[0];
for (int i = 1; i < dist.Length; i++)
{
if (dist[i] < eigenDistance)
{
index = i;
eigenDistance = dist[i];
}
}
label = Labels[index];
}
/// <summary>
/// Try to recognize the image and return its label
/// </summary>
/// <param name="image">The image to be recognized</param>
/// <returns>
/// String.Empty, if not recognized;
/// Label of the corresponding image, otherwise
/// </returns>
public String Recognize(Image<Gray, Byte> image)
{
int index;
float eigenDistance;
String label;
FindMostSimilarObject(image, out index, out eigenDistance, out label);
43 | P a g e
return (_eigenDistanceThreshold <= 0 || eigenDistance <
_eigenDistanceThreshold ) ? _labels[index] : String.Empty;
}
}
}
#include <stdio.h>
#if defined WIN32 || defined _WIN32
#include <conio.h> // For _kbhit() on Windows
#include <direct.h> // For mkdir(path) on Windows
#define snprintf sprintf_s // Visual Studio on Windows comes
with sprintf_s() instead of snprintf()
#else
#include <stdio.h> // For getchar() on Linux
#include <termios.h> // For kbhit() on Linux
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h> // For mkdir(path, options) on Linux
#endif
#include <vector>
#include <string>
//#include <string.h>
#include "cv.h"
#include "cvaux.h"
#include "highgui.h"
#ifndef BOOL
#define BOOL bool
#endif
// Global variables
IplImage ** faceImgArr = 0; // array of face images
CvMat * personNumTruthMat = 0; // array of person numbers
//#define MAX_NAME_LENGTH 256 // Give each name a fixed
size for easier code.
//char **personNames = 0; // array of person names
(indexed by the person number).
vector<string> personNames; // array of person names
(indexed by the person number).
int faceWidth = 120; // Default dimensions for faces in the face
recognition database.
44 | P a g e
int faceHeight = 90; // " " " "
" " " "
int nPersons = 0; // the number of people in the
training set.
int nTrainFaces = 0; // the number of training images
int nEigens = 0; // the number of eigenvalues
IplImage * pAvgTrainImg = 0; // the average image
IplImage ** eigenVectArr = 0; // eigenvectors
CvMat * eigenValMat = 0; // eigenvalues
CvMat * projectedTrainFaceMat = 0; // projected training faces
// Function prototypes
void printUsage();
void learn(const char *szFileTrain);
void doPCA();
void storeTrainingData();
int loadTrainingData(CvMat ** pTrainPersonNumMat);
int findNearestNeighbor(float * projectedTestFace);
int findNearestNeighbor(float * projectedTestFace, float
*pConfidence);
int loadFaceImgArray(const char * filename);
void recognizeFileList(const char *szFileTest);
void recognizeFromCam(void);
IplImage* getCameraFrame(void);
IplImage* convertImageToGreyscale(const IplImage *imageSrc);
IplImage* cropImage(const IplImage *img, const CvRect region);
IplImage* resizeImage(const IplImage *origImg, int newWidth, int
newHeight);
IplImage* convertFloatImageToUcharImage(const IplImage *srcImg);
void saveFloatImage(const char *filename, const IplImage *srcImg);
CvRect detectFaceInImage(const IplImage *inputImg, const
CvHaarClassifierCascade* cascade );
CvMat* retrainOnline(void);
// Startup routine.
int main( int argc, char** argv )
{
printUsage();
if ( dir == 1 ) {
tcgetattr( STDIN_FILENO, &oldt);
newt = oldt;
newt.c_lflag &= ~( ICANON | ECHO );
tcsetattr( STDIN_FILENO, TCSANOW, &newt);
}
else
tcsetattr( STDIN_FILENO, TCSANOW, &oldt);
}
tv.tv_sec = 0;
tv.tv_usec = 0;
46 | P a g e
FD_ZERO(&rdfs);
FD_SET (STDIN_FILENO, &rdfs);
// Train from the data in the given text file, and store the trained
data into the file 'facedata.xml'.
void learn(const char *szFileTrain)
47 | P a g e
{
int i, offset;
48 | P a g e
printf("Can't open training database file
'facedata.xml'.\n");
return 0;
}
return 1;
}
// Read the names & image filenames of people from a text file, and
load all those images listed.
int loadFaceImgArray(const char * filename)
{
FILE * imgListFile = 0;
char imgFilename[512];
int iFace, nFaces=0;
int i;
if( !faceImgArr[iFace] )
{
fprintf(stderr, "Can\'t load image from %s\n",
imgFilename);
return 0;
}
}
fclose(imgListFile);
return nFaces;
}
// Recognize the face in each of the test images given, and compare
the results with the truth.
void recognizeFileList(const char *szFileTest)
{
int i, nTestFaces = 0; // the number of test images
53 | P a g e
CvMat * trainPersonNumMat = 0; // the person numbers during
training
float * projectedTestFace = 0;
const char *answer;
int nCorrect = 0;
int nWrong = 0;
double timeFaceRecognizeStart;
double tallyFaceRecognizeTime;
float confidence;
iNearest = findNearestNeighbor(projectedTestFace,
&confidence);
truth = personNumTruthMat->data.i[i];
nearest = trainPersonNumMat->data.i[iNearest];
if (nearest == truth) {
answer = "Correct";
nCorrect++;
}
else {
answer = "WRONG!";
nWrong++;
}
printf("nearest = %d, Truth = %d (%s). Confidence =
%f\n", nearest, truth, answer, confidence);
}
tallyFaceRecognizeTime = (double)cvGetTickCount() -
timeFaceRecognizeStart;
if (nCorrect+nWrong > 0) {
printf("TOTAL ACCURACY: %d%% out of %d tests.\n",
nCorrect * 100/(nCorrect+nWrong), (nCorrect+nWrong));
printf("TOTAL TIME: %.1fms average.\n",
tallyFaceRecognizeTime/((double)cvGetTickFrequency() * 1000.0 *
(nCorrect+nWrong) ) );
54 | P a g e
}
// Grab the next camera frame. Waits until the next frame is ready,
// and provides direct access to it, so do NOT modify the returned
image or free it!
// Will automatically initialize the camera on the first frame.
IplImage* getCameraFrame(void)
{
IplImage *frame;
return outImg;
}
if (img->depth != IPL_DEPTH_8U) {
printf("ERROR in cropImage: Unknown image depth of %d
given in cropImage() instead of 8 bits per pixel.\n", img->depth);
exit(1);
}
cvReleaseImage( &imageTmp );
return imageRGB;
}
// Deal with NaN and extreme values, since the DFT seems
to give some NaN results.
if (cvIsNaN(minVal) || minVal < -1e30)
minVal = -1e30;
if (cvIsNaN(maxVal) || maxVal > 1e30)
maxVal = 1e30;
if (maxVal-minVal == 0.0f)
maxVal = minVal + 0.001; // remove potential
divide by zero errors.
57 | P a g e
// Convert the format
dstImg = cvCreateImage(cvSize(srcImg->width, srcImg-
>height), 8, 1);
cvConvertScale(srcImg, dstImg, 255.0 / (maxVal - minVal),
- minVal * 255.0 / (maxVal-minVal));
}
return dstImg;
}
// Perform face detection on the input image, using the given Haar
cascade classifier.
// Returns a rectangle for the detected region in the given image.
CvRect detectFaceInImage(const IplImage *inputImg, const
CvHaarClassifierCascade* cascade )
{
const CvSize minFeatureSize = cvSize(20, 20);
const int flags = CV_HAAR_FIND_BIGGEST_OBJECT |
CV_HAAR_DO_ROUGH_SEARCH; // Only search for 1 face.
const float search_scale_factor = 1.1f;
IplImage *detectImg;
IplImage *greyImg = 0;
CvMemStorage* storage;
CvRect rc;
double t;
CvSeq* rects;
int i;
storage = cvCreateMemStorage(0);
cvClearMemStorage( storage );
//cvReleaseHaarClassifierCascade( &cascade );
//cvReleaseImage( &detectImg );
if (greyImg)
cvReleaseImage( &greyImg );
cvReleaseMemStorage( &storage );
return trainPersonNumMat;
}
// Create a GUI window for the user to see the camera image.
cvNamedWindow("Input", CV_WINDOW_AUTOSIZE);
while (1)
{
int iNearest, nearest, truth;
IplImage *camImg;
IplImage *greyImg;
IplImage *faceImg;
IplImage *sizedImg;
IplImage *equalizedImg;
IplImage *processedFaceImg;
CvRect faceRect;
IplImage *shownImg;
int keyPressed = 0;
FILE *trainFile;
float confidence;
61 | P a g e
if (i > 0 && (newPersonName[i-1] == 10 ||
newPersonName[i-1] == 13)) {
newPersonName[i-1] = 0;
i--;
}
if (i > 0 && (newPersonName[i-1] == 10 ||
newPersonName[i-1] == 13)) {
newPersonName[i-1] = 0;
i--;
}
if (i > 0) {
printf("Collecting all images until you
hit 't', to start Training the images as '%s' ...\n",
newPersonName);
newPersonFaces = 0; // restart
training a new person
saveNextFaces = TRUE;
}
else {
printf("Did not get a valid name from
you, so will ignore it. Hit 'n' to retry.\n");
}
break;
case 't': // Start training
saveNextFaces = FALSE; // stop saving next
faces.
// Store the saved data into the training
file.
printf("Storing the training data for new
person '%s'.\n", newPersonName);
// Append the new person to the end of the
training data.
trainFile = fopen("train.txt", "a");
for (i=0; i<newPersonFaces; i++) {
snprintf(cstr, sizeof(cstr)-1,
"data/%d_%s%d.pgm", nPersons+1, newPersonName, i+1);
fprintf(trainFile, "%d %s %s\n",
nPersons+1, newPersonName, cstr);
}
fclose(trainFile);
//break;
//case 'r':
}//endif nEigens
else
{
snprintf(text, sizeof(text)-1, "Unauthorised Person:
'%s'");
cvPutText(shownImg, text, cvPoint(faceRect.x,
faceRect.y + faceRect.height + 15), &font, textColor);
snprintf(text, sizeof(text)-1, "Confidence:
%f", confidence);
cvPutText(shownImg, text, cvPoint(faceRect.x,
faceRect.y + faceRect.height + 30), &font, textColor);
}
}
}
// Give some time for OpenCV to draw the GUI and check if
the user has pressed something in the GUI window.
keyPressed = cvWaitKey(10);
if (keyPressed == VK_ESCAPE) { // Check if the user
hit the 'Escape' key in the GUI window.
break; // Stop processing input.
}
cvReleaseImage( &shownImg );
}
tallyFaceRecognizeTime = (double)cvGetTickCount() -
timeFaceRecognizeStart;
65 | P a g e
6.2 Hardware Section
66 | P a g e
6.2.2.1AMD-BULDOZER FX 4100
67 | P a g e
6.2.2.2 BIOSTAR MOTHERBOARD
AM3+ Socket
Ethernet port
68 | P a g e
6.2.2.3 GRAPHICS CARD HD Radeon 7770 OC
ATI Manufactured
Clock 1.1Ghz
DDR5 1GB
Directx 11.1
Open GL support
Vapour x technology
69 | P a g e
6.2.2.4 COOLING SYSTEMS
As we all are aware of the fact that our computers are very hot when they are processing
large volumes of data as here in the case of face recognition and detection. For increased
efficiency we deploy specialised cooling methodologies like copper tube water assembly
or more efficient commercially available variants. Our processor rated at around 95w
working 24x7 will lead to an extremely hot stage so we need special cooling system
rather than that comes with the processor which is a conventional convection type air
cooler and te graphics card will have its own separate cooling arrangement as the
revolutionary vapour x technology we will look into the details as we go.
1. GPU Cooling
2. Processor Cooling
GPU Cooling
This is the opened view of the vapour x technology device by ATI here the
conventional cooling systems deployed by laptop processors thats the water
copper tube assembly is deployed for the cooling of the GPU core. As all the
gamers have experienced the overheated graphics is a really nasty problems since
our processor is ready to go on but the card wont let us game. So this technique
ensures a constant cooling delivered to the GPU core so we dont need to stop for
the card to cool down. The heated fluid will rise through the tube with capillary
action and cooled down with a conventional heat sink fan assembly. The working
is essentially what we consider as heat pipes
Processor cooling
70 | P a g e
The processor needs to keep at the prescribed heat level to ensure maximum life
and to ensure proper working. The picture below is the conventional laptop
processor cooling system since laptop are prepared to work with maximum
efficiency and has only a very low space for a heat sink assembly so a bulky heat
sink as in the case of a desktop is not affordable
Liquid cooling
To ensure maximum efficiency in cooling we utilize liquid cooling as an
alternative to conventional cooling since the system will be up and running 24x7 a
reliable cooling system is a must. So we deploy the Atech khuler h20 620 liquid
cooler since it was the most rated cooler under 5k range. The photo of the
assembly is given over leaf
71 | P a g e
6.2.2.5 Implementation Challenges
At our primary stages of project planning we were inclined to the well know
development board the Beagle board. The details of the platform is like this ARM A8
cortex 512 mb ram and a hdmi port along with audio jack so all that we want was in-built
on board so we chose the beagle as implementation platform later when we began to
actually implement and debug the code on beagle the board was not that awesome there
was 2-5 seconds lag in images.
So we had to re-think our strategy, then we thought out of the box and came up
with this radical idea of implementing the code on a custom built pc assembled by us and
72 | P a g e
by Gods grace we succeeded. There we made a pc which would work 24x7 without a
hitch.
Environmental challenges
In our code we do histogram matching so that the brightness variation in the images took
at say different times of a day or in a different background will not affect our recognition
patterns. But the environmental factors like light intensity mainly has a tremendous role
on making the efficiency of recognition drop beyond 50% even though we took more than
55 pictures of the subject we want to recognize the we are utilizing greyscale images for
detection and recognition so brightness has a great role in this. The alternative of this will
be to use a different algorithm like AAM or to implement a booth in the clearance area
where passengers are checking in and out thus we will have control over the luminance.
73 | P a g e
CHAPTER 7 - TEST AND TRIAL
74 | P a g e
CHAPTER 8 APPLICATIONS OF THE PROPOSED SYSTEM
8.2 CABS:
Store all offence-related detain one easy-to-use system -- data is entered once and
only once.
Integrate with any database -- including other detachments and other applications
(RMS, CAD, Jail Management systems, and "most-wanted" databases) .
Link victims to offenders -- to aid in criminal analysis and investigations
Capture and store digital images of the offender -- encode all mug shots, marks,
tattoos, and scars
Perform rapid and accurate searches -- on all data and image fields for crime
statistics and reporting
Produce digital line-ups -- using any stored image in minutes
Identify previous offenders -- pre-integrated with advanced biometric face
recognition software.
75 | P a g e
Since the terrorist events of September 11, 2001, the world has paid much more
attention to the idea of Homeland Defence, and both governments and private
industries alike are committed to the cause of national defence.
This includes everything from preventing terrorists from boarding aircraft, to
protecting critical infrastructure from attack or tampering (e.g. dams, bridges,
water reservoirs, energy plants, etc.), to the identification of known terrorists.
8.7 Immigration:
Most countries do not want to be perceived as being a "weak link" when it comes
to accepting immigrants and refugees, particularly if that individual uses the new
country as a staging ground for multi-national criminal and terrorist activities.
Consequently, governments around the world are examining their immigration
policies and procedures.
Biometric technology, particularly face recognition software, can enhance the
effectiveness of immigration and customs personnel. After all, to the human eye it
is often difficult to determine a person's identity by looking at a photo, especially
if the person has aged, is of a different ethnic background, has altered their hair
style, shaved their beard, etc
76 | P a g e
CHAPTER 9- CONCLUSION
The human face plays an important role in our social interactions conveying peoples
identities. Using human face as a key to security biometric face recognition technology
has received significant attention in the past several years due to the potential of wide
verity of applications both law-enforcement and non.
As compared with other biometric systems using palm print finger print and iris
recognitions face recognition has a significant advantage due to its non contact process.
The face images can be captured at a distance without touching the person being
identified and identification does not require interaction with person. In addition face
recognition also serves a crime deterrent purpose because face images that have been
recorded and archived can later help identify a person
Our system can identify human face in a picture or a video. In our system we add a
security system on airport as to detect any offenders travelling in fake identity or to detect
multiple passports for one person; with a system like ours we could easily do these things.
But the scope of this system is not limited to this it could be further improved.
77 | P a g e
REFERENCES
Learning OpenCV Computer Vision with the OpenCV Library- Gary Bradski and
Adrian Kaehler
78 | P a g e
APPENDIX -1
COST ESTIMATION
79 | P a g e