Você está na página 1de 33



TOPIC Certificate Acknowledgement Summary List of Figures

PAGE NO 2 3 4 5 6 8 13 17 19 21 27 30 31

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8

Introduction Different Color Tones Algorithm Matlab Functions Used Threshold Conditions Matlab Code Experimental Results Conclusion References


This is to certify that the work titled SKIN SEGMENTATION BASED FACE DETECTION submitted by Anurag Gupta in partial fulfillment for the award of degree of B. Tech of Jaypee Institute of Information Technology University, Noida has been carried out under my supervision. This work has not been submitted partially or wholly to any other University or Institute for the award of this or any other degree or diploma.

Signature of Supervisor Name of Supervisor Designation Date

: : Mrs. Bhawna Gupta : Senior Lecturer : 19th November 2009


It is indeed a privileged opportunity to express my sincere gratitude to all those who have helped me in completing this project. In completing this work, I avail the opportunity to express my thanks with deep sense of gratitude to my respected supervisor Mrs. Bhawna Gupta whose technical and analytical acumen prompted the genesis of this project. Her esteemed individual guidance and helpful suggestions for improving the presentation of this thesis are certainly invaluable. I am especially grateful to the Faculty of Electronics & Communication Department for all the help provided by them and the resources that they made available without which the project would not have reached its current stage. I would like to thank my parents whose good wishes and silent blessings always remained with me throughout the course of the project. I would also like to thank all the fellow students and all those who have directly or indirectly helped me during this project.

Name of Student

: Anurag Gupta

Signature of student : Date : 19th November 2009


Face Detection is finding out the location of human faces in images. This report is the outcome of application of face detection algorithm for the above defined purpose. A brief introduction about face detection and its importance has been given. Various color modes of images have been described followed by detailed explanation of the algorithm used and its subsequent outcome. A few examples have been discussed along with the reasons for not obtaining full accuracy. A brief description of the Matlab functions used has also been given.

Signature of Student Name : Anurag Gupta Date : 19th November 2009

Signature of Supervisor Name : Mrs. Bhawna Gupta Date : 19th November 2009


Fig 2.1 RGB colors Fig 2.2 RGB Color cube Fig 2.3 YcbCr Cube Fig 2.4 HSV mode Fig 2.5 Double cone model of hsv mode Fig 3.1 Expected Outcome Fig 7.1- Original Image Fig 7.2 - Binary mask for YCbCr mode Fig 7.3 - Binary mask for HSV mode Fig 7.4 - Binary mask for RGB mode Fig 7.5 Original Image Fig 7.6 Binary mask for YCbCr mode Fig 7.7 Binary mask for HSV mode Fig 7.8 Binary mask for RGB mode Fig 7.9 Cb vs Cr plot


The goal of this project is to take a color digital image and indicate the location of faces in that image. Detection of faces in a digital image has gained much importance in the last decade, with applications in fields such as law enforcement and security. Although facial detection is an extremely simple task to the human eye, automating the process to a computer requires the use of various image processing techniques. Locating and tracking human faces is a prerequisite for face recognition and/or facial expressions analysis, although it is often assumed that a normalized face image is available. In order to locate a human face, the system needs to capture an image using a camera and a frame-grabber to process the image for locating the faces present in the image, if any.

1.1 What is face detection

Face Detection is a computer based technology to determine the location of a face ( or faces) in an image regardless of its size, color, illumination and ignoring all other constituents of the image. Face detection can be regarded as a specific case of object-class detection. In object-class detection, the task is to find the locations and sizes of all objects in an image that belong to a given class.

1.2 Why face detection is important Human face perception is currently an active research area in the computer vision community. Face detection is an important area of image processing, being the first step towards face recognition, video surveillance and image database management. It has important applications in bioinformatics. The task of face detection is extremely trivial for humans, but it is a challenge to enable computers to carry out the same task. Given an image, a face detection algorithm will locate all the faces in the image. There exist different approaches for face detection, the

main ones being feature based and image based algorithms. Feature based algorithm uses methods like edge detection, skin color, symmetry analysis, while the latter uses neural networks.


Different color tones of an image The study on skin color classification has gained increasing attention in recent years due to the active research in content-based image representation. For instance, the ability to locate image object as a face can be exploited for image coding, editing, indexing or other user interactivity purposes. Moreover, face localization also provides a good stepping stone in facial expression studies.

An image can be represented in various color tones. Each color tone has different properties and parameters from which we can generate different color tones. Some of them are RGB, HSV, YCbCr, CMYK (Cyan Magenta Yellow Key black), and TSL (Tint Saturation Lightness). Our emphasis in this project is on RGB, YCbCr and HSV modes.

2.1 RGB Mode (Red Green Blue) The RGB color model is an additive color model in which red, green,

and blue light are added together in various ways to reproduce a broad array of

colors. The name of the model comes from the initials of the three additive primary colors, red, green, and blue. Spectral components of these colors combine additively to produce a resultant color. It is one of the most widely used color spaces for processing and storing of digital image data. Zero intensity for each component gives the darkest color (no light, considered the black), and full intensity of each gives a white. The quality of this white depends on the nature of the primary light sources, but if they are properly balanced, the result is a neutral white matching the system's white point. When the intensities are different, the result is a colorized hue, more or less saturated depending on the difference of the strongest and weakest of the intensities of the primary colors employed.

Fig 2.1 RGB colors The RGB model can be represented by a 3-dimensional cube in 3-D space with its center at the origin and R, G, B components along the three axes. At the origin, R=G=B=0 representing black while the point R=G=B=1 represents white. On the color cube red is (1, 0, 0), green is (0, 1, 0) and blue is (0, 0, 1). In a 24-bit color graphics system with 8 bits per color channel, red is (255, 0, 0), green is (0, 255, 0) and blue is (0, 0,255).

Fig 2.2 RGB Color cube

2.2 YCbCr Mode (Yellow Blue difference Green difference)

YCbCr color space has been defined in response to increasing demands for digital algorithms in handling video information, and has since become a widely used model in a digital video. These color spaces separate RGB (Red-Green-Blue) into luminance and chrominance components. Y is the luminance and Cb and Cr are the bluedifference (Blue-luminance) and red-difference (Redluminance) chroma components. YCbCr is not an absolute color space. It is a way of encoding RGB information. The actual color displayed depends on the actual RGB colorants used to display the original image. YCbCr mode is used for image compression work.


Y has an excursion of 219 and an offset of +16. This coding places black at code 16 and white at code 235. Cb and Cr have excursions of +112 and offset of +128, producing a range from 16 to 240 inclusively. (128 represent 0).

Fig 2.3 YcbCr Cube

2.3 HSV/HSI mode (Hue saturation Value/Intensity) HSV stands for hue, saturation and value. Hue is one of the main properties of a color. It represents a pure color (without any tint or shade). A hue is an element of the color wheel.

Fig 2.4 HSV mode


Saturation is the difference of a color against its own brightness. The saturation of a color is determined by a combination of light intensity and how much it is distributed across the spectrum of different wavelengths. The purest color is achieved by using just one wavelength at a high intensity, such as in laser light. If the intensity drops, so does the saturation. Intensity is the brightness or dullness of a hue. One may lower the intensity by adding white or black.

Many applications use the HSI color model. Machine vision uses HSI color space in identifying the color of different objects. Image processing applications such as histogram operations, intensity transformations and convolutions operate only on an intensity image. These operations are performed with much ease on an image in the HIS color space.

Fig 2.5 Double cone model of hsv mode


The hue (H) is represented as the angle, varying from 0 to 360. Saturation (S) corresponds to the radius, varying from 0 to 1. Intensity (I) varies along the z axis with 0 being black and 1 being white. When S = 0, color is a gray value of intensity 1. When S = 1, color is on the boundary of top cone base. The greater the saturation, the farther the color is from white/gray/black (depending on the intensity). Adjusting the hue will vary the color from red at 0o, through green at 120o, blue at 240o, and back to red at 360o. When I = 0, the color is black and therefore H is undefined. When S = 0, the color is grayscale. H is also undefined in this case. By adjusting I, a color can be made darker or lighter. By maintaining S = 1 and adjusting I, shades of that color are created.

For detecting face there are various algorithms including skin color based algorithms. There are two major algorithms for face detection.

Human Face Detection in Cluttered Color Images Using Skin Color and Edge Information Face Detection based on skin colors using neural networks.

Color is an important feature of human faces. Using skin-color as a feature for tracking a face has several advantages. Color processing is much faster than processing other facial features. Under certain lighting conditions, color is orientation invariant. This property makes motion estimation much easier because only a translation model is needed for motion estimation.


The algorithm used in this project is Skin Segmentation Based Face Detection. The steps involved in the algorithm are:
1. Skin Pixel Classification: Different color spaces used in skin detection

include HSV, RGB, and YCbCr. The image is converted from RGB mode to HSV and YcbCr modes. The values of the individual parameters in all three color modes are compared with the threshold values of each mode for skin pixel classification. Based on these threshold values a pixel is classified as skin (pixel=1) or nor skin (pixel=0) in a new skin matrix. Then the skin tone pixels from all the three modes are combined to get the skin tone region from the image.
2. Connectivity Analysis: Using the skin detected image, we know whether

a pixel is a skin pixel or not, but cannot say anything about whether a pixel belongs to a face or not. We have to group pixels that are connected to each other geometrically. We group the skin pixels in the image based on a 8-connected neighborhood i.e. if a skin pixel has got another skin pixel

in any of its 8 neighboring places, then both the pixels belong to the same region. At this stage, we have different regions and we have to classify each of these regions as a human face or not. This is done by finding the ratio of height and width of the region and comparing it with given threshold values as well as the percentage of skin in the rectangular area defined by the above parameters. For finding height, we locate the first and the last row of the skin matrix which contains skin pixel of a connected segment. To obtain the width, we locate the leftmost and rightmost columns in the skin matrix which contain skin pixels of the connected segment. We can imagine a rectangle at the boundary of each segment of different connected components.


3. Edge Information: Inside the rectangle obtained from the connectivity

analysis, we determine the percentage of skin in that region and compare it with the threshold value. If all the above conditions are satisfied then we mark the region as a face.

Proposed algorithm
Convert the input RGB image ( rgb(i,j) ) into HSV image ( hsv(i,j) ) and YcbCr image(Ycbcr(i,j))

Extraction of skin tone pixels from all 3 image modes

Calculate connected components for each segment


Calculation of ratio of length : width for each segment

Calculation of percentage of skin in a rectangle segment

Mark region as a face


Expected outcome

Fig 3.1 Expected outcome



1. Imread:-A = IMREAD(FILENAME,FMT) reads a grayscale or color

image from the file specified by the string FILENAME. If the file is not in the current directory, or in a directory on the MATLAB path, specify the full pathname. The return value A is an array containing the image data. If the file contains a grayscale image, A is an M-by-N array. If the file contains a true color image, A is an M-by-N-by-3 array. For TIFF files containing color images that use the CMYK color space, A is an M-by-Nby-4 array. The class of A depends on the bits-per-sample of the image data, rounded to the next byte boundary. For example, IMREAD returns 24-bit color data as an array of uint8 data because the sample size for each color component is 8 bits.

2. RGB2HSV:- H = RGB2HSV(M) converts an RGB color map to an HSV

color map. Each map is a matrix with any number of rows, exactly three columns, and elements in the interval 0 to 1. The columns of the input matrix, M, represent intensity of red, blue and green, respectively. The columns of the resulting output matrix, H, represent hue, saturation and color value, respectively. HSV = RGB2HSV(RGB) converts the RGB image RGB (3-D array) to the equivalent HSV image HSV (3-D array).


values in MAP to the YCBCR color space. MAP must be an M-by-3 array. YCBCRMAP is an M-by-3 matrix that contains the YCBCR luminance (Y) and chrominance (Cb and Cr) color values as columns. Each row represents the equivalent color to the corresponding row in the RGB color map.


YCBCR = RGB2YCBCR(RGB) converts the true color image RGB to the equivalent image in the YCBCR color space. RGB must be an M-by-N-by3 array.

4. BWLABEL :- L = BWLABEL(BW,N) returns a matrix L, of the same

size as BW, containing labels for the connected components in BW. N can have a value of either 4 or 8, where 4 specifies 4-connected objects and 8 specifies 8-connected objects; if the argument is omitted, it defaults to 8. The elements of L are integer values greater than or equal to 0. The pixels labeled 0 are the background. The pixels labeled 1 make up one object, the pixels labeled 2 make up a second object, and so on. [L,NUM] = BWLABEL(BW,N) returns in NUM the number of connected objects found in BW.



5.1 RGB Mode Threshold conditions used to extract skin region from an image in RGB color tone are: R > 95 and G > 40 and B > 20 max{R,G,B}min{R,G,B} > 15 |RG| > 15 and R > G and R > B Where R,G,B specify the individual components of each pixel. These three conditions have to be simultaneously satisfied to categorize the pixel as skin type or non skin type.

5.2 YCbCr Mode Threshold conditions used to extract skin region from an image in YCbCr mode are: 133<Cr<177 90<Cb<130 where Cr and Cb represent red difference and blue difference components of each pixel.

5.3 HSV mode

In hsv color mode, a pixel is classified as skin type if: 0<H<50 .23<S<.68 Where H and S are values of Hue and Saturation for each pixel.


5.4 Golden Ratio To check weather a certain skin region lies within the range of ratio of length: width, we define a term called Golden Ratio GR = 1+sqrt (5) 2 We also define a tolerance level: T=0.65 To categorize a region as a face: GR-T < Ratio (length: width) < GR+T

5.5 Skin Threshold Inside the rectangular region obtained from the above four conditions, the percentage of skin must be above a certain level to mark it as a face. This threshold value is .56



clc; close all img=imread('test01.jpg'); % img as an m by n by 3 matrix imshow(img),title('original'); img; % look at the matrix m = size(img,1); % number of rows n = size(img,2); % number of columns p = size(img,3); % 3 values(RGB) for each pixel img1=rgb2ycbcr(img); ycbcr_skin = zeros(m,n); r=img(:,:,1); % r,g,b are all matrices here having dimensions m by n and contain the respective r g and b color values for each pixel g=img(:,:,2); b=img(:,:,3); y=img1(:,:,1); % y cb and cr values of each pixel after converting image into ycbcr color tone cb=img1(:,:,2); cr=img1(:,:,3); %figure; %imshow(img1); % display the ycbcr image cr1=137;% threshold values of cb and cr for skin region cr2=177; cb1=90; cb2=130; for i = 1:m%this will check all the pixel with the threshold values of cb and cr and assign value 1 for that pixel for j = 1:n if(cr(i,j)>cr1 && cr(i,j)<cr2 && cb(i,j)>cb1 && cb(i,j)<cb2) ycbcr_skin(i,j)=1; end end end %ycbcr_skin; final_ycbcr = zeros(m,n); for i = 1:m%making a matrix with original pixel values for skin region and 0 for non skin tone regions for j = 1:n if(ycbcr_skin(i,j)==1) final_ycbcr(i,j)=img(i,j); end end end figure; subplot(221),imshow(final_ycbcr),title('binary mask for ycbcr mode'); %%%%%%% for the hsv mode ------img1=rgb2hsv(img); m=size(img1,1); n=size(img1,2);


h=zeros(m,n); s=zeros(m,n); v=zeros(m,n); for i=1:m for j=1:n h(i,j)=360*img1(i,j,1); end end s=img1(:,:,2); v=img1(:,:,3); max(max(s)) hsv_skin=zeros(m,n); for i=1:m for j=1:n if(h(i,j)>0 && h(i,j)<50 && .23<s(i,j) && .68>s(i,j)) hsv_skin(i,j)=1; end end end for i=1:m for j=1:n if(hsv_skin(i,j)==1) final_hsv(i,j)=img(i,j); end end end final_hsv; subplot(222),imshow(hsv_skin),title('binary mask for hsv mode'); %%%%% from rgb mode %r1=180; %r2=249; %g1=130; %g2=195; %b1=120; %b2=215; rgb_skin=zeros(m,n); final_rgb=zeros(m,n); for i=1:m for j=1:n if(r(i,j)>95 && g(i,j)>40 && b(i,j)>20) if(max(max(r(i,j),g(i,j)),b(i,j))min(min(r(i,j),g(i,j)),b(i,j))>15) if(abs(r(i,j)-b(i,j))>15 && r(i,j)>b(i,j) && r(i,j)>g(i,j)) rgb_skin(i,j)=1; end end end end end for i=1:m for j=1:n


end end subplot(223),imshow(final_rgb),title('binary mask for rgb mode'); combo_skin=zeros(m,n); for i=1:m for j=1:n combo_skin(i,j)=ycbcr_skin(i,j)+rgb_skin(i,j) +hsv_skin(i,j); end end noofskinpixels=0; final_combo=zeros(m,n); for i=1:m for j=1:n if combo_skin(i,j)~=0 final_combo(i,j)=img(i,j); noofskinpixels= noofskinpixels+1; end end end subplot(224),imshow(final_combo),title('binary mask of all 3 color modes');

if(rgb_skin(i,j)==1) final_rgb(i,j)=img(i,j); end

%%%%%to calculate the number of connected components in the final skin tone %%%%%image obtained L = BWLABEL(combo_skin,8); L1 = max(max(L)) pixval('ON');

%%%% to count the number of pixels in each connected segment count=zeros(1,max(max(L))); for k=1:L1 for i = 1:m for j=1:n if(k==L(i,j)) count(k) = count(k)+1; end

end end count max(count); end

k=61; tempimage=zeros(m,n); for i=1:m for j=1:n if(L(i,j)==k)


tempimage(i,j)=1; %%%%%%this loop is to see which portion of the image is selected at a particular value of k end end end %figure; %imshow(tempimage); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ratio=zeros(1,L1); %% ratio is the array to store the ratio of length and width of each segment x1=zeros(1,L1); x2=zeros(1,L1); y1=zeros(1,L1); y2=zeros(1,L1); length=zeros(1,L1); width=zeros(1,L1); for k=1:L1 temp=0; for i=1:m for j=1:n if(k==L(i,j)) temp=temp+1; if(temp==1) x1(k)=i; end if(temp==count(k)) x2(k)=i; end end end end temp=0; for j=1:n for i=1:m if(k==L(i,j)) temp=temp+1; if(temp==1) y1(k)=j; end if(temp==count(k)) y2(k)=j; end end end end length(k) = x2(k)-x1(k); width(k) = y2(k)-y1(k); if(length(k) ~= 0 && width(k) ~= 0 ) ratio(k)=length(k)/width(k); end end x1; x2; y1; y2; ratio temp=0;


skinthresh=zeros(1,L1); for k=1:L1 if(length(k)~=0 && width(k)~=0) skinthresh(k)=count(k)/(length(k).*width(k)); end end skinthresh

%%% we define a golden ratio(gr) and tolerance(t)to check wether a given segment has %%% dimensions of the face or not finaloutput=zeros(m,n); gr= (1+sqrt(5))/2 t=.65; gr1=gr+t gr2=gr-t figure; imshow(img); hold('on') for k=1:L1 if(gr2<=ratio(k) && ratio(k)<=gr1 && skinthresh(k)>=.56 && count(k)>100) color('red'); rectangle('position',[y1(k),x1(k),width(k),length(k)]); k count(k) ratio(k) for i=1:m for j=1:n if(L(i,j)==k) finaloutput(i,j)=1; end end end end end figure; imshow(finaloutput); %%%below is code for cb vs cr graph for skin region cb1=zeros(1,noofskinpixels+1); cr1=zeros(1,noofskinpixels+1); y1=zeros(1,noofskinpixels+1); temp=1; for i=1:m for j=1:n if(combo_skin(i,j)==1) cb1(temp)=cr(i,j); cr1(temp)=cb(i,j); y1(temp)=y(i,j); temp=temp+1; end end end


%figure; %stem(cb1,'b.');title('cb'); %figure; %stem(cr1,'b.');title('cr'); %figure; %stem(y1,'b.');title('y'); figure stem(cb1,y1,'r.'); %figure; %stem(cb1,'r.'); %hold('on'); %stem(cr1,'g.'); %stem(y,'b.');



We will now discuss the results obtained from the three different color modes.

Fig 7.1- Original Image

. Fig 7.2 - Binary mask for YCbCr mode

Fig 7.3 - Binary mask for HSV mode


Fig 7.4 - Binary mask for RGB mode

Fig 7.5 - Original Image

Fig 7.6 - Binary Mask for YCbCr Mode

Fig 7.7 - Binary Mask for HSV Mode


Fig 7.8 - Binary Mask for RGB Mode While we see that using the combination of the three color tones, we are able to extract almost all the skin region from an image and pick out the faces, the efficiency of this algorithm is not 100 % due to obstructions present in the images like presence of structural components like beards and glasses, Occlusion (face may partially be covered by another object).

Below is a plot of Cb and Cr values of an image. The highly clustered region represents the skin region. There is a strong correlation between Cb and Cr values for skin pixels.

Fig 7.9 Cb vs Cr plot The outputs demonstrate that the system works extremely well for the images in which faces are full, upright, and facing towards the front. The false positives primarily occurred on large non-face body parts such as arms and legs. The false negatives were typically due to an obstructed face or variations that caused separations in the face such as sunglass.



A face detection algorithm is proposed that combines pixel from different color tones .The algorithm is able to detect faces at different scales in a given image, as well as slightly tilted images. However, it has a hard time breaking apart overlapping faces if they are too near each other vertically. Future work could include the use of rotated eigen images or the implementation of a neural network or a linear classifier as a secondary detection scheme.



V. Vezhenevets, V. Sazonov and A. Andreeva : A survey on pixel based skin color detection techniques, Graphics Media Laboratory, Faculty of Computational Mathematics and Cybernetics, Moscow State University, Moscow, Russia.

V. Nabiyev and A. Gnay : Towards a biometric purpose image filter according to skin detection, Department of Computer Engineering, KTU, Trabzon, Turkey

K. Sandeep and A. Rajagopalan : Human Face Detection in cluttered images using skin color and edge information, Department of Electrical Engineering, IIT Madras