Você está na página 1de 6

Discovering digital identities through face recognition on mobile devices

Gerry Hendrickx Faculty of Engineering, Department of Computer Science Katholieke Universiteit Leuven Hendrickx.gerry@gmail.com Abstract
This paper covers the current status and past work of the creation of an iPhone application. The application uses face recognition provided by Face.com [1] to recognize faces captured with the camera of the iPhone. The recognized faces will be named and enable the user to get access to the information of the person from different social networks, his digital identity. This paper describes the research, the related work, the user interface prototyping and the current state of the implementation. It covers 3 iterations of paper prototyping in which a user interface gets established and the initial implementation of the application. mechanism since iOS5 [2]. Android, the other option, does not have this built in feature.

1. INTRODUCTION
The internet has revolutionized the way people interact with each other. Different social networks have been created to support different levels of communications. The presence of a person on these networks is called his digital identity. The goal of this thesis is to nd an efcient way to discover the digital identity of a person, by using face recognition and a smartphone. This paper describes the process of creating a face recognition application for the iPhone. It uses face recognition to offer the user extra information about the persons seen through the camera. This extra information will come from various social networks, the most important being Facebook, Twitter and Google+. It aims to offer users access to online data and information that is publicly available on the internet. This can be used in a private context, to enhance conversations and nd common ground with your discussion partner, or in an academic context, to be able to easily nd information, like slides or publications of the speaker at an event youre attending. The app will be created for iOS because the SDK offers a built in face detection

A brainstorm and a survey resulted in a list of requirements for the face recognition application. The survey asked about which information users would like to get if they were able to recognize a person by using a smartphone. The survey was answered by 34 persons. 14 out of 34 voters would respect the privacy of the recognized person and thus would not want any information. 9 wanted contact information, 6 wanted links to the social network proles of the recognized person and 3 of the voters wanted the last status update of Facebook or Twitter. The 2 remaining votes went to pictures of the recognized person and the location where they last met. There was a strong need for privacy, so a policy was decided upon: The app could be limited to the recognition of the users Facebook friends, but the need to recognize your Facebook friends is lower than the need to recognize strangers. To broaden the scope of recognizable people, the other users of the application will also be recognizable. The general policy will be: if you use the application, you can be recognized. An eye for an eye. The brainstorm resulted in a list of functionalities or characteristics. First of all the application should work fast. Holding your smartphone up and pointed to another person is quite cumbersome, so the faster the recognition works, the smaller this problem becomes. The information about the person should be displayed in augmented reality (AR) [3]. This is a technology that augments a users view of the world. AR can add extra information to the screen, based on GPS data or image processing. AR could place information about a person around his face in real-time. The functionality requirements from the poll and brainstorm and thus the goals to achieve efcient discovery of digital identities by using face recognition are the following:

Detection and recognition of all the faces in the eld of view. Once recognized, the name and other options (like available social networks) will appear on screen with the face. Contact info will be fetched from the networks and can be saved in the contacts app. Quick access to all social networks will be available, along with some basic information such as the last status update/tweet. Information available will differ from person to person. To add an extra layer of privacy settings, a user will be able to link his networks and choose to enable or disable them. When the user gets recognized, only his enabled networks will show up.

Another non-commercial related work is a masterthesis of 2010 at the Katholieke Universiteit Leuven [10]. The author created a head-mounted-display-based application to recognize faces and get information. From his work we learned that HMDs are not the ideal practical setup (his app required a backpack with a laptop and heavy headgear), and that the technology used (OpenGL) is a cumbersome way to develop. Using iOS simplies these aspects. The author used Face.com as face recognition API and was very satised with it. A comparison of face recognition APIs was made in order to nd the one most suited to our goals. A quick summary of the positive and negative points: Viewdle: As said above, weve tried to contact Viewdle to get more information about the API. Sadly, they did not respond, therefore Viewdle is no option. Face.com: Face.com offers a well documented REST API. It offers Facebook and Twitter integration and a private namespace that will allow the application to apply the previously stated privacy policy . There is a rate limit on the free version of Face.com. Betaface [8]: The only API to work with images and video. However, it is Windows only, hasnt been used with iOS yet and is not free. PittPatt [9]: PittPatt was a very promising service, but sadly it got acquired by Google. The service cannot be used at this time. It seemed that Face.com is the only, but the best option found. It has an iOS SDK and social media integration, which are both very useful in the context of this application.

2. RELATED WORK
For this masterthesis we went to search for related mobile applications. No existing application was found that does exactly the same as this project. However, some similarities were found with the following applications: Viewdle [5]: Viewdle is a company focusing on face recognition. They have several projects ongoing, and have already created an application for Android, called Viewdle Social Camera. It recognizes faces in images based on Facebook. Viewdle has a face recognition iOS SDK available. Animetrics [6]: Animetrics created applications to help government and law enforcing agencies. It has multiple products like FaceR MobileID, which can be used to get the names and percentage of the match of any random person. FaceR CredentialME can be used for authentication on your own smartphone. It recognizes your face and if it matches, it unlocks your data. Animetrics also focuses on home security face recognition. However they do not seem to have a public API, since their focus is not on the commercial market. TAT Augmented ID [7]: TAT Augmented ID is a concept app. It recognizes faces in real-time and uses augmented reality to display icons around the face. The concept is the same, but the resulting user interface is different. Section 3 discusses why a fully augmented user interface is not preferred on mobile devices.

3. PAPER PROTOTYPING
Paper prototyping is the process of designing the user interface based on quick drawings of all the different parts of the user interface [11]. By doing this with paper parts, it becomes easy to quickly evaluate and adapt the interface. The prototyping phase consisted of 3 iterations: the interface decision, the interface evaluation and the expert opinion.

3.1. Phase one: interface decisions


The rst phase of the paper prototyping was to decide which of the 3 possible interfaces would be used. The interfaces were:

Interface 1: A box of information attached to the head of the recognized person. This is the best use of augmented reality, but you have to keep your phone pointed to the person in order to be able to read his information. See gure 1a. Interface 2: A sidebar with information, which takes about 1/4th of the screen. This way, users lower their phone when a person is selected, but they can still use the camera if they want. See gure 1b. Interface 3: A full screen information screen. This makes minimal use of augmented reality but offers a practical way to view the information. Users see the name of the recognized person in the camera view, and once tapped, they get referred to the full screen information window. See gure 1c. These interfaces are evaluated using 11 test subjects, aged 18 to 23, with mixed smartphone experience. The tests were done using the think aloud technique, which means they have to say what they think is going to happen when they click a button. The interviewer plays computer and changes the screens. The same simple scenario was given for all interfaces where the test subject needed to recognize a person and nd information about him. After the test, a small custom-made questionnaire was given to poll the interface preference. None of the users picked interface 1 as their favourite. The fact that you should keep your phone pointed to the person in order to read and browse through the information, proved to be a big disadvantage. The choice between interface 2 and 3 was not unanimously decided. 27% chose interface 2, 73% chose interface 3. Thus interface 3 was chosen and elaborated for the second phase of paper prototyping. People liked the second idea where you could still see the camera, but then again they realized that, if interface 1 thought us that you would not keep your camera pointed at the crowd, you wouldnt do this in interface 2, so the camera would be showing your table or pants. This reduces the use of the camera feed. The smartphone owners also pointed out that using only a part of the screen would be too small to actually put readable information on it.

(a) Interface 1

(b) Interface 2

3.2. Phase two: interface evaluation


For the second phase, 10 test subjects were used between the ages of 20 and 23, again with mixed smartphone experience. An extended scenario was created which explored the full scope of functionality in the application. The testers needed to recognize people, adjust settings, link social networks to their prole, indicate false positives and manage the recognition history.

(c) Interface 3

Figure 1: The different scenarios

The think aloud technique was applied once again. At the end of each prototype test, the test subject needed to ll in a USE questionnaire [4]. This is a questionnaire consisting of 30 questions, divided into 4 categories to poll to different aspect of the evaluated application. The questions can be answered on a 7-point Likert rating scale, 1 representing totally disagree and 7 representing totally agree. The categories are usefulness, ease of use, ease of learning and satisfaction. 3.2.1. Usefulness. The overall results were good. The question about whether the application is useful receives a median score of 5 out of 7, with no result below 4. People seem to understand why the app is useful and it does what they would expect. The scores on whether it meets the need of the users were divided with scores going from 1 to 7, because some users (especially the users with some privacy concerns or without a smartphone) did not see the usefulness of the application. However, the target audience, the smartphone users, did see its usefulness, resulting in higher scores in this section. 3.2.2. Ease of use. From the ease of use questions the need to add or rearrange buttons became clear. Users complained about the number of screen transitions it took to get from one screen to somewhere else in the application and this could be seen in the results. The question whether using the application is effortless received scores ranging from 2 to 7 with a median of 5. The path through the application should be revised to see whether missing link can be found and corrected to improve the navigation. For instance, the home button to go to the main screen will be added on several other screens, instead of having to navigate back through all the previous screens. Using the application was effortless for all iPhone users, because the user interface was built using the standard iOS interface parts. 3.2.3. Ease of learning. None of the ease of learning questions scored below 5 on the 7-point Likert rating scale. This is also due to the standard iOS interface, which is developed by Apple to be easy to work with and easy to learn. 3.2.4. Satisfaction. Most people were satised with the functionality offered by the application and how it was presented in the user interface. Especially the iPhone users were very enthusiastic, calling it an innovative, futuristic application. The question whether the user was satised with the application received scores ranging from 5 to 6, with a median at 6 out of 7. Nonsmartphone users were more skeptical and did not see the need for such an application. Aside from this, the

application was fun to use and the target audience was satised. 3.2.5. Positive and negative aspects. The users were asked to give the positive and negative aspects of the application. The positive aspects were the iOS-style of working and the functionality and concept of the application. The negative aspects were more user interface related, such as not enough home buttons, and the suggested method to indicate a false positive. This button was placed on the full screen information window of a person. Everybody agreed that this was to late, because all the data of the wrongfully tagged person would then be displayed. So the incorrect tag-button should be placed on the camera view. Some useful ideas like enabling the user to follow a recognized person on Twitter were suggested.

3.3. Phase three: expert opinions


For this phase, the supervisor and 6 advisers of the thesis were used in the paper prototype test. The prototype was adjusted to the results of the second iteration. More home-buttons were placed and the incorrect tagbutton was placed on the camera view. The test subjects all had extensive experience in the eld of humancomputer interaction and can thus be seen as experts. They took the tests and lled in the same questionnaire and gave their opinion on several aspects of the program. A small summary: There were concerns about the image quality of the different iPhones. Tests should be done to test from what distance a person can be recognized. The application should be modular enough. In a rapidly evolving 2.0 world, social network may need to be added or deleted from the application. If all the networks are implemented as modules, this will be a simpler task. The incorrect tag-button could be implemented in the same way as the iPhoto application asks the user if an image is tagged correctly. The social network information should not just be static info. The user should be able to interact directly from the application. If this is not possible, it would be better to refer the user directly to the Facebook or Twitter app. More info could be displayed in the full screen information window. Instead of showing links to all networks, the general information about the person could already be displayed there.

When asked which social networks they would like to see in the application, nearly everybody said Facebook, Twitter and Google+. In an academic context, they would like to see Mendeley and Slideshare.

4. IMPLEMENTATION
Apart from some suggestions, the third paper prototype was received positively. The next step in the process is the implementation. The application is currently in development, and a small base is working. The main focus so far is on the crucial functionality, the face recognition. It is important to get this part up and running as fast as possible, because the entire application depends on it. So far the application is able to track faces using the iOS5 face detection. A temporary box frames the faces and follows them as the camera or person moves. This functionality could be used to test the quality of the face detection API. As you can see in gure 2, the face detection algorithm of iOS5 can detect multiple faces at ones, and in such depth that the smallest recognized face is barely bigger than a button. These are great results, because the algorithm appeared to be fast and reliable and detailed enough for our purpose. Detection of even smaller faces is not necessary, because the boxes will become harder to click if they are smaller than a button.

Figure 2: Face tracking results.

The next step was the face recognition. This is handled by Face.com. An iOS SDK is available on the website [12]. This SDK contains the functionality to send images to the Face.com servers, and receive a JSON response with the recognized faces. It also covers the necessary Facebook login, as Face.com requires the user to log in using his Facebook account. This login is only needed one time. One problem was that Face.com only accepts images, not video. To be able to test the face recognition as fast as possible, a recognize-button was added to the camera view. Once clicked, a snapshot is taken with the camera. This snapshot is send to the servers of Face.com and analyzed for different faces. The JSON response gets parsed and the percentage of the match and the list of best matches can be fetched from it. At the moment, only one face can be recognized at the same time, because there is no algorithm provided to check which part of the response should be matched to which face on the camera. This is temporarily solved by limiting the program to one face at a time. Figure 3 shows the current status of the application. A face is recognized and its Facebook ID is printed above.

Figure 3: Face recognized and matched with the correct Facebook ID.

5. NEXT STEPS AND FUTURE WORK


The next step in development is the further development of the user interface. Now that we have

a basic implementation of the main functionality, it is important to nish the mock-up of the application. This way, a dummy implementation of several screens can be used to test the interface in several iterations using the digital prototype. When these tests happen, the underlying functionality can be extended and implemented in parallel. Several big problems need to be solved. The biggest being the matching of faces detected by iOS5 face detection and the faces recognized by Face.com. Because Face.com recognizes faces by using images, a way needs to be found to match these results to the faces on screen. If the user moves the camera to other people after pressing the recognize-button, the results from Face.com will not match the faces on screen. The solution in mind is to use an algorithm to match the Face.com results with the face detection based on proportions. If we succeed in nding a correlation between for instance the eyes and the nose of a person in both services, it should be possible to nd which detected face matches which Face.com result. Another way to match the faces to the reply is to keep track of the faces based on their coordinates on the screen. A suitable algorithm for this problem needs to be found. Another smaller problem is the use of the Face.com SDK. It has a limited Facebook graph API built into it. However, this API can not be used to fetch the name of an ID or to get status updates. Therefore the real Facebook iOS SDK should be used. To prevent the app from working with two separate APIs, the Face.com API needs to be adapted so that it uses the real Facebook iOS SDK instead of the limited graph API.

[4]

[5] [6] [7] [8] [9] [10] [11]

[12]

reality, IEEE Computer Graphics And Applications, 21(6):34-47, 2001 Arnold M. Lund, Measuring Usability with the USE Questionnaire, in Usability and User Experience, vol. 8, no. 2, October 2001, http://www.stcsig.org/usability/newsletter/0110 measuring with use.html http://www.viewdle.com/ http://www.animetrics.com/Products/FACER.php http://www.youtube.com/watch?v=tb0pMeg1UN0 http://www.betaface.com/ http://www.pittpatt.com/ Niels Buekers, Social Annotations in Augmented Reality, Masterthesis at KULeuven, 2010-2011 Erik Duval, paper prototyping, http://www.slideshare.net/erik.duval/paper-prototyping12082416, last checked on April 29, 2012 Sergiomtz Losa, FaceWrapper for iPhone, https://github.com/sergiomtzlosa/faceWrapper-iphone

6. CONCLUSION
This masterthesis is still a work in progress. We already have good results from paper prototyping, and the core of the application has already been implemented. In the following months, some problems will have to be solved and user testing is still required to make the application match its goal: a fast, new way to discover people using the newest technologies and networks.

References
[1] http://www.face.com [2] https://developer.apple.com/library/mac/documentation/ CoreImage/Reference/CIDetector Ref/Reference/Reference.html [3] R. Azuma, Y. Baillot, R. Behreinger, S. Feiner, S. Julier, B. MacInture, Recent developments in augmented

Você também pode gostar