Age and Height

Recent advances in age and height estimation from still images and video
Rama Chellappa
Center for Automation Research, UMIACS Electrical and Computer Engineering University of Maryland, College Park, MD 20740 Email: rama@umiacs.umd.edu
Pavan Turaga
Center for Automation Research, UMIACS University of Maryland, College Park, MD 20740 Email: pturaga@umiacs.umd.edu
AbstractSoft-biometrics such as gender, age, race, etc have been found to be useful characterizations that enable fast preltering and organization of data for biometric applications. In this paper, we focus on two useful soft-biometrics - age and height. We discuss their utility and the factors involved in their estimation from images and videos. In this context, we highlight the role that geometric constraints such as multiview-geometry, and shape-space geometry play. Then, we present methods based on these geometric constraints for age and height-estimation. These methods provide a principled means by fusing imageformation models, multi-view geometric constraints, and robust statistical methods for inference.
of point trajectories on the human. These lines enable the estimation of vanishing lines and vertical points that allow the height-estimation problem to be solved as shown in Figure 1.
(a)
(b)
I. I NTRODUCTION Soft-biometrics such as gender, age, race, etc are gaining importance in biometric systems. In large corpora of images, they enable fast pre-ltering for further analysis, and also allow for organization of data so that fast retrieval methods can be designed. As evidenced in most soft-biometric estimation methods, tools from machine learning are often used to learn and estimate the desired entities. In this paper, we focus on two useful soft-biometrics - age and height. In this context, we highlight the role that geometric constraints such as multiviewgeometry, and shape-space geometry play. We argue that it is important to devise algorithms that not only exploit 3D geometric constraints, but also obey the intrinsic geometry of the data itself. The knowledge of these constraints then paves the way towards designing robust statistical algorithms for estimation of these quantities. In height-estimation, the problem is compounded due to the distortion of many geometric scene properties due to the imaging transformation in a perspective camera. Useful properties such as the size of an object, the length ratio, and parallelism between line segments, are not preserved in a projected image. One approach to mitigate these problems would be to recover the real 3D world from a 2D projected image, which in most cases needs far more information than a single image can provide with unknown camera calibration. Techniques that can estimate heights with respect to a reference height with minimal calibration information will be effective in real applications. Toward this end, we show the utility of video data to obtain this calibration information directly with minimal manual intervention. Further, the motion of humans is exploited to obtain parallel lines in the form
91
(a) The basic geometry of the scene. l is the vanishing line of the reference plane, line br tr and bt are two parallel lines on the reference plane. p is the vanishing point on direction of br tr and bt. (b) An example of applying the algorithm to a real image, the red lines represent the reference height and the target height respectively, i is the same position illustrated in (a). Taken from [1].
Fig. 1.
For age-estimation, studies in neuroscience have shown that facial geometry is a strong factor that inuences age perception [2]. In [2], it is shown that shape-averaged faces are perceived to be younger. Further, the distance from the average is a strong indicator of the apparent age of the person. The regions where a given face shows a large difference in shape from a shape-averaged face when further exaggerated, results in a caricature [3], [4]. Young faces exhibit distinct growth-related anthropometric trends. Anthropometric variations in adults are distinctive to a lesser degree than in children, but nevertheless they do exhibit drifts in facial features surrounding the mouth, eyebrows etc. This is illustrated in gure 2 where distinct geometric changes can be observed as a person ages. To develop appropriate statistical inference methodologies, one needs to understand a) what is the space of these geometric landmarks, and b) What are the appropriate statistical models and distance metrics in this space. We show that an afneinvariant representation of facial landmark geometry can be analytically modeled as a Grassmann manifold. We describe the warping process of one face to another by a smooth geodesic ow on the Grassmann manifold. Then, these warping parameters are shown to contain age-specic information which can prove useful for estimating the apparent age of a person.
2 0.15 0.15
10 0.15
14 0.15
18 0.15
29 0.15
43
0.1
0.1
0.1
0.1
0.1
0.1
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.1
0.1
0.1
0.1
0.1
0.1
0.15
0.15
0.15
0.15
0.15
0.15
0.2 0.2
0.15
0.1
0.05
0.05
0.1
0.15
0.2
0.2 0.2
0.15
0.1
0.05
0.05
0.1
0.15
0.2
0.2 0.2
0.15
0.1
0.05
0.05
0.1
0.15
0.2
0.2 0.2
0.15
0.1
0.05
0.05
0.1
0.15
0.2
0.2 0.2
0.15
0.1
0.05
0.05
0.1
0.15
0.2
0.2 0.2
0.15
0.1
0.05
0.05
0.1
0.15
(a) Age 2 Fig. 2.
(b) Age 10
(c) Age 14
(d) Age 18
(e) Age 29
(f) Age 43
Facial geometric variation across ages. Samples shown correspond to individual 2 from the FG-net dataset.
A. Related Work a) Metrology for Height Estimation: Several techniques have been developed for metrology from static images. The vanishing point of parallel lines has been proven to be a useful feature in recovering 3D scenes [5], [6]. Criminisi et al. [7] outlined a method for recovering an afne scene structure from a single perspective view using vanishing lines and points. Without the knowledge of a cameras internal calibration, they computed the area and length ratios of any planes parallel to the reference plane. Gurdjos et al. [8] showed how to estimate the heights of soccer players from videos given vanishing points/lines. The major limitation of these approaches is the dependence on manual supervision. As an extension, Liang et al. [9] estimated afne height measurements by recovering the planar homography of the reference plane between two views separated by a (near) pure translation. Several automatic mensuration algorithms have been developed to take advantage of tracking results from video sequences. The calibration technique proposed by Lv et al. [10] uses the constraints that walking pedestrians are essentially perpendicular to the ground plane and their heights remain constant. Renno et al. [11] used projected sizes of pedestrians to estimate the vanishing line of a ground plane. Stauffer et al. proposed [12] a self-calibration method by linearly approximating measurable projected properties. Bose and Crimson [13] proposed a method that uses constant velocity trajectories of objects to derive vanishing lines for recovering the reference plane and planar rectication. The basic idea of their algorithm is to use an additional constraint brought by the constant-velocity assumption, which is not always available in surveillance sequences. b) Age estimation: A recent survey of age estimation from facial images can be found in [14]. Research in modeling aging can be divided into two main classes physics-based models and data-driven models. The rst class concerns itself with computational models to describe the physical process of aging. Examples include the works of Pittenger and Shaw [15] who studied facial growth as a viscalelastic event dened on the craniofacial complex. Mark et al. [16] studied geometric invariants that characterize cardioidal strain transformations and their relation to perception of growth. Todd et al. [17] treated the human head as a uid lled spherical object and proposed the revised cardioidal strain model to account for craniofacial growth. More recently, Ramanathan and Chellappa [18] applied these models in conjunction with anthropometric data to identify different growth parameters for different parts of the face. Physics-based approaches such
92
as these have mostly found use in synthesis applications such as age progression and regression, where it is important to synthesize realistic younger or older looking faces from a given face. In the data driven approaches, modeling of age progression is typically done by estimating functional forms of the aging process or learning classiers from training data. Examples include the work of [19], where methods for classifying face images as that of babies, young adults, and senior adults were proposed. Facial anthropometric measurements were used to classify faces as babies and adults. Adult faces were further classied into young or senior adults using texture analysis. Ramanathan and Chellappa [20] proposed a Bayesian age-difference classier built on a probabilistic eigenspaces framework to perform face verication across age progression. Several regression-based methods have been proposed to estimate the perceived age of a face from images. Lanitis et al. [21], [22] constructed an aging function based on a parametric model for human faces and performed automatic age progression, age estimation, face recognition across aging. Fu et al. [23] combined dimensionality reduction methods such as PCA, LLE, LPP, OLPP etc with regression. Guo et. al. [24] proposed robust regression followed by local adjustments for age estimation and showed that local adjustments improve performance. All these approaches mainly differ in the features used and variations in the choice of regression methods. c) Organization: In section II, we discuss methods derived by exploiting 3D geometric constraints for height estimation from videos. In section III, we discuss methods derived by exploiting the geometry of shape-spaces for age-estimation. Experimental results are presented in the corresponding sections. Concluding remarks are presented in section IV. II. H UMAN HEIGHT ESTIMATION FROM VIDEO The height of a human is a soft-biometric that can be estimated upto a reasonable degree of accuracy using the geometry of the scene. This eld is broadly called Video metrology, and we describe here how the basic geometric principles of video metrology can be used for human height estimation. One of the factors that makes human height estimation different from standard video metrology is the non-rigid deformation of the human body. For the case of rigid objects such as vehicles, buildings etc., geometric features such as orthogonal lines and parallel lines can be easily detected. Also, calibration information derived from these geometric makes the procedure highly reliant on the accuracy of detecting these features. In the case of video sequences, motion information can be used to estimate the minimal calibration of scenes. Ex-
ploitation of motion information offers enhanced exibility. Tracked moving humans in video sequences provide a way of estimating the vanishing line of ground planes and the vertical vanishing point. Perspective projection introduces challenges in video mensuration applications. One can easily make mensuration on images of objects, but little can be inferred from the mensuration without understanding how the objects have been mapped from world coordinates to image coordinates. Given a typical pinhole camera, Criminisi et al. [7] proved the following geometric fact: Given the vanishing line of a reference plane (in our case we refer to the ground plane), a vanishing point for another reference direction (not parallel to the plane, in our case the vertical direction) and a reference height, one can estimate any objects height if the object and the reference object are both on the reference plane. We call this the minimal calibration condition. Denote the vanishing line of a reference plane (ground plane) by l , the vertical vanishing point by p and the length of a reference line segment tr br by hr . We want to measure the height h of segment tb. The intersection v can be obtained by v = br b l and the intersection i by i = tv pbr , where denotes the cross product of two vectors. The segment br i is the projected segment of the segment tb onto tr br . Therefore, the world height ratio h/hr can be derived using the cross ratio [25] invariance and 1D homography [7] as: h d(p, i)d(tr , br ) = hr d(i, br )d(tr , p) (1)
plane due to deformation. Using this fact, the human-height estimation procedure consists of the following key steps: - Using the motion information of the tracked human in an uncalibrated video sequence, we automatically recover useful geometric properties of the scene, including the vanishing line of the ground plane and the vertical vanishing point. - We apply the Expectation Maximization (EM) algorithm to simultaneously cluster tracked trajectories of feature points into groups and estimate the corresponding vanishing points belonging to each group. Trajectory outliers are also detected. - With estimated minimal calibration information, we apply the single view metrology algorithm [7] to estimate the height of a target object in each frame. We also present an uncertainty analysis for the height measurement. - Finally, error covariances of height estimates from a single frame are incorporated while calculating the nal height estimate. The entire procedure is illustrated by an example of walking pedestrians in Figure 3.
(a)
(b)
(c)
where i = (p br ) (t (l (br b))). The height of any object can be estimated in a similar manner. Figure 1 illustrates the computation of the height ratio using the above geometric method. The minimal calibration condition includes three components, the vanishing line of a ground plane, the vertical point and the reference height(s). Traditional approaches count on manually labeled parallel lines to obtain vanishing points, which are based on the following well-known geometric properties: (i) The vanishing line (the line at innity), l = (l1 , l2 , l3 )T , of a ground plane has two degrees of freedom and can be determined as the line through two or more vanishing points of the plane. (ii) In a perspective view, a set of parallel lines pass through a common point in the image plane, which is called their vanishing point. The vertical point is the vanishing point of all the vertical world lines on the image plane. Therefore, two sets of parallel lines (non-parallel to each other) on the ground plane and one set of vertical lines are the minimum requirements to determine the minimal calibration. Instead of detecting parallel line sets from man-made structures [26], [27], [28], our approach focuses on automatically extracting parallel lines through motion information obtained from video. This can be done by exploiting the fact that most human motion occurs on the ground plane. For a non-rigid object, like a human, some points trajectories are parallel to the ground plane, while others are not parallel to the ground
93
(d)
(e)
Fig. 3. The procedure for measuring objects heights from one frame.
(a) The original image. (b) The obtained motion blobs indicating the moving objects in (a). (c) The estimated principal axes in yellow of the moving objects (step 1). (d) The principal axes in red after local tuning (step 2). (e) The thick red line depicts the vanishing line of the ground plane, the yellow segment is the reference object. Please zoom in to view details. Taken from [1].
The algorithmic details of this approach can be found in [1]. Here we briey present the results obtained using this procedure. The results of height estimation on the Honeywell dataset1 are discussed below. In this dataset, two static cameras from different view angles are used to acquire a pair of videos while a subject walks along a pre-specied path in a hallway as illustrated in Fig. 4. There are nine subjects, with each subject appearing only once or maximally four times. The subject changes part of his/her clothing in different walking sessions. In total, 30 pairs of video sequences are acquired, all with different clothing. The two cameras share a common reference object of the black board marked by the red lines in Fig. 4(a). However, we
1 The video sequences in this dataset were acquired by Honeywell Corporation under the HSARPA contract 433829 monitored by the Ofce of Navy Research (ONR).
do not know its exact height, hence all our height estimation is subject to an unknown scale factor. Fig. 4(b) presents the height estimation results: Its x-axis uses the cropped image of the subject in one view to represent each walking session, and its y-axis shows the estimated height values with two dots, one red dot for view 1 and one blue dot for view 2. It is clear that for almost all 30 walking sessions, the subjects heights are measured consistently across two views. Suppose that h1 and h2 are measurements from view 1 and view 2 from the same walking session, we dene the inter-view measurement error percentage as hv = |h1 h2 |/(h1 + h2 ) 100%, which cancels the unknown scale factor. For all 30 sessions, the mean hv value is 0.76% and the standard deviation is 0.54% only. Even for the same subject in difference sessions as marked by the green boxes in Fig. 4(b), the estimated height values for him/her are quite consistent too, regardless of the view and clothing. This is evidenced in Fig. 4(c), in which the inter-session measurement error percentage for one subject is dened as hs = std(h)/mean(h) 100% where mean(h) and std(h) are the mean and standard deviation of the measurements from both views of all walking sessions belonging to the subject. A simple recognition experiment based on only the height information using the sequences in view 1 as gallery and those in view 2 as probe or vice versa was performed. Figure 4(d) displays the cumulative matching curve (CMC). The recognition accuracy is 46.7% among the top match, 81.6% among the top 5 matches, and 95% among the top 9 matches. Of course, height information alone is insufcient for recognition, but it can serve as a soft-biometric for further analysis. III. AGE ESTIMATION FROM GEOMETRIC CUES Young faces exhibit distinct growth-related anthropometric trends. Anthropometric variations in adults are distinctive to a lesser degree than in children, but nevertheless they do exhibit drifts in facial features surrounding the mouth, eyebrows etc [29]. This is illustrated in Figure 2 where distinct geometric changes can be observed as a person ages. Here, we discuss that geometric cues can be exploited using analytic manifold formulation of shape spaces to enable robust age-estimation techniques. The shape observed in an image of a face is a perspective projection of the 3D locations of the landmarks. Standard approaches to describe shapes involve extracting features such as shape context [30] etc. These approaches extract coarse features which correspond to the average properties of the shape. These approaches are particularly useful when landmarks on shapes cannot be reliably located across different images or do not necessarily correspond to physically meaningful parts of the object. However, in the case of faces, there exist physically meaningful locations such as eyes, mouth, nose etc which can be reliably located on most faces [31]. This suggests the use of a representation that exploits the entire information offered by the location of landmarks instead of relying on coarse features. There exist several automatic methods to locate facial landmarks which work well
94
(a)
(b)
1 0.9
0.8
0.7
0.6
0.5
(c)
Fig. 4.
0.4
10
15
20
25
30
(d)
(a) The two views of one walking session in the Honeywell dataset. (b) The estimated height values. (c) The inter-session measurement error percentage. (d) The CMC curve. Taken from [1].
on constrained images such as passport photos (c. f. [32] ). It is in constrained scenarios such as these that the methods proposed here are applicable. The afne shape space [33] is useful to account for small changes in camera location or change in the pose of the subject. The afne transforms of the shape can be derived from the base shape simply by multiplying the shape matrix L by a 22 full rank matrix on the right. For example, let A be a 22 a11 a12 afne transformation matrix i.e. A = . Then, all a21 a22 afne transforms of the base shape Lbase can be expressed as Laf f ine (A) = Lbase AT . Note that, multiplication by a full-rank matrix on the right preserves the column-space of the matrix Lbase i.e. span(Lbase ) is invariant to afne transforms of the shape. Subspaces such as these can be identied as points on a Grassmann manifold. The geometric properties of the Grassmann manifold are well-known [34], [35]. The knowledge of tangent-spaces of analytic manifolds provides methods to perform statistical computations such as classication, regression, etc in a principled manner. This was used to estimate the age of a face from landmark data in [36]. Here, we briey review the results obtained. We evaluate the strength of the Riemannian framework on age-estimation tasks on the FG-Net dataset. For this dataset, 68 ducial points are available with each face. Some sample images from this dataset are shown in gure 5. Given a face and its landmarks, we extract the tall-thin orthonormal representation using SVD. Given the matrix of landmarks L = [(x1 , y1 ); (x2 , y2 ); . . . ; (xm , ym )], we compute its rank-2 SVD L = U V T . The afne-invariant Grassmann representation
Warping Velocities
Fig. 5.
Sample images from the FG-Net dataset. Other Algorithms
of L is then given by YL = U . Now given several examples Yi with corresponding ages yi , the aging-function y = f (Y ) is estimated using the differential geometric properties of the Grassmann manifold. Details of this procedure can be found in [36]. Two metrics have been proposed in the literature for quantifying the performance of age-estimation algorithms. The rst criterion measures the mean absolute error (MAE) in age1 estimation across the entire dataset. i.e. M AE = N i |li i |, where N is the size of the dataset, li is the true age of l the ith person being tested, and i is the assigned age. The l second metric is the cumulative match score. The cumulative score is dened as CS(j) = Nej /N 100%, where Nej is the number of test-images on which the absolute error in age-estimation is within j years. For the FG-Net dataset, we performed a leave-one-personout testing as has recently been suggested [37]. In this mode, all images corresponding to the same person are used for testing and the remaining images are used for training. The results of the proposed framework on the FG-Net dataset are shown in Table I. The lowest MAE was obtained by using SVM + polynomial kernel on velocity vectors. MAE in this case was 5.89 years. The table also shows a comparison with other recently published methods. The cumulative scores of the proposed methods is shown in gure 6. We see that more than 90% of the faces are classied within 15 years of their true age.
100 90 80 Cumulative Score 70 60 50 40 30 20 10 0 0 2 4 6 8 10 Error Level 12 VelSVR VelRidge VelRVM 14 16
Method Ridge SVR RVM AAS [22] WAS [37] Ages [37] Ageslda [37] QM [21] MLP [21] RUN1 [38] LARR [24]
TABLE I
MAE 7.57 5.89 6.69 14.83 8.06 6.77 6.22 6.55 6.98 5.78 5.07
C OMPARISON OF MEAN - ABSOLUTE ERROR (MAE) USING PROPOSED METHODS WITH STATE - OF - THE - ART ON THE FG-N ET DATASET. TAKEN FROM [36].
of the regression parameters yielding a robust Bayesian RVM (RB-RVM). We refer the reader to [39] for a detailed discussion of the regression methodology. Using the same geometric feature used earlier, we use RBRVM to categorize the whole dataset into inliers and outliers. The algorithm found 90 outliers; some of the inliers and outliers are shown in Figure 7. With this knowledge of the inliers and the outliers, we perform the leave-one-person-out test again. Table II shows the mean absolute error (MAE) of age prediction for the inliers and the outliers separately. The small prediction error for the inliers and the large prediction error for the outliers indicate that the inlier vs. outlier categorization by RB-RVM was good. Table II also shows that the prediction error of the RB-RVM for the whole dataset is lower than that of the RVM.
FG-Net data Cumulative scores using velocity parameters with polynomial kernel. Taken from [36].
Fig. 6.
The above method used standard regression methods based on SVMs and RVMs without any modications. This serves to illustrate the robustness of the geometric features. These features can further be used in conjunction with more robust regression methods to yield improved performance. Next, we show the improvements obtained by using a robust RVM presented in [39]. This robust RVM version relies on rst removing statistical outliers in the data by assuming that the outliers are sparse. This is followed by a Bayesian modeling
95
Fig. 7. Some inliers and outliers found by RB-RVM. Most of the outliers are images of older subjects like Outlier A and B. This is because there are less number of samples of older subjects in the FG-Net database. Outlier C has an extreme pose variation from the usual frontal faces of the database; hence, it is an outlier. The facial geometry of Outlier D is very similar to that of younger subjects, such as big forehead and small chin, so it is classied as an outlier. Taken from [39].
IV. C ONCLUSIONS In this paper we have highlighted some geometric constraints that arise in age and height estimation tasks. The knowledge of these geometric constraints gives rise to robust and accurate statistical inference methodologies. Physical
RB-RVM RVM
Inlier MAE 4.61 N.A.
Outlier MAE 25.87 N.A.
All MAE 6.52 6.80
TABLE II M EAN ABSOLUTE ERROR (MAE) OF AGE PREDICTION FOR THE INLIERS , OUTLIERS AND THE WHOLE DATASET USING RB-RVM. S INCE RVM DOES NOT DIFFERENTIATE BETWEEN INLIERS AND OUTLIERS , WE ONLY SHOW THE PREDICTION ERROR FOR THE WHOLE DATASET. TAKEN FROM [39].
models such as 3D body models, or age-related facial growth models can be further incorporated into these geometrically constrained inference methodologies. This can form the basis for further future work. ACKNOWLEDGMENT The authors are thankful to Drs. J. Shao and S. K. Zhou for their contributions to the height estimation section, and Dr. N. Ramanathan and Mr. K. Mitra for their contributions to the age estimation section. R EFERENCES
[1] J. Shao, S. K. Zhou, and R. Chellappa, Robust height estimation of moving objects from uncalibrated videos, vol. 19, pp. 22212232, August 2010. [2] A. J. OToole, T. Price, T. Vetter, J. C. Bartlett, and V. Blanz, Threedimensional shape and two-dimensional surface textures of human faces: The role of averages in attractiveness and age, Image and Vision Computing Journal, vol. 18, no. 1, pp. 919, 1999. [3] S. E. Brennan, The caricature generator, Leonardo, vol. 18, no. 3, pp. 170178, 1985. [4] P. J. Benson and D. I. Perrett, Synthesizing continuous-tone caricatures, Image and Vision Computing, vol. 9, no. 2, pp. 123129, 1991. [5] B. Caprile and V. Torre, Using vanishing points for camera calibration, Intl. J. of Comp. Vis., vol. 4, pp. 127140, 1990. [6] R. Collins and J. R. Beveridge, Matching perspective views of coplanar structures using projective unwarping and similarity matching, in IEEE Conference on computer vision and pattern recognition, June 1993, pp. 240245. [7] A. Criminisi, I. Reid, and A. Zisserman, Single view metrology, Intl. J. of Comp. Vis., vol. 40, pp. 123148, 2000. [8] P. Gurdjos and R. Payrissat, About conditions for recovering the metric structure of perpendicular planes from the single ground plane to image homography, in Intl. conf. on pattern recognition, vol. 1, 2000, pp. 358361. [9] B. Liang, Z. Chen, and N. Pears, Uncalibrated two-view metrology, in 17th international conference on pattern recognition, Aug 2004, pp. 9699. [10] F. Lv, T. Zhao, and R. Nevatia, Self-calibration of a camera from video of a walking human, in Proc. of 16th international conference on pattern recognition, vol. 1, Qu bec, Canada, August 2002, pp. 562567. e [11] J. Renno, J. Orwell, and G. Jones, Learning surveillance tracking models for the self-calibrated ground plane, in Proc. of British machine vision conference, Cardiff, September 2002, pp. 607616. [12] C. Stauffer, K. Tieu, and L. Lee, Robust automated planar normalization of tracking data, in Joint IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance, Nice, France, October 2003. [13] B. Bose and E. Grimson, Ground plane rectication by tracking moving objects, in Proc. of joint IEEE international workshop on visual surveillance and performence evaluation of tracking and surveillance, 2003. [14] Y. Fu, G. Guo, and T. S. Huang, Age synthesis and estimation via faces: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 19551976, 2010. [15] J. B. Pittenger and R. E. Shaw, Aging faces as viscal-elastic events : Implications for a theory of nonrigid shape perception, Journal of Experimental Psychology : Human Perception and Performance, vol. 1, no. 4, pp. 374382, 1975.
[16] L. S. Mark, J. T. Todd, and R. E. Shaw, Perception of growth : A geometric analysis of how different styles of change are distinguised, Journal of Experimental Psychology : Human Perception and Performance, vol. 7, no. 4, pp. 855868, 1981. [17] J. T. Todd, L. S. Mark, R. E. Shaw, and J. B. Pittenger, The perception of human growth, Scientic American, vol. 242, no. 2, pp. 132144, 1980. [18] N. Ramanathan and R. Chellappa, Modeling age progression in young faces, IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 387394, 2006. [19] Y. H. Kwon and N. Vitoria Lobo, Age classication from facial images, Computer Vision and Image Understanding, vol. 74, no. 1, pp. 121, 1999. [20] N. Ramanathan and R. Chellappa, Face verication across age progression, IEEE Trans. on Image Processing, vol. 15, no. 11, pp. 33493361, 2006. [21] A. Lanitis, C. Taylor, and T. Cootes, Toward automatic simulation of aging effects on face images, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 442455, 2002. [22] A. Lanitis, C. Draganova, and C. Christodoulo, Comparing different classiers for automatic age estimation, IEEE Trans. Systems, Man and Cybernetics, vol. 34, no. 1, pp. 621628, 2004. [23] Y. Fu, Y. Xu, and T. S. Huang, Estimating human age by manifold analysis of face pictures and regression on aging features, International Conference on Multimedia and Expo, pp. 13831386, 2007. [24] G. D. Guo, Y. Fu, C. R. Dyer, and T. S. Huang, Image-based human age estimation by manifold learning and locally adjusted robust regression, IEEE Trans. on Image Processing, vol. 17, no. 7, pp. 11781188, July 2008. [25] R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge university press, 2000. [26] C. Rother, A new approach for vanishing point detection in architectural environments, in Proc. of 11th British machine vision conference, 2000, pp. 382391. [27] J. Kosecka and W. Zhang, Video compass, in European conf. on comp. vision. Springer verlag, 2002, pp. 476491. [28] M. Bosse, R. Rikoski, J. Leonard, and S. Teller, Vanishing points and 3d lines from omnidirectional video, in Proceedings of the international conference on image processing, vol. III, Rochester, New York, September 2002, pp. 513516. [29] A. M. Alberta, K. Ricanek, and E. Patterson, A review of the literature on the aging adult skull and face: Implications for forensic science research and applications, Forensic Science International, vol. 172, no. 1, pp. 19, 2007. [30] S. Belongie, J. Malik, and J. Puzicha, Shape matching and object recognition using shape contexts, PAMI, vol. 24, no. 4, pp. 509522, 2002. [31] J. Shi, A. Samal, and D. Marx, How effective are landmarks and their geometry for face recognition? Computer Vision and Image Understanding, vol. 102, no. 2, pp. 117133, 2006. [32] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, Active shape modelstheir training and application, Computer Vision and Image Understanding, vol. 61, no. 1, pp. 3859, 1995. [33] C. R. Goodall and K. V. Mardia, Projective shape analysis, Journal of Computational and Graphical Statistics, vol. 8, no. 2, 1999. [34] A. Edelman, T. A. Arias, and S. T. Smith, The geometry of algorithms with orthogonality constraints, SIAM Journal Matrix Analysis and Application, vol. 20, no. 2, pp. 303353, 1999. [35] Y. Chikuse, Statistics on special manifolds, Lecture Notes in Statistics. Springer, New York., 2003. [36] P. Turaga, S. Biswas, and R. Chellappa, The role of geometry in age estimation, in ICASSP, 2010, pp. 946949. [37] X. Geng, Z. Zhou, and K. Smith-Miles, Automatic age estimation based on facial aging patterns, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 22342240, 2007. [38] S. Yan, H. Wang, X. Tang, and T. S. Huang, Learning auto-structured regressor from uncertain nonnegative labels, IEEE International Conference on Computer Vision, no. 7, pp. 18, 2007. [39] K. Mitra, A. Veeraraghavan, and R. Chellappa, Robust RVM regression using sparse outlier model, in IEEE Conference on Computer Vision and Pattern Recognition, 2010, pp. 18871894.
96

Age and Height

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Age and Height

Enviado por

Direitos autorais:

Formatos disponíveis

Recent advances in age and height estimation from still images and video

(a) Age 2 Fig. 2.

Sample images from the FG-Net dataset. Other Algorithms

Inlier MAE 4.61 N.A.

Outlier MAE 25.87 N.A.

All MAE 6.52 6.80

Você também pode gostar