Automatic Age Estimation from Face Images via Deep Ranking.

BMVC(2015)

引用 49|浏览63
暂无评分
摘要
This paper focuses on automatic age estimation (AAE) from face images, which amounts to determining the exact age or age group of a face image according to features from faces. Although great effort has been devoted to AAE [1, 4, 6], it remains a challenging problem. The difficulties are due to large facial appearance variations resulting from a number of factors, e.g., aging and facial expressions. AAE algorithms need to overcome heterogeneity in facial appearance changes to provide accurate age estimates. To this end, we propose a generic, deep network model for AAE (see Figure 1). Given a face image, our network first extracts features from the face through a 3-layer scattering network (ScatNet) [2], then reduces the feature dimension by principal component analysis (PCA), and finally predicts the age via category-wise rankers constructed as a 3-layer fullyconnected network. The contributions are: (1) Our ranking method is point-wised and thus is easily scaled up to large-scale datasets; (2) our deep ranking model is general and can be applied to age estimation from faces with large facial appearance variations as a result of aging or facial expression changes; and (3) we show that the high-level concepts learned from large-scale neutral faces can be transferred to estimating ages from faces under expression changes, leading to improved performance. Our model is with the following characteristics: (1) The scattering features are invariant to translation and small deformations. ScatNet is a deep convolutional network of specific characteristics. It uses predefined wavelets and computes scattering representations via a cascade of wavelet transforms and modulus pooling operators from shallow to deep layers. With the nonlinear modulus and averaging operators, ScatNet can produce representations that are discriminative as well as invariant to translation and small deformations. As ScatNet provides fundamentally invariant representations for discriminating feature extraction, only the weights of the fully-connected layers are learned in our network model, which considerably reduces the training time. (2) The rank labels encoded in the network exploit the ordering relation among labels. Each category-wise ranker is an ordinal regression ranker. We encode the age rank based on the reduction framework [5]. Given a set of training samples X = {(xi,yi), i = 1 · · ·N}, let xi ∈ RD be the input image and yi be a rank label (yi ∈ {1, . . . ,K}), respectively, where K is the number of age ranks. For rank k, we separate X into two subsets, X k and X − k , as follows: X k = {(xi,+1)|yi u003e k} X− k = {(xi,−1)|yi ≤ k}. (1)
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要