基于卷积-反卷积网络的正交人脸特征学习算法

深圳市媒体信息内容安全重点实验室,广东省智能信息处理重点实验室,深圳大学电子与信息工程学院,广东深圳518060

人工智能; 计算机神经网络; 深度学习; 人脸表情识别; 人脸图像分析; 正交人脸特征; 重构损失; 分类损失; 相关性最小损失

An orthogonal facial feature learning method based on convolutional-deconvolutional network
SUN Wenyun, SONG Yu, and CHEN Changsheng

Shenzhen Key Laboratory of Media Security, Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060,Guangdong Province, P.R.China

artificial intelligence; computer neural network; deep learning; facial expression recognition; facial image analysis; orthogonal facial feature; reconstruction loss; classification loss; correlation minimization loss

DOI: 10.3724/SP.J.1249.2020.05474

备注

身份特征与表情特征是人脸图像分析中的两组重要特征,传统的有监督正交人脸特征学习(supervised orthogonal facial feature learning, SOFFL)算法虽然能够在给定表情和身份标签时学习这一对特征,但因数据要求较高令其应用受限.提出一种低数据要求的无监督正交人脸特征学习(unsupervised orthogonal facial feature learning, UOFFL)算法,通过提取正交人脸特征的统一框架,假设人脸图像空间中仅有身份和表情变化,使用重构损失、分类损失和相关性最小化损失的组合,采用深度卷积-反卷积神经网络,从已对齐的人脸图像中联合学习,提取身份和表情特征.其中,分类损失用于学习表情特征; 相关性最小化损失用于提高身份特征和表情特征之间的独立性; 重构损失用于确保两组特征组合的信息完整性.在大规模合成人脸表情数据集(large-scale synthesized facial expression dataset, LSFED)和受限的Radboud人脸数据集(Radboud faces dataset, RaFD)上进行验证,将所学身份特征空间中的欧氏距离用于人脸验证任务,结果表明,算法性能接近联合贝叶斯等有监督人脸识别方法.UOFFL算法可在身份标签缺失的条件下,仅使用表情特征学得身份特征.相比改进前的SOFFL算法, 该方法缓解了对身份标签的依赖, 适用场合更广.

In facial image analysis, the identity and expression are two important features which are related to face recognition and facial expression recognition tasks. The traditional supervised orthogonal facial feature learning method can learn both features jointly once the emotion and identity labels are given. Its application scope is limited by the high data requirement. In this paper, we propose a new method which is a united framework of joint facial feature learning with lower data requirement. It is based on the assumption that there are only two variations in the face space, i.e., the identity and the emotion. By combining reconstruction loss, classification loss, and correlation minimization loss, we use the deep convolutional-deconvolutional neural network to learn and extract identity and expression features from the aligned facial image. The classification loss learns the expression feature. The correlation minimization loss keeps the two features independent of each other. The reconstruction loss confirms the information completeness of the face. When the identity label is missing, the proposed method can learn the identity feature based on the expression labels. By this means, the new method is a kind of unsupervised facial feature learning. It relieves the limitation of the existing method, extends the application scope. The proposed method is evaluated on the large-scale synthesized facial expression dataset and the constrained Radbound face dataset(RaFD)dataset. The Euclid distance in the identity feature space is used for facial recognition task in which the performance of the proposed unsupervised facial feature learning is close to some supervision ones including the joint Bayesian face method. Our method relieves the requirement of identity label in the previous supervised version to get a wider application scope.

·