FacER: Contrastive Attention based Expression Recognition via Smartphone Earpiece Speaker.

INFOCOM(2023)

Cited 0|Views17
No score
Abstract
Facial expression recognition has enormous potential for downstream applications by revealing users’ emotional status when interacting with digital content. Previous studies consider using cameras or wearable sensors for expression recognition. However, these approaches bring considerable privacy concerns or extra device burdens. Moreover, the recognition performance of camera-based methods deteriorates when users are wearing masks. In this paper, we propose FacER, an active acoustic facial expression recognition system. As a software solution on a smartphone, FacER avoids the extra costs of external microphone arrays. Facial expression features are extracted by modeling the echoes of emitted near-ultrasound signals between the earpiece speaker and the 3D facial contour. Besides isolating a range of background noises, FacER is designed to identify different expressions from various users with a limited set of training data. To achieve this, we propose a contrastive external attention-based model to learn consistent expression features across different users. Extensive experiments with 20 volunteers with or without masks show that FacER can recognize 6 common facial expressions with more than 85% accuracy, outperforming the state-of-the-art acoustic sensing approach by 10% in various real-life scenarios. FacER provides a more robust solution for recognizing facial expressions in a convenient and usable manner.
More
Translated text
Key words
Acoustic sensing,expression recognition,contrastive learning,attention,domain adaptation,smartphone
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined