Login / Signup

Context-Aware Emotion Recognition in the Wild Using Spatio-Temporal and Temporal-Pyramid Models.

Nhu-Tai DoSoo-Hyung KimHyung Jeong YangGuee-Sang LeeSoonja Yeom
Published in: Sensors (Basel, Switzerland) (2021)
Emotion recognition plays an important role in human-computer interactions. Recent studies have focused on video emotion recognition in the wild and have run into difficulties related to occlusion, illumination, complex behavior over time, and auditory cues. State-of-the-art methods use multiple modalities, such as frame-level, spatiotemporal, and audio approaches. However, such methods have difficulties in exploiting long-term dependencies in temporal information, capturing contextual information, and integrating multi-modal information. In this paper, we introduce a multi-modal flexible system for video-based emotion recognition in the wild. Our system tracks and votes on significant faces corresponding to persons of interest in a video to classify seven basic emotions. The key contribution of this study is that it proposes the use of face feature extraction with context-aware and statistical information for emotion recognition. We also build two model architectures to effectively exploit long-term dependencies in temporal information with a temporal-pyramid model and a spatiotemporal model with "Conv2D+LSTM+3DCNN+Classify" architecture. Finally, we propose the best selection ensemble to improve the accuracy of multi-modal fusion. The best selection ensemble selects the best combination from spatiotemporal and temporal-pyramid models to achieve the best accuracy for classifying the seven basic emotions. In our experiment, we take benchmark measurement on the AFEW dataset with high accuracy.
Keyphrases
  • autism spectrum disorder
  • depressive symptoms
  • health information
  • borderline personality disorder
  • machine learning
  • neural network
  • healthcare
  • convolutional neural network
  • working memory
  • social media
  • monte carlo