Multimodal Human Intention Perception for Human Robot Interaction

 

One of the key abilities for effective human-robot interaction is the understanding of human intention by the robots. In this project, we focus on recognition of human facial and body expressions from multimodal input signals, i.e. visual and audio.


Face Detection

Before working on the facial expression, we begin with improving existing face detection technique for realtime performance. In this work, we have proposed a combination of margin around region of interest (ROI) and template matching to improve detection rate and time performance.

Hybrid Margin-based Region of Interest (MROI) Face Detection and Tracking System

We have collected real-life videos from the public domain to evaluate the performance of our proposed algorithms.

People

Data/Code

Publications

Videos

  • We proposed a hybrid model with margin-based real-time face detection and tracking. We tested six different implementations of real-time face detection and tracking on 10 video source. The following videos demonstrate the effectiveness of combining template matching and margin-based ROI (region of interest) in improving face detection rate and speed. It compares six implementations on each source video. The right frame is the video source annotated with performance data, the left frame is the data being processed in each frame. More details are given in the published paper “Hybrid Model with Margin-Based Real-Time Face Detection and Tracking“.

    Description of the annotations in the analysis video



  • We extend the work of hybrid margin-based region of interest face detection and tracking system with the state of art CNN based main routine. We have also expanded the dataset to 20 real-life videos. The system proved to be able to improve the performance of the state of art face detection system. More details are given in the published paper “Using Margin-based Region of Interest Technique with Multi-Task Convolutional Neural Network and Template Matching for Robust Face Detection and Tracking System”.



 

Visual-Based Facial Expression Recognition

Armed with improved performance in real-time face detection, we progress into visual-based facial expression recognition. We explored feature based and deep learning approaches. In our journey to discover a well performed facial expression recognition system, we experimented with different publicly available datasets and compiled a well organized dataset.

People

Data/Code

Publications

Videos

This is the result of the the real-time visual-based facial expression recognition of our recognition model that has been trained with our dataset, CK+UBD. It has shown good real-time performance in recognizing expressions of eight (8) emotions.



Mutli-modal Human Emotion Recognition

People