Communication Scene Analysis Based on Probabilistic Modeling of Human Gaze Behavior

Speaker: Kazuhiro Otsaka , NTT

Date: Friday, April 28, 2006

Time: 2:00 PM to 3:00 PM

Location: Patil Seminar Room (32-G449)

Host: Rob Miller, MIT CSAIL

Face-to-face conversation is one of the most basic forms of communication in our life. In recent years, communication scene analysis has emerged as an attractive way of creating innovative multimedia applications for teleconferencing, archiving / summarizing meetings, and social agents / robots.

In this talk, I will describe our research aimed to automatically analyze small-group conversations in face-to-face setting and to identify the basic structures of conversations from human nonverbal behavior extracted from video sequences. The novel features of our study include i) focus on eye gaze of participants during conversations as an indicator of addressing / listening behavior, ii) build a probabilistic conversation model for inferring the gaze directions and conversation structures like monologue and dialogue from observed utterances and head directions, and iii) visual head tracking using monocular cameras for measuring head directions. I will also discuss two applications of our conversation analysis, automatic video editing of meeting scenes and developing measures for quantifying inter-personal influence and characterizing the roles of participants.

Also, I will describe other research topics in NTT Communication Science Laboratories, such as “t-Room”, a prototype system for immersive teleconference.

Speaker bio: Kazuhiro Otsuka works as a research scientist in NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Japan. He received the B.E. and M.E. from Yokohama National University, Japan, in 1993 and 1995, respectively. Since joining NTT in 1995, his work has focused on computer vision, video analysis, and communication scene analysis. Now, he is a Ph.D candidate at Nagoya University, Japan.

