Attention and Activities in First Person Vision

Speaker: Yin Li , College of Computing - Georgia Tech

Date: Tuesday, February 21, 2017

Time: 11:00 AM to 12:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-D463

Event Type:

Room Description:

Host: Bolei Zhou, CSAIL MIT

Contact: Bolei Zhou,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:,

Reminder Subject: TALK: Tuesday 02-21-2017 Attention and Activities in First Person Vision

Advances in sensor miniaturization, low-power computing, and battery life have enabled the first generation of mainstream wearable cameras. Millions of hours of videos have been captured by these devices, creating a record of our daily visual experiences at an unprecedented scale. This has created a major opportunity to develop new capabilities and products based on First Person Vision (FPV)--the automatic analysis of videos captured from wearable cameras. Meanwhile, vision technology is at a tipping point. Major progress has been made over the last few years in both visual recognition and 3D reconstruction. The stage is set for a grand challenge of activity recognition in FPV. My research focuses on understanding naturalistic daily activities of the camera wearer in FPV to advance both computer vision and mobile health.

In the first part of this talk, I will demonstrate that first person video has the unique property of encoding the intentions and goals of the camera wearer. I will introduce a set of first person visual cues that captures the users' intent and can be used to predict their point of gaze and the actions they are performing during activities of daily living. Our methods are demonstrated using a benchmark dataset that I helped to create. In the second part, I will describe a novel approach to measure childrenÂ’s social behaviors during naturalistic face-to-face interactions with an adult partner, who is wearing a camera. I will show that first person video can support fine-grained coding of gaze (differentiating looks to eyes vs. face), which is valuable for autism research. Going further, I will present a method for automatically detecting moments of eye contact. This is joint work with Zhefan Ye, Sarah Edmunds, Dr. Alireza Fathi, Dr. Agata Rozga and Dr. Wendy Stone.

Bio: Yin Li is currently a doctoral candidate in the School of Interactive Computing at the Georgia Institute of Technology. His research interests lie at the intersection of computer vision and mobile health. Specifically, he creates methods and systems to automatically analyze first person videos, known as First Person Vision (FPV). He has particular interests in recognizing the person's activities and developing FPV for health care applications. He is the co-recipient of the best student paper awards at MobiHealth 2014 and IEEE Face & Gesture 2015. His work had been covered by MIT Tech Review, WIRED UK and New Scientist.

Research Areas:

Impact Areas:

See other events that are part of the Vision Seminar Series 2017.

Created by Bolei Zhou Email at Saturday, February 18, 2017 at 12:33 AM.