Multi-sensory perception from top to down

Speaker: Anna Min , CSAIL

Date: Monday, September 16, 2024

Time: 4:00 PM to 4:30 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-G882, Hewlett Room

Event Type: Seminar

Room Description:

Host: Thien Le, CSAIL

Contact: Thien Le, thienle@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: mitml@mit.edu, lids-seminars@mit.edu, seminars@csail.mit.edu

Reminder Subject: TALK: Multi-sensory perception from top to down

Abstract: Human sensory experiences, such as vision, hearing, touch, and smell, serve as natural interfaces for perceiving and reasoning about the world around us. Understanding 3D environments is crucial for applications like video processing, robotics, and augmented reality. This work explores how material properties and microgeometry can be learned through cross-modal associations between sight, sound, and touch. I will introduce a method that leverages in-the-wild online videos to study interactable audio generation via dense visual cues. Additionally, I will share recent advancements in multimodal scene understanding and discuss future directions for the field.

Bio: Anna is a senior undergraduate in Tsinghua University. Her previous research lies in multi-modal perception, from the perspective of audio and vision. She is an intern in Jim Glass's group.

Research Areas:
AI & Machine Learning

Impact Areas:

See other events that are part of the ML Tea.

Created by Thien Le Email at Wednesday, September 11, 2024 at 7:00 PM.