Thesis Defense: Representing Unstructured Environments for Robotic Manipulation: Toward Generalization, Dexterity and Robustness

Speaker: Wei Gao , MIT CSAIL

Date: Wednesday, August 04, 2021

Time: 1:00 PM to 2:00 PM Note: all times are in the Eastern Time Zone

Public: Yes


Event Type: Thesis Defense

Room Description:

Host: Prof. Russ Tedrake, CSAIL MIT

Contact: Wei Gao,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:,

Reminder Subject: TALK: Thesis Defense: Representing Unstructured Environments for Robotic Manipulation


We would like to have highly useful robot manipulators that can handle a diversity of objects/environments, perform challenging manipulation tasks while being sufficiently robust such that deployment at scale is feasible. This thesis aims at such a generalizable, dexterous and robust manipulation pipeline. At the core of our approach is the representation of the environment. In particular, how should we represent the unstructured world such that it is useful for: 1) developing a capable manipulation pipeline; 2) performing a thorough robustness evaluation of it. To answer question 1), we propose the keypoint affordance, a novel object representation consists of 3D semantic keypoints. Existing works typically use 6 Degree-of-Freedom (DOF) poses to represent the manipulated objects. However, representing an object with a parameterized transformation defined on a fixed template cannot handle large shape mismatches among different objects. In contrast, our keypoint representation captures task-related geometric information while ignoring irrelevant details, which enables the generalization to unknown objects. We implement perception, planning and feedback control modules on top of the keypoint representation and integrate them into a fully functional perception-to-action manipulation pipeline. The second part of this thesis studies the pipeline robustness and attempts to answer the question 2). Due to the infeasibility of a parametric (pose-based) object representation, we do not have a continuous input domain for investigating how the object geometry impacts the robustness, which is a prerequisite for existing methods. To address this challenge, we model factors that affect the robustness as a structured distribution over variables (e.g. the camera pose), combined with an empirical distribution, that describes visual properties (e.g. the object geometry/texture). We then formulate the robustness evaluation as a failure rate estimation problem on this combined distribution and propose an efficient graph-based algorithm to solve it. Our formulation is applied to the developed manipulation pipeline, and it can benefit many other cyber-physical systems, such as autonomous cars.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Wei Gao Email at Friday, July 30, 2021 at 9:32 AM.