- Thesis Defense: Efficient a...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in May 2022
Thesis Defense: Efficient and Robust Algorithms for Practical Machine Learning
Speaker:
Yujia Bao
, MIT CSAIL
Date: Friday, May 06, 2022
Time: 12:00 PM to 1:00 PM Note: all times are in the Eastern Time Zone
Public: Yes
Location: 32-D463 (Stata Center - Star Conference Room)
Event Type: Thesis Defense
Room Description:
Host: Regina Barzilay, MIT CSAIL
Contact: Yujia Bao, yujia@csail.mit.edu
Relevant URL:
Speaker URL: https://people.csail.mit.edu/yujia/
Speaker Photo:
None
Reminders to:
seminars@csail.mit.edu
Reminder Subject:
TALK: Thesis Defense: Efficient and Robust Algorithms for Practical Machine Learning
This seminar series is online for everyone.
MIT Community members may attend in person.
For remote access to this event:
https://mit.zoom.us/j/4492242635
Thesis Advisors: Regina Barzilay
Thesis Committee: Dina Katabi, Pulkit Agrawal
Abstract:
Machine learning models are biased when trained on biased datasets. While many approaches have been proposed to mitigate biases, they often require human expert to identify and annotate the biases a priori. This thesis proposes three efficient algorithms for learning robust models. These algorithms do not require explicit annotations of the biases, enabling practical machine learning.
First, we introduce an algorithm that operates on data collected from multiple environments, across which correlations between unstable (bias) features and the label may vary. While these biases are not explicitly annotated, we show that when using a classifier trained on one environment to make predictions on examples from a different environment, its mistakes are informative of the unstable correlations. We leverages these mistakes to create groups of examples whose interpolation yields a distribution with only stable correlations.
We then consider the setting where we lack access to multiple environments, a common scenario for new tasks or resource-limited tasks. We show that in real-world applications related tasks often share similar biases. Based on this observation, we propose an algorithm that infers bias features from a resource-rich source task and transfers this knowledge to the target task.
Finally, we propose an algorithm for automatic bias detection where we are only given a set of input-label pairs. Our algorithm learns to split the dataset so that classifiers trained on the training split cannot generalize to the testing split. The performance gap provides a proxy for measuring the degree of bias in the learned features and can therefore be used to identify unknown biases.
Research Areas:
AI & Machine Learning
Impact Areas:
Created by Yujia Bao at Wednesday, March 16, 2022 at 4:33 PM.