- Efficient reinforcement lea...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in March 2022
Efficient reinforcement learning via singular value decomposition and reward shaping
Speaker:
Clement Gehring
, MIT EECS, MIT CSAIL
Date: Monday, March 28, 2022
Time: 10:30 AM to 11:30 AM Note: all times are in the Eastern Time Zone
Public: Yes
Location: Building 32, G449 (Patil/Kiva)
Event Type: Thesis Defense
Room Description:
Host:
Contact: Clement Gehring, gehring@csail.mit.edu
Speaker URL: None
Speaker Photo:
None
Reminders to:
seminars@csail.mit.edu
Reminder Subject:
TALK: Efficient reinforcement learning via singular value decomposition and reward shaping
Live stream: https://youtu.be/daVvdmcfMtM
Abstract:
Reinforcement learning (RL) provides a general framework for data-driven decision making. However, the very same generality that makes this approach applicable to a wide range of problems is also responsible for its well-known inefficiencies. In this talk, I will be giving an overview of RL and discuss informally why solving RL problems is fundamentally hard and, thus, often results in inefficient algorithms. We'll then see how certain common problem properties can be leveraged to improve both the computational and data efficiency while still producing algorithms applicable to a large class of problems we might care about. Specifically, this talk focuses on leveraging low-rank structure and classical planning heuristics. We first approach the subject with an overview of what low-rank structure is and how one would find it. We then show how low-rank structure in the state features can be exploited through a careful application of incremental singular value decomposition (SVD). With these tools, we show how to efficiently learn how to efficiently use linear action models which enables us to plan entirely in the low-rank space. We then discuss how prior knowledge about deterministic planning can be used to guide learning in a wide class of planning problems through reward shaping. We show how this enables us to efficiently learn improved heuristics specialized to a domain but still capable of generalizing to unseen planning problems.
Research Areas:
AI & Machine Learning
Impact Areas:
Created by Clement Gehring at Wednesday, March 23, 2022 at 2:40 PM.