Efficient reinforcement learning via singular value decomposition and reward shaping

Speaker: Clement Gehring , MIT EECS, MIT CSAIL

Date: Monday, March 28, 2022

Time: 10:30 AM to 11:30 AM Note: all times are in the Eastern Time Zone

Public: Yes

Location: Building 32, G449 (Patil/Kiva)

Event Type: Thesis Defense

Room Description:

Host:

Contact: Clement Gehring, gehring@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu

Reminder Subject: TALK: Efficient reinforcement learning via singular value decomposition and reward shaping

Live stream: https://youtu.be/daVvdmcfMtM

Abstract:
Reinforcement learning (RL) provides a general framework for data-driven decision making. However, the very same generality that makes this approach applicable to a wide range of problems is also responsible for its well-known inefficiencies. In this talk, I will be giving an overview of RL and discuss informally why solving RL problems is fundamentally hard and, thus, often results in inefficient algorithms. We'll then see how certain common problem properties can be leveraged to improve both the computational and data efficiency while still producing algorithms applicable to a large class of problems we might care about. Specifically, this talk focuses on leveraging low-rank structure and classical planning heuristics. We first approach the subject with an overview of what low-rank structure is and how one would find it. We then show how low-rank structure in the state features can be exploited through a careful application of incremental singular value decomposition (SVD). With these tools, we show how to efficiently learn how to efficiently use linear action models which enables us to plan entirely in the low-rank space. We then discuss how prior knowledge about deterministic planning can be used to guide learning in a wide class of planning problems through reward shaping. We show how this enables us to efficiently learn improved heuristics specialized to a domain but still capable of generalizing to unseen planning problems.

Research Areas:
AI & Machine Learning

Impact Areas:

This event is not part of a series.

Created by Clement Gehring Email at Wednesday, March 23, 2022 at 2:40 PM.