CSAIL Seminar: Reinforcement Learning Systems at DeepMind

Speaker: David Budden , DeepMinds

Date: Tuesday, September 18, 2018

Time: 4:00 PM to 5:00 PM

Location: 32-D463

Event Type: Seminar

Host: Nir Shavit, MIT

Abstract: The many recent successes of deep reinforcement learning have resulted from innovation not just in algorithm design, but the co-development of systems capable of scaling to thousands of machines and leveraging specialized hardware.

In this seminar I will cover three topics:

A brief introduction to off-policy reinforcement learning and policy gradient methods

Recent algorithmic improvements underlying the D4PG agent for continuous control and robotics, e.g. distributional RL and prioritized experience replay

Architectures and open research questions in distributing agents across many machines

Constant iteration between algorithm design and systems engineering is a hallmark of the Research Engineering role at DeepMind, and through this seminar I also hope to give a flavor of what this entails day-to-day.

Bio: David Budden is a Research Engineering Team Lead and Tech Lead for DeepMind's Machine Learning team. Before joining DeepMind, he worked as a postdoc in CSAIL with Prof Nir Shavit.

David's research interests include generative models, few-shot imitation and self-supervised learning. His main passion however is the intersection of machine learning research and systems engineering. David prepared and teaches DeepMind's internal training courses on distributed machine learning, and helped develop many of their engineering systems (e.g. Control Suite, ApeX) and state-of-the-art reinforcement learning agents (e.g. D4PG, DQfD).

