Yoram Singer: Memory-Efficient Adaptive Optimization for Humungous-Scale Learning

Speaker: Yoram Singer, Princeton University

Date: Tuesday, April 23, 2019

Time: 4:00 PM to 5:00 PM

Public: Yes

Location: Patil/Kiva G449 (Gates Bldg, Stata)

Event Type: Seminar

Room Description:

Host: Aleksander Madry

Contact: Deborah Goodwin, 617.324.7303, dlehto@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
Generate thumbnail

Reminders to: seminars@csail.mit.edu, theory-seminars@csail.mit.edu

Reminder Subject: TALK: Yoram Singer: Memory-Efficient Adaptive Optimization for Humungous-Scale Learning

Adaptive gradient-based optimizers such as AdaGrad and Adam are among the methods of choice in modern machine learning. These methods maintain second-order statistics of each model parameter, thus doubling the memory footprint of the optimizer. In behemoth-size applications, this memory overhead restricts the size of the model being used as well as the number of examples in a mini-batch. We describe a novel, simple, and flexible adaptive optimization method with sublinear memory cost that retains the benefits of per-parameter adaptivity while allowing for larger models and mini-batches. We give convergence guarantees for our method and demonstrate its effectiveness in training some of the largest deep models used at Google.

Yoram Singer is the head of Principles Of Effective Machine-learning (POEM) research group in Google Brain and a professor of Computer Science at Princeton. He was a member of the technical staff at AT&T Research from
1995 through 1999 and an associate professor at the Hebrew University from 1999 through 2007. He is a fellow of AAAI. His research on machine learning algorithms received several awards.

Research Areas:

Impact Areas:

See other events that are part of the Theory of Computation Seminar (ToC) 2019.

Created by Deborah Goodwin Email at Thursday, April 18, 2019 at 11:00 AM.