Yoram Singer: Memory-Efficient Adaptive Optimization for Humungous-Scale Learning

Speaker: Yoram Singer, Princeton University

Date: Tuesday, April 23, 2019

Time: 4:00 PM to 5:00 PM

Public: Yes

Location: Patil/Kiva G449 (Gates Bldg, Stata)

Event Type: Seminar

Room Description:

Host: Aleksander Madry

Contact: Deborah Goodwin, 617.324.7303, dlehto@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
Generate thumbnail

Reminders to: seminars@csail.mit.edu, theory-seminars@csail.mit.edu

Reminder Subject: TALK: Yoram Singer: Memory-Efficient Adaptive Optimization for Humungous-Scale Learning

Abstract:
Adaptive gradient-based optimizers such as AdaGrad and Adam are among the methods of choice in modern machine learning. These methods maintain second-order statistics of each model parameter, thus doubling the memory footprint of the optimizer. In behemoth-size applications, this memory overhead restricts the size of the model being used as well as the number of examples in a mini-batch. We describe a novel, simple, and flexible adaptive optimization method with sublinear memory cost that retains the benefits of per-parameter adaptivity while allowing for larger models and mini-batches. We give convergence guarantees for our method and demonstrate its effectiveness in training some of the largest deep models used at Google.

Bio:
Yoram Singer is the head of Principles Of Effective Machine-learning (POEM) research group in Google Brain and a professor of Computer Science at Princeton. He was a member of the technical staff at AT&T Research from
1995 through 1999 and an associate professor at the Hebrew University from 1999 through 2007. He is a fellow of AAAI. His research on machine learning algorithms received several awards.

Research Areas:

Impact Areas:

See other events that are part of the Theory of Computation Seminar (ToC) 2019.

Created by Deborah Goodwin Email at Thursday, April 18, 2019 at 11:00 AM.