A picture of the energy landscape of deep neural networks

Speaker: Pratik Chaudhari , UCLA

Date: Monday, June 12, 2017

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-D507

Event Type:

Room Description:

Host: Bolei Zhou, MIT CSAIL

Contact: Bolei Zhou, bzhou@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, vision-meeting@csail.mit.edu

Reminder Subject: TALK: A picture of the energy landscape of deep neural networks

Abstract:
Stochastic gradient descent (SGD) is the gold standard of optimization in deep learning. It does not, however, exploit the special structure and geometry of the loss functions we wish to optimize, viz. those of deep neural networks. In this talk, we will focus on the geometry of the energy landscape at local minima with an aim of understanding the generalization properties of deep networks.

In practice, optima discovered by SGD have a large proportion of almost-zero eigenvalues in the Hessian with very few positive or negative eigenvalues. We will first leverage upon this observation to construct an algorithm named Entropy-SGD that maximizes a local version of the free energy. Such a loss function favors flat regions of the energy landscape which are robust to perturbations and hence more generalizable, while simultaneously avoiding sharp, poorly-generalizable --- although possibly deep --- valleys. We will discuss connections of this algorithm with belief propagation and robust ensemble learning. Furthermore, we will establish a tight connection between such non-convex optimization algorithms and nonlinear partial differential equations. Empirical validation on CNNs and RNNs shows that Entropy-SGD and related algorithms compare favorably to state-of-the-art techniques in terms of both generalization error and training time.

arXiv: https://arxiv.org/abs/1611.01838, https://arxiv.org/abs/1704.04932

Bio:
Pratik Chaudhari is a PhD candidate in Computer Science at UCLA. With his advisor Stefano Soatto, he focuses on optimization algorithms for deep networks. He holds Master's and Engineer's degrees in Aeronautics and Astronautics from MIT where he worked on stochastic estimation and randomized motion planning algorithms for urban autonomous driving with Emilio Frazzoli.

Research Areas:

Impact Areas:

See other events that are part of the Vision Seminar Series 2017.

Created by Bolei Zhou Email at Friday, June 09, 2017 at 10:10 AM.