- A picture of the energy lan...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in June 2017
A picture of the energy landscape of deep neural networks
Speaker:
Pratik Chaudhari
, UCLA
Date: Monday, June 12, 2017
Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone
Public: Yes
Location: 32-D507
Event Type:
Room Description:
Host: Bolei Zhou, MIT CSAIL
Contact: Bolei Zhou, bzhou@csail.mit.edu
Speaker URL: None
Speaker Photo:
None
Reminders to:
seminars@csail.mit.edu, vision-meeting@csail.mit.edu
Reminder Subject:
TALK: A picture of the energy landscape of deep neural networks
Abstract:
Stochastic gradient descent (SGD) is the gold standard of optimization in deep learning. It does not, however, exploit the special structure and geometry of the loss functions we wish to optimize, viz. those of deep neural networks. In this talk, we will focus on the geometry of the energy landscape at local minima with an aim of understanding the generalization properties of deep networks.
In practice, optima discovered by SGD have a large proportion of almost-zero eigenvalues in the Hessian with very few positive or negative eigenvalues. We will first leverage upon this observation to construct an algorithm named Entropy-SGD that maximizes a local version of the free energy. Such a loss function favors flat regions of the energy landscape which are robust to perturbations and hence more generalizable, while simultaneously avoiding sharp, poorly-generalizable --- although possibly deep --- valleys. We will discuss connections of this algorithm with belief propagation and robust ensemble learning. Furthermore, we will establish a tight connection between such non-convex optimization algorithms and nonlinear partial differential equations. Empirical validation on CNNs and RNNs shows that Entropy-SGD and related algorithms compare favorably to state-of-the-art techniques in terms of both generalization error and training time.
arXiv: https://arxiv.org/abs/1611.01838, https://arxiv.org/abs/1704.04932
Bio:
Pratik Chaudhari is a PhD candidate in Computer Science at UCLA. With his advisor Stefano Soatto, he focuses on optimization algorithms for deep networks. He holds Master's and Engineer's degrees in Aeronautics and Astronautics from MIT where he worked on stochastic estimation and randomized motion planning algorithms for urban autonomous driving with Emilio Frazzoli.
Research Areas:
Impact Areas:
Created by Bolei Zhou at Friday, June 09, 2017 at 10:10 AM.