Deep Unfolding: Deriving Novel Deep Network Architectures from Model-based Inference Methods

Speaker: John Hershey , Mitsubishi Electric Research Laboratories (MERL)

Date: Tuesday, April 28, 2015

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 3:45 PM

Public: Yes

Location: 32-G449 (Stata Center - Patil/Kiva Conference Room)

Event Type:

Room Description:

Host: Najim Dehak and Jim Glass, MIT CSAIL

Contact: Marcia G. Davidson, 617-253-3049, marcia@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, sls-seminars@csail.mit.edu

Reminder Subject: TALK: Deep Unfolding: Deriving Novel Deep Network Architectures from Model-based Inference Methods

Model-based methods and deep neural networks have both been tremendously successful paradigms in machine learning. In model-based methods, problem domain knowledge can be built into the constraints of the model, typically at the expense of difficulties during inference. In contrast, deterministic deep neural networks are constructed in such a way that inference is straightforward, but their architectures are rather generic and it can be unclear how to incorporate problem domain knowledge. This work aims to obtain some of the advantages of both approaches. To do so, we start with a model-based approach and unfold the iterations of its inference method to form a layer-wise structure. This results in novel neural-network-like architectures that incorporate our model-based constraints, but can be trained discriminatively to perform fast and accurate inference. This framework allows us to view conventional sigmoid networks as a special case of unfolding Markov random field inference, and leads to other interesting generalizations.

We show how it can be applied to other models, such as non-negative matrix factorization, to obtain a new kind of non-negative deep neural network, that can be trained using a multiplicative backpropagation-style update algorithm. In speech enhancement experiments we show that our approach is competitive with conventional neural networks, while using fewer parameters.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Marcia G. Davidson Email at Monday, April 27, 2015 at 2:35 PM.