Thesis Defense: Model Selection in Compositional Spaces

Speaker: Roger Grosse , MIT CSAIL

Date: Tuesday, December 17, 2013

Time: 11:00 AM to 12:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 10:45 AM

Public: Yes

Location: 32-G449

Event Type:

Room Description:

Host: William Freeman

Contact: Maysoon Hamdiyyah, 617-253-6693, maysoon@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, vgn@csail.mit.edu

Reminder Subject: TALK: Thesis Defense: Model Selection in Compositional Spaces

Many complex probabilistic models are built by "composing" simpler models -- using one model to generate parameters or latent variables of another model. This allows us to express complex distributions over the observed data and to share statistical structure between different parts of a model. In this thesis, we present a space of matrix decomposition models defined by the composition of a small number of motifs of probabilistic modeling, including clustering, low rank factorizations, and binary latent factor models. This compositional structure is represented by a context-free grammar whose production rules correspond to these motifs. By exploiting the structure of this grammar, we can generically and efficiently infer latent components and estimate predictive likelihood for nearly 2500 model structures using a small toolbox of reusable algorithms. Using a greedy search over this grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs off gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code.

We next consider the problem of marginal likelihood estimation in the components of the grammar. We present a novel method for obtaining ground truth marginal likelihood values for simulated data by combining forward and reverse sequential Monte Carlo estimators. This enables the rigorous quantitative comparison of marginal likelihood estimators on realistic problem sizes. We evaluate a wide variety of marginal likelihood estimators for the production rules of the grammar and give practical guidelines for marginal likelihood estimation in matrix factorization models. Empirically, the most successful methods are based on some form of "annealing," where one defines a sequence of distributions to bridge between a tractable initial distribution and the intractable target distribution. We present a theoretical framework for analyzing annealing paths, and based on this framework, we present a novel annealing path based on averaging the moments of the initial and target distributions. This path performs well empirically at estimating partition functions of restricted Boltzmann machines.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Maysoon Hamdiyyah Email at Thursday, December 12, 2013 at 2:25 PM.