Thesis Defense: Model Selection in Compositional Spaces
, MIT CSAIL
Date: Tuesday, December 17, 2013
Time: 11:00 AM to 12:00 PM Note: all times are in the Eastern Time Zone
Refreshments: 10:45 AM
Host: William Freeman
Contact: Maysoon Hamdiyyah, 617-253-6693, email@example.com
Speaker URL: None
TALK: Thesis Defense: Model Selection in Compositional Spaces
Many complex probabilistic models are built by "composing" simpler models -- using one model to generate parameters or latent variables of another model. This allows us to express complex distributions over the observed data and to share statistical structure between different parts of a model. In this thesis, we present a space of matrix decomposition models defined by the composition of a small number of motifs of probabilistic modeling, including clustering, low rank factorizations, and binary latent factor models. This compositional structure is represented by a context-free grammar whose production rules correspond to these motifs. By exploiting the structure of this grammar, we can generically and efficiently infer latent components and estimate predictive likelihood for nearly 2500 model structures using a small toolbox of reusable algorithms. Using a greedy search over this grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs off gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code.
We next consider the problem of marginal likelihood estimation in the components of the grammar. We present a novel method for obtaining ground truth marginal likelihood values for simulated data by combining forward and reverse sequential Monte Carlo estimators. This enables the rigorous quantitative comparison of marginal likelihood estimators on realistic problem sizes. We evaluate a wide variety of marginal likelihood estimators for the production rules of the grammar and give practical guidelines for marginal likelihood estimation in matrix factorization models. Empirically, the most successful methods are based on some form of "annealing," where one defines a sequence of distributions to bridge between a tractable initial distribution and the intractable target distribution. We present a theoretical framework for analyzing annealing paths, and based on this framework, we present a novel annealing path based on averaging the moments of the initial and target distributions. This path performs well empirically at estimating partition functions of restricted Boltzmann machines.
Created by Maysoon Hamdiyyah at Thursday, December 12, 2013 at 2:25 PM.