Capturing and visualizing high order statistics with counting grids: From extreme image reconstruction to viral load regression to text skimming

Speaker: Nebojsa Jojic , Microsoft Research

Date: Wednesday, October 09, 2013

Time: 11:00 AM to 12:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 10:45 AM

Public: Yes

Location: 32-G449 (Kiva/Patil)

Event Type:

Room Description:

Host: William T. Freeman

Contact: Hossein Mobahi, 6172536693, hmobahi@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu

Reminder Subject: TALK: Capturing and visualizing high order statistics with counting grids: From extreme image reconstruction to viral load regression to text skimming

A counting grid is a simple generative model of bags of features. It consists of cells with feature distributions, and it generates features for each bag by first choosing a window into the grid at random, and then filling the bag with features sampled from the window. A vision researcher may recognize this generative process as the preprocessing step in many vision tasks, in which an image region is modeled as a bag of features. After features are extracted from the image region, the features’ locations are forgotten and we are left with a disordered jumble. I will discuss the inverse process and its uses: Suppose I am given these bags coming from many overlapping regions in one image. Suppose I am not given region-bag correspondences. And suppose I am not even given the original image. Can I still figure out where the regions must be coming from, where from in these regions the bags’ features originated, and what the image must have looked like (or rather what the spatial arrangement of all features in the image was)? It turns out that it is possible to do this surprisingly well, which then raises the question if such a map of features can be constructed for other data types, i.e. from bags of features that do not naturally come from a 2-D spatial arrangement, e.g. bag of words representations of language or molecular concentrations. I will address these questions as well as applications that came out of answering them, including classification/recognition tasks in vision, predicting viral load levels in HIV patients, and summarizing large collections of cooking recipes, science articles, and imdb movies in a way that allows for skim reading through thousands of abstracts in a minute while searching for various articles of interest in parallel.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Hossein Mobahi Email at Thursday, September 26, 2013 at 2:23 PM.