Thesis Defense: Linguistically Motivated Models for Lightly-Supervised Dependency Parsing

Speaker: Tahira Naseem , MIT CSAIL

Date: Wednesday, January 29, 2014

Time: 10:00 AM to 11:00 AM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-D463

Event Type:

Room Description:

Host: Regina Barzilay, MIT CSAIL

Contact: Marcia G. Davidson, 617-253-3049,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:,

Reminder Subject: TALK: Thesis Defense: Linguistically Motivated Models for Lightly-Supervised Dependency Parsing

Today, the top performing parsing algorithms rely on the availability of annotated data for learning the syntactic structure of a language. Unfortunately, syntactically annotated texts are available only for a handful of languages. The research presented in this thesis aims at developing parsing models that can effectively perform in a lightly-supervised training regime. In particular we focus on formulating linguistically aware models of dependency parsing that can exploit readily available sources of linguistic knowledge such as language universals and typological features. This type of linguistic knowledge can be used to motivate model design and/or to guide inference procedure.

We propose three alternative approaches for incorporating linguistic information into a lightly-supervised training setup. First, we show that linguistic information can be used in the form of rules on top of standard unsupervised parsing models to guide inference procedure. Next, we show that a linguistically aware model design greatly facilitates crosslingual parser transfer by leveraging syntactic connections between languages. Finally, we propose a corpus-level Bayesian framework that allows multiple linguistic views of data in a single model. Our models consistently outperform existing unsupervised and transfer based parsers across a diverse set languages.

Thesis Advisor: Regina Barzilay
Thesis Committee: Tommi Jaakkola and Ryan McDonald

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Marcia G. Davidson Email at Friday, January 24, 2014 at 6:36 PM.