Sequence Prediction with Neural Segmental Models

Speaker: Hao Tang , Toyota Technological Institute at Chicago

Date: Tuesday, February 14, 2017

Time: 3:00 PM to 4:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-G882 (Hewlett Room - Stata Center)

Event Type:

Room Description:

Host: Jim Glass, MIT CSAIL

Contact: Marcia G. Davidson, 617-253-3049, marcia@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, sls-seminars@csail.mit.edu

Reminder Subject: TALK: Sequence Prediction with Neural Segmental Models

Segments that span contiguous parts of inputs, such as phonemes in speech, named-entities in sentences, actions in videos, occur frequently in sequence prediction problems. Recent work has shown that segmental models, a class of models that explicitly hypothesizes segments, can significantly improve accuracy. However, segmental models suffer from slow decoding, hampering the use of computationally expensive features. In addition, training segmental models requires detailed manual annotation, which makes collecting datasets expensive.

In the first part of the talk, I will introduce discriminative segmental cascades, a multi-pass framework that allows us to improve accuracy by adding higher-order features and neural segmental features while maintaining efficiency. I will also show how the cascades can be used to speed up inference and training. In the second part of the talk, I will discuss end-to-end training for segmental models with various loss functions. I will address the difficulty of end-to-end training from random initialization by comparing it to two-stage training. Finally, I will show how end-to-end training can eliminate the need for detailed manual annotation.

Hao Tang is a Ph.D. candidate at Toyota Technological Institute at Chicago. His main interests are in machine learning and its application to speech recognition, with particular interests in discriminative training and segmental models. His work on segmental models has been nominated for the Best Paper award at ASRU 2015, and an application of such models to fingerspelling recognition has earned a Best Student Paper Award at ICASSP 2016. He received a B.S. degree in Computer Science and a M.S. degree in Electrical Engineering from National Taiwan University in 2007 and 2010, respectively.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Marcia G. Davidson Email at Tuesday, February 07, 2017 at 11:28 AM.