Towards Open-domain Spoken Dialogue Systems

Speaker: Steve Young , Cambridge University

Date: Monday, October 28, 2013

Time: 4:30 PM to 5:30 PM Note: all times are in the Eastern Time Zone

Refreshments: 4:15 PM

Public: Yes

Location: 32-155

Event Type:

Room Description:

Host: Jim Glass and Victor Zue, MIT CSAIL

Contact: Marcia G. Davidson, 617-253-3049,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:,

Reminder Subject: TALK: Towards Open-domain Spoken Dialogue Systems

In contrast to traditional rule-based approaches to building spoken dialogue systems, recent research has shown that it is possible to implement all of the required functionality using statistical models trained using a combination of supervised learning and reinforcement learning. This new approach to spoken dialogue is based on the mathematics of partially observable Markov decision processes (POMDPs) in which user inputs are treated as observations of some underlying belief state, and system responses are determined by a policy which maps belief states into actions.

Virtually all current spoken dialogue systems are designed to operate in a specific carefully defined domain such as restaurant information, appointment booking, product installation support, etc. However, if voice is to become a significant input modality for accessing web-based information and services, then techniques will be needed to enable spoken dialogue systems to operate within open domains.

The first part of the talk will briefly review the basic ideas of POMDP dialogue systems as currently applied to closed-domains. Unlike many other areas of machine learning, spoken dialogue systems always have a user on-hand to provide supervision. Based on this idea, the second part of the talk describes a number of techniques by which implicit user supervision can allow a spoken dialogue system to adapt on-line to extended domains.

BIOGRAPHY: Steve Young is Professor of Information Engineering and Senior Pro-Vice Chancellor at Cambridge University. His main research interests lie in the area of spoken language systems including speech recognition, speech synthesis and dialogue management. He is the inventor and original author of the HTK Toolkit for building hidden Markov model-based recognition systems, and he co-developed the HTK large vocabulary speech recognition system. More recently he has worked on statistical dialogue systems and pioneered the use of Partially Observable Markov Decision Processes for modelling them.

He is a Fellow of the Royal Academy of Engineering, the International Speech Communication Association, the Institution of Engineering and Technology, and the Institute of Electrical and Electronics Engineers. In 2004, he was a recipient of an IEEE Signal Processing Society Technical Achievement Award; in 2010, he received the ISCA Medal for Scientific Achievement; and in 2013, he received the European Signal Processing Society Individual Technical Achievement Award.


This CSAIL SEMINAR SERIES, organized in cooperation with the Siri team at Apple, invites leading researchers in HLT to give lectures that introduce the fundamentals of spoken language systems, assess the current state of the art, outline challenges, and speculate on how they can be met. Lectures occur 2-3 times per semester and should be accessible to undergraduates with some technical background.

Research Areas:

Impact Areas:

See other events that are part of the Human Language Technology Distinguished Lecture Series 2013/2014.

Created by Marcia G. Davidson Email at Thursday, October 10, 2013 at 11:40 AM.