Score Normalization and System Combination for Keyword Spotting in Speech

Speaker: Damianos Karakos , Raytheon BBN Technologies

Date: Thursday, November 21, 2013

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 3:45 PM

Public: Yes

Location: 32-G882 (Stata Center - Hewlett Room)

Event Type:

Room Description:

Host: Jim Glass, MIT CSAIL

Contact: Marcia G. Davidson, 617-253-3049, marcia@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, sls-seminars@csail.mit.edu

Reminder Subject: TALK: Score Normalization and System Combination for Keyword Spotting in Speech

Keyword spotting, the task of finding words or phrases of interest in audio, is related to, but still quite different from that of speech recognition, where a verbatim transcript is desired. Keyword spotting has the objective of extracting specific, content-bearing words or phrases, and this makes it crucial to use a performance measure that does not weight all word tokens equally. One such measure is the Actual Term Weighted Value (ATWV), which has been used in the IARPA-funded Babel project, and this talk is focused on techniques that have its maximization as the optimization objective. Two such techniques are score normalization and system combination. Score normalization aims at converting the scores of different keywords so that they are commensurate with each other, and they more closely correspond to the probability of being correct than raw posteriors. System combination merges the detections of multiple systems together, thus combining the strengths of different detection modalities, tokenizations, or models. Both of these techniques were applied successfully by BBN in the official evaluation of the Babel project in March/April of 2013, resulting in large gains, of the order of 8-10 points (absolute) in five different languages.

(This work was done in collaboration with Richard M. Schwartz; the contribution of BBN colleagues S. Tsakalidis, I. Bulyko, L. Zhang, S. Ranjan, T. Ng, R. Hsiao, G. Saikumar, L. Nguyen, J. Makhoul, as well as other members of the BABELON team, is gratefully acknowledged.)

Damianos Karakos has been a Research Scientist with Raytheon BBN Technologies since June 2012. He obtained the PhD in Electrical Engineering from the University of Maryland in 2002. He was a postdoctoral fellow in the Department of Electrical Engineering and the Center for Language and Speech Processing at Johns Hopkins University between 2003 and 2007. He became Assistant Research Professor in 2007, and, additionally, Research Scientist with the Human Language Technology Center of Excellence at JHU in 2011. His research interests lie in the general area of statistical pattern recognition, with a focus on speech and language applications.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Marcia G. Davidson Email at Friday, November 15, 2013 at 4:36 PM.