Neural Network Bottleneck Features for Language Identification

Speaker: Pavel Matejka , BBN

Date: Monday, February 24, 2014

Time: 3:00 PM to 4:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 2:45 PM

Public: Yes

Location: 32-G882 Stata Center - Hewlett Room

Event Type:

Room Description:

Host: Najim Dehak and Jim Glass, MIT CSAIL

Contact: Marcia G. Davidson, 617-253-3049, marcia@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, sls-seminars@csail.mit.edu

Reminder Subject: TALK: Neural Network Bottleneck Features for Language Identification

This talk presents the application of Neural Network Bottleneck (BN) features in Language Identification (LID). BN features are generally used for Large Vocabulary Speech Recognition in conjunction with conventional acoustic features, such as MFCC or PLP. We compare the BN features to several common types of acoustic features used in the present-day state-of-the-art LID systems. The test set is from DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state-of-the-art detection capabilities on audio from highly degraded radio communication channels. On this type of noisy data, we show that in average, the BN features provide a 45% relative improvement in the Cavg or Equal Error Rate (EER) metrics across several test duration conditions, with respect to our single best acoustic features.

Pavel Matejka received the Ph.D. degree (2009) in signal, image, and speech processing from the Department of Computer Graphics and Multimedia, Faculty of Information Technology, Brno University Of Technology (BUT). He is a postdoc researcher at Raytheon BBN technologies since 03/2012. He participated in the European Commission's projects M4, AMI, AMIDA and MOBIO, in the language identification project sponsored by the US Air-Force European Office of Aerospace Research and Development (EOARD) and the language and speaker recognition projects sponsored by Czech Ministry of Defense. He currently work on the robust language and speaker identification for DARPA RATS (Robust Automatic Transcription of Speech) program. He took part in all NIST Language recognition evaluations (since 2003) and speaker recognition evaluations (since 2006), as a member of BUT team, with excellent results. His research interests include robust feature extraction, speaker verification, language identification, speech recognition, namely phone recognition and neural networks.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Marcia G. Davidson Email at Wednesday, February 19, 2014 at 11:23 AM.