Acoustic Factor Analysis for Robust Speaker Verification

Speaker: Taufiq Hasan , University of Texas at Dallas

Date: Tuesday, April 09, 2013

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 5:45 PM

Public: Yes

Location: 32-D463 (Stata Center - Star Conference Room)

Event Type:

Room Description:

Host: Jim Glass and Najim Dehak, MIT CSAIL

Contact: Marcia Davidson, 617-253-3049, marcia@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, sls-seminars@csail.mit.edu

Reminder Subject: TALK: Acoustic Factor Analysis for Robust Speaker Verification

Variability due to channel and noise degradation are pertinent problems in speaker recognition/verification. Conventional methods operate on utterance models for mismatch compensation which are concatenated Gaussian Mixture Model (GMM) mean vectors adapted from a Universal Background Model (UBM). The key recent advancement in this domain is the development of low-dimensional i-vector representations from utterance models, that are easily classified by traditional machine learning techniques. Motivated by the approximate low-rank covariance structure of acoustic features, we propose a mixture of factor analysis models for the front-end. These models, termed as Acoustic Factor Analysis (AFA), perform feature compensation/compaction in each mixture component before the i-vector extraction stage, effectively removing nuisance dimensions from the feature space. We demonstrate that the proposed approach and its variants achieve improved performance in noisy and channel mismatched conditions, and can potentially replace GMMs in speaker recognition systems.

Biography

Taufiq Hasan is currently completing his Ph.D. in Electrical Engineering at The University of Texas at Dallas, and is a member of Center for Robust Speech Systems (CRSS). He is developing acoustic modeling techniques for robust speaker recognition in noisy and channel degraded conditions. He also worked on front-end compensation strategies, speech enhancement and audio/visual signal processing. He led the CRSS team for 2012 NIST Speaker Recognition Evaluation (SRE), which resulted in a high accuracy speaker recognition system and multiple research publications. As a summer Intern at BOSCH, he co-invented a patent. He received the ISCA student grant for his paper in Odyssey 2012. Prior to his Ph.D., he received B.S. and M.S. degrees in Electrical and Electronic Engineering from Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh. Also, he briefly served as a Lecturer in the Electrical & Electronic Engineering Department at United International University, Dhaka.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Marcia G. Davidson Email at Wednesday, June 19, 2013 at 6:25 AM.