DNN Uncertainty Propagation Using GMM-Derived Uncertainty Features for Noise Robust ASR

The uncertainty decoding framework is known to improve the deep neural network (DNN)-based automatic speech recognition (ASR) performance in noisy environments. It operates by estimating the statistical uncertainty about the input features and propagating it to the output senone posteriors by sampli...

Full description

Saved in:
Bibliographic Details
Published in:IEEE signal processing letters Vol. 25; no. 3; pp. 338 - 342
Main Authors: Nathwani, Karan, Vincent, Emmanuel, Illina, Irina
Format: Journal Article
Language:English
Published: IEEE 01-03-2018
Institute of Electrical and Electronics Engineers
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The uncertainty decoding framework is known to improve the deep neural network (DNN)-based automatic speech recognition (ASR) performance in noisy environments. It operates by estimating the statistical uncertainty about the input features and propagating it to the output senone posteriors by sampling. Unfortunately, this approximate propagation scheme limits the performance improvement. In this letter, we exploit the fact that uncertainty propagation can be achieved in closed form for Gaussian mixture acoustic models (GMMs). We introduce new GMM-derived (GMMD) uncertainty features for the robust DNN-based acoustic model training and decoding. The GMMD features are computed as the difference between the GMM log-likelihoods obtained with versus without uncertainty. They are concatenated with conventional acoustic features and used as inputs to the DNN. We evaluate the resulting ASR performance on the CHiME-2 and CHiME-3 datasets. The proposed features are shown to improve the performance on both datasets, both for the conventional decoding and for the uncertainty decoding with different uncertainty estimation/propagation techniques.
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2018.2791534