Search Results - "Seltzer, Michael L"
-
1
Deep beamforming networks for multi-channel speech recognition
Published in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-03-2016)“…Despite the significant progress in speech recognition enabled by deep neural networks, poor performance persists in some scenarios. In this work, we focus on…”
Get full text
Conference Proceeding Journal Article -
2
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition
Published in IEEE/ACM transactions on audio, speech, and language processing (01-10-2015)“…We investigate techniques based on deep neural networks (DNNs) for attacking the single-channel multi-talker speech recognition problem. Our proposed approach…”
Get full text
Journal Article -
3
Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning
Published in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2015)“…In this paper, we propose two approaches to improve deep neural network (DNN) acoustic models for speech recognition in reverberant environments. Both methods…”
Get full text
Conference Proceeding -
4
Reconstruction of missing features for robust speech recognition
Published in Speech communication (01-09-2004)“…Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing noise…”
Get full text
Journal Article -
5
A study on data augmentation of reverberant speech for robust speech recognition
Published in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-03-2017)“…The environmental robustness of DNN-based acoustic models can be significantly improved by using multi-condition training data. However, as data collection is…”
Get full text
Conference Proceeding -
6
A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
Published in Speech communication (01-09-2004)“…Missing feature methods of noise compensation for speech recognition operate by first identifying components of a spectrographic representation of speech that…”
Get full text
Journal Article -
7
An investigation of deep neural networks for noise robust speech recognition
Published in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (01-05-2013)“…Recently, a new acoustic model based on deep neural networks (DNN) has been introduced. While the DNN has generated significant improvements over GMM-based…”
Get full text
Conference Proceeding -
8
Transformer-Based Acoustic Modeling for Hybrid Speech Recognition
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition. Several modeling choices are discussed in this work, including…”
Get full text
Conference Proceeding -
9
Alignment Restricted Streaming Recurrent Neural Network Transducer
Published in 2021 IEEE Spoken Language Technology Workshop (SLT) (19-01-2021)“…There is a growing interest in the speech community in developing Recurrent Neural Network Transducer (RNN-T) models for automatic speech recognition (ASR)…”
Get full text
Conference Proceeding -
10
Toward Human Parity in Conversational Speech Recognition
Published in IEEE/ACM transactions on audio, speech, and language processing (01-12-2017)“…Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we…”
Get full text
Journal Article -
11
Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition
Published in IEEE transactions on audio, speech, and language processing (01-01-2007)“…One serious difficulty in the deployment of wideband speech recognition systems for new tasks is the expense in both time and cost of obtaining sufficient…”
Get full text
Journal Article -
12
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06-06-2021)“…Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech recognition model architectures, has an implicit neural network language model (NNLM)…”
Get full text
Conference Proceeding -
13
Aipnet: Generative Adversarial Pre-Training of Accent-Invariant Networks for End-To-End Speech Recognition
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…As one of the major sources in speech variability, accents have posed a grand challenge to the robustness of speech recognition systems. In this paper, our…”
Get full text
Conference Proceeding -
14
Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…We show how factoring the RNN-T's output distribution can significantly reduce the computation cost and power consumption for on-device ASR inference with no…”
Get full text
Conference Proceeding -
15
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…Grapheme-based acoustic modeling has recently been shown to outperform phoneme-based approaches in both hybrid and end-to-end automatic speech recognition…”
Get full text
Conference Proceeding -
16
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…End-to-end multilingual ASR has become more appealing because of several reasons such as simplifying the training and deployment process and positive…”
Get full text
Conference Proceeding -
17
Memory-Efficient Speech Recognition on Smart Devices
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06-06-2021)“…Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models…”
Get full text
Conference Proceeding -
18
Improving fast-slow Encoder based Transducer with Streaming Deliberation
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…This paper introduces a fast-slow encoder based transducer with streaming deliberation for end-to-end automatic speech recognition. We aim to improve the…”
Get full text
Conference Proceeding -
19
Deep neural network features and semi-supervised training for low resource speech recognition
Published in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (01-05-2013)“…We propose a new technique for training deep neural networks (DNNs) as data-driven feature front-ends for large vocabulary continuous speech recognition…”
Get full text
Conference Proceeding -
20
Neural-FST Class Language Model for End-to-End Speech Recognition
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)“…We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and…”
Get full text
Conference Proceeding