Search Results - "Seltzer, Michael L"

1
Deep beamforming networks for multi-channel speech recognition by Xiong Xiao, Watanabe, Shinji, Erdogan, Hakan, Liang Lu, Hershey, John, Seltzer, Michael L., Guoguo Chen, Yu Zhang, Mandel, Michael, Dong Yu

Published in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-03-2016)
“…Despite the significant progress in speech recognition enabled by deep neural networks, poor performance persists in some scenarios. In this work, we focus on…”

Get full text

Conference Proceeding Journal Article
QR Code
Save to List

Saved in:
2
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition by Chao Weng, Dong Yu, Seltzer, Michael L., Droppo, Jasha

Published in IEEE/ACM transactions on audio, speech, and language processing (01-10-2015)
“…We investigate techniques based on deep neural networks (DNNs) for attacking the single-channel multi-talker speech recognition problem. Our proposed approach…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
3
Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning by Giri, Ritwik, Seltzer, Michael L., Droppo, Jasha, Dong Yu

Published in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2015)
“…In this paper, we propose two approaches to improve deep neural network (DNN) acoustic models for speech recognition in reverberant environments. Both methods…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
4
Reconstruction of missing features for robust speech recognition by Raj, Bhiksha, Seltzer, Michael L., Stern, Richard M.

Published in Speech communication (01-09-2004)
“…Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing noise…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
5
A study on data augmentation of reverberant speech for robust speech recognition by Ko, Tom, Peddinti, Vijayaditya, Povey, Daniel, Seltzer, Michael L., Khudanpur, Sanjeev

Published in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-03-2017)
“…The environmental robustness of DNN-based acoustic models can be significantly improved by using multi-condition training data. However, as data collection is…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
6
A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition by Seltzer, Michael L., Raj, Bhiksha, Stern, Richard M.

Published in Speech communication (01-09-2004)
“…Missing feature methods of noise compensation for speech recognition operate by first identifying components of a spectrographic representation of speech that…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
An investigation of deep neural networks for noise robust speech recognition by Seltzer, Michael L., Dong Yu, Yongqiang Wang

Published in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (01-05-2013)
“…Recently, a new acoustic model based on deep neural networks (DNN) has been introduced. While the DNN has generated significant improvements over GMM-based…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
8
Transformer-Based Acoustic Modeling for Hybrid Speech Recognition by Wang, Yongqiang, Mohamed, Abdelrahman, Le, Due, Liu, Chunxi, Xiao, Alex, Mahadeokar, Jay, Huang, Hongzhao, Tjandra, Andros, Zhang, Xiaohui, Zhang, Frank, Fuegen, Christian, Zweig, Geoffrey, Seltzer, Michael L.

Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)
“…We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition. Several modeling choices are discussed in this work, including…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
9
Alignment Restricted Streaming Recurrent Neural Network Transducer by Mahadeokar, Jay, Shangguan, Yuan, Le, Duc, Keren, Gil, Su, Hang, Le, Thong, Yeh, Ching-Feng, Fuegen, Christian, Seltzer, Michael L.

Published in 2021 IEEE Spoken Language Technology Workshop (SLT) (19-01-2021)
“…There is a growing interest in the speech community in developing Recurrent Neural Network Transducer (RNN-T) models for automatic speech recognition (ASR)…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
10
Toward Human Parity in Conversational Speech Recognition by Xiong, Wayne, Droppo, Jasha, Xuedong Huang, Seide, Frank, Seltzer, Michael L., Stolcke, Andreas, Dong Yu, Zweig, Geoffrey

Published in IEEE/ACM transactions on audio, speech, and language processing (01-12-2017)
“…Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
11
Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition by Seltzer, M.L., Acero, A.

Published in IEEE transactions on audio, speech, and language processing (01-01-2007)
“…One serious difficulty in the deployment of wideband speech recognition systems for new tasks is the expense in both time and cost of obtaining sufficient…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
12
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer by Kim, Suyoun, Shangguan, Yuan, Mahadeokar, Jay, Bruguier, Antoine, Fuegen, Christian, Seltzer, Michael L., Le, Duc

Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06-06-2021)
“…Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech recognition model architectures, has an implicit neural network language model (NNLM)…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
13
Aipnet: Generative Adversarial Pre-Training of Accent-Invariant Networks for End-To-End Speech Recognition by Chen, Yi-Chen, Yang, Zhaojun, Yeh, Ching-Feng, Jain, Mahaveer, Seltzer, Michael L.

Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)
“…As one of the major sources in speech variability, accents have posed a grand challenge to the robustness of speech recognition systems. In this paper, our…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
14
Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers by Le, Duc, Seide, Frank, Wang, Yuhao, Li, Yang, Schubert, Kjell, Kalinli, Ozlem, Seltzer, Michael L.

Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)
“…We show how factoring the RNN-T's output distribution can significantly reduce the computation cost and power consumption for on-device ASR inference with no…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
15
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR by Le, Duc, Koehler, Thilo, Fuegen, Christian, Seltzer, Michael L.

Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)
“…Grapheme-based acoustic modeling has recently been shown to outperform phoneme-based approaches in both hybrid and end-to-end automatic speech recognition…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
16
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities by Tjandra, Andros, Singhal, Nayan, Zhang, David, Kalinli, Ozlem, Mohamed, Abdelrahman, Le, Duc, Seltzer, Michael L.

Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)
“…End-to-end multilingual ASR has become more appealing because of several reasons such as simplifying the training and deployment process and positive…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
17
Memory-Efficient Speech Recognition on Smart Devices by Venkatesh, Ganesh, Valliappan, Alagappan, Mahadeokar, Jay, Shangguan, Yuan, Fuegen, Christian, Seltzer, Michael L., Chandra, Vikas

Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06-06-2021)
“…Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
18
Improving fast-slow Encoder based Transducer with Streaming Deliberation by Li, Ke, Mahadeokar, Jay, Guo, Jinxi, Shi, Yangyang, Keren, Gil, Kalinli, Ozlem, Seltzer, Michael L., Le, Duc

Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)
“…This paper introduces a fast-slow encoder based transducer with streaming deliberation for end-to-end automatic speech recognition. We aim to improve the…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
19
Deep neural network features and semi-supervised training for low resource speech recognition by Thomas, Samuel, Seltzer, Michael L., Church, Kenneth, Hermansky, Hynek

Published in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (01-05-2013)
“…We propose a new technique for training deep neural networks (DNNs) as data-driven feature front-ends for large vocabulary continuous speech recognition…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
20
Neural-FST Class Language Model for End-to-End Speech Recognition by Bruguier, Antoine, Le, Duc, Prabhavalkar, Rohit, Li, Dangna, Liu, Zhe, Wang, Bo, Chang, Eun, Peng, Fuchun, Kalinli, Ozlem, Seltzer, Michael L.

Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)
“…We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:

Search Results - "Seltzer, Michael L"

Deep beamforming networks for multi-channel speech recognition by Xiong Xiao, Watanabe, Shinji, Erdogan, Hakan, Liang Lu, Hershey, John, Seltzer, Michael L., Guoguo Chen, Yu Zhang, Mandel, Michael, Dong Yu

Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition by Chao Weng, Dong Yu, Seltzer, Michael L., Droppo, Jasha

Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning by Giri, Ritwik, Seltzer, Michael L., Droppo, Jasha, Dong Yu

Reconstruction of missing features for robust speech recognition by Raj, Bhiksha, Seltzer, Michael L., Stern, Richard M.

A study on data augmentation of reverberant speech for robust speech recognition by Ko, Tom, Peddinti, Vijayaditya, Povey, Daniel, Seltzer, Michael L., Khudanpur, Sanjeev

A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition by Seltzer, Michael L., Raj, Bhiksha, Stern, Richard M.

An investigation of deep neural networks for noise robust speech recognition by Seltzer, Michael L., Dong Yu, Yongqiang Wang

Transformer-Based Acoustic Modeling for Hybrid Speech Recognition by Wang, Yongqiang, Mohamed, Abdelrahman, Le, Due, Liu, Chunxi, Xiao, Alex, Mahadeokar, Jay, Huang, Hongzhao, Tjandra, Andros, Zhang, Xiaohui, Zhang, Frank, Fuegen, Christian, Zweig, Geoffrey, Seltzer, Michael L.

Alignment Restricted Streaming Recurrent Neural Network Transducer by Mahadeokar, Jay, Shangguan, Yuan, Le, Duc, Keren, Gil, Su, Hang, Le, Thong, Yeh, Ching-Feng, Fuegen, Christian, Seltzer, Michael L.

Toward Human Parity in Conversational Speech Recognition by Xiong, Wayne, Droppo, Jasha, Xuedong Huang, Seide, Frank, Seltzer, Michael L., Stolcke, Andreas, Dong Yu, Zweig, Geoffrey

Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition by Seltzer, M.L., Acero, A.

Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer by Kim, Suyoun, Shangguan, Yuan, Mahadeokar, Jay, Bruguier, Antoine, Fuegen, Christian, Seltzer, Michael L., Le, Duc

Aipnet: Generative Adversarial Pre-Training of Accent-Invariant Networks for End-To-End Speech Recognition by Chen, Yi-Chen, Yang, Zhaojun, Yeh, Ching-Feng, Jain, Mahaveer, Seltzer, Michael L.

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers by Le, Duc, Seide, Frank, Wang, Yuhao, Li, Yang, Schubert, Kjell, Kalinli, Ozlem, Seltzer, Michael L.

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR by Le, Duc, Koehler, Thilo, Fuegen, Christian, Seltzer, Michael L.

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities by Tjandra, Andros, Singhal, Nayan, Zhang, David, Kalinli, Ozlem, Mohamed, Abdelrahman, Le, Duc, Seltzer, Michael L.

Memory-Efficient Speech Recognition on Smart Devices by Venkatesh, Ganesh, Valliappan, Alagappan, Mahadeokar, Jay, Shangguan, Yuan, Fuegen, Christian, Seltzer, Michael L., Chandra, Vikas

Improving fast-slow Encoder based Transducer with Streaming Deliberation by Li, Ke, Mahadeokar, Jay, Guo, Jinxi, Shi, Yangyang, Keren, Gil, Kalinli, Ozlem, Seltzer, Michael L., Le, Duc

Deep neural network features and semi-supervised training for low resource speech recognition by Thomas, Samuel, Seltzer, Michael L., Church, Kenneth, Hermansky, Hynek

Neural-FST Class Language Model for End-to-End Speech Recognition by Bruguier, Antoine, Le, Duc, Prabhavalkar, Rohit, Li, Dangna, Liu, Zhe, Wang, Bo, Chang, Eun, Peng, Fuchun, Kalinli, Ozlem, Seltzer, Michael L.

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication