Search Results - "Saon, George"

1
Deep Convolutional Neural Networks for Large-scale Speech Tasks by Sainath, Tara N., Kingsbury, Brian, Saon, George, Soltau, Hagen, Mohamed, Abdel-rahman, Dahl, George, Ramabhadran, Bhuvana

Published in Neural networks (01-04-2015)
“…Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
2
Advancing RNN Transducer Technology for Speech Recognition by Saon, George, Tuske, Zoltan, Bolanos, Daniel, Kingsbury, Brian

Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-01-2021)
“…We investigate a set of techniques for RNN Transducers (RNN-Ts) that were instrumental in lowering the word error rate on three different tasks (Switchboard…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
3
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition by Audhkhasi, Kartik, Kingsbury, Brian, Ramabhadran, Bhuvana, Saon, George, Picheny, Michael

Published in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2018)
“…Direct acoustics-to-word (A2W) models in the end-to-end paradigm have received increasing attention compared to conventional subword based automatic speech…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
4
Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions by Thomas, Samuel, Ganapathy, Sriram, Saon, George, Soltau, Hagen

Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)
“…Convolutional neural networks (CNN) are extensions to deep neural networks (DNN) which are used as alternate acoustic models with state-of-the-art performances…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
5
Joint training of convolutional and non-convolutional neural networks by Soltau, Hagen, Saon, George, Sainath, Tara N.

Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)
“…We describe a simple modification of neural networks which consists in extending the commonly used linear layer structure to an arbitrary graph structure. This…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
6
Speaker adaptation of neural network acoustic models using i-vectors by Saon, George, Soltau, Hagen, Nahamoo, David, Picheny, Michael

Published in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (01-12-2013)
“…We propose to adapt deep neural network (DNN) acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
7
Alignment-Length Synchronous Decoding for RNN Transducer by Saon, George, Tuske, Zoltan, Audhkhasi, Kartik

Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)
“…We present a beam decoding strategy for recurrent neural network transducers which has the characteristic that all competing hypotheses within the beam have…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
8
Diagonal State Space Augmented Transformers for Speech Recognition by Saon, George, Gupta, Ankit, Cui, Xiaodong

Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)
“…We improve on the popular conformer architecture by replacing the depthwise temporal convolutions with diagonal state space (DSS) models. DSS is a recently…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
9
Boosting systems for large vocabulary continuous speech recognition by Saon, George, Soltau, Hagen

Published in Speech communication (01-02-2012)
“…► We apply the Adaboost algorithm to large vocabulary continuous speech recognition. ► Acoustic models are trained sequentially on re-weighted data. ► Phonetic…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
10
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems by Udagawa, Takuma, Suzuki, Masayuki, Kurata, Gakuto, Muraoka, Masayasu, Saon, George

Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)
“…Transferring the knowledge of large language models (LLMs) is a promising technique to incorporate linguistic knowledge into end-to-end automatic speech…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
11
Semi-Autoregressive Streaming ASR with Label Context by Arora, Siddhant, Saon, George, Watanabe, Shinji, Kingsbury, Brian

Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)
“…Non-autoregressive (NAR) modeling has gained significant interest in speech processing since these models achieve dramatically lower inference time than…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
12
Sequence Noise Injected Training for End-to-end Speech Recognition by Saon, George, Tuske, Zoltan, Audhkhasi, Kartik, Kingsbury, Brian

Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)
“…We present a simple noise injection algorithm for training end-to-end ASR models which consists in adding to the spectra of training utterances the scaled…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
13
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems by Thomas, Samuel, Kuo, Hong-Kwang J., Kingsbury, Brian, Saon, George

Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)
“…The lack of speech data annotated with labels required for spoken language understanding (SLU) is often a major hurdle in building end-to-end (E2E) systems…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
14
Distributed Deep Learning Strategies for Automatic Speech Recognition by Zhang, Wei, Cui, Xiaodong, Finkler, Ulrich, Kingsbury, Brian, Saon, George, Kung, David, Picheny, Michael

Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)
“…In this paper, we propose and investigate a variety of distributed deep learning strategies for automatic speech recognition (ASR) and evaluate them with a…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
15
Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models by Thomas, Samuel, Kingsbury, Brian, Saon, George, Kuo, Hong-Kwang J.

Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)
“…Compared to hybrid automatic speech recognition (ASR) systems that use a modular architecture in which each component can be in-dependently adapted to a new…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
16
A nonmonotone learning rate strategy for SGD training of deep neural networks by Keskar, Nitish Shirish, Saon, George

Published in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2015)
“…The algorithm of choice for cross-entropy training of deep neural network (DNN) acoustic models is mini-batch stochastic gradient descent (SGD). One of the…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
17
A comparison of two optimization techniques for sequence discriminative training of deep neural networks by Saon, George, Soltau, Hagen

Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)
“…We compare two optimization methods for lattice-based sequence discriminative training of neural network acoustic models: distributed Hessian-free (DHF) and…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
18
Bayesian Sensing Hidden Markov Models : Deep Learning for Speech and Language Processing by SAON, George, CHIEN, Jen-Tzung

Published in IEEE transactions on audio, speech, and language processing (2012)

Get full text

Journal Article
QR Code
Save to List

Saved in:
19
Speech Recognition Using Biologically-Inspired Neural Networks by Bohnstingl, Thomas, Garg, Ayush, Wozniak, Stanislaw, Saon, George, Eleftheriou, Evangelos, Pantazi, Angeliki

Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)
“…Automatic speech recognition systems (ASR), such as the recurrent neural network transducer (RNN-T), have reached close to human-like performance and are…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
20
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition by Thomas, Samuel, Kuo, Hong-Kwang J., Saon, George, Kingsbury, Brian

Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)
“…Publicly available datasets traditionally used to train E2E ASR models for conversational telephone speech recognition are based on clean, short duration,…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:

Search Results - "Saon, George"

Deep Convolutional Neural Networks for Large-scale Speech Tasks by Sainath, Tara N., Kingsbury, Brian, Saon, George, Soltau, Hagen, Mohamed, Abdel-rahman, Dahl, George, Ramabhadran, Bhuvana

Advancing RNN Transducer Technology for Speech Recognition by Saon, George, Tuske, Zoltan, Bolanos, Daniel, Kingsbury, Brian

Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition by Audhkhasi, Kartik, Kingsbury, Brian, Ramabhadran, Bhuvana, Saon, George, Picheny, Michael

Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions by Thomas, Samuel, Ganapathy, Sriram, Saon, George, Soltau, Hagen

Joint training of convolutional and non-convolutional neural networks by Soltau, Hagen, Saon, George, Sainath, Tara N.

Speaker adaptation of neural network acoustic models using i-vectors by Saon, George, Soltau, Hagen, Nahamoo, David, Picheny, Michael

Alignment-Length Synchronous Decoding for RNN Transducer by Saon, George, Tuske, Zoltan, Audhkhasi, Kartik

Diagonal State Space Augmented Transformers for Speech Recognition by Saon, George, Gupta, Ankit, Cui, Xiaodong

Boosting systems for large vocabulary continuous speech recognition by Saon, George, Soltau, Hagen

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems by Udagawa, Takuma, Suzuki, Masayuki, Kurata, Gakuto, Muraoka, Masayasu, Saon, George

Semi-Autoregressive Streaming ASR with Label Context by Arora, Siddhant, Saon, George, Watanabe, Shinji, Kingsbury, Brian

Sequence Noise Injected Training for End-to-end Speech Recognition by Saon, George, Tuske, Zoltan, Audhkhasi, Kartik, Kingsbury, Brian

Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems by Thomas, Samuel, Kuo, Hong-Kwang J., Kingsbury, Brian, Saon, George

Distributed Deep Learning Strategies for Automatic Speech Recognition by Zhang, Wei, Cui, Xiaodong, Finkler, Ulrich, Kingsbury, Brian, Saon, George, Kung, David, Picheny, Michael

Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models by Thomas, Samuel, Kingsbury, Brian, Saon, George, Kuo, Hong-Kwang J.

A nonmonotone learning rate strategy for SGD training of deep neural networks by Keskar, Nitish Shirish, Saon, George

A comparison of two optimization techniques for sequence discriminative training of deep neural networks by Saon, George, Soltau, Hagen

Bayesian Sensing Hidden Markov Models : Deep Learning for Speech and Language Processing by SAON, George, CHIEN, Jen-Tzung

Speech Recognition Using Biologically-Inspired Neural Networks by Bohnstingl, Thomas, Garg, Ayush, Wozniak, Stanislaw, Saon, George, Eleftheriou, Evangelos, Pantazi, Angeliki

Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition by Thomas, Samuel, Kuo, Hong-Kwang J., Saon, George, Kingsbury, Brian

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication