Search Results - "SAON, George"
-
1
Deep Convolutional Neural Networks for Large-scale Speech Tasks
Published in Neural networks (01-04-2015)“…Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations…”
Get full text
Journal Article -
2
Advancing RNN Transducer Technology for Speech Recognition
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-01-2021)“…We investigate a set of techniques for RNN Transducers (RNN-Ts) that were instrumental in lowering the word error rate on three different tasks (Switchboard…”
Get full text
Conference Proceeding -
3
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition
Published in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2018)“…Direct acoustics-to-word (A2W) models in the end-to-end paradigm have received increasing attention compared to conventional subword based automatic speech…”
Get full text
Conference Proceeding -
4
Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions
Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)“…Convolutional neural networks (CNN) are extensions to deep neural networks (DNN) which are used as alternate acoustic models with state-of-the-art performances…”
Get full text
Conference Proceeding -
5
Joint training of convolutional and non-convolutional neural networks
Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)“…We describe a simple modification of neural networks which consists in extending the commonly used linear layer structure to an arbitrary graph structure. This…”
Get full text
Conference Proceeding -
6
Speaker adaptation of neural network acoustic models using i-vectors
Published in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (01-12-2013)“…We propose to adapt deep neural network (DNN) acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the…”
Get full text
Conference Proceeding -
7
Alignment-Length Synchronous Decoding for RNN Transducer
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…We present a beam decoding strategy for recurrent neural network transducers which has the characteristic that all competing hypotheses within the beam have…”
Get full text
Conference Proceeding -
8
Diagonal State Space Augmented Transformers for Speech Recognition
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…We improve on the popular conformer architecture by replacing the depthwise temporal convolutions with diagonal state space (DSS) models. DSS is a recently…”
Get full text
Conference Proceeding -
9
Boosting systems for large vocabulary continuous speech recognition
Published in Speech communication (01-02-2012)“…► We apply the Adaboost algorithm to large vocabulary continuous speech recognition. ► Acoustic models are trained sequentially on re-weighted data. ► Phonetic…”
Get full text
Journal Article -
10
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Transferring the knowledge of large language models (LLMs) is a promising technique to incorporate linguistic knowledge into end-to-end automatic speech…”
Get full text
Conference Proceeding -
11
Semi-Autoregressive Streaming ASR with Label Context
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Non-autoregressive (NAR) modeling has gained significant interest in speech processing since these models achieve dramatically lower inference time than…”
Get full text
Conference Proceeding -
12
Sequence Noise Injected Training for End-to-end Speech Recognition
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…We present a simple noise injection algorithm for training end-to-end ASR models which consists in adding to the spectra of training utterances the scaled…”
Get full text
Conference Proceeding -
13
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)“…The lack of speech data annotated with labels required for spoken language understanding (SLU) is often a major hurdle in building end-to-end (E2E) systems…”
Get full text
Conference Proceeding -
14
Distributed Deep Learning Strategies for Automatic Speech Recognition
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…In this paper, we propose and investigate a variety of distributed deep learning strategies for automatic speech recognition (ASR) and evaluate them with a…”
Get full text
Conference Proceeding -
15
Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)“…Compared to hybrid automatic speech recognition (ASR) systems that use a modular architecture in which each component can be in-dependently adapted to a new…”
Get full text
Conference Proceeding -
16
A nonmonotone learning rate strategy for SGD training of deep neural networks
Published in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2015)“…The algorithm of choice for cross-entropy training of deep neural network (DNN) acoustic models is mini-batch stochastic gradient descent (SGD). One of the…”
Get full text
Conference Proceeding -
17
A comparison of two optimization techniques for sequence discriminative training of deep neural networks
Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)“…We compare two optimization methods for lattice-based sequence discriminative training of neural network acoustic models: distributed Hessian-free (DHF) and…”
Get full text
Conference Proceeding -
18
-
19
Speech Recognition Using Biologically-Inspired Neural Networks
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)“…Automatic speech recognition systems (ASR), such as the recurrent neural network transducer (RNN-T), have reached close to human-like performance and are…”
Get full text
Conference Proceeding -
20
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…Publicly available datasets traditionally used to train E2E ASR models for conversational telephone speech recognition are based on clean, short duration,…”
Get full text
Conference Proceeding