Search Results - "Hank Liao"

1
Speaker adaptation of context dependent deep neural networks by Hank Liao

Published in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (01-05-2013)
“…There has been little work on examining how deep neural networks may be adapted to speakers for improved speech recognition accuracy. Past work has examined…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
2
A Comparison of End-to-End Models for Long-Form Speech Recognition by Chiu, Chung-Cheng, Han, Wei, Zhang, Yu, Pang, Ruoming, Kishchenko, Sergey, Nguyen, Patrick, Narayanan, Arun, Liao, Hank, Zhang, Shuyuan, Kannan, Anjuli, Prabhavalkar, Rohit, Chen, Zhifeng, Sainath, Tara, Wu, Yonghui

Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2019)
“…End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
3
Reducing the computational complexity for whole word models by Soltau, Hagen, Hank Liao, Hasim Sak

Published in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2017)
“…In a previous study, we demonstrated the feasibility to build a competitive, greatly simplified, large vocabulary continuous speech recognition system with…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
4
Conformer is All You Need for Visual Speech Recognition by Chang, Oscar, Liao, Hank, Serdyuk, Dmitriy, Shah, Ankit, Siohan, Olivier

Published in Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) (14-04-2024)
“…Visual speech recognition models extract visual features in a hierarchical manner. At the lower level, there is a visual front-end with a limited temporal…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
5
RADMM: Recurrent Adaptive Mixture Model with Applications to Domain Robust Language Modeling by Irie, Kazuki, Kumar, Shankar, Nirschl, Michael, Liao, Hank

Published in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2018)
“…We present a new architecture and a training strategy for an adaptive mixture of experts with applications to domain robust language modeling. The proposed…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
6
End-to-End Multi-Person Audio/Visual Automatic Speech Recognition by Braga, Otavio, Makino, Takaki, Siohan, Olivier, Liao, Hank

Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)
“…Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
7
USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models by Zhao, Guanlong, Wang, Yongqiang, Pelecanos, Jason, Zhang, Yu, Liao, Hank, Huang, Yiling, Lu, Han, Wang, Quan

Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)
“…We introduce a multilingual speaker change detection model (USM-SCD) that can simultaneously detect speaker turns and perform ASR for 96 languages. This model…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
8
Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription by Liao, Hank, McDermott, Erik, Senior, Andrew

Published in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (01-12-2013)
“…YouTube is a highly visited video sharing website where over one billion people watch six billion hours of video every month. Improving accessibility to these…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
9
Recurrent Neural Network Transducer for Audio-Visual Speech Recognition by Makino, Takaki, Liao, Hank, Assael, Yannis, Shillingford, Brendan, Garcia, Basilio, Braga, Otavio, Siohan, Olivier

Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2019)
“…This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture. To support the…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
10
GMM-free DNN acoustic model training by Senior, Andrew, Heigold, Georg, Bacchiani, Michiel, Liao, Hank

Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)
“…While deep neural networks (DNNs) have become the dominant acoustic model (AM) for speech recognition systems, they are still dependent on Gaussian mixture…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
11
Exemplar-based large vocabulary speech recognition using k-nearest neighbors by Yanbo Xu, Siohan, Olivier, Simcha, David, Kumar, Sanjiv, Liao, Hank

Published in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2015)
“…This paper describes a large scale exemplar-based acoustic modeling approach for large vocabulary continuous speech recognition. We construct an index of…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
12
On Robustness to Missing Video for Audiovisual Speech Recognition by Chang, Oscar, Braga, Otavio, Liao, Hank, Serdyuk, Dmitriy, Siohan, Olivier

Published 13-12-2023
“…It has been shown that learning audiovisual features can lead to improved speech recognition performance over audio-only features, especially for noisy speech…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
13
Conformers are All You Need for Visual Speech Recognition by Chang, Oscar, Liao, Hank, Serdyuk, Dmitriy, Shah, Ankit, Siohan, Olivier

Published 16-02-2023
“…Visual speech recognition models extract visual features in a hierarchical manner. At the lower level, there is a visual front-end with a limited temporal…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
14
Lattice rescoring strategies for long short term memory language models in speech recognition by Kumar, Shankar, Nirschl, Michael, Holtmann-Rice, Daniel, Liao, Hank, Suresh, Ananda Theertha, Yu, Felix

Published in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2017)
“…Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
15
End-to-End Multi-Person Audio/Visual Automatic Speech Recognition by Braga, Otavio, Makino, Takaki, Siohan, Olivier, Liao, Hank

Published 11-05-2022
“…Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
16
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models by Wang, Quan, Huang, Yiling, Zhao, Guanlong, Clark, Evan, Xia, Wei, Liao, Hank

Published 07-01-2024
“…In this paper, we introduce DiarizationLM, a framework to leverage large language models (LLM) to post-process the outputs from a speaker diarization system…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
17
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network by Huang, Yiling, Wang, Weiran, Zhao, Guanlong, Liao, Hank, Xia, Wei, Wang, Quan

Published 15-09-2023
“…While standard speaker diarization attempts to answer the question "who spoken when", most of relevant applications in reality are more interested in…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
18
USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models by Zhao, Guanlong, Wang, Yongqiang, Pelecanos, Jason, Zhang, Yu, Liao, Hank, Huang, Yiling, Lu, Han, Wang, Quan

Published 14-09-2023
“…We introduce a multilingual speaker change detection model (USM-SCD) that can simultaneously detect speaker turns and perform ASR for 96 languages. This model…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
19
Adversarial Training for Multilingual Acoustic Modeling by Hu, Ke, Sak, Hasim, Liao, Hank

Published 17-06-2019
“…Multilingual training has been shown to improve acoustic modeling performance by sharing and transferring knowledge in modeling different languages. Knowledge…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
20
Neural Language Modeling with Visual Features by Anastasopoulos, Antonios, Kumar, Shankar, Liao, Hank

Published 07-03-2019
“…Multimodal language models attempt to incorporate non-linguistic features for the language modeling task. In this work, we extend a standard recurrent neural…”

Get full text

Journal Article
QR Code
Save to List

Saved in:

Search Results - "Hank Liao"

Speaker adaptation of context dependent deep neural networks by Hank Liao

A Comparison of End-to-End Models for Long-Form Speech Recognition by Chiu, Chung-Cheng, Han, Wei, Zhang, Yu, Pang, Ruoming, Kishchenko, Sergey, Nguyen, Patrick, Narayanan, Arun, Liao, Hank, Zhang, Shuyuan, Kannan, Anjuli, Prabhavalkar, Rohit, Chen, Zhifeng, Sainath, Tara, Wu, Yonghui

Reducing the computational complexity for whole word models by Soltau, Hagen, Hank Liao, Hasim Sak

Conformer is All You Need for Visual Speech Recognition by Chang, Oscar, Liao, Hank, Serdyuk, Dmitriy, Shah, Ankit, Siohan, Olivier

RADMM: Recurrent Adaptive Mixture Model with Applications to Domain Robust Language Modeling by Irie, Kazuki, Kumar, Shankar, Nirschl, Michael, Liao, Hank

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition by Braga, Otavio, Makino, Takaki, Siohan, Olivier, Liao, Hank

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models by Zhao, Guanlong, Wang, Yongqiang, Pelecanos, Jason, Zhang, Yu, Liao, Hank, Huang, Yiling, Lu, Han, Wang, Quan

Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription by Liao, Hank, McDermott, Erik, Senior, Andrew

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition by Makino, Takaki, Liao, Hank, Assael, Yannis, Shillingford, Brendan, Garcia, Basilio, Braga, Otavio, Siohan, Olivier

GMM-free DNN acoustic model training by Senior, Andrew, Heigold, Georg, Bacchiani, Michiel, Liao, Hank

Exemplar-based large vocabulary speech recognition using k-nearest neighbors by Yanbo Xu, Siohan, Olivier, Simcha, David, Kumar, Sanjiv, Liao, Hank

On Robustness to Missing Video for Audiovisual Speech Recognition by Chang, Oscar, Braga, Otavio, Liao, Hank, Serdyuk, Dmitriy, Siohan, Olivier

Conformers are All You Need for Visual Speech Recognition by Chang, Oscar, Liao, Hank, Serdyuk, Dmitriy, Shah, Ankit, Siohan, Olivier

Lattice rescoring strategies for long short term memory language models in speech recognition by Kumar, Shankar, Nirschl, Michael, Holtmann-Rice, Daniel, Liao, Hank, Suresh, Ananda Theertha, Yu, Felix

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition by Braga, Otavio, Makino, Takaki, Siohan, Olivier, Liao, Hank

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models by Wang, Quan, Huang, Yiling, Zhao, Guanlong, Clark, Evan, Xia, Wei, Liao, Hank

Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network by Huang, Yiling, Wang, Weiran, Zhao, Guanlong, Liao, Hank, Xia, Wei, Wang, Quan

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models by Zhao, Guanlong, Wang, Yongqiang, Pelecanos, Jason, Zhang, Yu, Liao, Hank, Huang, Yiling, Lu, Han, Wang, Quan

Adversarial Training for Multilingual Acoustic Modeling by Hu, Ke, Sak, Hasim, Liao, Hank

Neural Language Modeling with Visual Features by Anastasopoulos, Antonios, Kumar, Shankar, Liao, Hank

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication