Search Results - "Shon, Suwon"
-
1
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
Published in IEEE/ACM transactions on audio, speech, and language processing (01-08-2019)“…There are a number of studies about extraction of bottleneck (BN) features from deep neural networks (DNNs) trained to discriminate speakers, pass-phrases, and…”
Get full text
Journal Article -
2
Deep Neural Network based learning and transferring mid-level audio features for acoustic scene classification
Published in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-03-2017)“…Deep Neural Network (DNN) based transfer learning has been shown to be effective in Visual Object Classification (VOC) for complementing the deficit of target…”
Get full text
Conference Proceeding -
3
Robust speaker direction estimation with microphone array using NMF for smart TV interaction
Published in 2015 IEEE International Conference on Consumer Electronics (ICCE) (01-01-2015)“…This paper proposes a robust speaker direction estimation method based on a microphone array for voice based interaction with smart TV. The proposed method…”
Get full text
Conference Proceeding -
4
MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge
Published in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2017)“…In order to successfully annotate the Arabic speech content found in open-domain media broadcasts, it is essential to be able to process a diverse set of…”
Get full text
Conference Proceeding -
5
Sudden noise source localization system for intelligent automobile application with acoustic sensors
Published in 2012 IEEE International Conference on Consumer Electronics (ICCE) (01-01-2012)“…This paper suggests an automotive application for finding direction of sudden noise source in driving situation. The system applies sound source localization…”
Get full text
Conference Proceeding -
6
Domain Mismatch Robust Acoustic Scene Classification Using Channel Information Conversion
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…In recent acoustic scene classification (ASC) research field, training and test device channel mismatch have become an issue for the real world implementation…”
Get full text
Conference Proceeding -
7
Noise-tolerant Audio-visual Online Person Verification Using an Attention-based Neural Network Fusion
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…In this paper, we present a multi-modal online person verification system using both speech and visual signals. Inspired by neuroscientific findings on the…”
Get full text
Conference Proceeding -
8
ADI17: A Fine-Grained Arabic Dialect Identification Dataset
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…In this paper, we describe a method to collect dialectal speech from YouTube videos to create a large-scale Dialect Identification (DID) dataset. Using this…”
Get full text
Conference Proceeding -
9
Improving ASR Contextual Biasing with Guided Attention
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…In this paper, we propose a Guided Attention (GA) auxiliary training loss, which improves the effectiveness and robustness of automatic speech recognition…”
Get full text
Conference Proceeding -
10
SLUE: New Benchmark Tasks For Spoken Language Understanding Evaluation on Natural Speech
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23-05-2022)“…Progress in speech processing has been facilitated by shared datasets and benchmarks. Historically these have focused on automatic speech recognition (ASR),…”
Get full text
Conference Proceeding -
11
Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…End-to-end deep learning language or dialect identification systems operate on the spectrogram or other acoustic feature and directly generate identification…”
Get full text
Conference Proceeding -
12
Exploiting Convolutional Neural Networks for Phonotactic Based Dialect Identification
Published in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-04-2018)“…In this paper, we investigate different approaches for Dialect Identification (DID) in Arabic broadcast speech. Dialects differ in their inventory of…”
Get full text
Conference Proceeding -
13
Generative Context-Aware Fine-Tuning of Self-Supervised Speech Models
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…When performing tasks like automatic speech recognition or spoken language understanding for a given utterance, access to preceding text or audio provides…”
Get full text
Conference Proceeding -
14
Context-Aware Fine-Tuning of Self-Supervised Speech Models
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…Self-supervised pre-trained transformers have improved the state of the art on a variety of speech tasks. Due to the quadratic time and space complexity of…”
Get full text
Conference Proceeding -
15
Generalized cross-correlation based noise robust abnormal acoustic event localization utilizing non-negative matrix factorization
Published in 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (01-08-2014)“…In this paper, robust sound source localization for surveillance system is presented. In particular, we propose an algorithm for abnormal acoustic event…”
Get full text
Conference Proceeding -
16
Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition
Published in 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (01-08-2015)“…This paper analyzes heteroscedasticity in i-vector for robust forensics and surveillance speaker recognition system. Linear Discriminant Analysis (LDA), a…”
Get full text
Conference Proceeding -
17
Abnormal acoustic event localization based on selective frequency bin in high noise environment for audio surveillance
Published in 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (01-08-2013)“…In this paper, a method for source localization for surveillance system is presented. In particular, we propose an algorithm for abnormal acoustic event…”
Get full text
Conference Proceeding -
18
Motion primitives for designing flexible gesture set in Human-Robot Interface
Published in 2011 11th International Conference on Control, Automation and Systems (01-10-2011)“…This paper proposes motion primitives for designing a gesture set in a gesture recognition system as Human-Robot Interface (HRI). Based on statistical analyses…”
Get full text
Conference Proceeding -
19
The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech
Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2019)“…This paper describes the fifth edition of the Multi-Genre Broadcast Challenge (MGB-5), an evaluation focused on Arabic speech recognition and dialect…”
Get full text
Conference Proceeding -
20
Frame-Level Speaker Embeddings for Text-Independent Speaker Recognition and Analysis of End-to-End Model
Published in 2018 IEEE Spoken Language Technology Workshop (SLT) (01-12-2018)“…In this paper, we propose a Convolutional Neural Network (CNN) based speaker recognition model for extracting robust speaker embeddings. The embedding can be…”
Get full text
Conference Proceeding