Search Results - "Hori, Chiori"
-
1
Attention-Based Multimodal Fusion for Video Description
Published in 2017 IEEE International Conference on Computer Vision (ICCV) (01-10-2017)“…Current methods for video description are based on encoder-decoder sentence generation using recurrent neural networks (RNNs). Recent work has demonstrated the…”
Get full text
Conference Proceeding -
2
End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…In order for machines interacting with the real world to have conversations with users about the objects and events around them, they need to understand…”
Get full text
Conference Proceeding -
3
Overview of the sixth dialog system technology challenge: DSTC6
Published in Computer speech & language (01-05-2019)“…•DSTC6: Dialog Challenge to improve performance of end-to-end dialog systems using Neural Network models and dialog breakdown detection.•Track 1, End-to-End…”
Get full text
Journal Article -
4
Sparse representation based on a bag of spectral exemplars for acoustic event detection
Published in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2014)“…Acoustic event detection is an important step for audio content analysis and retrieval. Traditional detection techniques model the acoustic events on…”
Get full text
Conference Proceeding -
5
Minimum word error training of long short-term memory recurrent neural network language models for speech recognition
Published in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-03-2016)“…This paper describes minimum word error (MWE) training of recurrent neural network language models (RNNLMs) for speech recognition. RNNLMs are usually trained…”
Get full text
Conference Proceeding Journal Article -
6
A-STAR: Toward translating Asian spoken languages
Published in Computer speech & language (01-02-2013)“…► The first Asian network-based speech-to-speech translation system developed by the A-STAR consortium. ► A-STAR field testing experiments was carried out in…”
Get full text
Journal Article -
7
Multilingual Speech-to-Speech Translation System: VoiceTra
Published in 2013 IEEE 14th International Conference on Mobile Data Management (01-06-2013)“…This study presents an overview of VoiceTra, which was developed by NICT and released as the world's first network-based multilingual speech-to-speech…”
Get full text
Conference Proceeding -
8
Audio Visual Scene-Aware Dialog
Published in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2019)“…We introduce the task of scene-aware dialog. Our goal is to generate a complete and natural response to a question about a scene, given video and audio of the…”
Get full text
Conference Proceeding -
9
Early and late integration of audio features for automatic video description
Published in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01-12-2017)“…This paper presents our approach to improve video captioning by integrating audio and video features. Video captioning is the task of generating a textual…”
Get full text
Conference Proceeding -
10
Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics
Published in Computer speech & language (01-05-2019)“…•Automatic metric for evaluating natural language generated sentences for dialog systems.•Integration of adequacy and fluency information to jointly evaluate…”
Get full text
Journal Article -
11
NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Head-related transfer functions (HRTFs) are important for immersive audio, and their spatial interpolation has been studied to upsample finite measurements…”
Get full text
Conference Proceeding -
12
Generation or Replication: Auscultating Audio Latent Diffusion Models
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…The introduction of audio latent diffusion models possessing the ability to generate realistic sound clips on demand from a text description has the potential…”
Get full text
Conference Proceeding -
13
WI-FI based Indoor Monitoring Enhanced by Multimodal Fusion
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Indoor monitoring systems are in high demand to protect vulnerable people, especially when they are alone at home, in nursing homes, hospitals, etc. Although…”
Get full text
Conference Proceeding -
14
Overview of the seventh Dialog System Technology Challenge: DSTC7
Published in Computer speech & language (01-07-2020)“…•DSTC7: Dialog Challenge to build more robust and accurate end-to-end dialog systems.•Track 1, Sentence selection for multiple domains, including variations…”
Get full text
Journal Article -
15
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Published in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) (01-03-2020)“…Generating video descriptions automatically is a challenging task that involves a complex interplay between spatio-temporal visual features and language…”
Get full text
Conference Proceeding -
16
Adversarial training and decoding strategies for end-to-end neural conversation models
Published in Computer speech & language (01-03-2019)“…•An advanced end to end conversation system for the 6-th edition of Dialog System Technology Challenge (DSTC6).•Applying sequence adversarial training with…”
Get full text
Journal Article -
17
Leveraging social Q&A collections for improving complex question answering
Published in Computer speech & language (01-01-2015)“…•The proposed approach leverages social Q&A collections to improve automatic complex QA system.•There is no need to manually collect training Q&A pairs that…”
Get full text
Journal Article -
18
A cloud robotics approach towards dialogue-oriented robot speech
Published in Advanced robotics (03-04-2015)“…Robot utterances generally sound monotonous, unnatural and unfriendly because their Text-to-Speech systems are not optimized for communication but for text…”
Get full text
Journal Article -
19
Superpositional HMM-Based Intonation Synthesis Using a Functional F0 Model
Published in Journal of signal processing systems (01-02-2016)“…This paper addresses intonation synthesis combining both statistical and generative models to manipulate fundamental frequency ( F 0 ) contours in the…”
Get full text
Journal Article -
20
Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition
Published in IEEE transactions on audio, speech, and language processing (01-05-2007)“…This paper proposes a novel one-pass search algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large-vocabulary…”
Get full text
Journal Article