Search Results - "Kastner, Kyle"
-
1
Deep learning-based point-scanning super-resolution imaging
Published in Nature methods (01-04-2021)“…Point-scanning imaging systems are among the most widely used tools for high-resolution cellular and tissue imaging, benefiting from arbitrarily defined pixel…”
Get full text
Journal Article -
2
Representation Mixing for TTS Synthesis
Published in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2019)“…Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation. However, the choice…”
Get full text
Conference Proceeding -
3
Understanding Shared Speech-Text Representations
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…Recently, a number of approaches to train speech models by incorporating text into end-to-end models have been developed, with Maestro advancing…”
Get full text
Conference Proceeding -
4
Deep Learning‐Based Point‐Scanning Super‐Resolution Imaging
Published in The FASEB journal (01-04-2020)“…Abstract only…”
Get full text
Journal Article -
5
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a…”
Get full text
Conference Proceeding -
6
ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation
Published in 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (01-06-2016)“…We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of…”
Get full text
Conference Proceeding -
7
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Published 30-06-2022“…This paper introduces R-MelNet, a two-part autoregressive architecture with a frontend based on the first tier of MelNet and a backend WaveRNN-style audio…”
Get full text
Journal Article -
8
Zero-shot Cross-lingual Voice Transfer for TTS
Published 20-09-2024“…In this paper, we introduce a zero-shot Voice Transfer (VT) module that can be seamlessly integrated into a multi-lingual Text-to-speech (TTS) system to…”
Get full text
Journal Article -
9
Understanding Shared Speech-Text Representations
Published 27-04-2023“…Recently, a number of approaches to train speech models by incorpo-rating text into end-to-end models have been developed, with Mae-stro advancing…”
Get full text
Journal Article -
10
Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Published 19-08-2024“…The keyword spotting (KWS) problem requires large amounts of real speech training data to achieve high accuracy across diverse populations. Utilizing large…”
Get full text
Journal Article -
11
Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model
Published 26-07-2024“…This paper explores the use of TTS synthesized training data for KWS (keyword spotting) task while minimizing development cost and time. Keyword spotting…”
Get full text
Journal Article -
12
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Published 29-02-2024“…Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a…”
Get full text
Journal Article -
13
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings
Published 08-01-2024“…Automatic speech recognition (ASR) systems can suffer from poor recall for various reasons, such as noisy audio, lack of sufficient training data, etc…”
Get full text
Journal Article -
14
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling
Published 16-12-2021“…Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive…”
Get full text
Journal Article -
15
Planning in Dynamic Environments with Conditional Autoregressive Models
Published 25-11-2018“…We demonstrate the use of conditional autoregressive generative models (van den Oord et al., 2016a) over a discrete latent space (van den Oord et al., 2017b)…”
Get full text
Journal Article -
16
Harmonic Recomposition using Conditional Autoregressive Modeling
Published 18-11-2018“…We demonstrate a conditional autoregressive pipeline for efficient music recomposition, based on methods presented in van den Oord et al.(2017). Recomposition…”
Get full text
Journal Article -
17
Representation Mixing for TTS Synthesis
Published 17-11-2018“…Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation. However, the choice…”
Get full text
Journal Article -
18
Blindfold Baselines for Embodied QA
Published 12-11-2018“…We explore blindfold (question-only) baselines for Embodied Question Answering. The EmbodiedQA task requires an agent to answer a question by intelligently…”
Get full text
Journal Article -
19
Learning Distributed Representations from Reviews for Collaborative Filtering
Published 18-06-2018“…Recent work has shown that collaborative filter-based recommender systems can be improved by incorporating side information, such as natural language reviews,…”
Get full text
Journal Article -
20
Learning to Discover Sparse Graphical Models
Published 20-05-2016“…We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often…”
Get full text
Journal Article