Search Results - "Timo Gerkmann"
-
1
Bayesian Estimation of Clean Speech Spectral Coefficients Given a Priori Knowledge of the Phase
Published in IEEE transactions on signal processing (15-08-2014)“…While most short-time discrete Fourier transform-based single-channel speech enhancement algorithms only modify the noisy spectral amplitude, in recent years…”
Get full text
Journal Article -
2
STFT Phase Reconstruction in Voiced Speech for an Improved Single-Channel Speech Enhancement
Published in IEEE/ACM transactions on audio, speech, and language processing (01-12-2014)“…The enhancement of speech which is corrupted by noise is commonly performed in the short-time discrete Fourier transform domain. In case only a single…”
Get full text
Journal Article -
3
On MMSE-Based Estimation of Amplitude and Complex Speech Spectral Coefficients Under Phase-Uncertainty
Published in IEEE/ACM transactions on audio, speech, and language processing (01-12-2016)“…Among the most commonly used single-channel approaches for the enhancement of noise corrupted speech are Bayesian estimators of clean speech coefficients in…”
Get full text
Journal Article -
4
A neural network-supported two-stage algorithm for lightweight dereverberation on hearing devices
Published in EURASIP journal on audio, speech, and music processing (01-05-2023)“…A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper. The approach combines a multi-channel multi-frame…”
Get full text
Journal Article -
5
Causal Diffusion Models for Generalized Speech Enhancement
Published in IEEE open journal of signal processing (2024)“…In this work, we present a causal speech enhancement system that is designed to handle different types of corruptions. This paper is an extended version of our…”
Get full text
Journal Article -
6
Front-end technologies for robust ASR in reverberant environments—spectral enhancement-based dereverberation and auditory modulation filterbank features
Published in EURASIP journal on advances in signal processing (05-08-2015)“…This paper presents extended techniques aiming at the improvement of automatic speech recognition (ASR) in single-channel scenarios in the context of the…”
Get full text
Journal Article -
7
A Survey on Probabilistic Models in Human Perception and Machines
Published in Frontiers in robotics and AI (07-07-2020)“…Extracting information from noisy signals is of fundamental importance for both biological and artificial perceptual systems. To provide tractable solutions to…”
Get full text
Journal Article -
8
Noise power estimation based on the probability of speech presence
Published in 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (01-01-2011)“…In this paper, we analyze the minimum mean square error (MMSE) based spectral noise power estimator [1] and present an improvement. We will show that the MMSE…”
Get full text
Conference Proceeding -
9
A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…In this work, we investigate if the learned encoder of the end-to-end convolutional time domain audio separation network (Conv-TasNet) is the key to its recent…”
Get full text
Conference Proceeding -
10
Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement
Published in IEEE/ACM transactions on audio, speech, and language processing (01-01-2023)“…The key advantage of using multiple microphones for speech enhancement is that spatial filtering can be used to complement the tempo-spectral processing. In a…”
Get full text
Journal Article -
11
Multi-Channel Speech Separation Using Spatially Selective Deep Non-Linear Filters
Published in IEEE/ACM transactions on audio, speech, and language processing (2024)“…In a multi-channel separation task with multiple speakers, we aim to recover all individual speech signals from the mixture. In contrast to single-channel…”
Get full text
Journal Article -
12
Phase-aware deep speech enhancement: It's all about the frame length
Published in JASA express letters (01-10-2022)“…Algorithmic latency in speech processing is dominated by the frame length used for Fourier analysis, which in turn limits the achievable performance of…”
Get full text
Journal Article -
13
SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement
Published in IEEE/ACM transactions on audio, speech, and language processing (2021)“…In this paper, we address the generalization of deep neural network (DNN) based speech enhancement to unseen noise conditions for the case that training data…”
Get full text
Journal Article -
14
Spatially Selective Deep Non-Linear Filters For Speaker Extraction
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…In a scenario with multiple persons talking simultaneously, the spatial characteristics of the signals are the most distinct feature for extracting the target…”
Get full text
Conference Proceeding -
15
Nonlinear Spatial Filtering in Multichannel Speech Enhancement
Published in IEEE/ACM transactions on audio, speech, and language processing (2021)“…The majority of multichannel speech enhancement algorithms are two-step procedures that first apply a linear spatial filter, a so-called beamformer, and…”
Get full text
Journal Article -
16
Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Published in IEEE/ACM transactions on audio, speech, and language processing (01-01-2023)“…In this work, we build upon our previous publication and use diffusion-based generative models for speech enhancement. We present a detailed overview of the…”
Get full text
Journal Article -
17
Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04-06-2023)“…Single-channel deep speech enhancement approaches often estimate a single multiplicative mask to extract clean speech without a measure of its accuracy…”
Get full text
Conference Proceeding -
18
Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay
Published in IEEE transactions on audio, speech, and language processing (01-05-2012)“…Recently, it has been proposed to estimate the noise power spectral density by means of minimum mean-square error (MMSE) optimal estimation. We show that the…”
Get full text
Journal Article -
19
DriftRec: Adapting Diffusion Models to Blind JPEG Restoration
Published in IEEE transactions on image processing (2024)“…In this work, we utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels. We propose an…”
Get full text
Journal Article -
20
Distilling Hubert with LSTMs via Decoupled Knowledge Distillation
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Much research effort is being applied to the task of compressing the knowledge of self-supervised models, which are powerful, yet large and memory consuming…”
Get full text
Conference Proceeding