Search Results - "Zmolikova, Katerina"

Refine Results
  1. 1

    SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures by Zmolikova, Katerina, Delcroix, Marc, Kinoshita, Keisuke, Ochiai, Tsubasa, Nakatani, Tomohiro, Burget, Lukas, Cernocky, Jan

    “…The processing of speech corrupted by interfering overlapping speakers is one of the challenging problems with regards to today's automatic speech recognition…”
    Get full text
    Journal Article
  2. 2

    Analysis and interpretation of joint source separation and sound event detection in domestic environments by de Benito-Gorrón, Diego, Zmolikova, Katerina, Toledano, Doroteo T

    Published in PloS one (05-07-2024)
    “…In recent years, the relation between Sound Event Detection (SED) and Source Separation (SSep) has received a growing interest, in particular, with the aim to…”
    Get full text
    Journal Article
  3. 3

    Masked Spectrogram Prediction for Unsupervised Domain Adaptation in Speech Enhancement by Zmolikova, Katerina, Pedersen, Michael Syskind, Jensen, Jesper

    “…Supervised learning-based speech enhancement methods often work remarkably well in acoustic situations represented in the training corpus but generalize poorly…”
    Get full text
    Journal Article
  4. 4

    Single Channel Target Speaker Extraction and Recognition with Speaker Beam by Delcroix, Marc, Zmolikova, Katerina, Kinoshita, Keisuke, Ogawa, Atsunori, Nakatani, Tomohiro

    “…This paper addresses the problem of single channel speech recognition of a target speaker in a mixture of speech signals. We propose to exploit auxiliary…”
    Get full text
    Conference Proceeding
  5. 5

    Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam by Delcroix, Marc, Ochiai, Tsubasa, Zmolikova, Katerina, Kinoshita, Keisuke, Tawara, Naohiro, Nakatani, Tomohiro, Araki, Shoko

    “…Target speech extraction, which extracts a single target source in a mixture given clues about the target speaker, has attracted increasing attention. We have…”
    Get full text
    Conference Proceeding
  6. 6

    Speaker Activity Driven Neural Speech Extraction by Delcroix, Marc, Zmolikova, Katerina, Ochiai, Tsubasa, Kinoshita, Keisuke, Nakatani, Tomohiro

    “…Target speech extraction, which extracts the speech of a target speaker in a mixture given auxiliary speaker clues, has recently received increased interest…”
    Get full text
    Conference Proceeding
  7. 7

    Compact Network for Speakerbeam Target Speaker Extraction by Delcroix, Marc, Zmolikova, Katerina, Ochiai, Tsubasa, Kinoshita, Keisuke, Araki, Shoko, Nakatani, Tomohiro

    “…Speech separation that separates a mixture of speech signals into each of its sources has been an active research topic for a long time and has seen recent…”
    Get full text
    Conference Proceeding
  8. 8

    Sequence summarizing neural network for speaker adaptation by Vesely, Karel, Watanabe, Shinji, Zmolikova, Katerina, Karafiat, Martin, Burget, Lukas, Cernocky, Jan Honza

    “…In this paper, we propose a DNN adaptation technique, where the i-vector extractor is replaced by a Sequence Summarizing Neural Network (SSNN). Similarly to…”
    Get full text
    Conference Proceeding Journal Article
  9. 9

    Jointly Trained Transformers Models for Spoken Language Translation by Vydana, Hari Krishna, Karafiat, Martin, Zmolikova, Katerina, Burget, Lukas, Cernocky, Honza

    “…End-to-End and cascade (ASR-MT) spoken language translation (SLT) systems are reaching comparable performances, however, a large degradation is observed when…”
    Get full text
    Conference Proceeding
  10. 10

    But System for the Second Dihard Speech Diarization Challenge by Landini, Federico, Wang, Shuai, Diez, Mireia, Burget, Lukas, Matejka, Pavel, Zmolikova, Katerina, Mosner, Ladislav, Silnova, Anna, Plchot, Oldrich, Novotny, Ondrej, Zeinali, Hossein, Rohdin, Johan

    “…This paper describes the winning systems developed by the BUT team for the four tracks of the Second DIHARD Speech Diarization Challenge. For tracks 1 and 2…”
    Get full text
    Conference Proceeding
  11. 11

    Optimization of Speaker-Aware Multichannel Speech Extraction with ASR Criterion by Zmolikova, Katerina, Delcroix, Marc, Kinoshita, Keisuke, Higuchi, Takuya, Nakatani, Tomohiro, Cernocky, Jan

    “…This paper addresses the problem of recognizing speech corrupted by overlapping speakers in a multichannel setting. To extract a target speaker from the…”
    Get full text
    Conference Proceeding
  12. 12

    Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens by Zhao, Jinzheng, Moritz, Niko, Lakomkin, Egor, Xie, Ruiming, Xiu, Zhiping, Zmolikova, Katerina, Ahmed, Zeeshan, Gaur, Yashesh, Le, Duc, Fuegen, Christian

    Published 04-10-2024
    “…Cascaded speech-to-speech translation systems often suffer from the error accumulation problem and high latency, which is a result of cascaded modules whose…”
    Get full text
    Journal Article
  13. 13

    Neural Target Speech Extraction: An Overview by Zmolikova, Katerina, Delcroix, Marc, Ochiai, Tsubasa, Kinoshita, Keisuke, Černocký, Jan, Yu, Dong

    Published 31-01-2023
    “…Humans can listen to a target speaker even in challenging acoustic conditions that have noise, reverberation, and interfering speakers. This phenomenon is…”
    Get full text
    Journal Article
  14. 14

    Source Separation for Sound Event Detection in Domestic Environments using Jointly Trained Models by de Benito-Gorron, Diego, Zmolikova, Katerina, Toledano, Doroteo T.

    “…Sound Event Detection and Source Separation are closely related tasks: whereas the first aims to find the time boundaries of acoustic events inside a…”
    Get full text
    Conference Proceeding
  15. 15

    Speaker activity driven neural speech extraction by Delcroix, Marc, Zmolikova, Katerina, Ochiai, Tsubasa, Kinoshita, Keisuke, Nakatani, Tomohiro

    Published 14-01-2021
    “…Target speech extraction, which extracts the speech of a target speaker in a mixture given auxiliary speaker clues, has recently received increased interest…”
    Get full text
    Journal Article
  16. 16

    Listen only to me! How well can target speech extraction handle false alarms? by Delcroix, Marc, Kinoshita, Keisuke, Ochiai, Tsubasa, Zmolikova, Katerina, Sato, Hiroshi, Nakatani, Tomohiro

    Published 10-04-2022
    “…Target speech extraction (TSE) extracts the speech of a target speaker in a mixture given auxiliary clues characterizing the speaker, such as an enrollment…”
    Get full text
    Journal Article
  17. 17

    Analysis of Impact of Emotions on Target Speech Extraction and Speech Separation by Svec, Jan, Zmolikova, Katerina, Kocour, Martin, Delcroix, Marc, Ochiai, Tsubasa, Mosner, Ladislav, Cernocky, Jan Honza

    “…Recently, the performance of blind speech separation (BSS) and target speech extraction (TSE) has greatly progressed. Most works, however, focus on relatively…”
    Get full text
    Conference Proceeding
  18. 18

    Integration of Variational Autoencoder and Spatial Clustering for Adaptive Multi-Channel Neural Speech Separation by Zmolikova, Katerina, Delcroix, Marc, Burget, Lukas, Nakatani, Tomohiro, Cernocky, Jan Honza

    “…In this paper, we propose a method combining variational autoencoder model of speech with a spatial clustering approach for multi-channel speech separation…”
    Get full text
    Conference Proceeding
  19. 19

    Integration of variational autoencoder and spatial clustering for adaptive multi-channel neural speech separation by Zmolikova, Katerina, Delcroix, Marc, Burget, Lukáš, Nakatani, Tomohiro, Černocký, Jan "Honza"

    Published 24-11-2020
    “…In this paper, we propose a method combining variational autoencoder model of speech with a spatial clustering approach for multi-channel speech separation…”
    Get full text
    Journal Article
  20. 20

    Learning speaker representation for neural network based multichannel speaker extraction by Zmolikova, Katerina, Delcroix, Marc, Kinoshita, Keisuke, Higuchi, Takuya, Ogawa, Atsunori, Nakatani, Tomohiro

    “…Recently, schemes employing deep neural networks (DNNs) for extracting speech from noisy observation have demonstrated great potential for noise robust…”
    Get full text
    Conference Proceeding