Search Results - "Shvetsova, Nina"

  • Showing 1 - 17 results of 17
Refine Results
  1. 1

    Anomaly Detection in Medical Imaging With Deep Perceptual Autoencoders by Shvetsova, Nina, Bakker, Bart, Fedulova, Irina, Schulz, Heinrich, Dylov, Dmitry V.

    Published in IEEE access (2021)
    “…Anomaly detection is the problem of recognizing abnormal inputs based on the seen examples of normal data. Despite recent advances of deep learning in…”
    Get full text
    Journal Article
  2. 2

    Everything at Once - Multi-modal Fusion Transformer for Video Retrieval by Shvetsova, Nina, Chen, Brian, Rouditchenko, Andrew, Thomas, Samuel, Kingsbury, Brian, Feris, Rogerio, Harwath, David, Glass, James, Kuehne, Hilde

    “…Multi-modal learning from video data has seen increased attention recently as it allows training of semantically meaningful embeddings without human…”
    Get full text
    Conference Proceeding
  3. 3
  4. 4
  5. 5

    In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval by Shvetsova, Nina, Kukleva, Anna, Schiele, Bernt, Kuehne, Hilde

    Published 16-09-2023
    “…Large-scale noisy web image-text datasets have been proven to be efficient for learning robust vision-language models. However, when transferring them to the…”
    Get full text
    Journal Article
  6. 6

    Learning by Sorting: Self-supervised Learning with Group Ordering Constraints by Shvetsova, Nina, Petersen, Felix, Kukleva, Anna, Schiele, Bernt, Kuehne, Hilde

    Published 05-01-2023
    “…Contrastive learning has become an important tool in learning representations from unlabeled data mainly relying on the idea of minimizing distance between…”
    Get full text
    Journal Article
  7. 7

    VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models by Vogel, Felix, Shvetsova, Nina, Karlinsky, Leonid, Kuehne, Hilde

    Published 12-09-2022
    “…Vision-language models trained on large, randomly collected data had significant impact in many areas since they appeared. But as they show great performance…”
    Get full text
    Journal Article
  8. 8

    HowToCaption: Prompting LLMs to Transform Video Annotations at Scale by Shvetsova, Nina, Kukleva, Anna, Hong, Xudong, Rupprecht, Christian, Schiele, Bernt, Kuehne, Hilde

    Published 07-10-2023
    “…Instructional videos are a common source for learning text-video or even multimodal representations by leveraging subtitles extracted with automatic speech…”
    Get full text
    Journal Article
  9. 9

    Preserving Modality Structure Improves Multi-Modal Learning by Sirnam, Swetha, Rizve, Mamshad Nayeem, Shvetsova, Nina, Kuehne, Hilde, Shah, Mubarak

    Published 24-08-2023
    “…Self-supervised learning on large-scale multi-modal datasets allows learning semantically meaningful embeddings in a joint multi-modal representation space…”
    Get full text
    Journal Article
  10. 10

    What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions by Chen, Brian, Shvetsova, Nina, Rouditchenko, Andrew, Kondermann, Daniel, Thomas, Samuel, Chang, Shih-Fu, Feris, Rogerio, Glass, James, Kuehne, Hilde

    “…Spatio-temporal grounding describes the task of localizing events in space and time, e.g., in video data, based on verbal descriptions only. Models for this…”
    Get full text
    Conference Proceeding
  11. 11

    Augmentation Learning for Semi-Supervised Classification by Frommknecht, Tim, Zipf, Pedro Alves, Fan, Quanfu, Shvetsova, Nina, Kuehne, Hilde

    Published 03-08-2022
    “…Recently, a number of new Semi-Supervised Learning methods have emerged. As the accuracy for ImageNet and similar datasets increased over time, the performance…”
    Get full text
    Journal Article
  12. 12

    What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions by Chen, Brian, Shvetsova, Nina, Rouditchenko, Andrew, Kondermann, Daniel, Thomas, Samuel, Chang, Shih-Fu, Feris, Rogerio, Glass, James, Kuehne, Hilde

    Published 29-03-2023
    “…Spatio-temporal grounding describes the task of localizing events in space and time, e.g., in video data, based on verbal descriptions only. Models for this…”
    Get full text
    Journal Article
  13. 13

    MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge by Lin, Wei, Karlinsky, Leonid, Shvetsova, Nina, Possegger, Horst, Kozinski, Mateusz, Panda, Rameswar, Feris, Rogerio, Kuehne, Hilde, Bischof, Horst

    Published 15-03-2023
    “…Large scale Vision-Language (VL) models have shown tremendous success in aligning representations between visual and text modalities. This enables remarkable…”
    Get full text
    Journal Article
  14. 14

    Anomaly Detection in Medical Imaging with Deep Perceptual Autoencoders by Shvetsova, Nina, Bakker, Bart, Fedulova, Irina, Schulz, Heinrich, Dylov, Dmitry V

    Published 13-09-2021
    “…IEEE Access, vol. 9, pp. 118571-118583, 2021 Anomaly detection is the problem of recognizing abnormal inputs based on the seen examples of normal data. Despite…”
    Get full text
    Journal Article
  15. 15

    Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval by Shvetsova, Nina, Chen, Brian, Rouditchenko, Andrew, Thomas, Samuel, Kingsbury, Brian, Feris, Rogerio, Harwath, David, Glass, James, Kuehne, Hilde

    Published 08-12-2021
    “…Multi-modal learning from video data has seen increased attention recently as it allows to train semantically meaningful embeddings without human annotation…”
    Get full text
    Journal Article
  16. 16

    Routing with Self-Attention for Multimodal Capsule Networks by Duarte, Kevin, Chen, Brian, Shvetsova, Nina, Rouditchenko, Andrew, Thomas, Samuel, Liu, Alexander, Harwath, David, Glass, James, Kuehne, Hilde, Shah, Mubarak

    Published 01-12-2021
    “…The task of multimodal learning has seen a growing interest recently as it allows for training neural architectures based on different modalities such as…”
    Get full text
    Journal Article
  17. 17

    C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval by Rouditchenko, Andrew, Chuang, Yung-Sung, Shvetsova, Nina, Thomas, Samuel, Feris, Rogerio, Kingsbury, Brian, Karlinsky, Leonid, Harwath, David, Kuehne, Hilde, Glass, James

    Published 07-10-2022
    “…Multilingual text-video retrieval methods have improved significantly in recent years, but the performance for other languages lags behind English. We propose…”
    Get full text
    Journal Article