Search Results - "Albanie, Samuel"
-
1
Squeeze-and-Excitation Networks
Published in IEEE transactions on pattern analysis and machine intelligence (01-08-2020)“…The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by…”
Get full text
Journal Article -
2
Seeing Voices and Hearing Faces: Cross-Modal Biometric Matching
Published in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (01-06-2018)“…We introduce a seemingly impossible task: given only an audio clip of someone speaking, decide which of two face images is the speaker. In this paper we study…”
Get full text
Conference Proceeding -
3
Disentangled Speech Embeddings Using Cross-Modal Self-Supervision
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01-05-2020)“…The objective of this paper is to learn representations of speaker identity without access to manually annotated data. To do so, we develop a self-supervised…”
Get full text
Conference Proceeding -
4
Unsupervised Learning of Landmarks by Descriptor Vector Exchange
Published in 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (01-10-2019)“…Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without…”
Get full text
Conference Proceeding -
5
Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval
Published in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2021)“…In this paper, we study the task of visual-text retrieval in the highly practical setting in which labelled visual data with paired text descriptions are…”
Get full text
Conference Proceeding -
6
All you need are a few pixels: semantic segmentation with PixelPick
Published in 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (01-10-2021)“…A central challenge for the task of semantic segmentation is the prohibitive cost of obtaining dense pixel-level annotations to supervise model training. In…”
Get full text
Conference Proceeding -
7
TeachText: CrossModal text-video retrieval through generalized distillation
Published in Artificial intelligence (01-01-2025)“…In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets…”
Get full text
Journal Article -
8
Scaling Up Sign Spotting Through Sign Language Dictionaries
Published in International journal of computer vision (01-06-2022)“…The focus of this work is sign spotting –given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous,…”
Get full text
Journal Article -
9
Self-Supervised Learning of Geometrically Stable Features Through Probabilistic Introspection
Published in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (01-06-2018)“…Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks. While self-supervision has usually been…”
Get full text
Conference Proceeding -
10
TeachText: CrossModal Generalized Distillation for Text-Video Retrieval
Published in 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (01-10-2021)“…In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets…”
Get full text
Conference Proceeding -
11
Cross Modal Retrieval with Querybank Normalisation
Published in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2022)“…Profiting from large-scale training datasets, advances in neural architecture design and efficient inference, joint embeddings have become the dominant…”
Get full text
Conference Proceeding -
12
Movement change detected by optical flow precedes, but does not predict, tail-biting in pigs
Published in Livestock science (01-10-2020)“…•Optical flow was used to monitor pig movement before an outbreak of tail-biting.•There was increased group movement in the three days prior to…”
Get full text
Journal Article -
13
Read and Attend: Temporal Localisation in Sign Language Videos
Published in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2021)“…The objective of this work is to annotate sign instances across a broad vocabulary in continuous sign language. We train a Transformer model to ingest a…”
Get full text
Conference Proceeding -
14
Sign Language Video Retrieval with Free-Form Textual Queries
Published in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2022)“…Systems that can efficiently search collections of sign language videos have been highlighted as a useful application of sign language technology. However, the…”
Get full text
Conference Proceeding -
15
Audio Retrieval With Natural Language Queries: A Benchmark Study
Published in IEEE transactions on multimedia (2023)“…The objectives of this work are cross-modal text-audio and audio-text retrieval , in which the goal is to retrieve the audio content from a pool of candidates…”
Get full text
Journal Article -
16
Unsupervised Salient Object Detection with Spectral Cluster Voting
Published in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (01-06-2022)“…In this paper, we tackle the challenging task of unsupervised salient object detection (SOD) by leveraging spectral clustering on self-supervised features. We…”
Get full text
Conference Proceeding -
17
A Sound Approach: Using Large Language Models to Generate Audio Descriptions for Egocentric Text-Audio Retrieval
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14-04-2024)“…Video databases from the internet are a valuable source of text-audio retrieval datasets. However, given that sound and vision streams represent different…”
Get full text
Conference Proceeding -
18
Sign Language Segmentation with Temporal Convolutional Networks
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06-06-2021)“…The objective of this work is to determine the location of temporal boundaries between signs in continuous sign language videos. Our approach employs 3D…”
Get full text
Conference Proceeding -
19
Aligning Subtitles in Sign Language Videos
Published in 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (01-10-2021)“…The goal of this work is to temporally align asynchronous subtitles in sign language videos. In particular, we focus on sign-language interpreted TV broadcast…”
Get full text
Conference Proceeding -
20
QUERYD: A Video Dataset with High-Quality Text and Audio Narrations
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06-06-2021)“…We introduce QuerYD, a new large-scale dataset for retrieval and event localisation in video. A unique feature of our dataset is the availability of two audio…”
Get full text
Conference Proceeding