Search Results - "Min, Kyle"

1
TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection by Min, Kyle, Corso, Jason

Published in 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (01-10-2019)
“…TASED-Net is a 3D fully-convolutional network architecture for video saliency detection. It consists of two building blocks: first, the encoder network…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
2
Hierarchical Novelty Detection for Visual Object Recognition by Lee, Kibok, Lee, Kimin, Min, Kyle, Zhang, Yuting, Shin, Jinwoo, Lee, Honglak

Published in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (01-06-2018)
“…Deep neural networks have achieved impressive success in large-scale visual object recognition tasks with a predefined set of classes. However, recognizing…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
3
SViTT: Temporal Learning of Sparse Video-Text Transformers by Li, Yi, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

Published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2023)
“…Do video-text transformers learn to model temporal relationships across frames? Despite their immense capacity and the abundance of multimodal training data,…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
4
Integrating Human Gaze into Attention for Egocentric Activity Recognition by Min, Kyle, Corso, Jason J.

Published in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) (01-01-2021)
“…It is well known that human gaze carries significant information about visual attention. However, there are three main difficulties in incorporating the gaze…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
5
Unbiased Scene Graph Generation in Videos by Nag, Sayak, Min, Kyle, Tripathi, Subarna, Roy-Chowdhury, Amit K.

Published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2023)
“…The task of dynamic scene graph generation (SGG) from videos is complicated and challenging due to the inherent dynamics of a scene, temporal fluctuation of…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
6
STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization by Min, Kyle

Published 18-06-2023
“…This report introduces our novel method named STHG for the Audio-Visual Diarization task of the Ego4D Challenge 2023. Our key innovation is that we model all…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization by Min, Kyle

Published 14-10-2022
“…This report describes our approach for the Audio-Visual Diarization (AVD) task of the Ego4D Challenge 2022. Specifically, we present multiple technical…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
8
SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video by Valdez, Hector A, Min, Kyle, Tripathi, Subarna

Published 12-06-2024
“…Pretraining egocentric vision-language models has become essential to improving downstream egocentric video-text tasks. These egocentric foundation models…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
9
R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model by Kim, Changhoon, Min, Kyle, Yang, Yezhou

Published 25-05-2024
“…In the evolving landscape of text-to-image (T2I) diffusion models, the remarkable capability to generate high-quality images from textual descriptions faces…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
10
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation by Wu, Tz-Ying, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

Published 28-07-2024
“…Video understanding typically requires fine-tuning the large backbone when adapting to new domains. In this paper, we leverage the egocentric video foundation…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
11
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models by Kim, Changhoon, Min, Kyle, Patel, Maitreya, Cheng, Sheng, Yang, Yezhou

Published in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (16-06-2024)
“…The rapid advancement of generative models, facilitating the creation of hyper-realistic images from textual de-scriptions, has concurrently escalated critical…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
12
Contrastive Language Video Time Pre-training by Liu, Hengyue, Min, Kyle, Valdez, Hector A, Tripathi, Subarna

Published 03-06-2024
“…We introduce LAVITI, a novel approach to learning language, video, and temporal representations in long-form videos via contrastive learning. Different from…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
13
Integrating Human Gaze into Attention for Egocentric Activity Recognition by Min, Kyle, Corso, Jason J

Published 08-11-2020
“…It is well known that human gaze carries significant information about visual attention. However, there are three main difficulties in incorporating the gaze…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
14
Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization by Min, Kyle, Corso, Jason J

Published 13-07-2020
“…Temporally localizing activities within untrimmed videos has been extensively studied in recent years. Despite recent advances, existing methods for…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
15
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models by Kim, Changhoon, Min, Kyle, Patel, Maitreya, Cheng, Sheng, Yang, Yezhou

Published 07-06-2023
“…The rapid advancement of generative models, facilitating the creation of hyper-realistic images from textual descriptions, has concurrently escalated critical…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
16
SViTT: Temporal Learning of Sparse Video-Text Transformers by Li, Yi, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

Published 18-04-2023
“…Do video-text transformers learn to model temporal relationships across frames? Despite their immense capacity and the abundance of multimodal training data,…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
17
Unbiased Scene Graph Generation in Videos by Nag, Sayak, Min, Kyle, Tripathi, Subarna, Chowdhury, Amit K. Roy

Published 03-04-2023
“…The task of dynamic scene graph generation (SGG) from videos is complicated and challenging due to the inherent dynamics of a scene, temporal fluctuation of…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
18
TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection by Min, Kyle, Corso, Jason J

Published 15-08-2019
“…TASED-Net is a 3D fully-convolutional network architecture for video saliency detection. It consists of two building blocks: first, the encoder network…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
19
Action Scene Graphs for Long-Form Understanding of Egocentric Videos by Rodin, Ivan, Furnari, Antonino, Min, Kyle, Tripathi, Subarna, Farinella, Giovanni Maria

Published in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (16-06-2024)
“…We present Egocentric Action Scene Graphs (EASGs), a new representation for long-form understanding of egocentric videos. EASGs extend standard…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
20
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection by Min, Kyle, Roy, Sourya, Tripathi, Subarna, Guha, Tanaya, Majumdar, Somdeb

Published 15-07-2022
“…Active speaker detection (ASD) in videos with multiple speakers is a challenging task as it requires learning effective audiovisual features and…”

Get full text

Journal Article
QR Code
Save to List

Saved in:

Search Results - "Min, Kyle"

TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection by Min, Kyle, Corso, Jason

Hierarchical Novelty Detection for Visual Object Recognition by Lee, Kibok, Lee, Kimin, Min, Kyle, Zhang, Yuting, Shin, Jinwoo, Lee, Honglak

SViTT: Temporal Learning of Sparse Video-Text Transformers by Li, Yi, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

Integrating Human Gaze into Attention for Egocentric Activity Recognition by Min, Kyle, Corso, Jason J.

Unbiased Scene Graph Generation in Videos by Nag, Sayak, Min, Kyle, Tripathi, Subarna, Roy-Chowdhury, Amit K.

STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization by Min, Kyle

Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization by Min, Kyle

SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video by Valdez, Hector A, Min, Kyle, Tripathi, Subarna

R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model by Kim, Changhoon, Min, Kyle, Yang, Yezhou

Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation by Wu, Tz-Ying, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models by Kim, Changhoon, Min, Kyle, Patel, Maitreya, Cheng, Sheng, Yang, Yezhou

Contrastive Language Video Time Pre-training by Liu, Hengyue, Min, Kyle, Valdez, Hector A, Tripathi, Subarna

Integrating Human Gaze into Attention for Egocentric Activity Recognition by Min, Kyle, Corso, Jason J

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization by Min, Kyle, Corso, Jason J

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models by Kim, Changhoon, Min, Kyle, Patel, Maitreya, Cheng, Sheng, Yang, Yezhou

SViTT: Temporal Learning of Sparse Video-Text Transformers by Li, Yi, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

Unbiased Scene Graph Generation in Videos by Nag, Sayak, Min, Kyle, Tripathi, Subarna, Chowdhury, Amit K. Roy

TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection by Min, Kyle, Corso, Jason J

Action Scene Graphs for Long-Form Understanding of Egocentric Videos by Rodin, Ivan, Furnari, Antonino, Min, Kyle, Tripathi, Subarna, Farinella, Giovanni Maria

Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection by Min, Kyle, Roy, Sourya, Tripathi, Subarna, Guha, Tanaya, Majumdar, Somdeb

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication