Search Results - "Yamazaki, Kashu"
-
1
Deep reinforcement learning in computer vision: a comprehensive survey
Published in The Artificial intelligence review (01-04-2022)“…Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have…”
Get full text
Journal Article -
2
Spiking Neural Networks and Their Applications: A Review
Published in Brain sciences (30-06-2022)“…The past decade has witnessed the great success of deep neural networks in various domains. However, deep neural networks are very resource-intensive in terms…”
Get full text
Journal Article -
3
AerialFormer: Multi-Resolution Transformer for Aerial Image Segmentation
Published in Remote sensing (Basel, Switzerland) (01-08-2024)“…When performing remote sensing image segmentation, practitioners often encounter various challenges, such as a strong imbalance in the foreground–background,…”
Get full text
Journal Article -
4
Narrow Band Active Contour Attention Model for Medical Segmentation
Published in Diagnostics (Basel) (31-07-2021)“…Medical image segmentation is one of the most challenging tasks in medical image analysis and widely developed for many clinical applications. While deep…”
Get full text
Journal Article -
5
ABN: Agent-Aware Boundary Networks for Temporal Action Proposal Generation
Published in IEEE access (2021)“…Temporal action proposal generation (TAPG) aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet plays an important…”
Get full text
Journal Article -
6
Development of a Soft Robot Based Photodynamic Therapy for Pancreatic Cancer
Published in IEEE/ASME transactions on mechatronics (01-12-2021)“…Photodynamic therapy (PDT) has received increased attention over the past decades with the potential to noninvasively treat cancer via light exposure. Due to…”
Get full text
Journal Article -
7
Towards Multi-Modal Explainable Video Understanding
Published 01-01-2023“…This thesis presents a novel approach to video understanding by emulating human perceptual processes and creating an explainable and coherent storytelling…”
Get full text
Dissertation -
8
AOE-Net: Entities Interactions Modeling with Adaptive Attention Mechanism for Temporal Action Proposals Generation
Published in International journal of computer vision (2023)“…Temporal action proposal generation (TAPG) is a challenging task, which requires localizing action intervals in an untrimmed video. Intuitively, we as humans,…”
Get full text
Journal Article -
9
CLIP-TSA: Clip-Assisted Temporal Self-Attention for Weakly-Supervised Video Anomaly Detection
Published in 2023 IEEE International Conference on Image Processing (ICIP) (08-10-2023)“…Video anomaly detection (VAD) - commonly formulated as a multiple-instance learning problem in a weakly-supervised manner due to its labor-intensive nature -…”
Get full text
Conference Proceeding -
10
VLCAP: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning
Published in 2022 IEEE International Conference on Image Processing (ICIP) (16-10-2022)“…In this paper, we leverage the human perceiving process, that involves vision and language interaction, to generate a coherent paragraph description of…”
Get full text
Conference Proceeding -
11
A Multi-task Contextual Atrous Residual Network for Brain Tumor Detection & Segmentation
Published in 2020 25th International Conference on Pattern Recognition (ICPR) (10-01-2021)“…In recent years, deep neural networks have achieved state-of-the-art performance in a variety of recognition and segmentation tasks in medical imaging…”
Get full text
Conference Proceeding -
12
Contextual Explainable Video Representation: Human Perception-based Understanding
Published in 2022 56th Asilomar Conference on Signals, Systems, and Computers (31-10-2022)“…Video understanding is a growing field and a subject of intense research, which includes many interesting tasks to understanding both spatial and temporal…”
Get full text
Conference Proceeding -
13
Offset Curves Loss for Imbalanced Problem in Medical Segmentation
Published in 2020 25th International Conference on Pattern Recognition (ICPR) (10-01-2021)“…Medical image segmentation has played an important role in medical analysis and widely developed for many clinical applications. Deep learning-based approaches…”
Get full text
Conference Proceeding -
14
HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model
Published 01-06-2024“…Current video-language models (VLMs) rely extensively on instance-level alignment between video and language modalities, which presents two major limitations:…”
Get full text
Journal Article -
15
CLIP-TSA: CLIP-Assisted Temporal Self-Attention for Weakly-Supervised Video Anomaly Detection
Published 09-12-2022“…Video anomaly detection (VAD) -- commonly formulated as a multiple-instance learning problem in a weakly-supervised manner due to its labor-intensive nature --…”
Get full text
Journal Article -
16
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
Published 28-11-2022“…Video paragraph captioning aims to generate a multi-sentence description of an untrimmed video with several temporal event locations in coherent storytelling…”
Get full text
Journal Article -
17
Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
Published in 2024 IEEE International Conference on Robotics and Automation (ICRA) (13-05-2024)“…Precise 3D environmental mapping with semantics is essential in robotics. Existing methods often rely on pre-defined concepts during training or are…”
Get full text
Conference Proceeding -
18
DNA: Deformable Neural Articulations Network for Template-free Dynamic 3D Human Reconstruction from Monocular RGB-D Video
Published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (01-06-2023)“…In this paper, we present a novel Deformable Neural Articulations Network (DNA-Net), which is a template-free learning-based method for dynamic 3D human…”
Get full text
Conference Proceeding -
19
Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
Published 05-10-2023“…Precise 3D environmental mapping is pivotal in robotics. Existing methods often rely on predefined concepts during training or are time-intensive when…”
Get full text
Journal Article -
20
Contextual Explainable Video Representation: Human Perception-based Understanding
Published 12-12-2022“…Video understanding is a growing field and a subject of intense research, which includes many interesting tasks to understanding both spatial and temporal…”
Get full text
Journal Article