Event-Based Optical Flow via Transforming Into Motion-Dependent View

Event cameras respond to temporal dynamics, helping to resolve ambiguities in spatio-temporal changes for optical flow estimation. However, the unique spatio-temporal event distribution challenges the feature extraction, and the direct construction of motion representation through the orthogonal vie...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on image processing Vol. 33; pp. 5327 - 5339
Main Authors: Wan, Zengyu, Tan, Ganchao, Wang, Yang, Zhai, Wei, Cao, Yang, Zha, Zheng-Jun
Format: Journal Article
Language:English
Published: United States IEEE 01-01-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Event cameras respond to temporal dynamics, helping to resolve ambiguities in spatio-temporal changes for optical flow estimation. However, the unique spatio-temporal event distribution challenges the feature extraction, and the direct construction of motion representation through the orthogonal view is less than ideal due to the entanglement of appearance and motion. This paper proposes to transform the orthogonal view into a motion-dependent one for enhancing event-based motion representation and presents a Motion View-based Network (MV-Net) for practical optical flow estimation. Specifically, this motion-dependent view transformation is achieved through the Event View Transformation Module, which captures the relationship between the steepest temporal changes and motion direction, incorporating these temporal cues into the view transformation process for feature gathering. This module includes two phases: extracting the temporal evolution clues by central difference operation in the extraction phase and capturing the motion pattern by evolution-guided deformable convolution in the perception phase. Besides, the MV-Net constructs an eccentric downsampling process to avoid response weakening from the sparsity of events in the downsampling stage. The whole network is trained end-to-end in a self-supervised manner, and the evaluations conducted on four challenging datasets reveal the superior performance of the proposed model compared to state-of-the-art (SOTA) methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1057-7149
1941-0042
1941-0042
DOI:10.1109/TIP.2024.3426469