Spatial-Temporal Contextual Aggregation Siamese Network for UAV Tracking

In recent years, many studies have used Siamese networks (SNs) for UAV tracking. However, there are two problems with SNs for UAV tracking. Firstly, the information sources of the SNs are the invariable template patch and the current search frame. The static template information lacks the perception...

Full description

Saved in:
Bibliographic Details
Published in:Drones (Basel) Vol. 8; no. 9; p. 433
Main Authors: Chen, Qiqi, Wang, Xuan, Liu, Faxue, Zuo, Yujia, Liu, Chenglong
Format: Journal Article
Language:English
Published: Basel MDPI AG 01-09-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, many studies have used Siamese networks (SNs) for UAV tracking. However, there are two problems with SNs for UAV tracking. Firstly, the information sources of the SNs are the invariable template patch and the current search frame. The static template information lacks the perception of dynamic feature information flow, and the shallow feature extraction and linear sequential mapping severely limit the mining of feature expressiveness. This makes it difficult for many existing SNs to cope with the challenges of UAV tracking, such as scale variation and viewpoint change caused by the change in height and angle of the UAV, and the challenges of background clutter and occlusion caused by complex aviation backgrounds. Secondly, the SNs trackers for UAV tracking still struggle with extracting lightweight and effective features. A tracker with a heavy-weighted backbone is not welcome due to the limited computing power of the UAV platform. Therefore, we propose a lightweight spatial-temporal contextual Siamese tracking system for UAV tracking (SiamST). The proposed SiamST improves the UAV tracking performance by augmenting the horizontal spatial information and introducing vertical temporal information to the Siamese network. Specifically, a high-order multiscale spatial module is designed to extract multiscale remote high-order spatial information, and a temporal template transformer introduces temporal contextual information for dynamic template updating. The evaluation and contrast results of the proposed SiamST with many state-of-the-art trackers on three UAV benchmarks show that the proposed SiamST is efficient and lightweight.
ISSN:2504-446X
2504-446X
DOI:10.3390/drones8090433