SeLT: Sonar Echo Image Recognition for Small Targets Using Lightweight Swin Transformer

Underwater sonar echo image data containing targets is relatively scarce, usually limiting the recognition performance of the model when employing a high-capacity (even state-of-the-art) network for recognition. To address this issue, we propose SeLT, a lightweight adaptation of the Swin-T,using lig...

Full description

Saved in:
Bibliographic Details
Published in:OCEANS 2024 - Singapore pp. 1 - 5
Main Authors: Xia, Sijia, Hou, Mengyang, Han, Yina, Xiao, Ziyuan, Guo, Zihao, Liu, Qingyu, Ma, Yuanliang
Format: Conference Proceeding
Language:English
Published: IEEE 15-04-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Underwater sonar echo image data containing targets is relatively scarce, usually limiting the recognition performance of the model when employing a high-capacity (even state-of-the-art) network for recognition. To address this issue, we propose SeLT, a lightweight adaptation of the Swin-T,using lightweight feature extraction and feature coding modules. Specifically, we have reduced the stacking of Swin Transformer blocks and introduced a lightweight channel attention module to replace the MLP in each block. This eases the requirements for training data and computing resources, greatly accelerating the model training phase. Extensive experiments have demonstrated that, compared to the original Swin-T,our model achieves higher recognition performance (increasing 1.9% in terms of AUC value) with fewer parameters (reduced by 71%) and lower computational complexity (reduced by 73%).
DOI:10.1109/OCEANS51537.2024.10682372