Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation

High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. However, objects of the same category within HRS images generally show significant differences in scale and shape across diverse geographical environments, making it difficult to fit...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE journal of selected topics in applied earth observations and remote sensing Vol. 17; pp. 15024 - 15040
Main Authors:	Chen, Yuxia, Fang, Pengcheng, Zhong, Xiaoling, Yu, Jianhui, Zhang, Xiaoming, Li, Tianrui
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Aggregation Algorithms Attention Benchmarks Convolution Feature extraction Geographical distribution High resolution Image enhancement Image processing Image resolution Image segmentation Information processing Machine learning Modules Object recognition Precipitates pretraining Remote sensing Semantic segmentation Semantics Shape Task analysis Transformers
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. However, objects of the same category within HRS images generally show significant differences in scale and shape across diverse geographical environments, making it difficult to fit the data distribution. In addition, a complex background environment causes similar appearances of objects of different categories, which precipitates a substantial number of objects into misclassification as background. These issues make existing learning algorithms suboptimal. In this work, we solve the abovementioned problems by proposing a high-resolution remote sensing network (Hi-ResNet) with efficient network structure designs, which consists of a funnel module, a multibranch module with stacks of information aggregation (IA) blocks, and a feature refinement module, sequentially, and class-agnostic edge-aware (CEA) loss. Specifically, we propose a funnel module to downsample, which reduces the computational cost and extracts high-resolution semantic information from the initial input image. Then, we downsample the processed feature images into multiresolution branches incrementally to capture image features at different scales. Furthermore, with the design of the window multihead self-attention, squeeze-and-excitation attention, and depthwise convolution, the light-efficient IA blocks are utilized to distinguish image features of the same class with variant scales and shapes. Finally, our feature refinement module integrates the CEA loss function, which disambiguates interclass objects with similar shapes and increases the data distribution distance for correct predictions. With effective pretraining strategies, we demonstrate the superiority of Hi-ResNet over the existing prevalent methods on three HRS segmentation benchmarks.
ISSN:	1939-1404 2151-1535
DOI:	10.1109/JSTARS.2024.3444773