Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation
High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. However, objects of the same category within HRS images generally show significant differences in scale and shape across diverse geographical environments, making it difficult to fit...
Saved in:
Published in: | IEEE journal of selected topics in applied earth observations and remote sensing Vol. 17; pp. 15024 - 15040 |
---|---|
Main Authors: | , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Piscataway
IEEE
2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. However, objects of the same category within HRS images generally show significant differences in scale and shape across diverse geographical environments, making it difficult to fit the data distribution. In addition, a complex background environment causes similar appearances of objects of different categories, which precipitates a substantial number of objects into misclassification as background. These issues make existing learning algorithms suboptimal. In this work, we solve the abovementioned problems by proposing a high-resolution remote sensing network (Hi-ResNet) with efficient network structure designs, which consists of a funnel module, a multibranch module with stacks of information aggregation (IA) blocks, and a feature refinement module, sequentially, and class-agnostic edge-aware (CEA) loss. Specifically, we propose a funnel module to downsample, which reduces the computational cost and extracts high-resolution semantic information from the initial input image. Then, we downsample the processed feature images into multiresolution branches incrementally to capture image features at different scales. Furthermore, with the design of the window multihead self-attention, squeeze-and-excitation attention, and depthwise convolution, the light-efficient IA blocks are utilized to distinguish image features of the same class with variant scales and shapes. Finally, our feature refinement module integrates the CEA loss function, which disambiguates interclass objects with similar shapes and increases the data distribution distance for correct predictions. With effective pretraining strategies, we demonstrate the superiority of Hi-ResNet over the existing prevalent methods on three HRS segmentation benchmarks. |
---|---|
ISSN: | 1939-1404 2151-1535 |
DOI: | 10.1109/JSTARS.2024.3444773 |