FSAU-Net: a network for extracting buildings from remote sensing imagery using feature self-attention
Convolutional neural networks (CNNs) extract semantic features from images by stacking convolutional operators, which easily causes semantic information loss and leads to hollow and edge inaccuracies in building extraction. Therefore, a features self-attention U-block network (FSAU-Net) is proposed....
Saved in:
Published in: | International journal of remote sensing Vol. 44; no. 5; pp. 1643 - 1664 |
---|---|
Main Authors: | , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
London
Taylor & Francis
04-03-2023
Taylor & Francis Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Convolutional neural networks (CNNs) extract semantic features from images by stacking convolutional operators, which easily causes semantic information loss and leads to hollow and edge inaccuracies in building extraction. Therefore, a features self-attention U-block network (FSAU-Net) is proposed. The network focuses on the target feature self-attention in the coding stage, and features self-attention (FSA) distinguishes buildings from nonbuilding by weighting the extracted features themselves; we introduce spatial attention (SA) in the decoder stage to focus on the spatial locations of features, and SA generates spatial location features through the spatial relationship among the features to highlight the building information area. A jump connection is used to fuse the shallow features generated in the decoder stage with the deep features generated in the encoder stage to reduce the building information loss. We validate the superiority of the method FSAU-Net on the WHU and Inria datasets with 0.3 m resolution and Massachusetts with 1.0 m resolution, experimentally showing IoU of 91.73%, 80.73% and 78.46% and precision of 93.60%, 90.71% and 86.37%, respectively. In addition, we also set up ablation experiments by adding an FSA module, Squeeze-and-Excitation (SE) module and Efficient Channel Attention (ECA) module to UNet and ResNet101, where UNet+FSA improves the IoU values by 3.15%, 2.72% and 1.77% compared to UNet, UNet+SE and UNet+ECA, respectively, and ResNet101+FSA improves the IoU values by 2.06%, 1.17% and 0.9% compared to ResNet101, ResNet101+SE and ResNet101+ECA, respectively, demonstrating the superiority of our proposed FSA module. FSAU-Net improves the IoU values by 3.18%, 2.75% and 1.80% compared to those of UNet, UNet+SE and UNet+ECA, respectively. FSAU-Net has 2.11%, 1.22%, and 0.95% IoU improvements over the IoU values of ResNet101, ResNet101+SE and ResNet101+ECA, respectively, demonstrating the superiority of our proposed FSAU-Net model. The TensorFlow implementation is available at
https://github.com/HMH12456/FSAU-Net-master.git
. |
---|---|
ISSN: | 0143-1161 1366-5901 |
DOI: | 10.1080/01431161.2023.2177125 |