Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks

Semantic segmentation requires methods capable of learning high-level features while dealing with large volume of data. Toward such goal, convolutional networks can learn specific and adaptable features based on the data. However, these networks are not capable of processing a whole remote sensing i...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on geoscience and remote sensing Vol. 57; no. 10; pp. 7503 - 7520
Main Authors: Nogueira, Keiller, Dalla Mura, Mauro, Chanussot, Jocelyn, Schwartz, William Robson, dos Santos, Jefersson Alex
Format: Journal Article
Language:English
Published: New York IEEE 01-10-2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semantic segmentation requires methods capable of learning high-level features while dealing with large volume of data. Toward such goal, convolutional networks can learn specific and adaptable features based on the data. However, these networks are not capable of processing a whole remote sensing image, given its huge size. To overcome such limitation, the image is processed using fixed size patches. The definition of the input patch size is usually performed empirically (evaluating several sizes) or imposed (by network constraint). Both strategies suffer from drawbacks and could not lead to the best patch size. To alleviate this problem, several works exploited multicontext information by combining networks or layers. This process increases the number of parameters, resulting in a more difficult model to train. In this paper, we propose a novel technique to perform semantic segmentation of remote sensing images that exploits a multicontext paradigm without increasing the number of parameters while defining, in training time, the best patch size. The main idea is to train a dilated network with distinct patch sizes, allowing it to capture multicontext characteristics from heterogeneous contexts. While processing these varying patches, the network provides a score for each patch size, helping in the definition of the best size for the current scenario. A systematic evaluation of the proposed algorithm is conducted using four high-resolution remote sensing data sets with very distinct properties. Our results show that the proposed algorithm provides improvements in pixelwise classification accuracy when compared to the state-of-the-art methods.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2019.2913861