Foundation Models for Generalist Geospatial Artificial Intelligence
Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for v...
Saved in:
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
28-10-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Significant progress in the development of highly adaptable and reusable
Artificial Intelligence (AI) models is expected to have a significant impact on
Earth science and remote sensing. Foundation models are pre-trained on large
unlabeled datasets through self-supervision, and then fine-tuned for various
downstream tasks with small labeled datasets. This paper introduces a
first-of-a-kind framework for the efficient pre-training and fine-tuning of
foundational models on extensive geospatial data. We have utilized this
framework to create Prithvi, a transformer-based geospatial foundational model
pre-trained on more than 1TB of multispectral satellite imagery from the
Harmonized Landsat-Sentinel 2 (HLS) dataset. Our study demonstrates the
efficacy of our framework in successfully fine-tuning Prithvi to a range of
Earth observation tasks that have not been tackled by previous work on
foundation models involving multi-temporal cloud gap imputation, flood mapping,
wildfire scar segmentation, and multi-temporal crop segmentation. Our
experiments show that the pre-trained model accelerates the fine-tuning process
compared to leveraging randomly initialized weights. In addition, pre-trained
Prithvi compares well against the state-of-the-art, e.g., outperforming a
conditional GAN model in multi-temporal cloud imputation by up to 5pp (or 5.7%)
in the structural similarity index. Finally, due to the limited availability of
labeled data in the field of Earth observation, we gradually reduce the
quantity of available labeled data for refining the model to evaluate data
efficiency and demonstrate that data can be decreased significantly without
affecting the model's accuracy. The pre-trained 100 million parameter model and
corresponding fine-tuning workflows have been released publicly as open source
contributions to the global Earth sciences community through Hugging Face. |
---|---|
DOI: | 10.48550/arxiv.2310.18660 |