TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Feature Guided Masked Autoencoder for Self-supervised Lear...

Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing

Yi Wang, Hugo Hernández Hernández, Conrad M Albrecht, Xiao Xiang Zhu

2023-10-28Image ClassificationMulti-Label Image ClassificationSelf-Supervised Learning
PaperPDFCode(official)

Abstract

Self-supervised learning guided by masked image modelling, such as Masked AutoEncoder (MAE), has attracted wide attention for pretraining vision transformers in remote sensing. However, MAE tends to excessively focus on pixel details, thereby limiting the model's capacity for semantic understanding, in particular for noisy SAR images. In this paper, we explore spectral and spatial remote sensing image features as improved MAE-reconstruction targets. We first conduct a study on reconstructing various image features, all performing comparably well or better than raw pixels. Based on such observations, we propose Feature Guided Masked Autoencoder (FG-MAE): reconstructing a combination of Histograms of Oriented Graidents (HOG) and Normalized Difference Indices (NDI) for multispectral images, and reconstructing HOG for SAR images. Experimental results on three downstream tasks illustrate the effectiveness of FG-MAE with a particular boost for SAR imagery. Furthermore, we demonstrate the well-inherited scalability of FG-MAE and release a first series of pretrained vision transformers for medium resolution SAR and multispectral images.

Results

TaskDatasetMetricValueModel
Multi-Label Image ClassificationBigEarthNet-S1 (official test set)mAP (micro)82.7FG-MAE (ViT-S/16)
Multi-Label Image ClassificationBigEarthNet-S1 (official test set)mAP (micro)81.3MAE (ViT-S/16)
Multi-Label Image ClassificationBigEarthNet-S1 (official test set)mAP (micro)79.5ViT-S/16
Multi-Label Image ClassificationBigEarthNet (official test set)F1 Score80.8FG-MAE (ViT-S/16)
Multi-Label Image ClassificationBigEarthNet (official test set)mAP (micro)89.3FG-MAE (ViT-S/16)
Multi-Label Image ClassificationBigEarthNet (official test set)F1 Score79.9MAE (ViT-S/16)
Multi-Label Image ClassificationBigEarthNet (official test set)mAP (micro)88.6MAE (ViT-S/16)
Multi-Label Image ClassificationBigEarthNet (official test set)F1 Score78.9ViT-S/16
Multi-Label Image ClassificationBigEarthNet (official test set)mAP (micro)87.8ViT-S/16
Image ClassificationEuroSAT-SAROverall Accuracy85.9FG-MAE (ViT-S/16)
Image ClassificationEuroSAT-SAROverall Accuracy81MAE (ViT-S/16)
Image ClassificationEuroSAT-SAROverall Accuracy78.4ViT-S/16
Image ClassificationBigEarthNet-S1 (official test set)mAP (micro)82.7FG-MAE (ViT-S/16)
Image ClassificationBigEarthNet-S1 (official test set)mAP (micro)81.3MAE (ViT-S/16)
Image ClassificationBigEarthNet-S1 (official test set)mAP (micro)79.5ViT-S/16
Image ClassificationBigEarthNet (official test set)F1 Score80.8FG-MAE (ViT-S/16)
Image ClassificationBigEarthNet (official test set)mAP (micro)89.3FG-MAE (ViT-S/16)
Image ClassificationBigEarthNet (official test set)F1 Score79.9MAE (ViT-S/16)
Image ClassificationBigEarthNet (official test set)mAP (micro)88.6MAE (ViT-S/16)
Image ClassificationBigEarthNet (official test set)F1 Score78.9ViT-S/16
Image ClassificationBigEarthNet (official test set)mAP (micro)87.8ViT-S/16

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14