TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Extending global-local view alignment for self-supervised ...

Extending global-local view alignment for self-supervised learning with remote sensing imagery

Xinye Wanyan, Sachith Seneviratne, Shuchang Shen, Michael Kirley

2023-03-12Image ClassificationRepresentation LearningMulti-Label Image ClassificationSelf-Supervised LearningLinear-Probe ClassificationContrastive LearningChange DetectionKnowledge DistillationMulti-Label Classification
PaperPDFCode(official)

Abstract

Since large number of high-quality remote sensing images are readily accessible, exploiting the corpus of images with less manual annotation draws increasing attention. Self-supervised models acquire general feature representations by formulating a pretext task that generates pseudo-labels for massive unlabeled data to provide supervision for training. While prior studies have explored multiple self-supervised learning techniques in remote sensing domain, pretext tasks based on local-global view alignment remain underexplored, despite achieving state-of-the-art results on natural imagery. Inspired by DINO, which employs an effective representation learning structure with knowledge distillation based on global-local view alignment, we formulate two pretext tasks for self-supervised learning on remote sensing imagery (SSLRS). Using these tasks, we explore the effectiveness of positive temporal contrast as well as multi-sized views on SSLRS. We extend DINO and propose DINO-MC which uses local views of various sized crops instead of a single fixed size in order to alleviate the limited variation in object size observed in remote sensing imagery. Our experiments demonstrate that even when pre-trained on only 10% of the dataset, DINO-MC performs on par or better than existing state-of-the-art SSLRS methods on multiple remote sensing tasks, while using less computational resources. All codes, models, and results are released at https://github.com/WennyXY/DINO-MC.

Results

TaskDatasetMetricValueModel
Multi-Label Image ClassificationBigEarthNet-10%mean average precision84.2DINO-MC
Multi-Label Image ClassificationBigEarthNetmAP (micro)88.75DINO-MC
Image ClassificationEuroSATAccuracy (%)98.78DINO-MC (Wide ResNet)
Image ClassificationEuroSATAccuracy (%)95.7DINO-MC (WRN linear eval))
Image ClassificationBigEarthNet-10%mean average precision84.2DINO-MC
Image ClassificationBigEarthNetmAP (micro)88.75DINO-MC
Change DetectionOSCD - 13chF152.7DINO-MC (WRN-50)
Change DetectionOSCD - 13chPrecision49.99DINO-MC (WRN-50)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17