TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Composed Image Retrieval for Remote Sensing

Composed Image Retrieval for Remote Sensing

Bill Psomas, Ioannis Kakogeorgiou, Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum, Yannis Avrithis, Konstantinos Karantzalos

2024-05-24Composed Image Retrieval (CoIR)DescriptiveRetrievalZero-Shot Composed Image Retrieval (ZS-CIR)Language ModellingImage Retrieval
PaperPDFCode(official)

Abstract

This work introduces composed image retrieval to remote sensing. It allows to query a large image archive by image examples alternated by a textual description, enriching the descriptive power over unimodal queries, either visual or textual. Various attributes can be modified by the textual part, such as shape, color, or context. A novel method fusing image-to-image and text-to-image similarity is introduced. We demonstrate that a vision-language model possesses sufficient descriptive power and no further learning step or training data are necessary. We present a new evaluation benchmark focused on color, context, density, existence, quantity, and shape modifications. Our work not only sets the state-of-the-art for this task, but also serves as a foundational step in addressing a gap in the field of remote sensing image retrieval. Code at: https://github.com/billpsomas/rscir

Results

TaskDatasetMetricValueModel
Image RetrievalPatternCommAP30.19WeiCom (RemoteCLIP)
Image RetrievalPatternCommAP24.83WeiCom (CLIP)
Composed Image Retrieval (CoIR)PatternCommAP30.19WeiCom (RemoteCLIP)
Composed Image Retrieval (CoIR)PatternCommAP24.83WeiCom (CLIP)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17