TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Adaptive Cross-Modal Few-Shot Learning

Adaptive Cross-Modal Few-Shot Learning

Chen Xing, Negar Rostamzadeh, Boris N. Oreshkin, Pedro O. Pinheiro

2019-02-19NeurIPS 2019 12Few-Shot LearningMeta-LearningImage ClassificationFew-Shot Image ClassificationGeneral Classification
PaperPDFCode

Abstract

Metric-based meta-learning techniques have successfully been applied to few-shot classification problems. In this paper, we propose to leverage cross-modal information to enhance metric-based few-shot learning methods. Visual and semantic feature spaces have different structures by definition. For certain concepts, visual features might be richer and more discriminative than text ones. While for others, the inverse might be true. Moreover, when the support from visual information is limited in image classification, semantic representations (learned from unsupervised text corpora) can provide strong prior knowledge and context to help learning. Based on these two intuitions, we propose a mechanism that can adaptively combine information from both modalities according to new image categories to be learned. Through a series of experiments, we show that by this adaptive combination of the two modalities, our model outperforms current uni-modality few-shot learning methods and modality-alignment methods by a large margin on all benchmarks and few-shot scenarios tested. Experiments also show that our model can effectively adjust its focus on the two modalities. The improvement in performance is particularly large when the number of shots is very small.

Results

TaskDatasetMetricValueModel
Image ClassificationMini-Imagenet 5-way (5-shot)Accuracy78.1AM3-TADAM
Image ClassificationMini-Imagenet 5-way (1-shot)Accuracy65.3AM3-TADAM
Image ClassificationMini-Imagenet 5-way (10-shot)Accuracy81.57AM3-TADAM
Image ClassificationTiered ImageNet 5-way (1-shot)Accuracy69.08AM3-TADAM
Image ClassificationTiered ImageNet 5-way (5-shot)Accuracy82.58AM3-TADAM
Few-Shot Image ClassificationMini-Imagenet 5-way (5-shot)Accuracy78.1AM3-TADAM
Few-Shot Image ClassificationMini-Imagenet 5-way (1-shot)Accuracy65.3AM3-TADAM
Few-Shot Image ClassificationMini-Imagenet 5-way (10-shot)Accuracy81.57AM3-TADAM
Few-Shot Image ClassificationTiered ImageNet 5-way (1-shot)Accuracy69.08AM3-TADAM
Few-Shot Image ClassificationTiered ImageNet 5-way (5-shot)Accuracy82.58AM3-TADAM

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16