TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LMPT: Prompt Tuning with Class-Specific Embedding Loss for...

LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

Peng Xia, Di Xu, Ming Hu, Lie Ju, ZongYuan Ge

2023-05-08Long-tail Learning
PaperPDFCode(official)

Abstract

Long-tailed multi-label visual recognition (LTML) task is a highly challenging task due to the label co-occurrence and imbalanced data distribution. In this work, we propose a unified framework for LTML, namely prompt tuning with class-specific embedding loss (LMPT), capturing the semantic feature interactions between categories by combining text and image modality data and improving the performance synchronously on both head and tail classes. Specifically, LMPT introduces the embedding loss function with class-aware soft margin and re-weighting to learn class-specific contexts with the benefit of textual descriptions (captions), which could help establish semantic relationships between classes, especially between the head and tail classes. Furthermore, taking into account the class imbalance, the distribution-balanced loss is adopted as the classification loss function to further improve the performance on the tail classes without compromising head classes. Extensive experiments are conducted on VOC-LT and COCO-LT datasets, which demonstrates that our method significantly surpasses the previous state-of-the-art methods and zero-shot CLIP in LTML. Our codes are fully public at https://github.com/richard-peng-xia/LMPT.

Results

TaskDatasetMetricValueModel
Image ClassificationCOCO-MLTAverage mAP66.19LMPT(ViT-B/16)
Image ClassificationCOCO-MLTAverage mAP58.97LMPT(ResNet-50)
Image ClassificationVOC-MLTAverage mAP87.88LMPT(ViT-B/16)
Image ClassificationVOC-MLTAverage mAP85.44LMPT(ResNet-50)
Few-Shot Image ClassificationCOCO-MLTAverage mAP66.19LMPT(ViT-B/16)
Few-Shot Image ClassificationCOCO-MLTAverage mAP58.97LMPT(ResNet-50)
Few-Shot Image ClassificationVOC-MLTAverage mAP87.88LMPT(ViT-B/16)
Few-Shot Image ClassificationVOC-MLTAverage mAP85.44LMPT(ResNet-50)
Generalized Few-Shot ClassificationCOCO-MLTAverage mAP66.19LMPT(ViT-B/16)
Generalized Few-Shot ClassificationCOCO-MLTAverage mAP58.97LMPT(ResNet-50)
Generalized Few-Shot ClassificationVOC-MLTAverage mAP87.88LMPT(ViT-B/16)
Generalized Few-Shot ClassificationVOC-MLTAverage mAP85.44LMPT(ResNet-50)
Long-tail LearningCOCO-MLTAverage mAP66.19LMPT(ViT-B/16)
Long-tail LearningCOCO-MLTAverage mAP58.97LMPT(ResNet-50)
Long-tail LearningVOC-MLTAverage mAP87.88LMPT(ViT-B/16)
Long-tail LearningVOC-MLTAverage mAP85.44LMPT(ResNet-50)
Generalized Few-Shot LearningCOCO-MLTAverage mAP66.19LMPT(ViT-B/16)
Generalized Few-Shot LearningCOCO-MLTAverage mAP58.97LMPT(ResNet-50)
Generalized Few-Shot LearningVOC-MLTAverage mAP87.88LMPT(ViT-B/16)
Generalized Few-Shot LearningVOC-MLTAverage mAP85.44LMPT(ResNet-50)

Related Papers

Mitigating Spurious Correlations with Causal Logit Perturbation2025-05-21LIFT+: Lightweight Fine-Tuning for Long-Tail Learning2025-04-17Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition2024-10-28Learning from Neighbors: Category Extrapolation for Long-Tail Learning2024-10-21Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition2024-10-08AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation2024-09-30Representation Norm Amplification for Out-of-Distribution Detection in Long-Tail Learning2024-08-20LTRL: Boosting Long-tail Recognition via Reflective Learning2024-07-17