TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Enhanced Long-Tailed Recognition with Contrastive CutMix A...

Enhanced Long-Tailed Recognition with Contrastive CutMix Augmentation

Haolin Pan, Yong Guo, Mianjie Yu, Jian Chen

2024-07-06Image ClassificationLong-tail LearningData AugmentationContrastive Learning
PaperPDFCodeCode(official)

Abstract

Real-world data often follows a long-tailed distribution, where a few head classes occupy most of the data and a large number of tail classes only contain very limited samples. In practice, deep models often show poor generalization performance on tail classes due to the imbalanced distribution. To tackle this, data augmentation has become an effective way by synthesizing new samples for tail classes. Among them, one popular way is to use CutMix that explicitly mixups the images of tail classes and the others, while constructing the labels according to the ratio of areas cropped from two images. However, the area-based labels entirely ignore the inherent semantic information of the augmented samples, often leading to misleading training signals. To address this issue, we propose a Contrastive CutMix (ConCutMix) that constructs augmented samples with semantically consistent labels to boost the performance of long-tailed recognition. Specifically, we compute the similarities between samples in the semantic space learned by contrastive learning, and use them to rectify the area-based labels. Experiments show that our ConCutMix significantly improves the accuracy on tail classes as well as the overall performance. For example, based on ResNeXt-50, we improve the overall accuracy on ImageNet-LT by 3.0% thanks to the significant improvement of 3.3% on tail classes. We highlight that the improvement also generalizes well to other benchmarks and models. Our code and pretrained models are available at https://github.com/PanHaulin/ConCutMix.

Results

TaskDatasetMetricValueModel
Image ClassificationCIFAR-100-LT (ρ=50)Error Rate42.6ConCutMix
Image ClassificationCIFAR-10-LT (ρ=50)Error Rate12ConCutMix
Image ClassificationCIFAR-100-LT (ρ=100)Error Rate46.84ConCutMix
Image ClassificationCIFAR-10-LT (ρ=100)Error Rate13.93ConCutMix
Few-Shot Image ClassificationCIFAR-100-LT (ρ=50)Error Rate42.6ConCutMix
Few-Shot Image ClassificationCIFAR-10-LT (ρ=50)Error Rate12ConCutMix
Few-Shot Image ClassificationCIFAR-100-LT (ρ=100)Error Rate46.84ConCutMix
Few-Shot Image ClassificationCIFAR-10-LT (ρ=100)Error Rate13.93ConCutMix
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=50)Error Rate42.6ConCutMix
Generalized Few-Shot ClassificationCIFAR-10-LT (ρ=50)Error Rate12ConCutMix
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=100)Error Rate46.84ConCutMix
Generalized Few-Shot ClassificationCIFAR-10-LT (ρ=100)Error Rate13.93ConCutMix
Long-tail LearningCIFAR-100-LT (ρ=50)Error Rate42.6ConCutMix
Long-tail LearningCIFAR-10-LT (ρ=50)Error Rate12ConCutMix
Long-tail LearningCIFAR-100-LT (ρ=100)Error Rate46.84ConCutMix
Long-tail LearningCIFAR-10-LT (ρ=100)Error Rate13.93ConCutMix
Generalized Few-Shot LearningCIFAR-100-LT (ρ=50)Error Rate42.6ConCutMix
Generalized Few-Shot LearningCIFAR-10-LT (ρ=50)Error Rate12ConCutMix
Generalized Few-Shot LearningCIFAR-100-LT (ρ=100)Error Rate46.84ConCutMix
Generalized Few-Shot LearningCIFAR-10-LT (ρ=100)Error Rate13.93ConCutMix

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17