TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Long-Tail Learning with Foundation Model: Heavy Fine-Tunin...

Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts

Jiang-Xin Shi, Tong Wei, Zhi Zhou, Jie-Jing Shao, Xin-Yan Han, Yu-Feng Li

2023-09-18Image ClassificationLong-tail LearningLong-tail learning with class descriptorsFine-Grained Image Classification
PaperPDFCode(official)

Abstract

The fine-tuning paradigm in addressing long-tail learning tasks has sparked significant interest since the emergence of foundation models. Nonetheless, how fine-tuning impacts performance in long-tail learning was not explicitly quantified. In this paper, we disclose that heavy fine-tuning may even lead to non-negligible performance deterioration on tail classes, and lightweight fine-tuning is more effective. The reason is attributed to inconsistent class conditions caused by heavy fine-tuning. With the observation above, we develop a low-complexity and accurate long-tail learning algorithms LIFT with the goal of facilitating fast prediction and compact models by adaptive lightweight fine-tuning. Experiments clearly verify that both the training time and the learned parameters are significantly reduced with more accurate predictive performance compared with state-of-the-art approaches. The implementation code is available at https://github.com/shijxcs/LIFT.

Results

TaskDatasetMetricValueModel
Image ClassificationPlaces-LTTop-1 Accuracy53.7LIFT (ViT-L/14)
Image ClassificationPlaces-LTTop-1 Accuracy52.2LIFT (ViT-B/16)
Image ClassificationCIFAR-100-LT (ρ=50)Error Rate9.8LIFT (ViT-B/16, ImageNet-21K pre-training)
Image ClassificationCIFAR-100-LT (ρ=50)Error Rate16.9LIFT (ViT-B/16, CLIP)
Image ClassificationCIFAR-100-LT (ρ=10)Error Rate8.7LIFT (ViT-B/16, ImageNet-21K pre-training)
Image ClassificationCIFAR-100-LT (ρ=10)Error Rate15.1LIFT (ViT-B/16, CLIP)
Image ClassificationImageNet-LTTop-1 Accuracy82.9LIFT (ViT-L/14)
Image ClassificationImageNet-LTTop-1 Accuracy78.3LIFT (ViT-B/16)
Image ClassificationCIFAR-100-LT (ρ=100)Error Rate10.9LIFT (ViT-B/16, ImageNet-21K pre-training)
Image ClassificationCIFAR-100-LT (ρ=100)Error Rate18.3LIFT (ViT-B/16, CLIP)
Few-Shot Image ClassificationPlaces-LTTop-1 Accuracy53.7LIFT (ViT-L/14)
Few-Shot Image ClassificationPlaces-LTTop-1 Accuracy52.2LIFT (ViT-B/16)
Few-Shot Image ClassificationCIFAR-100-LT (ρ=50)Error Rate9.8LIFT (ViT-B/16, ImageNet-21K pre-training)
Few-Shot Image ClassificationCIFAR-100-LT (ρ=50)Error Rate16.9LIFT (ViT-B/16, CLIP)
Few-Shot Image ClassificationCIFAR-100-LT (ρ=10)Error Rate8.7LIFT (ViT-B/16, ImageNet-21K pre-training)
Few-Shot Image ClassificationCIFAR-100-LT (ρ=10)Error Rate15.1LIFT (ViT-B/16, CLIP)
Few-Shot Image ClassificationImageNet-LTTop-1 Accuracy82.9LIFT (ViT-L/14)
Few-Shot Image ClassificationImageNet-LTTop-1 Accuracy78.3LIFT (ViT-B/16)
Few-Shot Image ClassificationCIFAR-100-LT (ρ=100)Error Rate10.9LIFT (ViT-B/16, ImageNet-21K pre-training)
Few-Shot Image ClassificationCIFAR-100-LT (ρ=100)Error Rate18.3LIFT (ViT-B/16, CLIP)
Generalized Few-Shot ClassificationPlaces-LTTop-1 Accuracy53.7LIFT (ViT-L/14)
Generalized Few-Shot ClassificationPlaces-LTTop-1 Accuracy52.2LIFT (ViT-B/16)
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=50)Error Rate9.8LIFT (ViT-B/16, ImageNet-21K pre-training)
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=50)Error Rate16.9LIFT (ViT-B/16, CLIP)
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=10)Error Rate8.7LIFT (ViT-B/16, ImageNet-21K pre-training)
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=10)Error Rate15.1LIFT (ViT-B/16, CLIP)
Generalized Few-Shot ClassificationImageNet-LTTop-1 Accuracy82.9LIFT (ViT-L/14)
Generalized Few-Shot ClassificationImageNet-LTTop-1 Accuracy78.3LIFT (ViT-B/16)
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=100)Error Rate10.9LIFT (ViT-B/16, ImageNet-21K pre-training)
Generalized Few-Shot ClassificationCIFAR-100-LT (ρ=100)Error Rate18.3LIFT (ViT-B/16, CLIP)
Long-tail LearningPlaces-LTTop-1 Accuracy53.7LIFT (ViT-L/14)
Long-tail LearningPlaces-LTTop-1 Accuracy52.2LIFT (ViT-B/16)
Long-tail LearningCIFAR-100-LT (ρ=50)Error Rate9.8LIFT (ViT-B/16, ImageNet-21K pre-training)
Long-tail LearningCIFAR-100-LT (ρ=50)Error Rate16.9LIFT (ViT-B/16, CLIP)
Long-tail LearningCIFAR-100-LT (ρ=10)Error Rate8.7LIFT (ViT-B/16, ImageNet-21K pre-training)
Long-tail LearningCIFAR-100-LT (ρ=10)Error Rate15.1LIFT (ViT-B/16, CLIP)
Long-tail LearningImageNet-LTTop-1 Accuracy82.9LIFT (ViT-L/14)
Long-tail LearningImageNet-LTTop-1 Accuracy78.3LIFT (ViT-B/16)
Long-tail LearningCIFAR-100-LT (ρ=100)Error Rate10.9LIFT (ViT-B/16, ImageNet-21K pre-training)
Long-tail LearningCIFAR-100-LT (ρ=100)Error Rate18.3LIFT (ViT-B/16, CLIP)
Generalized Few-Shot LearningPlaces-LTTop-1 Accuracy53.7LIFT (ViT-L/14)
Generalized Few-Shot LearningPlaces-LTTop-1 Accuracy52.2LIFT (ViT-B/16)
Generalized Few-Shot LearningCIFAR-100-LT (ρ=50)Error Rate9.8LIFT (ViT-B/16, ImageNet-21K pre-training)
Generalized Few-Shot LearningCIFAR-100-LT (ρ=50)Error Rate16.9LIFT (ViT-B/16, CLIP)
Generalized Few-Shot LearningCIFAR-100-LT (ρ=10)Error Rate8.7LIFT (ViT-B/16, ImageNet-21K pre-training)
Generalized Few-Shot LearningCIFAR-100-LT (ρ=10)Error Rate15.1LIFT (ViT-B/16, CLIP)
Generalized Few-Shot LearningImageNet-LTTop-1 Accuracy82.9LIFT (ViT-L/14)
Generalized Few-Shot LearningImageNet-LTTop-1 Accuracy78.3LIFT (ViT-B/16)
Generalized Few-Shot LearningCIFAR-100-LT (ρ=100)Error Rate10.9LIFT (ViT-B/16, ImageNet-21K pre-training)
Generalized Few-Shot LearningCIFAR-100-LT (ρ=100)Error Rate18.3LIFT (ViT-B/16, CLIP)

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14FedGSCA: Medical Federated Learning with Global Sample Selector and Client Adaptive Adjuster under Label Noise2025-07-13