TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Advancing Fine-Grained Classification by Structure and Sub...

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

Eyal Michaeli, Ohad Fried

2024-06-20Few-Shot LearningData AugmentationMitigating Contextual BiasFine-Grained Image Classification
PaperPDFCode(official)

Abstract

Fine-grained visual classification (FGVC) involves classifying closely related sub-classes. This task is difficult due to the subtle differences between classes and the high intra-class variance. Moreover, FGVC datasets are typically small and challenging to gather, thus highlighting a significant need for effective data augmentation. Recent advancements in text-to-image diffusion models offer new possibilities for augmenting classification datasets. While these models have been used to generate training data for classification tasks, their effectiveness in full-dataset training of FGVC models remains under-explored. Recent techniques that rely on Text2Image generation or Img2Img methods, often struggle to generate images that accurately represent the class while modifying them to a degree that significantly increases the dataset's diversity. To address these challenges, we present SaSPA: Structure and Subject Preserving Augmentation. Contrary to recent methods, our method does not use real images as guidance, thereby increasing generation flexibility and promoting greater diversity. To ensure accurate class representation, we employ conditioning mechanisms, specifically by conditioning on image edges and subject representation. We conduct extensive experiments and benchmark SaSPA against both traditional and recent generative data augmentation methods. SaSPA consistently outperforms all established baselines across multiple settings, including full dataset training, contextual bias, and few-shot classification. Additionally, our results reveal interesting patterns in using synthetic data for FGVC models; for instance, we find a relationship between the amount of real data used and the optimal proportion of synthetic data. Code is available at https://github.com/EyalMichaeli/SaSPA-Aug.

Results

TaskDatasetMetricValueModel
Few-Shot LearningStanford Cars12-shot Accuracy88.8SaSPA + CAL
Few-Shot LearningStanford Cars16-shot Accuracy91SaSPA + CAL
Few-Shot LearningStanford Cars4-shot Accuracy66.7SaSPA + CAL
Few-Shot LearningStanford Cars8-shot Accuracy82.6SaSPA + CAL
Few-Shot LearningFGVC Aircraft12-shot Accuracy75.4SaSPA + CAL
Few-Shot LearningFGVC Aircraft16-shot Accuracy78.9SaSPA + CAL
Few-Shot LearningFGVC Aircraft4-shot Accuracy52.2SaSPA + CAL
Few-Shot LearningFGVC Aircraft8-shot Accuracy67.2SaSPA + CAL
Few-Shot LearningFGVC AircraftHarmonic mean52.2SaSPA + CAL
Few-Shot LearningDTD12-shot Accuracy58.1SaSPA + CAL
Few-Shot LearningDTD16-shot Accuracy60.2SaSPA + CAL
Few-Shot LearningDTD4-shot Accuracy48.3SaSPA + CAL
Few-Shot LearningDTD8-shot Accuracy54.8SaSPA + CAL
Image ClassificationFGVC AircraftAccuracy94.5SaSPA + CAL
Image ClassificationStanford CarsAccuracy95.72SaSPA + CAL
Fine-Grained Image ClassificationFGVC AircraftAccuracy94.5SaSPA + CAL
Fine-Grained Image ClassificationStanford CarsAccuracy95.72SaSPA + CAL
Meta-LearningStanford Cars12-shot Accuracy88.8SaSPA + CAL
Meta-LearningStanford Cars16-shot Accuracy91SaSPA + CAL
Meta-LearningStanford Cars4-shot Accuracy66.7SaSPA + CAL
Meta-LearningStanford Cars8-shot Accuracy82.6SaSPA + CAL
Meta-LearningFGVC Aircraft12-shot Accuracy75.4SaSPA + CAL
Meta-LearningFGVC Aircraft16-shot Accuracy78.9SaSPA + CAL
Meta-LearningFGVC Aircraft4-shot Accuracy52.2SaSPA + CAL
Meta-LearningFGVC Aircraft8-shot Accuracy67.2SaSPA + CAL
Meta-LearningFGVC AircraftHarmonic mean52.2SaSPA + CAL
Meta-LearningDTD12-shot Accuracy58.1SaSPA + CAL
Meta-LearningDTD16-shot Accuracy60.2SaSPA + CAL
Meta-LearningDTD4-shot Accuracy48.3SaSPA + CAL
Meta-LearningDTD8-shot Accuracy54.8SaSPA + CAL
ClassificationFGVC AircraftOOD Accuracy (%)41.5CAL + SaSPA
ClassificationFGVC AircraftTop-1 Accuracy (%)73CAL + SaSPA

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15Iceberg: Enhancing HLS Modeling with Synthetic Data2025-07-14AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13FreeAudio: Training-Free Timing Planning for Controllable Long-Form Text-to-Audio Generation2025-07-11