Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

Eyal Michaeli, Ohad Fried

2024-06-20Few-Shot Learning Data Augmentation Mitigating Contextual Bias Fine-Grained Image Classification

Abstract

Fine-grained visual classification (FGVC) involves classifying closely related sub-classes. This task is difficult due to the subtle differences between classes and the high intra-class variance. Moreover, FGVC datasets are typically small and challenging to gather, thus highlighting a significant need for effective data augmentation. Recent advancements in text-to-image diffusion models offer new possibilities for augmenting classification datasets. While these models have been used to generate training data for classification tasks, their effectiveness in full-dataset training of FGVC models remains under-explored. Recent techniques that rely on Text2Image generation or Img2Img methods, often struggle to generate images that accurately represent the class while modifying them to a degree that significantly increases the dataset's diversity. To address these challenges, we present SaSPA: Structure and Subject Preserving Augmentation. Contrary to recent methods, our method does not use real images as guidance, thereby increasing generation flexibility and promoting greater diversity. To ensure accurate class representation, we employ conditioning mechanisms, specifically by conditioning on image edges and subject representation. We conduct extensive experiments and benchmark SaSPA against both traditional and recent generative data augmentation methods. SaSPA consistently outperforms all established baselines across multiple settings, including full dataset training, contextual bias, and few-shot classification. Additionally, our results reveal interesting patterns in using synthetic data for FGVC models; for instance, we find a relationship between the amount of real data used and the optimal proportion of synthetic data. Code is available at https://github.com/EyalMichaeli/SaSPA-Aug.

Results

Task	Dataset	Metric	Value	Model
Few-Shot Learning	Stanford Cars	12-shot Accuracy	88.8	SaSPA + CAL
Few-Shot Learning	Stanford Cars	16-shot Accuracy	91	SaSPA + CAL
Few-Shot Learning	Stanford Cars	4-shot Accuracy	66.7	SaSPA + CAL
Few-Shot Learning	Stanford Cars	8-shot Accuracy	82.6	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	12-shot Accuracy	75.4	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	16-shot Accuracy	78.9	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	4-shot Accuracy	52.2	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	8-shot Accuracy	67.2	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	Harmonic mean	52.2	SaSPA + CAL
Few-Shot Learning	DTD	12-shot Accuracy	58.1	SaSPA + CAL
Few-Shot Learning	DTD	16-shot Accuracy	60.2	SaSPA + CAL
Few-Shot Learning	DTD	4-shot Accuracy	48.3	SaSPA + CAL
Few-Shot Learning	DTD	8-shot Accuracy	54.8	SaSPA + CAL
Image Classification	FGVC Aircraft	Accuracy	94.5	SaSPA + CAL
Image Classification	Stanford Cars	Accuracy	95.72	SaSPA + CAL
Fine-Grained Image Classification	FGVC Aircraft	Accuracy	94.5	SaSPA + CAL
Fine-Grained Image Classification	Stanford Cars	Accuracy	95.72	SaSPA + CAL
Meta-Learning	Stanford Cars	12-shot Accuracy	88.8	SaSPA + CAL
Meta-Learning	Stanford Cars	16-shot Accuracy	91	SaSPA + CAL
Meta-Learning	Stanford Cars	4-shot Accuracy	66.7	SaSPA + CAL
Meta-Learning	Stanford Cars	8-shot Accuracy	82.6	SaSPA + CAL
Meta-Learning	FGVC Aircraft	12-shot Accuracy	75.4	SaSPA + CAL
Meta-Learning	FGVC Aircraft	16-shot Accuracy	78.9	SaSPA + CAL
Meta-Learning	FGVC Aircraft	4-shot Accuracy	52.2	SaSPA + CAL
Meta-Learning	FGVC Aircraft	8-shot Accuracy	67.2	SaSPA + CAL
Meta-Learning	FGVC Aircraft	Harmonic mean	52.2	SaSPA + CAL
Meta-Learning	DTD	12-shot Accuracy	58.1	SaSPA + CAL
Meta-Learning	DTD	16-shot Accuracy	60.2	SaSPA + CAL
Meta-Learning	DTD	4-shot Accuracy	48.3	SaSPA + CAL
Meta-Learning	DTD	8-shot Accuracy	54.8	SaSPA + CAL
Classification	FGVC Aircraft	OOD Accuracy (%)	41.5	CAL + SaSPA
Classification	FGVC Aircraft	Top-1 Accuracy (%)	73	CAL + SaSPA

Abstract

Results

Task	Dataset	Metric	Value	Model
Few-Shot Learning	Stanford Cars	12-shot Accuracy	88.8	SaSPA + CAL
Few-Shot Learning	Stanford Cars	16-shot Accuracy	91	SaSPA + CAL
Few-Shot Learning	Stanford Cars	4-shot Accuracy	66.7	SaSPA + CAL
Few-Shot Learning	Stanford Cars	8-shot Accuracy	82.6	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	12-shot Accuracy	75.4	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	16-shot Accuracy	78.9	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	4-shot Accuracy	52.2	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	8-shot Accuracy	67.2	SaSPA + CAL
Few-Shot Learning	FGVC Aircraft	Harmonic mean	52.2	SaSPA + CAL
Few-Shot Learning	DTD	12-shot Accuracy	58.1	SaSPA + CAL
Few-Shot Learning	DTD	16-shot Accuracy	60.2	SaSPA + CAL
Few-Shot Learning	DTD	4-shot Accuracy	48.3	SaSPA + CAL
Few-Shot Learning	DTD	8-shot Accuracy	54.8	SaSPA + CAL
Image Classification	FGVC Aircraft	Accuracy	94.5	SaSPA + CAL
Image Classification	Stanford Cars	Accuracy	95.72	SaSPA + CAL
Fine-Grained Image Classification	FGVC Aircraft	Accuracy	94.5	SaSPA + CAL
Fine-Grained Image Classification	Stanford Cars	Accuracy	95.72	SaSPA + CAL
Meta-Learning	Stanford Cars	12-shot Accuracy	88.8	SaSPA + CAL
Meta-Learning	Stanford Cars	16-shot Accuracy	91	SaSPA + CAL
Meta-Learning	Stanford Cars	4-shot Accuracy	66.7	SaSPA + CAL
Meta-Learning	Stanford Cars	8-shot Accuracy	82.6	SaSPA + CAL
Meta-Learning	FGVC Aircraft	12-shot Accuracy	75.4	SaSPA + CAL
Meta-Learning	FGVC Aircraft	16-shot Accuracy	78.9	SaSPA + CAL
Meta-Learning	FGVC Aircraft	4-shot Accuracy	52.2	SaSPA + CAL
Meta-Learning	FGVC Aircraft	8-shot Accuracy	67.2	SaSPA + CAL
Meta-Learning	FGVC Aircraft	Harmonic mean	52.2	SaSPA + CAL
Meta-Learning	DTD	12-shot Accuracy	58.1	SaSPA + CAL
Meta-Learning	DTD	16-shot Accuracy	60.2	SaSPA + CAL
Meta-Learning	DTD	4-shot Accuracy	48.3	SaSPA + CAL
Meta-Learning	DTD	8-shot Accuracy	54.8	SaSPA + CAL
Classification	FGVC Aircraft	OOD Accuracy (%)	41.5	CAL + SaSPA
Classification	FGVC Aircraft	Top-1 Accuracy (%)	73	CAL + SaSPA

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

Abstract

Results

Related Papers

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

Abstract

Results

Related Papers