Evaluation of Output Embeddings for Fine-Grained Image Classification

Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, Bernt Schiele

2014-09-30CVPR 2015 6Image Classification Few-Shot Image Classification Zero-Shot Action Recognition General Classification Classification Zero-Shot Learning Fine-Grained Image Classification

Paper PDF Code Code

Abstract

Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.

Results

Task	Dataset	Metric	Value	Model
Image Classification	CUB 200 50-way (0-shot)	Accuracy	50.1	SJE Akata et al. (2015)
Few-Shot Image Classification	CUB 200 50-way (0-shot)	Accuracy	50.1	SJE Akata et al. (2015)
Zero-Shot Action Recognition	UCF101	Top-1 Accuracy	12	SJE(Attribute)
Zero-Shot Action Recognition	UCF101	Top-1 Accuracy	9.9	SJE(Word Embedding)
Zero-Shot Action Recognition	Kinetics	Top-1 Accuracy	22.3	SJE(Word Embedding)
Zero-Shot Action Recognition	Kinetics	Top-5 Accuracy	48.2	SJE(Word Embedding)
Zero-Shot Action Recognition	HMDB51	Top-1 Accuracy	13.3	SJE(word embedding)
Zero-Shot Action Recognition	Olympics	Top-1 Accuracy	47.5	SJE(Atrribute)
Zero-Shot Action Recognition	Olympics	Top-1 Accuracy	28.6	SJE(Word Embedding)

Evaluation of Output Embeddings for Fine-Grained Image Classification

Abstract

Results

Related Papers

Evaluation of Output Embeddings for Fine-Grained Image Classification

Abstract

Results

Related Papers