TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Evaluation of Output Embeddings for Fine-Grained Image Cla...

Evaluation of Output Embeddings for Fine-Grained Image Classification

Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, Bernt Schiele

2014-09-30CVPR 2015 6Image ClassificationFew-Shot Image ClassificationZero-Shot Action RecognitionGeneral ClassificationClassificationZero-Shot LearningFine-Grained Image Classification
PaperPDFCodeCode

Abstract

Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.

Results

TaskDatasetMetricValueModel
Image ClassificationCUB 200 50-way (0-shot)Accuracy50.1SJE Akata et al. (2015)
Few-Shot Image ClassificationCUB 200 50-way (0-shot)Accuracy50.1SJE Akata et al. (2015)
Zero-Shot Action RecognitionUCF101Top-1 Accuracy12SJE(Attribute)
Zero-Shot Action RecognitionUCF101Top-1 Accuracy9.9SJE(Word Embedding)
Zero-Shot Action RecognitionKineticsTop-1 Accuracy22.3SJE(Word Embedding)
Zero-Shot Action RecognitionKineticsTop-5 Accuracy48.2SJE(Word Embedding)
Zero-Shot Action RecognitionHMDB51Top-1 Accuracy13.3SJE(word embedding)
Zero-Shot Action RecognitionOlympicsTop-1 Accuracy47.5SJE(Atrribute)
Zero-Shot Action RecognitionOlympicsTop-1 Accuracy28.6SJE(Word Embedding)

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16