Label-Embedding for Image Classification

Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid

2015-03-30Image Classification Attribute Zero-Shot Action Recognition General Classification Classification Zero-Shot Learning Multi-label zero-shot learning

Paper PDF Code Code

Abstract

Attributes act as intermediate representations that enable parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function that measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. Results on the Animals With Attributes and Caltech-UCSD-Birds datasets show that the proposed framework outperforms the standard Direct Attribute Prediction baseline in a zero-shot learning scenario. Label embedding enjoys a built-in ability to leverage alternative sources of information instead of or in addition to attributes, such as e.g. class hierarchies or textual descriptions. Moreover, label embedding encompasses the whole range of learning settings from zero-shot learning to regular learning with a large number of labeled examples.

Results

Task	Dataset	Metric	Value	Model
Zero-Shot Learning	Open Images V4	MAP	40.5	LabelEM
Zero-Shot Action Recognition	Kinetics	Top-1 Accuracy	23.4	ALE
Zero-Shot Action Recognition	Kinetics	Top-5 Accuracy	50.3	ALE

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18 Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17 Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17 Federated Learning for Commercial Image Sources2025-07-17 MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17 GLAD: Generalizable Tuning for Vision-Language Models2025-07-17 MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16 Non-Adaptive Adversarial Face Generation2025-07-16