Unsupervised Representation Learning by Sorting Sequences

Hsin-Ying Lee, Jia-Bin Huang, Maneesh Singh, Ming-Hsuan Yang

2017-08-03ICCV 2017 10Image Classification Representation Learning Action Recognition Temporal Action Localization object-detection Object Detection Self-Supervised Action Recognition

Paper PDF Code(official)

Abstract

We present an unsupervised representation learning approach using videos without semantic labels. We leverage the temporal coherence as a supervisory signal by formulating representation learning as a sequence sorting task. We take temporally shuffled frames (i.e., in non-chronological order) as inputs and train a convolutional neural network to sort the shuffled sequences. Similar to comparison-based sorting algorithms, we propose to extract features from all frame pairs and aggregate them to predict the correct order. As sorting shuffled image sequence requires an understanding of the statistical temporal structure of images, training with such a proxy task allows us to learn rich and generalizable visual representation. We validate the effectiveness of the learned representation using our method as pre-training on high-level recognition problems. The experimental results show that our method compares favorably against state-of-the-art methods on action recognition, image classification and object detection tasks.

Results

Task	Dataset	Metric	Value	Model
Activity Recognition	HMDB51	Top-1 Accuracy	23.8	OPN (VGG-M-2048)
Action Recognition	HMDB51	Top-1 Accuracy	23.8	OPN (VGG-M-2048)

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20 Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18 Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17 Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17 Federated Learning for Commercial Image Sources2025-07-17 MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17 Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17 Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17