TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Few-Shot Video Classification via Temporal Alignment

Few-Shot Video Classification via Temporal Alignment

Kaidi Cao, Jingwei Ji, Zhangjie Cao, Chien-Yi Chang, Juan Carlos Niebles

2019-06-27CVPR 2020 6Few-Shot LearningFew Shot Action RecognitionVideo ClassificationGeneral ClassificationAction RecognitionClassification
PaperPDF

Abstract

There is a growing interest in learning a model which could recognize novel classes with only a few labeled examples. In this paper, we propose Temporal Alignment Module (TAM), a novel few-shot learning framework that can learn to classify a previous unseen video. While most previous works neglect long-term temporal ordering information, our proposed model explicitly leverages the temporal ordering information in video data through temporal alignment. This leads to strong data-efficiency for few-shot learning. In concrete, TAM calculates the distance value of query video with respect to novel class proxies by averaging the per frame distances along its alignment path. We introduce continuous relaxation to TAM so the model can be learned in an end-to-end fashion to directly optimize the few-shot learning objective. We evaluate TAM on two challenging real-world datasets, Kinetics and Something-Something-V2, and show that our model leads to significant improvement of few-shot video classification over a wide range of competitive baselines.

Results

TaskDatasetMetricValueModel
Activity RecognitionSomething-Something V2Top-1 Accuracy52.3TAM (5-shot)
Activity RecognitionKinetics-100Accuracy85.8OTAM
Activity RecognitionSomething-Something-1001:1 Accuracy52.3OTAM
Action RecognitionSomething-Something V2Top-1 Accuracy52.3TAM (5-shot)
Action RecognitionKinetics-100Accuracy85.8OTAM
Action RecognitionSomething-Something-1001:1 Accuracy52.3OTAM

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis2025-07-10