TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Glimpse Clouds: Human Activity Recognition from Unstructur...

Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points

Fabien Baradel, Christian Wolf, Julien Mille, Graham W. Taylor

2018-02-22CVPR 2018 6Skeleton Based Action RecognitionHuman Activity RecognitionAction RecognitionTemporal Action LocalizationActivity RecognitionActivity Prediction
PaperPDFCode(official)

Abstract

We propose a method for human activity recognition from RGB data that does not rely on any pose information during test time and does not explicitly calculate pose information internally. Instead, a visual attention module learns to predict glimpse sequences in each frame. These glimpses correspond to interest points in the scene that are relevant to the classified activities. No spatial coherence is forced on the glimpse locations, which gives the module liberty to explore different points at each frame and better optimize the process of scrutinizing visual information. Tracking and sequentially integrating this kind of unstructured data is a challenge, which we address by separating the set of glimpses from a set of recurrent tracking/recognition workers. These workers receive glimpses, jointly performing subsequent motion tracking and activity prediction. The glimpses are soft-assigned to the workers, optimizing coherence of the assignments in space, time and feature space using an external memory module. No hard decisions are taken, i.e. each glimpse point is assigned to all existing workers, albeit with different importance. Our methods outperform state-of-the-art methods on the largest human activity recognition dataset available to-date; NTU RGB+D Dataset, and on a smaller human action recognition dataset Northwestern-UCLA Multiview Action 3D Dataset. Our code is publicly available at https://github.com/fabienbaradel/glimpse_clouds.

Results

TaskDatasetMetricValueModel
Activity RecognitionNTU RGB+DAccuracy (CS)86.6Glimpse Clouds (RGB only)
Activity RecognitionNTU RGB+DAccuracy (CV)93.2Glimpse Clouds (RGB only)
Action RecognitionNTU RGB+DAccuracy (CS)86.6Glimpse Clouds (RGB only)
Action RecognitionNTU RGB+DAccuracy (CV)93.2Glimpse Clouds (RGB only)

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16ZKP-FedEval: Verifiable and Privacy-Preserving Federated Evaluation using Zero-Knowledge Proofs2025-07-15RLHGNN: Reinforcement Learning-driven Heterogeneous Graph Neural Network for Next Activity Prediction in Business Processes2025-07-03Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26SEZ-HARN: Self-Explainable Zero-shot Human Activity Recognition Network2025-06-25Feature Hallucination for Self-supervised Action Recognition2025-06-25