Representation Flow for Action Recognition

AJ Piergiovanni, Michael S. Ryoo

2018-10-02CVPR 2019 6Activity Recognition In Videos Action Classification Optical Flow Estimation Video Classification Video Understanding Action Recognition Action Recognition In Videos Temporal Action Localization Activity Recognition

Paper PDF Code Code Code Code(official)Code

Abstract

In this paper, we propose a convolutional layer inspired by optical flow algorithms to learn motion representations. Our representation flow layer is a fully-differentiable layer designed to capture the `flow' of any representation channel within a convolutional neural network for action recognition. Its parameters for iterative flow optimization are learned in an end-to-end fashion together with the other CNN model parameters, maximizing the action recognition performance. Furthermore, we newly introduce the concept of learning `flow of flow' representations by stacking multiple representation flow layers. We conducted extensive experimental evaluations, confirming its advantages over previous recognition models using traditional optical flows in both computational speed and performance. Code/models available here: https://piergiaj.github.io/rep-flow-site/

Results

Task	Dataset	Metric	Value	Model
Video	Kinetics-400	Acc@1	77.9	RepFlow-50 ([2+1]D CNN, FcF, Non-local block)
Activity Recognition	HMDB-51	Average accuracy of 3 splits	81.1	RepFlow-50 ([2+1]D CNN, FcF, Non-local block)
Action Recognition	HMDB-51	Average accuracy of 3 splits	81.1	RepFlow-50 ([2+1]D CNN, FcF, Non-local block)

Related Papers

Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17 VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding2025-07-17 A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17 DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks2025-07-15 ZKP-FedEval: Verifiable and Privacy-Preserving Federated Evaluation using Zero-Knowledge Proofs2025-07-15 EmbRACE-3K: Embodied Reasoning and Action in Complex Environments2025-07-14 Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI2025-07-14