TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Dance with Flow: Two-in-One Stream Action Detection

Dance with Flow: Two-in-One Stream Action Detection

Jiaojiao Zhao, Cees G. M. Snoek

2019-04-01CVPR 2019 6Action DetectionOptical Flow EstimationVocal Bursts Valence Prediction
PaperPDFCode(official)

Abstract

The goal of this paper is to detect the spatio-temporal extent of an action. The two-stream detection network based on RGB and flow provides state-of-the-art accuracy at the expense of a large model-size and heavy computation. We propose to embed RGB and optical-flow into a single two-in-one stream network with new layers. A motion condition layer extracts motion information from flow images, which is leveraged by the motion modulation layer to generate transformation parameters for modulating the low-level RGB features. The method is easily embedded in existing appearance- or two-stream action detection networks, and trained end-to-end. Experiments demonstrate that leveraging the motion condition to modulate RGB features improves detection accuracy. With only half the computation and parameters of the state-of-the-art two-stream methods, our two-in-one stream still achieves impressive results on UCF101-24, UCFSports and J-HMDB.

Results

TaskDatasetMetricValueModel
Activity RecognitionUCF1013-fold Accuracy92two-in-one two stream
Action DetectionUCF101-24Video-mAP 0.278.48Two-in-one Two Stream
Action DetectionUCF101-24Video-mAP 0.550.3Two-in-one Two Stream
Action DetectionUCF101-24Video-mAP 0.275.48Two-in-one
Action DetectionUCF101-24Video-mAP 0.548.31Two-in-one
Action DetectionUCF SportsVideo-mAP 0.596.52Two-in-one Two Stream
Action DetectionUCF SportsVideo-mAP 0.592.74Two-in-one
Action DetectionJ-HMDBVideo-mAP 0.574.74Two-in-one Two Stream
Action DetectionJ-HMDBVideo-mAP 0.557.96Two-in-one
Action RecognitionUCF1013-fold Accuracy92two-in-one two stream

Related Papers

Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17An Efficient Approach for Muscle Segmentation and 3D Reconstruction Using Keypoint Tracking in MRI Scan2025-07-11Learning to Track Any Points from Human Motion2025-07-08TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation2025-07-07MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation2025-06-29EndoFlow-SLAM: Real-Time Endoscopic SLAM with Flow-Constrained Gaussian Splatting2025-06-26WAFT: Warping-Alone Field Transforms for Optical Flow2025-06-26CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment2025-06-25