TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Deep Feature Flow for Video Recognition

Deep Feature Flow for Video Recognition

Xizhou Zhu, Yuwen Xiong, Jifeng Dai, Lu Yuan, Yichen Wei

2016-11-23CVPR 2017 7Video RecognitionVideo Semantic Segmentation
PaperPDFCode(official)CodeCode

Abstract

Deep convolutional neutral networks have achieved great success on image recognition tasks. Yet, it is non-trivial to transfer the state-of-the-art image recognition networks to videos as per-frame evaluation is too slow and unaffordable. We present deep feature flow, a fast and accurate framework for video recognition. It runs the expensive convolutional sub-network only on sparse key frames and propagates their deep feature maps to other frames via a flow field. It achieves significant speedup as flow computation is relatively fast. The end-to-end training of the whole architecture significantly boosts the recognition accuracy. Deep feature flow is flexible and general. It is validated on two recent large scale video datasets. It makes a large step towards practical video recognition.

Results

TaskDatasetMetricValueModel
Scene ParsingCityscapes valmIoU69.2DFF [22]
Video Semantic SegmentationCityscapes valmIoU69.2DFF [22]
Scene UnderstandingCityscapes valmIoU69.2DFF [22]
2D Semantic SegmentationCityscapes valmIoU69.2DFF [22]

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Memory-Augmented SAM2 for Training-Free Surgical Video Segmentation2025-07-13MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation2025-07-10Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and Grounder2025-06-28CogGen: A Learner-Centered Generative AI Architecture for Intelligent Tutoring with Programming Video2025-06-25Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment2025-06-17A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects2025-06-16