TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ViPNAS: Efficient Video Pose Estimation via Neural Archite...

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang

2021-05-21CVPR 2021 1Neural Architecture SearchPose Estimation
PaperPDFCodeCode(official)CodeCode

Abstract

Human pose estimation has achieved significant progress in recent years. However, most of the recent methods focus on improving accuracy using complicated models and ignoring real-time efficiency. To achieve a better trade-off between accuracy and efficiency, we propose a novel neural architecture search (NAS) method, termed ViPNAS, to search networks in both spatial and temporal levels for fast online video pose estimation. In the spatial level, we carefully design the search space with five different dimensions including network depth, width, kernel size, group number, and attentions. In the temporal level, we search from a series of temporal feature fusions to optimize the total accuracy and speed across multiple video frames. To the best of our knowledge, we are the first to search for the temporal feature fusion and automatic computation allocation in videos. Extensive experiments demonstrate the effectiveness of our approach on the challenging COCO2017 and PoseTrack2018 datasets. Our discovered model family, S-ViPNAS and T-ViPNAS, achieve significantly higher inference speed (CPU real-time) without sacrificing the accuracy compared to the previous state-of-the-art methods.

Results

TaskDatasetMetricValueModel
Pose EstimationCOCO test-devAP73.9S-ViPNAS-HRNetW32
Pose EstimationCOCO test-devAP5091.7S-ViPNAS-HRNetW32
Pose EstimationCOCO test-devAP7582S-ViPNAS-HRNetW32
Pose EstimationCOCO test-devAPL79.5S-ViPNAS-HRNetW32
Pose EstimationCOCO test-devAPM70.5S-ViPNAS-HRNetW32
Pose EstimationCOCO test-devAR80.4S-ViPNAS-HRNetW32
Pose EstimationCOCO test-devAP70.3S-ViPNAS-Res50
Pose EstimationCOCO test-devAP5090.7S-ViPNAS-Res50
Pose EstimationCOCO test-devAP7578.8S-ViPNAS-Res50
Pose EstimationCOCO test-devAPL75.5S-ViPNAS-Res50
Pose EstimationCOCO test-devAPM67.3S-ViPNAS-Res50
Pose EstimationCOCO test-devAR77.3S-ViPNAS-Res50
3DCOCO test-devAP73.9S-ViPNAS-HRNetW32
3DCOCO test-devAP5091.7S-ViPNAS-HRNetW32
3DCOCO test-devAP7582S-ViPNAS-HRNetW32
3DCOCO test-devAPL79.5S-ViPNAS-HRNetW32
3DCOCO test-devAPM70.5S-ViPNAS-HRNetW32
3DCOCO test-devAR80.4S-ViPNAS-HRNetW32
3DCOCO test-devAP70.3S-ViPNAS-Res50
3DCOCO test-devAP5090.7S-ViPNAS-Res50
3DCOCO test-devAP7578.8S-ViPNAS-Res50
3DCOCO test-devAPL75.5S-ViPNAS-Res50
3DCOCO test-devAPM67.3S-ViPNAS-Res50
3DCOCO test-devAR77.3S-ViPNAS-Res50
1 Image, 2*2 StitchiCOCO test-devAP73.9S-ViPNAS-HRNetW32
1 Image, 2*2 StitchiCOCO test-devAP5091.7S-ViPNAS-HRNetW32
1 Image, 2*2 StitchiCOCO test-devAP7582S-ViPNAS-HRNetW32
1 Image, 2*2 StitchiCOCO test-devAPL79.5S-ViPNAS-HRNetW32
1 Image, 2*2 StitchiCOCO test-devAPM70.5S-ViPNAS-HRNetW32
1 Image, 2*2 StitchiCOCO test-devAR80.4S-ViPNAS-HRNetW32
1 Image, 2*2 StitchiCOCO test-devAP70.3S-ViPNAS-Res50
1 Image, 2*2 StitchiCOCO test-devAP5090.7S-ViPNAS-Res50
1 Image, 2*2 StitchiCOCO test-devAP7578.8S-ViPNAS-Res50
1 Image, 2*2 StitchiCOCO test-devAPL75.5S-ViPNAS-Res50
1 Image, 2*2 StitchiCOCO test-devAPM67.3S-ViPNAS-Res50
1 Image, 2*2 StitchiCOCO test-devAR77.3S-ViPNAS-Res50

Related Papers

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16