TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Hierarchical binary CNNs for landmark localization with li...

Hierarchical binary CNNs for landmark localization with limited resources

Adrian Bulat, Georgios Tzimiropoulos

2018-08-14Face AlignmentBinarization3D Face AlignmentPose Estimation
PaperPDFCode(official)

Abstract

Our goal is to design architectures that retain the groundbreaking performance of Convolutional Neural Networks (CNNs) for landmark localization and at the same time are lightweight, compact and suitable for applications with limited computational resources. To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation and face alignment. We exhaustively evaluate various design choices, identify performance bottlenecks, and more importantly propose multiple orthogonal ways to boost performance. (b) Based on our analysis, we propose a novel hierarchical, parallel and multi-scale residual architecture that yields large performance improvement over the standard bottleneck block while having the same number of parameters, thus bridging the gap between the original network and its binarized counterpart. (c) We perform a large number of ablation studies that shed light on the properties and the performance of the proposed block. (d) We present results for experiments on the most challenging datasets for human pose estimation and face alignment, reporting in many cases state-of-the-art performance. (e) We further provide additional results for the problem of facial part segmentation. Code can be downloaded from https://www.adrianbulat.com/binary-cnn-landmark

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingAFLW-LFPANME3.02Ours
Pose EstimationMPII Human PosePCKh-0.581.3x
Face ReconstructionAFLW-LFPANME3.02Ours
3DMPII Human PosePCKh-0.581.3x
3DAFLW-LFPANME3.02Ours
3DAFLW2000-3DNME3.26Ours
3D Face ModellingAFLW-LFPANME3.02Ours
3D Face ReconstructionAFLW-LFPANME3.02Ours
1 Image, 2*2 StitchiMPII Human PosePCKh-0.581.3x

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16