TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Cascaded Dual Vision Transformer for Accurate Facial Landm...

Cascaded Dual Vision Transformer for Accurate Facial Landmark Detection

Ziqiang Dang, Jianfang Li, Lin Liu

2024-11-08Facial Landmark Detection
PaperPDFCode

Abstract

Facial landmark detection is a fundamental problem in computer vision for many downstream applications. This paper introduces a new facial landmark detector based on vision transformers, which consists of two unique designs: Dual Vision Transformer (D-ViT) and Long Skip Connections (LSC). Based on the observation that the channel dimension of feature maps essentially represents the linear bases of the heatmap space, we propose learning the interconnections between these linear bases to model the inherent geometric relations among landmarks via Channel-split ViT. We integrate such channel-split ViT into the standard vision transformer (i.e., spatial-split ViT), forming our Dual Vision Transformer to constitute the prediction blocks. We also suggest using long skip connections to deliver low-level image features to all prediction blocks, thereby preventing useful information from being discarded by intermediate supervision. Extensive experiments are conducted to evaluate the performance of our proposal on the widely used benchmarks, i.e., WFLW, COFW, and 300W, demonstrating that our model outperforms the previous SOTAs across all three benchmarks.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingWFLWAUC@10 (inter-ocular)63.7D-ViT
Facial Recognition and ModellingWFLWFR@10 (inter-ocular)1.76D-ViT
Facial Recognition and ModellingWFLWNME3.75D-ViT
Facial Recognition and ModellingWFLWNME (inter-ocular)3.75D-ViT
Facial Recognition and Modelling300WNME2.85D-ViT
Facial Recognition and ModellingCOFWNME (inter-pupil)4.13D-ViT
Facial Landmark DetectionWFLWAUC@10 (inter-ocular)63.7D-ViT
Facial Landmark DetectionWFLWFR@10 (inter-ocular)1.76D-ViT
Facial Landmark DetectionWFLWNME3.75D-ViT
Facial Landmark DetectionWFLWNME (inter-ocular)3.75D-ViT
Facial Landmark Detection300WNME2.85D-ViT
Facial Landmark DetectionCOFWNME (inter-pupil)4.13D-ViT
Face ReconstructionWFLWAUC@10 (inter-ocular)63.7D-ViT
Face ReconstructionWFLWFR@10 (inter-ocular)1.76D-ViT
Face ReconstructionWFLWNME3.75D-ViT
Face ReconstructionWFLWNME (inter-ocular)3.75D-ViT
Face Reconstruction300WNME2.85D-ViT
Face ReconstructionCOFWNME (inter-pupil)4.13D-ViT
3DWFLWAUC@10 (inter-ocular)63.7D-ViT
3DWFLWFR@10 (inter-ocular)1.76D-ViT
3DWFLWNME3.75D-ViT
3DWFLWNME (inter-ocular)3.75D-ViT
3D300WNME2.85D-ViT
3DCOFWNME (inter-pupil)4.13D-ViT
3D Face ModellingWFLWAUC@10 (inter-ocular)63.7D-ViT
3D Face ModellingWFLWFR@10 (inter-ocular)1.76D-ViT
3D Face ModellingWFLWNME3.75D-ViT
3D Face ModellingWFLWNME (inter-ocular)3.75D-ViT
3D Face Modelling300WNME2.85D-ViT
3D Face ModellingCOFWNME (inter-pupil)4.13D-ViT
3D Face ReconstructionWFLWAUC@10 (inter-ocular)63.7D-ViT
3D Face ReconstructionWFLWFR@10 (inter-ocular)1.76D-ViT
3D Face ReconstructionWFLWNME3.75D-ViT
3D Face ReconstructionWFLWNME (inter-ocular)3.75D-ViT
3D Face Reconstruction300WNME2.85D-ViT
3D Face ReconstructionCOFWNME (inter-pupil)4.13D-ViT

Related Papers

MOL: Joint Estimation of Micro-Expression, Optical Flow, and Landmark via Transformer-Graph-Style Convolution2025-06-17Semantic Style Transfer for Enhancing Animal Facial Landmark Detection2025-05-08ORFormer: Occlusion-Robust Transformer for Accurate Facial Landmark Detection2024-12-17Precise Facial Landmark Detection by Dynamic Semantic Aggregation Transformer2024-12-01POPoS: Improving Efficient and Robust Facial Landmark Detection with Parallel Optimal Position Search2024-10-12Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach2024-10-08Real-Time Drowsiness Detection Using Eye Aspect Ratio and Facial Landmark Detection2024-08-11Efficient Facial Landmark Detection for Embedded Systems2024-07-14