TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LOTR: Face Landmark Localization Using Localization Transf...

LOTR: Face Landmark Localization Using Localization Transformer

Ukrit Watchareeruetai, Benjaphan Sommana, Sanjana Jain, Pavit Noinongyao, Ankush Ganguly, Aubin Samacoits, Samuel W. F. Earp, Nakarin Sritrakool

2021-09-21Face AlignmentFace Recognition
PaperPDF

Abstract

This paper presents a novel Transformer-based facial landmark localization network named Localization Transformer (LOTR). The proposed framework is a direct coordinate regression approach leveraging a Transformer network to better utilize the spatial information in the feature map. An LOTR model consists of three main modules: 1) a visual backbone that converts an input image into a feature map, 2) a Transformer module that improves the feature representation from the visual backbone, and 3) a landmark prediction head that directly predicts the landmark coordinates from the Transformer's representation. Given cropped-and-aligned face images, the proposed LOTR can be trained end-to-end without requiring any post-processing steps. This paper also introduces the smooth-Wing loss function, which addresses the gradient discontinuity of the Wing loss, leading to better convergence than standard loss functions such as L1, L2, and Wing loss. Experimental results on the JD landmark dataset provided by the First Grand Challenge of 106-Point Facial Landmark Localization indicate the superiority of LOTR over the existing methods on the leaderboard and two recent heatmap-based approaches. On the WFLW dataset, the proposed LOTR framework demonstrates promising results compared with several state-of-the-art methods. Additionally, we report the improvement in state-of-the-art face recognition performance when using our proposed LOTRs for face alignment.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingWFLWAUC@10 (inter-ocular)60.14LOTR-HR
Facial Recognition and ModellingWFLWFR@10 (inter-ocular)3.52LOTR-HR
Facial Recognition and ModellingWFLWNME (inter-ocular)4.31LOTR-HR
Face ReconstructionWFLWAUC@10 (inter-ocular)60.14LOTR-HR
Face ReconstructionWFLWFR@10 (inter-ocular)3.52LOTR-HR
Face ReconstructionWFLWNME (inter-ocular)4.31LOTR-HR
3DWFLWAUC@10 (inter-ocular)60.14LOTR-HR
3DWFLWFR@10 (inter-ocular)3.52LOTR-HR
3DWFLWNME (inter-ocular)4.31LOTR-HR
3D Face ModellingWFLWAUC@10 (inter-ocular)60.14LOTR-HR
3D Face ModellingWFLWFR@10 (inter-ocular)3.52LOTR-HR
3D Face ModellingWFLWNME (inter-ocular)4.31LOTR-HR
3D Face ReconstructionWFLWAUC@10 (inter-ocular)60.14LOTR-HR
3D Face ReconstructionWFLWFR@10 (inter-ocular)3.52LOTR-HR
3D Face ReconstructionWFLWNME (inter-ocular)4.31LOTR-HR

Related Papers

ProxyFusion: Face Feature Aggregation Through Sparse Experts2025-09-24Non-Adaptive Adversarial Face Generation2025-07-16Attributes Shape the Embedding Space of Face Recognition Models2025-07-15Face mask detection project report.2025-07-02On the Burstiness of Faces in Set2025-06-25Identifying Physically Realizable Triggers for Backdoored Face Recognition Networks2025-06-24SELFI: Selective Fusion of Identity for Generalizable Deepfake Detection2025-06-21FaceLiVT: Face Recognition using Linear Vision Transformer with Structural Reparameterization For Mobile Device2025-06-12