TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Recurrence without Recurrence: Stable Video Landmark Detec...

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

2023-04-02CVPR 2023 1Face Alignment
PaperPDFCode(official)

Abstract

Cascaded computation, whereby predictions are recurrently refined over several stages, has been a persistent theme throughout the development of landmark detection models. In this work, we show that the recently proposed Deep Equilibrium Model (DEQ) can be naturally adapted to this form of computation. Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3.92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules. Furthermore, we show that DEQs are particularly suited for landmark detection in videos. In this setting, it is typical to train on still images due to the lack of labelled videos. This can lead to a ``flickering'' effect at inference time on video, whereby a model can rapidly oscillate between different plausible solutions across consecutive frames. By rephrasing DEQs as a constrained optimization, we emulate recurrence at inference time, despite not having access to temporal data at training time. This Recurrence without Recurrence (RwR) paradigm helps in reducing landmark flicker, which we demonstrate by introducing a new metric, normalized mean flicker (NMF), and contributing a new facial landmark video dataset (WFLW-V) targeting landmark uncertainty. On the WFLW-V hard subset made up of $500$ videos, our LDEQ with RwR improves the NME and NMF by $10$ and $13\%$ respectively, compared to the strongest previously published model using a hand-tuned conventional filter.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingWFLWAUC@10 (inter-ocular)62.4LDEQ
Facial Recognition and ModellingWFLWFR@10 (inter-ocular)2.48LDEQ
Facial Recognition and ModellingWFLWNME (inter-ocular)3.92LDEQ
Face ReconstructionWFLWAUC@10 (inter-ocular)62.4LDEQ
Face ReconstructionWFLWFR@10 (inter-ocular)2.48LDEQ
Face ReconstructionWFLWNME (inter-ocular)3.92LDEQ
3DWFLWAUC@10 (inter-ocular)62.4LDEQ
3DWFLWFR@10 (inter-ocular)2.48LDEQ
3DWFLWNME (inter-ocular)3.92LDEQ
3D Face ModellingWFLWAUC@10 (inter-ocular)62.4LDEQ
3D Face ModellingWFLWFR@10 (inter-ocular)2.48LDEQ
3D Face ModellingWFLWNME (inter-ocular)3.92LDEQ
3D Face ReconstructionWFLWAUC@10 (inter-ocular)62.4LDEQ
3D Face ReconstructionWFLWFR@10 (inter-ocular)2.48LDEQ
3D Face ReconstructionWFLWNME (inter-ocular)3.92LDEQ

Related Papers

Towards Large-Scale Pose-Invariant Face Recognition Using Face Defrontalization2025-06-04HonestFace: Towards Honest Face Restoration with One-Step Diffusion Model2025-05-24Multimodal Emotion Coupling via Speech-to-Facial and Bodily Gestures in Dyadic Interaction2025-05-08SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users2025-04-14Mitigating Knowledge Discrepancies among Multiple Datasets for Task-agnostic Unified Face Alignment2025-03-28Learning Person-Specific Animatable Face Models from In-the-Wild Images via a Shared Base Model2025-01-01Impact of Face Alignment on Face Image Quality2024-12-16Precise Facial Landmark Detection by Dynamic Semantic Aggregation Transformer2024-12-01