TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CodeTalker: Speech-Driven 3D Facial Animation with Discret...

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong

2023-01-06CVPR 2023 1regression3D Face Animation
PaperPDFCode(official)

Abstract

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
3D Human Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
3D Human Pose EstimationBEAT2MSE8.026CodeTalker
Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
Pose EstimationBEAT2MSE8.026CodeTalker
3DBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
3DBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
3DBEAT2MSE8.026CodeTalker
3D Face AnimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
3D Face AnimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
3D Face AnimationBEAT2MSE8.026CodeTalker
2D Human Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
2D Human Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
2D Human Pose EstimationBEAT2MSE8.026CodeTalker
3D Absolute Human Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
3D Absolute Human Pose EstimationBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
3D Absolute Human Pose EstimationBEAT2MSE8.026CodeTalker
1 Image, 2*2 StitchiBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2FDD4.117CodeTalker
1 Image, 2*2 StitchiBiwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2Lip Vertex Error4.7914CodeTalker
1 Image, 2*2 StitchiBEAT2MSE8.026CodeTalker

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16Second-Order Bounds for [0,1]-Valued Regression via Betting Loss2025-07-16Sparse Regression Codes exploit Multi-User Diversity without CSI2025-07-15Bradley-Terry and Multi-Objective Reward Modeling Are Complementary2025-07-10Active Learning for Manifold Gaussian Process Regression2025-06-26A Survey of Predictive Maintenance Methods: An Analysis of Prognostics via Classification and Regression2025-06-25