TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Generate Like Experts: Multi-Stage Font Generation by Inco...

Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models

Bin Fu, Fanghua Yu, Anran Liu, Zixuan Wang, Jie Wen, Junjun He, Yu Qiao

2024-01-01CVPR 2024 1Font GenerationDisentanglement
PaperPDFCode(official)Code

Abstract

Few-shot font generation (FFG) produces stylized font images with a limited number of reference samples which can significantly reduce labor costs in manual font designs. Most existing FFG methods follow the style-content disentanglement paradigm and employ the Generative Adversarial Network (GAN) to generate target fonts by combining the decoupled content and style representations. The complicated structure and detailed style are simultaneously generated in those methods which may be the sub-optimal solutions for FFG task. Inspired by most manual font design processes of expert designers in this paper we model font generation as a multi-stage generative process. Specifically as the injected noise and the data distribution in diffusion models can be well-separated into different sub-spaces we are able to incorporate the font transfer process into these models. Based on this observation we generalize diffusion methods to model font generative process by separating the reverse diffusion process into three stages with different functions: The structure construction stage first generates the structure information for the target character based on the source image and the font transfer stage subsequently transforms the source font to the target font. Finally the font refinement stage enhances the appearances and local details of the target font images. Based on the above multi-stage generative process we construct our font generation framework named MSD-Font with a dual-network approach to generate font images. The superior performance demonstrates the effectiveness of our model. The code is available at: https://github.com/fubinfb/MSD-Font .

Related Papers

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models2025-07-18Towards Imperceptible JPEG Image Hiding: Multi-range Representations-driven Adversarial Stego Generation2025-07-11Generative Head-Mounted Camera Captures for Photorealistic Avatars2025-07-08Reflections Unlock: Geometry-Aware Reflection Disentanglement in 3D Gaussian Splatting for Photorealistic Scenes Rendering2025-07-08Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations2025-07-04Causal-SAM-LLM: Large Language Models as Causal Reasoners for Robust Medical Segmentation2025-07-04Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization2025-07-03SemFaceEdit: Semantic Face Editing on Generative Radiance Manifolds2025-06-28