TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ColorizeDiffusion v2: Enhancing Reference-based Sketch Col...

ColorizeDiffusion v2: Enhancing Reference-based Sketch Colorization Through Separating Utilities

Dingkun Yan, Xinrui Wang, Yusuke Iwasawa, Yutaka Matsuo, Suguru Saito, Jiaxian Guo

2025-04-09Colorization
PaperPDFCodeCode(official)

Abstract

Reference-based sketch colorization methods have garnered significant attention due to their potential applications in the animation production industry. However, most existing methods are trained with image triplets of sketch, reference, and ground truth that are semantically and spatially well-aligned, while real-world references and sketches often exhibit substantial misalignment. This mismatch in data distribution between training and inference leads to overfitting, consequently resulting in spatial artifacts and significant degradation in overall colorization quality, limiting potential applications of current methods for general purposes. To address this limitation, we conduct an in-depth analysis of the \textbf{carrier}, defined as the latent representation facilitating information transfer from reference to sketch. Based on this analysis, we propose a novel workflow that dynamically adapts the carrier to optimize distinct aspects of colorization. Specifically, for spatially misaligned artifacts, we introduce a split cross-attention mechanism with spatial masks, enabling region-specific reference injection within the diffusion process. To mitigate semantic neglect of sketches, we employ dedicated background and style encoders to transfer detailed reference information in the latent feature space, achieving enhanced spatial control and richer detail synthesis. Furthermore, we propose character-mask merging and background bleaching as preprocessing steps to improve foreground-background integration and background generation. Extensive qualitative and quantitative evaluations, including a user study, demonstrate the superior performance of our proposed method compared to existing approaches. An ablation study further validates the efficacy of each proposed component.

Related Papers

MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization2025-06-21Self-supervised Feature Extraction for Enhanced Ball Detection on Soccer Robots2025-06-20Exploiting the Exact Denoising Posterior Score in Training-Free Guidance of Diffusion Models2025-06-16SSIMBaD: Sigma Scaling with SSIM-Guided Balanced Diffusion for AnimeFace Colorization2025-06-04Restoring Real-World Images with an Internal Detail Enhancement Diffusion Model2025-05-24Leveraging the Powerful Attention of a Pre-trained Diffusion Model for Exemplar-based Image Colorization2025-05-21Controllable Image Colorization with Instance-aware Texts and Masks2025-05-13ColorVein: Colorful Cancelable Vein Biometrics2025-04-19