Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

Dongjun Kim, Seungjae Shin, Kyungwoo Song, Wanmo Kang, Il-Chul Moon

2021-06-10Density Estimation Image Generation

Abstract

Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly contributed by small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training a score network well across the entire diffusion time is demanding because the loss scale is significantly imbalanced at each diffusion time. For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. In experiments, Soft Truncation achieves state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ 256x256, and STL-10 datasets.

Results

Task	Dataset	Metric	Value	Model
Image Generation	STL-10	FID	7.71	UNCSN++ (RVE) + ST
Image Generation	STL-10	Inception score	13.43	UNCSN++ (RVE) + ST
Image Generation	ImageNet 32x32	FID	8.42	DDPM++ (VP, NLL) + ST
Image Generation	ImageNet 32x32	Inception score	11.82	DDPM++ (VP, NLL) + ST
Image Generation	ImageNet 32x32	bpd	3.85	DDPM++ (VP, NLL) + ST
Image Generation	CelebA 64x64	FID	1.9	DDPM++ (VP, FID) + ST
Image Generation	CelebA 64x64	bits/dimension	2.1	DDPM++ (VP, FID) + ST
Image Generation	CelebA 64x64	FID	2.9	DDPM++ (VP, NLL) + ST
Image Generation	CelebA 64x64	bits/dimension	1.96	DDPM++ (VP, NLL) + ST
Image Generation	CelebA 64x64	bits/dimension	1.97	UNCSN++ (RVE) + ST
Image Generation	FFHQ 256 x 256	FID	5.54	UDM (RVE) + ST
Image Generation	LSUN Bedroom 256 x 256	FID	4.57	UDM (RVE) + ST
Image Generation	CelebA-HQ 256x256	FID	7.16	UNCSN++ (RVE) + ST

Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

Abstract

Results

Related Papers

Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

Abstract

Results

Related Papers