TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Temporal Coherence via Self-Supervision for GAN-b...

Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation

Mengyu Chu, You Xie, Jonas Mayer, Laura Leal-Taixé, Nils Thuerey

2018-11-23Super-ResolutionMotion CompensationVideo Super-ResolutionImage Super-ResolutionTranslationVideo Generation
PaperPDFCodeCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCode

Abstract

Our work explores temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationships in the generated data are much less explored. Natural temporal changes are crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as $L^2$ over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel Ping-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. Additionally, we propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirm the rankings computed with these metrics. Code, data, models, and results are provided at https://github.com/thunil/TecoGAN. The project page https://ge.in.tum.de/publications/2019-tecogan-chu/ contains supplemental materials.

Results

TaskDatasetMetricValueModel
Super-ResolutionMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
Super-ResolutionMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
Super-ResolutionMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
Super-ResolutionVid4 - 4x upscalingPSNR25.89TecoGAN⊖
Super-ResolutionVid4 - 4x upscalingPSNR25.57TecoGAN
3D Human Pose EstimationMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
3D Human Pose EstimationMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
3D Human Pose EstimationMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
3D Human Pose EstimationVid4 - 4x upscalingPSNR25.89TecoGAN⊖
3D Human Pose EstimationVid4 - 4x upscalingPSNR25.57TecoGAN
VideoMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
VideoMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
VideoMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
VideoVid4 - 4x upscalingPSNR25.89TecoGAN⊖
VideoVid4 - 4x upscalingPSNR25.57TecoGAN
Pose EstimationMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
Pose EstimationMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
Pose EstimationMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
Pose EstimationVid4 - 4x upscalingPSNR25.89TecoGAN⊖
Pose EstimationVid4 - 4x upscalingPSNR25.57TecoGAN
3DMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
3DMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
3DMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
3DVid4 - 4x upscalingPSNR25.89TecoGAN⊖
3DVid4 - 4x upscalingPSNR25.57TecoGAN
3D Face AnimationMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
3D Face AnimationMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
3D Face AnimationMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
3D Face AnimationVid4 - 4x upscalingPSNR25.89TecoGAN⊖
3D Face AnimationVid4 - 4x upscalingPSNR25.57TecoGAN
2D Human Pose EstimationMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
2D Human Pose EstimationMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
2D Human Pose EstimationMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
2D Human Pose EstimationVid4 - 4x upscalingPSNR25.89TecoGAN⊖
2D Human Pose EstimationVid4 - 4x upscalingPSNR25.57TecoGAN
3D Absolute Human Pose EstimationMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
3D Absolute Human Pose EstimationMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
3D Absolute Human Pose EstimationMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
3D Absolute Human Pose EstimationVid4 - 4x upscalingPSNR25.89TecoGAN⊖
3D Absolute Human Pose EstimationVid4 - 4x upscalingPSNR25.57TecoGAN
Video Super-ResolutionMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
Video Super-ResolutionMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
Video Super-ResolutionMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
Video Super-ResolutionVid4 - 4x upscalingPSNR25.89TecoGAN⊖
Video Super-ResolutionVid4 - 4x upscalingPSNR25.57TecoGAN
3D Object Super-ResolutionMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
3D Object Super-ResolutionMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
3D Object Super-ResolutionMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
3D Object Super-ResolutionVid4 - 4x upscalingPSNR25.89TecoGAN⊖
3D Object Super-ResolutionVid4 - 4x upscalingPSNR25.57TecoGAN
1 Image, 2*2 StitchiMSU Video Upscalers: Quality EnhancementPSNR26.6TecoGAN
1 Image, 2*2 StitchiMSU Video Upscalers: Quality EnhancementSSIM0.933TecoGAN
1 Image, 2*2 StitchiMSU Video Upscalers: Quality EnhancementVMAF61.2TecoGAN
1 Image, 2*2 StitchiVid4 - 4x upscalingPSNR25.89TecoGAN⊖
1 Image, 2*2 StitchiVid4 - 4x upscalingPSNR25.57TecoGAN

Related Papers

SpectraLift: Physics-Guided Spectral-Inversion Network for Self-Supervised Hyperspectral Image Super-Resolution2025-07-17A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Taming Diffusion Transformer for Real-Time Mobile Video Generation2025-07-17LoViC: Efficient Long Video Generation with Context Compression2025-07-17Function-to-Style Guidance of LLMs for Code Translation2025-07-15IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution2025-07-14