TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Joint Spatial-Temporal Transformations for Video ...

Learning Joint Spatial-Temporal Transformations for Video Inpainting

Yanhong Zeng, Jianlong Fu, Hongyang Chao

2020-07-20ECCV 2020 8Seeing Beyond the VisibleVideo Inpainting
PaperPDFCodeCode(official)

Abstract

High-quality video inpainting that completes missing regions in video frames is a promising yet challenging task. State-of-the-art approaches adopt attention models to complete a frame by searching missing contents from reference frames, and further complete whole videos frame by frame. However, these approaches can suffer from inconsistent attention results along spatial and temporal dimensions, which often leads to blurriness and temporal artifacts in videos. In this paper, we propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting. Specifically, we simultaneously fill missing regions in all input frames by self-attention, and propose to optimize STTN by a spatial-temporal adversarial loss. To show the superiority of the proposed model, we conduct both quantitative and qualitative evaluations by using standard stationary masks and more realistic moving object masks. Demo videos are available at https://github.com/researchmm/STTN.

Results

TaskDatasetMetricValueModel
3DDAVISEwarp0.1449STTN
3DDAVISPSNR30.67STTN
3DDAVISSSIM0.956STTN
3DDAVISVFID0.149STTN
3DYouTube-VOS 2018Ewarp0.0907STTN
3DYouTube-VOS 2018PSNR32.34STTN
3DYouTube-VOS 2018SSIM0.9655STTN
3DYouTube-VOS 2018VFID0.053STTN
3DHQVI (240p)LPIPS0.0528STTN
3DHQVI (240p)PSNR29.64STTN
3DHQVI (240p)SSIM0.9339STTN
3DHQVI (240p)VFID0.2594STTN
Video InpaintingDAVISEwarp0.1449STTN
Video InpaintingDAVISPSNR30.67STTN
Video InpaintingDAVISSSIM0.956STTN
Video InpaintingDAVISVFID0.149STTN
Video InpaintingYouTube-VOS 2018Ewarp0.0907STTN
Video InpaintingYouTube-VOS 2018PSNR32.34STTN
Video InpaintingYouTube-VOS 2018SSIM0.9655STTN
Video InpaintingYouTube-VOS 2018VFID0.053STTN
Video InpaintingHQVI (240p)LPIPS0.0528STTN
Video InpaintingHQVI (240p)PSNR29.64STTN
Video InpaintingHQVI (240p)SSIM0.9339STTN
Video InpaintingHQVI (240p)VFID0.2594STTN
Seeing Beyond the VisibleKITTI360-EXAverage PSNR18.73STTN

Related Papers

Video Virtual Try-on with Conditional Diffusion Transformer Inpainter2025-06-26Let Your Video Listen to Your Music!2025-06-23VideoPDE: Unified Generative PDE Solving via Video Inpainting Diffusion Models2025-06-16Follow-Your-Creation: Empowering 4D Creation through Video Inpainting2025-06-05DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds2025-05-30Aquarius: A Family of Industry-Level Video Generation Models for Marketing Scenarios2025-05-14DiTPainter: Efficient Video Inpainting with Diffusion Transformers2025-04-22Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting2025-04-15