TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Deficiency-Aware Masked Transformer for Video Inpainting

Deficiency-Aware Masked Transformer for Video Inpainting

Yongsheng Yu, Heng Fan, Libo Zhang

2023-07-17Optical Flow EstimationHallucinationImage InpaintingVideo Inpainting
PaperPDFCode(official)

Abstract

Recent video inpainting methods have made remarkable progress by utilizing explicit guidance, such as optical flow, to propagate cross-frame pixels. However, there are cases where cross-frame recurrence of the masked video is not available, resulting in a deficiency. In such situation, instead of borrowing pixels from other frames, the focus of the model shifts towards addressing the inverse problem. In this paper, we introduce a dual-modality-compatible inpainting framework called Deficiency-aware Masked Transformer (DMT), which offers three key advantages. Firstly, we pretrain a image inpainting model DMT_img serve as a prior for distilling the video model DMT_vid, thereby benefiting the hallucination of deficiency cases. Secondly, the self-attention module selectively incorporates spatiotemporal tokens to accelerate inference and remove noise signals. Thirdly, a simple yet effective Receptive Field Contextualizer is integrated into DMT, further improving performance. Extensive experiments conducted on YouTube-VOS and DAVIS datasets demonstrate that DMT_vid significantly outperforms previous solutions. The code and video demonstrations can be found at github.com/yeates/DMT.

Results

TaskDatasetMetricValueModel
3DDAVISPSNR33.82DMT
3DDAVISSSIM0.976DMT
3DDAVISVFID0.104DMT
3DYouTube-VOS 2018PSNR34.27DMT
3DYouTube-VOS 2018SSIM0.973DMT
3DYouTube-VOS 2018VFID0.044DMT
Video InpaintingDAVISPSNR33.82DMT
Video InpaintingDAVISSSIM0.976DMT
Video InpaintingDAVISVFID0.104DMT
Video InpaintingYouTube-VOS 2018PSNR34.27DMT
Video InpaintingYouTube-VOS 2018SSIM0.973DMT
Video InpaintingYouTube-VOS 2018VFID0.044DMT

Related Papers

Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17Mitigating Object Hallucinations via Sentence-Level Early Intervention2025-07-16An Efficient Approach for Muscle Segmentation and 3D Reconstruction Using Keypoint Tracking in MRI Scan2025-07-11ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way2025-07-11RePaintGS: Reference-Guided Gaussian Splatting for Realistic and View-Consistent 3D Scene Inpainting2025-07-11Learning to Track Any Points from Human Motion2025-07-08UQLM: A Python Package for Uncertainty Quantification in Large Language Models2025-07-08TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation2025-07-07