TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/E-NeRV: Expedite Neural Video Representation with Disentan...

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

Zizhang Li, Mengmeng Wang, Huaijin Pi, Kechun Xu, Jianbiao Mei, Yong liu

2022-07-17Video Reconstruction
PaperPDFCode(official)

Abstract

Recently, the image-wise implicit neural representation of videos, NeRV, has gained popularity for its promising results and swift speed compared to regular pixel-wise implicit representations. However, the redundant parameters within the network structure can cause a large model size when scaling up for desirable performance. The key reason of this phenomenon is the coupled formulation of NeRV, which outputs the spatial and temporal information of video frames directly from the frame index input. In this paper, we propose E-NeRV, which dramatically expedites NeRV by decomposing the image-wise implicit neural representation into separate spatial and temporal context. Under the guidance of this new formulation, our model greatly reduces the redundant model parameters, while retaining the representation ability. We experimentally find that our method can improve the performance to a large extent with fewer parameters, resulting in a more than $8\times$ faster speed on convergence. Code is available at https://github.com/kyleleey/E-NeRV.

Results

TaskDatasetMetricValueModel
3DUVGAverage PSNR (dB)34.85E-NeRV
Video ReconstructionUVGAverage PSNR (dB)34.85E-NeRV

Related Papers

GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field2025-07-08Quanta Diffusion2025-06-07Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation2025-06-04Compressing Human Body Video with Interactive Semantics: A Generative Approach2025-05-22Motion Matters: Compact Gaussian Streaming for Free-Viewpoint Video Reconstruction2025-05-22V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation2025-05-22Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space2025-05-22Few-shot Semantic Encoding and Decoding for Video Surveillance2025-05-12