TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel ...

RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis

Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander Keller, Sameh Khamis, Thomas Müller, Charles Loop, Nathan Morrical, Koki Nagano, Towaki Takikawa, Stan Birchfield

2022-05-14Novel View Synthesis
PaperPDF

Abstract

We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels). The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis, thus providing a large unified benchmark for both training and evaluation. Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures. Because our dataset is too large for existing methods to process, we propose Sparse Voxel Light Field (SVLF), an efficient voxel-based light field approach for novel view synthesis that achieves comparable performance to NeRF on synthetic data, while being an order of magnitude faster to train and two orders of magnitude faster to render. SVLF achieves this speed by relying on a sparse voxel octree, careful voxel sampling (requiring only a handful of queries per ray), and reduced network structure; as well as ground truth depth maps at training time. Our dataset is generated by NViSII, a Python-based ray tracing renderer, which is designed to be simple for non-experts to use and share, flexible and powerful through its use of scripting, and able to create high-quality and physically-based rendered images. Experiments with a subset of our dataset allow us to compare standard methods like NeRF and mip-NeRF for single-scene modeling, and pixelNeRF for category-level modeling, pointing toward the need for future improvements in this area.

Results

TaskDatasetMetricValueModel
Novel View SynthesisRTMVPSNR14.588Pixel-NeRF (env: Google Scan)
Novel View SynthesisRTMVSSIM0.483Pixel-NeRF (env: Google Scan)
Novel View SynthesisRTMVPSNR12.149Pixel-NeRF (env: ABC)
Novel View SynthesisRTMVSSIM0.629Pixel-NeRF (env: ABC)
Novel View SynthesisRTMVPSNR12.149Pixel-NeRF (env: Bricks)
Novel View SynthesisRTMVSSIM0.523Pixel-NeRF (env: Bricks)
Novel View SynthesisRTMVPSNR12.126Pixel-NeRF (env: Amz. Ber.)
Novel View SynthesisRTMVSSIM0.318Pixel-NeRF (env: Amz. Ber.)

Related Papers

Physically Based Neural LiDAR Resimulation2025-07-15MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second2025-07-14Cameras as Relative Positional Encoding2025-07-14LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures2025-07-08Reflections Unlock: Geometry-Aware Reflection Disentanglement in 3D Gaussian Splatting for Photorealistic Scenes Rendering2025-07-08Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps2025-07-04Refine Any Object in Any Scene2025-06-30VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding2025-06-28