TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/UMIFormer: Mining the Correlations between Similar Tokens ...

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

Zhenwei Zhu, Liying Yang, Ning li, Chaohao Jiang, Yanyan Liang

2023-02-27ICCV 2023 13D Object Reconstruction3D ReconstructionSingle-View 3D Reconstruction
PaperPDFCode(official)

Abstract

In recent years, many video tasks have achieved breakthroughs by utilizing the vision transformer and establishing spatial-temporal decoupling for feature extraction. Although multi-view 3D reconstruction also faces multiple images as input, it cannot immediately inherit their success due to completely ambiguous associations between unstructured views. There is not usable prior relationship, which is similar to the temporally-coherence property in a video. To solve this problem, we propose a novel transformer network for Unstructured Multiple Images (UMIFormer). It exploits transformer blocks for decoupled intra-view encoding and designed blocks for token rectification that mine the correlation between similar tokens from different views to achieve decoupled inter-view encoding. Afterward, all tokens acquired from various branches are compressed into a fixed-size compact representation while preserving rich information for reconstruction by leveraging the similarities between tokens. We empirically demonstrate on ShapeNet and confirm that our decoupled learning method is adaptable for unstructured multiple images. Meanwhile, the experiments also verify our model outperforms existing SOTA methods by a large margin. Code will be available at https://github.com/GaryZhu1996/UMIFormer.

Results

TaskDatasetMetricValueModel
Object ReconstructionData3D−R2N23DIoU0.68UMIFormer
3D Object ReconstructionData3D−R2N23DIoU0.68UMIFormer

Related Papers

AutoPartGen: Autogressive 3D Part Generation and Discovery2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16BRUM: Robust 3D Vehicle Reconstruction from 360 Sparse Images2025-07-16Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15Binomial Self-Compensation: Mechanism and Suppression of Motion Error in Phase-Shifting Profilometry2025-07-14An Efficient Approach for Muscle Segmentation and 3D Reconstruction Using Keypoint Tracking in MRI Scan2025-07-11Review of Feed-forward 3D Reconstruction: From DUSt3R to VGGT2025-07-11DreamGrasp: Zero-Shot 3D Multi-Object Reconstruction from Partial-View Images for Robotic Manipulation2025-07-08