Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich
This paper introduces SuperGlue, a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. Assignments are estimated by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network. We introduce a flexible context aggregation mechanism based on attention, enabling SuperGlue to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from image pairs. SuperGlue outperforms other learned approaches and achieves state-of-the-art results on the task of pose estimation in challenging real-world indoor and outdoor environments. The proposed method performs matching in real-time on a modern GPU and can be readily integrated into modern SfM or SLAM systems. The code and trained weights are publicly available at https://github.com/magicleap/SuperGluePretrainedNetwork.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Visual Localization | Aachen Day-Night v1.1 Benchmark | Acc@0.25m, 2° | 77 | SuperGlue |
| Visual Localization | Aachen Day-Night v1.1 Benchmark | Acc@0.5m, 5° | 90.6 | SuperGlue |
| Visual Localization | Aachen Day-Night v1.1 Benchmark | Acc@5m, 10° | 100 | SuperGlue |
| Pose Estimation | InLoc | DUC1-Acc@0.25m,10° | 49 | SuperGlue |
| Pose Estimation | InLoc | DUC1-Acc@0.5m,10° | 68.7 | SuperGlue |
| Pose Estimation | InLoc | DUC1-Acc@1.0m,10° | 80.8 | SuperGlue |
| Pose Estimation | InLoc | DUC2-Acc@0.25m,10° | 53.4 | SuperGlue |
| Pose Estimation | InLoc | DUC2-Acc@0.5m,10° | 77.1 | SuperGlue |
| Pose Estimation | InLoc | DUC2-Acc@1.0m,10° | 82.4 | SuperGlue |
| Visual Place Recognition | Berlin Kudamm | Recall@1 | 59.64 | SuperPoint & SuperGlue |
| Image Matching | IMC PhotoTourism | mean average accuracy @ 10 | 0.65248 | SuperGlue |
| Image Matching | ZEB | Mean AUC@5° | 31.2 | SuperGlue |
| 3D | InLoc | DUC1-Acc@0.25m,10° | 49 | SuperGlue |
| 3D | InLoc | DUC1-Acc@0.5m,10° | 68.7 | SuperGlue |
| 3D | InLoc | DUC1-Acc@1.0m,10° | 80.8 | SuperGlue |
| 3D | InLoc | DUC2-Acc@0.25m,10° | 53.4 | SuperGlue |
| 3D | InLoc | DUC2-Acc@0.5m,10° | 77.1 | SuperGlue |
| 3D | InLoc | DUC2-Acc@1.0m,10° | 82.4 | SuperGlue |
| 1 Image, 2*2 Stitchi | InLoc | DUC1-Acc@0.25m,10° | 49 | SuperGlue |
| 1 Image, 2*2 Stitchi | InLoc | DUC1-Acc@0.5m,10° | 68.7 | SuperGlue |
| 1 Image, 2*2 Stitchi | InLoc | DUC1-Acc@1.0m,10° | 80.8 | SuperGlue |
| 1 Image, 2*2 Stitchi | InLoc | DUC2-Acc@0.25m,10° | 53.4 | SuperGlue |
| 1 Image, 2*2 Stitchi | InLoc | DUC2-Acc@0.5m,10° | 77.1 | SuperGlue |
| 1 Image, 2*2 Stitchi | InLoc | DUC2-Acc@1.0m,10° | 82.4 | SuperGlue |