Metric: ndtw (higher is better)
| # | Model↕ | ndtw▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | MARVAL | 66.76 | Yes | A New Path: Scaling Vision-and-Language Navigati... | 2022-10-06 | - |
| 2 | EnvEdit-PT | 64.61 | Yes | EnvEdit: Environment Editing for Vision-and-Lang... | 2022-03-29 | Code |
| 3 | HAMT | 59.94 | No | History Aware Multimodal Transformer for Vision-... | 2021-10-25 | Code |
| 4 | CLEAR-CLIP | 53.69 | Yes | How Much Can CLIP Benefit Vision-and-Language Ta... | 2021-07-13 | Code |
| 5 | Monolingual Baseline | 41.05 | No | Room-Across-Room: Multilingual Vision-and-Langua... | 2020-10-15 | Code |
| 6 | Multilingual Baseline | 36.81 | No | Room-Across-Room: Multilingual Vision-and-Langua... | 2020-10-15 | Code |