TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Tactical Rewind: Self-Correction via Backtracking in Visio...

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

2019-03-06CVPR 2019 6Vision-Language NavigationVision and Language Navigation
PaperPDFCode(official)

Abstract

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et. al. (2018). Given a natural language instruction and photo-realistic image views of a previously unseen environment, the agent was tasked with navigating from source to target location as quickly as possible. While all current approaches make local action decisions or score entire trajectories using beam search, ours balances local and global signals when exploring an unobserved environment. Importantly, this lets us act greedily but use global signals to backtrack when necessary. Applying FAST framework to existing state-of-the-art models achieved a 17% relative gain, an absolute 6% gain on Success rate weighted by Path Length (SPL).

Results

TaskDatasetMetricValueModel
Vision-Language NavigationRoom2Roomspl0.41Tactical Rewind - short
Vision and Language NavigationVLN Challengeerror4.29Tactical Rewind - long
Vision and Language NavigationVLN Challengelength196.53Tactical Rewind - long
Vision and Language NavigationVLN Challengeoracle success0.9Tactical Rewind - long
Vision and Language NavigationVLN Challengespl0.03Tactical Rewind - long
Vision and Language NavigationVLN Challengesuccess0.61Tactical Rewind - long
Vision and Language NavigationVLN Challengeerror5.14Tactical Rewind - short
Vision and Language NavigationVLN Challengelength22.08Tactical Rewind - short
Vision and Language NavigationVLN Challengeoracle success0.64Tactical Rewind - short
Vision and Language NavigationVLN Challengespl0.41Tactical Rewind - short
Vision and Language NavigationVLN Challengesuccess0.54Tactical Rewind - short

Related Papers

SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models2025-07-17Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments2025-06-30VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning2025-06-20Grounded Vision-Language Navigation for UAVs with Open-Vocabulary Goal Understanding2025-06-12A Navigation Framework Utilizing Vision-Language Models2025-06-11Generating Vision-Language Navigation Instructions Incorporated Fine-Grained Alignment Annotations2025-06-10Active Test-time Vision-Language Navigation2025-06-07