TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/The Regretful Agent: Heuristic-Aided Navigation through Pr...

The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira

2019-03-05CVPR 2019 6Vision-Language NavigationVisual NavigationVision and Language NavigationDecision Making
PaperPDFCode(official)CodeCode

Abstract

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making. Specifically, the Vision and Language Navigation (VLN) task involves navigating to a goal purely from language instructions and visual information without explicit knowledge of the goal. Recent successful approaches have made in-roads in achieving good success rates for this task but rely on beam search, which thoroughly explores a large number of trajectories and is unrealistic for applications such as robotics. In this paper, inspired by the intuition of viewing the problem as search on a navigation graph, we propose to use a progress monitor developed in prior work as a learnable heuristic for search. We then propose two modules incorporated into an end-to-end architecture: 1) A learned mechanism to perform backtracking, which decides whether to continue moving forward or roll back to a previous state (Regret Module) and 2) A mechanism to help the agent decide which direction to go next by showing directions that are visited and their associated progress estimate (Progress Marker). Combined, the proposed approach significantly outperforms current state-of-the-art methods using greedy action selection, with 5% absolute improvement on the test server in success rates, and more importantly 8% on success rates normalized by the path length. Our code is available at https://github.com/chihyaoma/regretful-agent .

Results

TaskDatasetMetricValueModel
Vision and Language NavigationVLN Challengeerror5.69The Regretful Agent (no beam search; greedy action selection)
Vision and Language NavigationVLN Challengelength13.69The Regretful Agent (no beam search; greedy action selection)
Vision and Language NavigationVLN Challengeoracle success0.56The Regretful Agent (no beam search; greedy action selection)
Vision and Language NavigationVLN Challengespl0.4The Regretful Agent (no beam search; greedy action selection)
Vision and Language NavigationVLN Challengesuccess0.48The Regretful Agent (no beam search; greedy action selection)

Related Papers

Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion2025-07-18SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models2025-07-17Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17Higher-Order Pattern Unification Modulo Similarity Relations2025-07-17Exploiting Constraint Reasoning to Build Graphical Explanations for Mixed-Integer Linear Programming2025-07-17Acting and Planning with Hierarchical Operational Models on a Mobile Robot: A Study with RAE+UPOM2025-07-15CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking2025-07-15Detección y Cuantificación de Erosión Fluvial con Visión Artificial2025-07-15