TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Environment-agnostic Multitask Learning for Natural Langua...

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

Xin Eric Wang, Vihan Jain, Eugene Ie, William Yang Wang, Zornitsa Kozareva, Sujith Ravi

2020-03-01ECCV 2020 8Vision-Language Navigation
PaperPDFCode(official)

Abstract

Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e.g., following natural language instructions or dialog. However, existing methods tend to overfit training data in seen environments and fail to generalize well in previously unseen environments. To close the gap between seen and unseen environments, we aim at learning a generalized navigation model from two novel perspectives: (1) we introduce a multitask navigation model that can be seamlessly trained on both Vision-Language Navigation (VLN) and Navigation from Dialog History (NDH) tasks, which benefits from richer natural language guidance and effectively transfers knowledge across tasks; (2) we propose to learn environment-agnostic representations for the navigation policy that are invariant among the environments seen during training, thus generalizing better on unseen environments. Extensive experiments show that environment-agnostic multitask learning significantly reduces the performance gap between seen and unseen environments, and the navigation agent trained so outperforms baselines on unseen environments by 16% (relative measure on success rate) on VLN and 120% (goal progress) on NDH. Our submission to the CVDN leaderboard establishes a new state-of-the-art for the NDH task on the holdout test set. Code is available at https://github.com/google-research/valan.

Results

TaskDatasetMetricValueModel
Visual NavigationCooperative Vision-and-Dialogue Navigationdist_to_end_reduction3.91Environment-agnostic Multitask Learning
Visual NavigationCooperative Vision-and-Dialogue Navigationspl0.17Environment-agnostic Multitask Learning
Vision and Language NavigationVLN Challengeerror6.03Environment-Agnostic Multitask Learning
Vision and Language NavigationVLN Challengelength13.35Environment-Agnostic Multitask Learning
Vision and Language NavigationVLN Challengeoracle success0.56Environment-Agnostic Multitask Learning
Vision and Language NavigationVLN Challengespl0.4Environment-Agnostic Multitask Learning
Vision and Language NavigationVLN Challengesuccess0.45Environment-Agnostic Multitask Learning

Related Papers

SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models2025-07-17VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning2025-06-20Grounded Vision-Language Navigation for UAVs with Open-Vocabulary Goal Understanding2025-06-12Generating Vision-Language Navigation Instructions Incorporated Fine-Grained Alignment Annotations2025-06-10Active Test-time Vision-Language Navigation2025-06-07EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation2025-06-02Automated Data Curation Using GPS & NLP to Generate Instruction-Action Pairs for Autonomous Vehicle Vision-Language Navigation Datasets2025-05-06UAV-VLN: End-to-End Vision Language guided Navigation for UAVs2025-04-30