Vision-Language Navigation

1 benchmarks81 papers

Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.

<span style="color:grey; opacity: 0.6">( Image credit: Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout )</span>

Benchmarks

Vision-Language Navigation on Room2Room