VLN-NF: Feasibility-Aware Vision-and-Language Navigation with False-Premise Instructions¶

Conference: ACL 2026 arXiv: 2604.10533 Code: https://vln-nf.github.io/ Area: Robotics & Embodied AI Keywords: Vision-Language Navigation, False Premise, NOT-FOUND, Embodied Exploration, Feasibility Awareness

TL;DR¶

VLN-NF is the first benchmark requiring VLN agents to identify false-premise instructions and output NOT-FOUND in 3D partially observable environments. The paper also proposes REV-SPL evaluation metric and ROAM two-stage hybrid framework, achieving 6.1 REV-SPL (+45% over supervised baselines).

Method¶

Key Designs¶

Dataset Construction Pipeline (Rewrite + Verify): LLM Rewriter generates semantically fluent but factually incorrect instructions by replacing target objects with plausible alternatives absent from the target room. VLM Verifier confirms via open-vocabulary detection. Human audit error rate <2%.
REV-SPL Evaluation Metric: Jointly evaluates navigation efficiency, exploration coverage, and FOUND/NOT-FOUND decision correctness. Penalizes premature stopping and incorrect decisions.
ROAM Two-Stage Hybrid Framework: Stage 1 uses supervised DUET model for room-level navigation; Stage 2 uses LLM/VLM for in-room exploration with free-space clearance prior guidance.

Key Experimental Results¶

Method	Type	REV-SPL
DUET + VLN-NF	Supervised	4.2
NaviLLM	LLM-based	1.0
ROAM	Hybrid	6.1

Highlights & Insights¶

Fills VLN reliability gap: first systematic study of false-premise navigation in 3D partially observable environments
Two-stage decomposition strategy is transferable to other embodied tasks requiring decisions under uncertainty

Rating¶

Novelty: ⭐⭐⭐⭐⭐
Experimental Thoroughness: ⭐⭐⭐⭐
Writing Quality: ⭐⭐⭐⭐⭐
Value: ⭐⭐⭐⭐