📖 NLP Understanding¶

🧠 NeurIPS2025 · 3 paper notes

Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention: This work unrolls selective SSMs (Mamba) into an attention-equivalent form and derives generalization bounds via covering number techniques, controlled by the spectral abscissa \(s_{\mathbf{A}}\) of the continuous-time state matrix. When \(s_{\mathbf{A}} < 0\), the bound is independent of sequence length; when \(s_{\mathbf{A}} \geq 0\), it grows exponentially. The paper further proves this dependence is irreducible.
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL: This paper proposes PNLC, a method that trains a lightweight goal-conditioned value function as a "natural language critic" to guide LLM agents in multi-turn planning and self-refinement at the thought-step level. Without direct fine-tuning or inference-time search, PNLC significantly outperforms existing methods on complex interactive tasks such as web navigation, social reasoning, and persuasion, while achieving 8–10× faster inference.
Weak-to-Strong Generalization under Distribution Shifts: This paper demonstrates that naive weak-to-strong generalization fails under distribution shifts—where the strong model performs even worse than the weak supervisor—and proposes RAVEN, a framework that dynamically learns optimal combination weights over multiple weak models to achieve robust weak-to-strong generalization, surpassing baselines by over 30% on OOD tasks.