👻 Hallucination Detection¶

🧪 ICML2025 · 3 paper notes

🔥 Top topics: LLM ×2

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models: This paper proposes the MemVR decoding paradigm, which reinjects visual tokens as supplementary evidence into intermediate trigger layers through the key-value memory mechanism of FFNs. This "look-twice" mechanism mitigates hallucinations in MLLMs without introducing additional inference overhead.
Rejecting Hallucinated State Targets during Planning: This paper systematically identifies the types of "delusional behavior" caused by generators producing unfeasible goals (hallucinatory goals) in goal-conditioned decision planning, and designs a feasibility evaluator as an auxiliary module to identify and reject these unfeasible goals. Combined with off-policy learning rules, a distributional architecture, and hindsight relabeling data augmentation, this approach significantly reduces delusional behavior and enhances OOD generalization performance without modifying the original agent.
Steer LLM Latents for Hallucination Detection: Proposes Truthfulness Separator Vector (TSV), a lightweight steering vector that reshapes the LLM representation space at inference time to enhance the separation between truthful and hallucinated outputs, achieving performance close to full supervision with only 32 labeled exemplars.