Dialectic-Med: Mitigating Diagnostic Hallucinations via Counterfactual Adversarial Multi-Agent Debate¶

Conference: ACL 2026 arXiv: 2604.11258 Code: N/A Area: Causal Inference Keywords: Medical Hallucination, Multi-Agent Debate, Counterfactual Reasoning, Visual Falsification, Confirmation Bias

TL;DR¶

Dialectic-Med, inspired by Popperian falsificationism, uses three-agent adversarial dialectical reasoning (proposer for diagnostic hypotheses, opponent with visual falsification module for proactively retrieving contradictory visual evidence, and mediator with weighted consensus graph), achieving SOTA on MIMIC-CXR-VQA, VQA-RAD, and PathVQA with 12.5% explanation faithfulness improvement.

Method¶

Key Designs¶

Visual Falsification Module (VFM): Given hypothesis \(H_t\), the opponent generates counterfactual probe queries \(Q_{cf}\) and uses PubMedCLIP to compute attention maps \(M_{cf}\) locating contradictory evidence in images.
Dynamic Consensus Graph: Nodes represent diagnostic hypotheses or visual evidence; edges encode support/refute logical relations with confidence weights. Includes cycle detection to prevent hypothesis loops.
Attack Strength Threshold Termination: Debate terminates when \(S_{attack} < \theta_{thresh}\), indicating the current hypothesis has withstood falsification attempts.

Key Experimental Results¶

SOTA across all three medical VQA benchmarks
12.5% explanation faithfulness improvement
Visual falsification is the key differentiator vs pure semantic debate

Highlights & Insights¶

Operationalizing Popperian falsificationism as an AI system design principle: actively seeking disconfirming evidence rather than just supporting evidence
VFM grounds debate in concrete image regions rather than language games
Direct value for medical AI safety as a safeguard layer before clinical deployment

Rating¶

Novelty: ⭐⭐⭐⭐⭐
Experimental Thoroughness: ⭐⭐⭐⭐
Writing Quality: ⭐⭐⭐⭐⭐
Value: ⭐⭐⭐⭐⭐