Skip to content

🩺 Medical NLP

💬 ACL2026 · 8 paper notes

📌 Same area in other venues: 🔬 ICLR2026 (5) · 🤖 AAAI2026 (2) · 🧠 NeurIPS2025 (6)

🔥 Top topics: Medical Imaging ×4 · LLM ×4 · Question Answering ×3

Beyond the Individual: Virtualizing Multi-Disciplinary Reasoning for Clinical Intake via Collaborative Agents

This paper proposes Aegle, a graph-structured multi-agent framework that virtualizes multidisciplinary team (MDT) consultation for clinical intake. By introducing decoupled parallel reasoning and dynamic topology into the outpatient interview workflow, Aegle surpasses state-of-the-art models across 53 metrics spanning 24 clinical departments.

BioHiCL: Hierarchical Multi-Label Contrastive Learning for Biomedical Retrieval with MeSH Labels

BioHiCL leverages the hierarchical multi-label annotations of MeSH (Medical Subject Headings) to provide structured supervision for dense retrievers. By aligning the embedding space with the MeSH semantic space via depth-weighted label similarity, a 0.1B model surpasses most specialized models on biomedical retrieval, sentence similarity, and question answering tasks.

Calibrated? Not for Everyone: How Sexual Orientation and Religious Markers Distort LLM Accuracy and Confidence in Medical QA

This paper investigates how social identity markers (sexual orientation and religious affiliation) distort LLM accuracy and confidence calibration in medical question answering. It finds that the "homosexual" marker consistently degrades performance and induces calibration crises across 9 LLMs, and that intersectional identities produce non-additive, identity-specific harms.

Efficient and Effective Internal Memory Retrieval for LLM-Based Healthcare Prediction

This paper proposes the K2K framework, which treats the FFN parameter space of LLMs as a retrievable knowledge base. Clinical knowledge is injected via LoRA, activation-guided probes enable precise retrieval, and cross-attention reranking adaptively integrates multi-source internal knowledge — achieving state-of-the-art healthcare prediction without external retrieval latency.

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

This paper proposes HypEHR, a 22M-parameter Lorentz hyperbolic model that embeds medical codes, patient visits, and questions into hyperbolic space. Through hierarchy-aware regularization aligned with the ICD ontology structure, HypEHR achieves performance comparable to LLM-based approaches on the MIMIC-IV EHR question answering task.

MHSafeEval: Role-Aware Interaction-Level Evaluation of Mental Health Safety in Large Language Models

This paper proposes R-MHSafe, a role-aware mental health safety taxonomy, and MHSafeEval, a closed-loop agent evaluation framework. Through adversarial multi-turn counseling interactions, the framework systematically uncovers role-dependent cumulative safety failures of LLMs in mental health counseling scenarios, revealing interaction-level harms that existing static benchmarks fail to capture.

Query Pipeline Optimization for Cancer Patient Question Answering Systems

This paper proposes CoMeta, a three-tier controllable metadata-aware RAG framework for Cancer Patient Question Answering (CPQA). It integrates Clinical Hybrid Semantic-symbolic Document Retrieval (CHSDR), which fuses real-time Boolean search via E-Utilities with MedCPT semantic retrieval, and employs Semantically Enhanced Overlapping Segmentation (SEOS) to prevent context fragmentation. On the CMMQA dataset, CoMeta improves Claude-3-Haiku answer accuracy by 5.24% over CoT and approximately 3% over naive RAG.

Text-Attributed Knowledge Graph Enrichment with Large Language Models for Medical Concept Representation

This paper proposes CoMed, an LLM-empowered graph learning framework that constructs a global medical knowledge graph by combining EHR statistical evidence with type-constrained LLM inference, enriches it into a text-attributed graph via LLM-generated node descriptions and edge rationales, and jointly trains a LoRA-finetuned LLaMA encoder with a heterogeneous GNN to learn unified medical concept embeddings, achieving significant improvements in diagnosis prediction on MIMIC-III/IV.