🗣️ Dialogue Systems¶

🧠 NeurIPS2025 · 5 paper notes

AC-LoRA: (Almost) Training-Free Access Control-Aware Multi-Modal LLMs: AC-LoRA is an end-to-end system that trains independent LoRA adapters for datasets with different permission levels. At inference time, it dynamically retrieves and training-freely merges multiple LoRA outputs based on cosine similarity and user permissions, achieving strong information isolation while matching or surpassing SOTA LoRA mixture methods in response quality.
Bridging Human and LLM Judgments: Understanding and Narrowing the Gap: This paper proposes Bridge, a statistical framework that models the latent relationship between human and LLM judgments via ordinal logistic regression. With a small number of human labels, Bridge improves the calibration and alignment of LLM judgments while supporting formal statistical hypothesis testing for systematic biases.
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location: This paper proposes HyGen, an interference-aware LLM inference system that achieves elastic co-location of online and offline workloads through an accurate batch latency predictor, an SLO-aware performance profiler, and a prefix-sharing-maximization scheduling strategy, delivering 3.87–5.84× throughput gains while strictly guaranteeing SLO compliance.
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems: This paper proposes MetaMind — a multi-agent framework inspired by psychological metacognition theory — that significantly enhances the social reasoning capabilities of LLMs through three-stage collaboration: a ToM Agent (mental state hypothesis generation), a Moral Agent (social norm-constrained refinement), and a Response Agent (response generation with self-verification). MetaMind achieves state-of-the-art performance on multiple social intelligence benchmarks, approaching human-level performance for the first time.
SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks: SciArena is a community-driven open evaluation platform for scientific literature tasks. It adopts a Chatbot Arena-style human preference voting paradigm to rank 47 foundation models, collecting over 20,000 votes, and releases SciArena-Eval as a meta-benchmark for assessing the ability of automated evaluation systems to judge answer quality on literature-grounded tasks.