📄 multi_agent¶
🧪 ICML2026 · 6 paper notes
📌 Same area in other venues: 💬 ACL2026 (12)
🔥 Top topics: Agents ×5 · LLM ×4
- E-mem: Multi-Agent Based Episodic Context Reconstruction for LLM Agent Memory
-
E-mem shifts the traditional memory paradigm of "preprocessing and compressing into embeddings/graphs" to an episodic reconstruction paradigm of "retaining original context + in-situ reasoning by small model assistants": the master agent only performs global planning, while multiple SLM assistants each guard an uncompressed segment of the original text. Upon multi-pathway retrieval and activation, they conduct local reasoning and return evidence. On LoCoMo, E-mem surpasses SOTA by 7.75 F1 points while reducing token consumption by 70%.
- EngiAgent: Fully Connected Coordination of LLM Agents for Solving Open-ended Engineering Problems with Feasible Solutions
-
EngiAgent decomposes engineering problem solving into five expert agents: Analyzer, Modeler, Verifier, Solver, and Evaluator. A fully connected coordinator dynamically routes feedback (instead of following a fixed pipeline), boosting the feasible solution rate on engineering tasks with GPT-4o from 5.66% (zero-shot) / 7.55% (MM-Agent) to 64.15%—an average improvement of about 7x over previous SOTA.
- MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems
-
MASPO achieves end-to-end joint optimization of role prompts for the entire multi-agent chain without relying on annotations, by combining multi-granularity joint evaluation (local validity + lookahead potential + global alignment) and misalignment-case-driven evolutionary beam search. This yields an average improvement of about 2.9 points across 6 tasks.
- OMAC: A Holistic Optimization Framework for LLM-Based Multi-Agent Collaboration
-
This paper formalizes the optimization space of multi-agent systems (MAS) into five dimensions (two functional + three structural), and applies a dual-actor algorithm—"Semantic Initializer generation + Contrastive Comparator improvement"—to perform supervised optimization on each dimension. It then iteratively and jointly optimizes multiple dimensions, consistently outperforming baselines such as DyLAN, ADAS, and AFlow on HumanEval, MMLU, and MATH.
- RADAR: Redundancy-Aware Diffusion for Multi-Agent Communication Structure Generation
-
RADAR models the communication topology design of multi-LLM-Agent systems as a "redundancy-aware" discrete graph diffusion process, using effective size as a guiding signal to incrementally generate query-adaptive collaboration graphs. It achieves higher accuracy, lower token consumption, and stronger robustness across six benchmarks.
- Systematic Failures in Collective Reasoning under Distributed Information in Multi-Agent LLMs
-
This paper adapts the social psychology Hidden Profile paradigm to multi-agent LLM evaluation, constructing a 65-task HiddenBench. Systematic evaluation on 15 cutting-edge LLMs reveals: for tasks where a single agent achieves 80.7% accuracy under Full Profile, multi-agent setups under distributed information achieve only 30.1%. The fundamental failure mode is the inability to proactively elicit information not disclosed by others. However, lightweight structured communication protocols can significantly mitigate this across model families.