CVPR2026 Multi-Agent AI paper notes paper summaries Agents Object Detection Few-/Zero-Shot Learning Reasoning

👥 Multi-Agent¶

📷 CVPR2026 · 2 paper notes

🔥 Top topics: Agents ×2

AgentDet: A Shared-Blackboard Multi-Agent Framework for Zero-/Few-Shot Object Detection: AgentDet decomposes zero-/few-shot object detection into four LLM agents: Scout, Pinner, Curator, and Judge. These agents collaborate via a "Shared Blackboard" and a patch-level "Knowledge Base" (KB). The framework fragments visual evidence into the KB, assembles them into holistic textual clues for LLM-based box prediction, and trains only the Judge agent. It achieves competitive results on PASCAL VOC and COCO for both ZSOD and FSOD tasks.
Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling: MACT decomposes the "monolithic single-model" visual document QA into four agents with distinct roles: planning, execution, judging, and answering. It adaptively allocates test-time compute according to the cognitive load of each agent rather than uniformly increasing parameters. On 15 benchmarks, it consistently ranks in the top three with <30B parameters, achieving an average improvement of 9.9–11.5% over the base models.