Explicit Trait Inference for Multi-Agent Coordination¶
Conference: ACL 2026
arXiv: 2604.19278
Code: None
Area: LLM Multi-Agent / Social Reasoning
Keywords: Multi-Agent Coordination, Trait Inference, Warmth-Competence Dimensions, Social Cognition, Game Theory
TL;DR¶
The paper proposes the Explicit Trait Inference (ETI) method, which allows LLM agents to reason about and track the behavioral characteristics of partners based on two psychological dimensions: warmth and competence. This approach reduces payoff loss by 45-77% in economic games and improves task performance by 3-29% on MultiAgentBench.
Background & Motivation¶
Background: LLM-based multi-agent systems (MAS) demonstrate potential in complex tasks but remain prone to coordination failures—issues such as goal drift, error cascades, insufficient information sharing, and behavioral misalignment limit their reliability and scalability.
Limitations of Prior Work: (1) Structured methods (e.g., CAMEL, ChatDev) organize collaboration through fixed roles but do not involve how agents reason about and adapt to each other; (2) Theory of Mind (ToM) methods primarily model transient mental states (beliefs, intentions) rather than stable behavioral traits (reliability, willingness to cooperate); (3) Reputation systems only track task metrics (success rate) without capturing the "why" and "how" of behavioral patterns.
Key Challenge: The core challenge is not whether agents can perform individual actions, but whether they can coordinate effectively with one another—this requires forming stable, actionable cognitive representations of partners.
Goal: To provide a lightweight, psychology-based mechanism that enables agents to infer partner traits from interaction history and adjust their behavior accordingly.
Key Insight: Drawing on the warmth-competence two-dimensional model from social psychology (Fiske et al., 2007), social evaluations are mapped into actionable coordination signals.
Core Idea: Agents are prompted to explicitly infer and maintain trait profiles of partners regarding warmth (trust/cooperation) and competence (skill/reliability) to guide delegation, communication, and strategy adjustment.
Method¶
Overall Architecture¶
ETI is a framework based on prompting and context management. After each interaction, the agent receives a structured summary containing task goals, actions, communications, and outcomes, and is prompted to infer the partner's traits. The agent generates (a) a 1-7 Likert scale rating for each trait and (b) brief evidence supporting the judgment. These profiles are appended to the context for subsequent planning and execution.
Key Designs¶
-
Warmth-Competence Trait Framework:
- Function: Provides agents with structured representations of partners.
- Mechanism: Eight behaviorally anchored traits are divided into two dimensions—Warmth (goal alignment, collaborativeness, trustworthiness, malevolence) and Competence (execution ability, reliability, adaptability, efficiency). Trait definitions explicitly separate warmth and competence to prevent confusion in natural language (e.g., misjudging non-cooperation as incompetence).
- Design Motivation: The warmth dimension addresses goal drift and unreliable cooperation; low warmth prompts agents to clarify intentions or discount unreliable inputs. The competence dimension addresses execution errors and cascading failures; low competence prompts task reallocation or increased verification.
-
Inference-Planning-Execution Loop:
- Function: Seamlessly integrates trait inference into multi-agent pipelines.
- Mechanism: After each iteration, the agent (1) infers partner traits based on action and outcome history; (2) incorporates the structured trait profile into the context; (3) utilizes the enriched context to plan and execute the next step. Prompts instruct the model to focus on primary behavioral patterns rather than isolated events, maintaining domain agnosticism.
- Design Motivation: The pure prompting approach requires no fine-tuning or additional data, has minimal overhead, and is applicable to any MAS architecture.
-
Capability Parameterization in Economic Games:
- Function: Provides a controlled environment with ground truth to evaluate trait inference accuracy.
- Mechanism: Capability parameters are added to standard Prisoner's Dilemma and Stag Hunt games—a player's intended action is successfully executed only with probability \(p_i\). This allows agents to infer intentions (cooperativeness vs. selfishness) from actions and competence (success rate) from outcomes. Agents play 50 rounds against parameterized scripted opponents.
- Design Motivation: Economic games provide simple decision problems requiring adaptive reasoning, allowing for precise assessment of trait inference accuracy.
Loss & Training¶
ETI is a pure prompting method and does not involve training. Qwen3-8B is used as the agent, with 25 independent trials in all configurations.
Key Experimental Results¶
Main Results¶
In economic games (Qwen3-8B vs. scripted opponents):
| Game | Method | Payoff Deviation ↓ | Description |
|---|---|---|---|
| Prisoner's Dilemma | CoT Baseline | High | Lacks opponent modeling |
| Prisoner's Dilemma | ETI | Reduced by 45-77% | Trait-aware decision making |
| Stag Hunt | CoT Baseline | High | Default conservative strategy |
| Stag Hunt | ETI | Significant Improvement | Accurate judgment of cooperation likelihood |
On MultiAgentBench:
| Scenario Type | ETI Gain | Coordination Gain |
|---|---|---|
| Collaborative Scenarios | 3-29% | 6-42% |
| Competitive Scenarios | Improvement | Significant |
Ablation Study¶
| Configuration | Effect | Description |
|---|---|---|
| ETI (Informative Profile) | Optimal | Diverse trait judgments drive improvement |
| ETI (Generalized Profile) | Slight Improvement | Non-discriminative profiles are ineffective |
| No Trait Inference | Baseline | CoT focuses only on task-level reasoning |
| Trait-Predicted Behavior | Accurate | ETI profiles indeed predict agent actions |
Key Findings¶
- The gains of ETI do not come from "more reasoning" but from "more targeted reasoning"—generalized profiles are nearly ineffective, whereas highly informative profiles are useful.
- Trait inference capability was verified: the profiles generated by ETI indeed predict the agent's subsequent behavior, proving that models can reliably infer stable traits from interaction history.
- In complex scenarios of MultiAgentBench, ETI achieved a maximum gain of 29%, demonstrating the generalizability of the method from controlled settings to real-world MAS.
- The warmth dimension is more important in collaborative scenarios (detecting unreliable collaborators), while the competence dimension is more critical in complex task scenarios (reallocating tasks).
Highlights & Insights¶
- Introducing the warmth-competence model from social psychology into MAS is an elegant interdisciplinary innovation: trust and coordination in human society operate based on these two dimensions, and formalizing this directly as an inter-agent reasoning framework is natural.
- The design of "behaviorally anchored" trait definitions is a valuable takeaway: using explicit behavioral descriptions (rather than abstract concepts) prevents LLMs from confusing dimensions during inference, which is applicable to any scenario requiring structured LLM judgments.
- The pure prompting implementation implies zero additional training costs and plug-and-play capability, making it extremely friendly for practical MAS deployment.
Limitations & Future Work¶
- The accuracy of trait inference depends on the social reasoning capabilities of the LLM; weaker models may generate inaccurate profiles.
- The current framework assumes that traits are relatively stable—it has limited capability in detecting strategic disguise (e.g., betraying after initial cooperation).
- While the choice of eight traits is psychologically grounded, it may not be the optimal MAS design—task-specific trait dimensions might be more effective.
- In extremely large-scale MAS (>10 agents), the context cost of maintaining trait profiles for all partners may become excessive.
Related Work & Insights¶
- vs. ToM Methods (Li et al., 2023): Models transient beliefs/intentions without tracking stable traits; ETI provides persistent representations across interactions.
- vs. Reputation Systems (Lou et al., 2026): Only tracks metrics like success rates without capturing behavioral motivations; ETI provides richer representations (why + how).
- vs. CoT/Reflexion: Only structures task-level reasoning without involving reasoning about others; ETI extends this to the social reasoning domain.
Rating¶
- Novelty: ⭐⭐⭐⭐⭐ First systematic combination of psychological trait theory and MAS
- Experimental Thoroughness: ⭐⭐⭐⭐⭐ Comprehensive validation from accuracy to causality across controlled games and real MAS
- Writing Quality: ⭐⭐⭐⭐⭐ Clear motivation and excellent interdisciplinary integration
- Value: ⭐⭐⭐⭐⭐ Provides a lightweight and effective new paradigm for LLM multi-agent coordination