MAC-AMP: A Closed-Loop Multi-Agent Collaboration System for Multi-Objective Antimicrobial Peptide Design¶
Conference: ICLR 2026 arXiv: 2602.14926 Code: GitHub Area: Image Generation Keywords: antimicrobial peptide design, multi-agent collaboration, closed-loop reinforcement learning, multi-objective optimization, LLM agent
TL;DR¶
This paper proposes MAC-AMP, the first closed-loop multi-agent collaboration system that reformulates antimicrobial peptide (AMP) design as a coordinated multi-agent optimization problem, achieving multi-objective optimization through AI-simulated peer review and adaptive reward design.
Background & Motivation¶
- Antimicrobial Resistance (AMR) Crisis: Directly responsible for approximately 1.14 million deaths in 2021, with projections exceeding 39 million direct deaths between 2025 and 2050.
- Limitations of Existing AMP Design Models:
- Most optimize only for antimicrobial activity, neglecting toxicity, stability, and novelty.
- Multi-objective optimization is unstable; static weights readily cause reward hacking or diversity collapse.
- Outputs are typically scattered scores or text, making it difficult to convert them into reproducible learning signals.
- Limitations of Existing Multi-Agent Systems:
- Outputs are primarily in natural language, lacking trainable optimization signals.
- Most are open-loop systems that rely on human intervention.
Method¶
Overall Architecture¶
MAC-AMP comprises six interconnected modules: Input Module → Property Prediction → AI-Simulated Peer Review → RL Refinement → Peptide Generation → Output Module. Users need only provide the target bacterial name and an example dataset.
1. Property Prediction Module¶
Evaluates multiple AMP attributes, categorized into two types: - Explicit Reward Signals \(S\): Antimicrobial activity score \(S_a\) (MIC predictor fine-tuned on ProtBERT), AMP likelihood score \(S_b\) (Macrel 1.5). - Auxiliary Evidence \(V\): Toxicity score \(V_a\) (ToxinPred 3.0), structural reliability \(V_b\) (OmegaFold), physicochemical properties \(V_c\) (ProtParam), template similarity \(V_d\) (Foldseek).
2. AI-Simulated Peer Review Module¶
- Three Independent Reviewer Agents (GPT-5, Gemini 2.5, Perplexity) evaluate peptides across four dimensions: efficacy, safety, developmental structure, and originality.
- Each dimension is associated with a weighted vocabulary sub-table, using the tag format \(\text{ID}(\text{State}, \text{Weight})\) for structured annotation.
- Area Chair Agent: Aggregates review results, resolves semantic conflicts, computes dimension-level meta-scores, and outputs meta-review text \(T\) and average meta-score \(S_c\).
3. RL Refinement Module¶
- CS-Based Reward Design Agent: Optimizes the reward function based on observable signals and mathematical properties.
- Biomedical Reward Alignment Agent: Analyzes meta-review text and proposes revision recommendations grounded in domain knowledge.
- Candidate rewards are filtered by a rule-based validator → short-term sandbox training → Pareto optimization to select the optimal reward function.
- Phase-Adaptive Optimization: The reward function is redesigned every 15 epochs over 3 iterations.
4. PPO Optimization¶
Normalized advantage: \(A = \text{norm}(R - \bar{V}_\phi)\)
Clipped surrogate loss:
Total loss:
where \(L_{value}\) is the value regression loss and \(L_{ent}\) is the entropy regularization term.
Key Experimental Results¶
Main Results: Target-Specific AMP Evaluation¶
| Model | Antimicrobial Activity (↑) | AMP Likelihood (↑) | Toxicity (↓) | Structural Reliability (↑) |
|---|---|---|---|---|
| MAC-AMP | 0.943±0.008 | 0.797±0.012 | 0.154±0.008 | 0.873±0.009 |
| AMP-Designer | 0.807±0.021 | 0.811±0.011 | 0.251±0.024 | 0.817±0.017 |
| BroadAMP-GPT | 0.831±0.025 | 0.821±0.018 | 0.246±0.033 | 0.763±0.023 |
| PepGAN | 0.823±0.023 | 0.572±0.035 | 0.247±0.064 | 0.637±0.026 |
| Diff-AMP | 0.822±0.006 | 0.554±0.036 | 0.235±0.072 | 0.752±0.020 |
Results on E. coli target
Broad-Spectrum Activity Evaluation¶
| Model | E. coli | S. aureus | P. aeruginosa | K. pneumoniae | E. faecium |
|---|---|---|---|---|---|
| MAC-AMP | 0.94 | 0.81 | 0.94 | 0.98 | 0.95 |
| AMP-Designer | 0.81 | 0.81 | 0.85 | 0.96 | 0.96 |
| PepGAN | 0.82 | 0.89 | 0.91 | 0.98 | 0.96 |
Key Findings¶
- MAC-AMP comprehensively outperforms baselines in antimicrobial activity, toxicity, and structural reliability.
- AMPs designed for E. coli generalize well to other Gram-negative bacteria (which share outer membrane structures).
- Strong generalization is also demonstrated for E. faecium (a Gram-positive bacterium).
- Training cost: 47.61 GPU hours, 853 API calls, API expenditure of $36.56.
Highlights & Insights¶
- First Closed-Loop Multi-Agent System: Converts natural-language review consensus into executable RL reward signals, bridging the gap between output format and training signal.
- End-to-End Interpretability: Overcomes black-box limitations through transparent logging, replay trajectories, and consensus-aware decision tracking.
- Cross-Domain Transferability: The framework's generality is validated on English table-to-text generation tasks.
- Multi-Objective Balance: Achieves multi-objective optimization through structured agent consensus rather than manually specified static weights.
Limitations & Future Work¶
- Generated peptides have not yet been validated through in vitro experiments.
- API call costs may limit large-scale deployment.
- The peer review module relies on specific commercial LLMs, constraining reproducibility.
- The phase interval (15 epochs) and iteration count (3 rounds) are hyperparameters that may require task-specific tuning.
Related Work & Insights¶
- AMP Generation: AMPGAN v2, Diff-AMP, AMP Designer
- LLM Multi-Agent Collaboration: Virtual Lab, CAMEL, AutoGen, ReviewAgents
- LLM-Augmented RL: RLAIF, Eureka
Rating¶
- Novelty: ⭐⭐⭐⭐ — First framework to apply closed-loop multi-agent collaboration to molecular design.
- Technical Depth: ⭐⭐⭐⭐ — Modular design is sophisticated with thorough multi-level validation.
- Experimental Thoroughness: ⭐⭐⭐⭐ — Five bacterial targets, four baselines, and multi-dimensional ablation studies.
- Practical Value: ⭐⭐⭐⭐ — Extensible to other molecular design tasks.