MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection¶

Conference: ACL 2026 arXiv: 2602.21394 Code: GitHub Area: Security AI Keywords: phishing detection, LLM agent, episodic memory, multimodal reasoning, tool invocation

TL;DR¶

This paper proposes MemoPhishAgent (MPA), the first memory-augmented multimodal LLM agent specifically designed for phishing URL detection. MPA dynamically orchestrates five dedicated tools and leverages an episodic memory system to reuse historical reasoning trajectories. It achieves a 13.6% recall improvement on public benchmarks and a 20% improvement on real-world social media data, and has been deployed in production, processing approximately 60K high-risk URLs per week.

Background & Motivation¶

Background: Phishing attacks continue to evolve, and traditional defenses (static blacklists, handcrafted heuristic rules) provide insufficient coverage against new domains and novel techniques. Reference-based methods using brand-domain mappings improve robustness but incur high maintenance costs and respond slowly to emerging brands and subdomains.

Limitations of Prior Work: (1) Existing LLM-based approaches are mostly prompt-driven deterministic pipelines lacking adaptive evidence collection; (2) Tool usage follows fixed workflows (e.g., OCR → brand matching → domain verification) and cannot dynamically adjust based on the current evidence state; (3) The absence of a memory system prevents reuse of historical investigation experience, leading to inefficient repeated analysis of similar phishing patterns.

Key Challenge: Phishing attacks are non-stationary — attackers continuously vary their strategies — yet defense systems are memoryless and restart analysis from scratch each time.

Goal: To construct a phishing detection agent capable of dynamically adjusting evidence collection strategies, learning from historical investigations, and operating effectively in production environments.

Key Insight: Phishing detection is modeled as a multi-step reasoning process that simulates the investigative behavior of human experts, dynamically selecting tools to gather evidence.

Core Idea: Five phishing-specific multimodal tools + a ReAct reasoning loop + an episodic memory system (storing and retrieving historical reasoning trajectories) are combined to achieve adaptive, continually improving phishing detection.

Method¶

Overall Architecture¶

MPA receives a list of suspicious URLs; each URL is processed by the agent as follows: (1) five dedicated tools are dynamically selected to collect multimodal evidence (text + visual + external knowledge); (2) multi-step reasoning is performed within a ReAct loop, determining the next action based on the current evidence state; (3) the episodic memory system retrieves similar historical cases to accelerate judgment or provide exemplar guidance. The final output is a "malicious" or "benign" verdict.

Key Designs¶

Five Phishing-Specific Tools:
- Function: Provide complementary multimodal evidence.
- Mechanism: Three-dimensional coverage — multimodal evidence (Crawl Content for Markdown text extraction + Check Screenshot for full-page screenshot analysis + Check Image for fine-grained image inspection), external knowledge (Intelligent Search constructs evidence-driven search queries to retrieve up-to-date information), and nested attack surfaces (Extract Targets extracts redirect targets and sublinks for deep inspection).
- Design Motivation: General-purpose tools are ill-suited to the phishing domain; the five tools collectively cover textual, visual, link, and external-knowledge dimensions.
Episodic Memory System:
- Function: Store, retrieve, and reuse historical reasoning trajectories.
- Mechanism: An LLM extracts compact keywords from pages (e.g., "apple login," "wallet connect"), which are embedded and indexed as vectors. The top-\(k\) nearest neighbors are retrieved and used according to a three-tier strategy — full ReAct reasoning when no match is found, historical trajectories as in-context exemplars when a partial match exists, and direct majority voting when a full match is found. The memory grows richer as deployment continues.
- Design Motivation: Phishing patterns are highly repetitive (the same attack template targets different victims); the memory system transforms repeated investigations into rapid decisions.
Three-Tier Memory Usage Strategy:
- Function: Balance speed and reliability.
- Mechanism: \(k'=0\) → full reasoning (unseen pattern); \(0 < k' < k\) → historical trajectories as in-context exemplars (partial similarity); \(k' \geq k\) → direct majority voting (high similarity).
- Design Motivation: Prevent memory from dominating reasoning — it should serve as contextual guidance rather than a substitute for deliberation.

Key Experimental Results¶

Main Results¶

Method	TR-OP Recall	DynaPD Recall	Speed (s/URL)
MPA	93.4%	93.6%	4.46
PhishLLM	~80%	~88%	14.2
MLLM	~82%	~85%	5.1
URLTran	~86%	—	2.8 (incl. training)

Ablation Study¶

Configuration	Key Metric	Note
Full MPA	93.4% Recall	All components
w/o memory system	−27% Recall	Memory contributes most
w/o tool design	Performance drop	Specialized tools outperform generic ones
Prompt-based baseline	Inferior	Fixed pipeline underperforms adaptive selection

Key Findings¶

The episodic memory system contributes up to 27% recall improvement without additional computational overhead.
MPA achieves the fastest inference among all methods (4.46 s/URL), as the memory system bypasses redundant analysis for repeated patterns.
A 20% recall improvement on real-world social media data (SocPhish) indicates that advantages are amplified in realistic settings.
Production deployment processes ~60K high-risk URLs per week, achieving a 91.44% recall rate.
URL shorteners and platform-hosted paths (e.g., sites.google.com) are blind spots for traditional methods; MPA overcomes these via multimodal tools.

Highlights & Insights¶

Production deployment: This is not merely an academic contribution — the system has been deployed in Amazon's production environment to protect millions of users, lending substantial credibility.
Remarkable impact of episodic memory: A 27% recall gain with no added computation, achieved by reducing LLM calls through direct voting on repeated patterns.
Professional and complementary tool design: The five tools collect evidence across textual, visual, search, and link dimensions.
Three-tier memory strategy balances efficiency and accuracy: Full analysis for unseen patterns, rapid decisions for previously seen ones.

Limitations & Future Work¶

Reliance on external LLM APIs: Latency and cost associated with Claude-3-Sonnet.
Memory contamination risk: Early misclassifications stored in memory may propagate errors to subsequent decisions.
Scope limited to phishing URLs: Other security threats (e.g., malware distribution) are not covered.
Future directions include memory self-correction mechanisms, extension to broader security threat types, and replacement of API-dependent models with lightweight local alternatives.

vs. PhishLLM: Employs LLMs for brand extraction and intent recognition but still relies on a fixed pipeline; MPA selects tools dynamically.
vs. Cao et al. (2025): Uses multimodal LLMs for phishing detection but with a fixed evidence acquisition workflow and no memory component.
vs. General agent frameworks: General-purpose tools and reasoning are less effective; MPA's phishing-specific tools yield superior results.

Rating¶

Novelty: ⭐⭐⭐⭐ First memory-augmented LLM agent tailored for phishing detection, with an elegantly designed episodic memory system.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Evaluated on two public benchmarks, real-world social media data, and production deployment, with comprehensive ablation studies.
Writing Quality: ⭐⭐⭐⭐ Threat model is clearly defined; system architecture diagrams are intuitive.
Value: ⭐⭐⭐⭐⭐ Validated in a production environment; offers direct practical value for security AI.