A-MEM: Agentic Memory for LLM Agents¶

Conference: NeurIPS 2025 arXiv: 2502.12110 Code: https://github.com/WujiangXu/AgenticMemory Area: LLM Agent / Memory Systems Keywords: Agentic Memory, Zettelkasten, Long-term Memory, LLM Agent, Knowledge Management

TL;DR¶

This paper proposes A-Mem, a Zettelkasten-inspired agentic memory system for LLM agents. Each memory entry automatically generates a structured note (keywords/tags/contextual description), dynamically establishes inter-memory links, and triggers evolutionary updates to existing memories upon the insertion of new ones. A-Mem substantially outperforms baselines such as MemGPT on the LoCoMo long-conversation QA benchmark.

Background & Motivation¶

Background: LLM agents require memory systems to support long-term interaction; existing systems (MemGPT, MemoryBank) provide basic storage and retrieval functionality.

Limitations of Prior Work: - Existing memory systems rely on predefined storage structures and fixed access operations, lacking adaptive capacity. - Approaches such as Mem0 introduce graph databases but depend on predefined schemas, precluding flexible creation of new organizational patterns. - Fixed workflows constrain generalization across diverse tasks. - Memories are read-only — once stored, they are never updated and cannot evolve over time.

Key Challenge: Static memory structures vs. the need for dynamically evolving knowledge organization.

Goal: Design a flexible memory system capable of autonomous organization, dynamic linking, and continuous evolution.

Key Insight: The Zettelkasten (slip-box) method — atomic notes + flexible links + knowledge networks.

Core Idea: Enable the memory system to autonomously generate structured notes, establish links, and trigger the evolution of existing memories, analogous to the Zettelkasten approach.

Method¶

Overall Architecture¶

Input: Interaction records between the agent and its environment. The memory storage pipeline proceeds as follows: (1) Note Construction — an LLM generates keywords, tags, and contextual descriptions from the interaction content; (2) Link Generation — embedding-based retrieval identifies neighboring memories, and an LLM determines whether semantic links should be established; (3) Memory Evolution — newly added memories trigger updates to the context and tags of neighboring existing memories. At retrieval time, cosine similarity is used to identify top-\(k\) memories, along with their linked associated memories.

Key Designs¶

Note Construction:
- Function: Generate atomic, multi-attribute memory notes for each interaction.
- Mechanism: Each memory \(m_i = \{c_i, t_i, K_i, G_i, X_i, e_i, L_i\}\), where \(c_i\) is content, \(t_i\) is timestamp, \(K_i\) keywords, \(G_i\) tags, \(X_i\) contextual description, \(e_i\) embedding, and \(L_i\) link set. \(K_i\), \(G_i\), and \(X_i\) are generated by an LLM from the raw interaction content. The embedding \(e_i\) is obtained by encoding the concatenation of all textual attributes via a text encoder.
- Design Motivation: Multi-faceted representations capture different dimensions of a memory (keywords = concepts, tags = categories, description = context), enabling fine-grained retrieval and organization.
Autonomous Link Generation:
- Function: Automatically establish semantic links between a new memory and historical memories.
- Mechanism: Embedding cosine similarity retrieves top-\(k\) neighboring memories; an LLM then analyzes whether links should be formed based on shared attributes and contextual associations. Links form "boxes" — thematic clusters analogous to those in Zettelkasten — and a single memory may belong to multiple boxes.
- Design Motivation: Embeddings serve as an efficient initial filter, while the LLM provides precise judgments that capture subtle associations (causal, conceptual, etc.).
Memory Evolution:
- Function: Automatically update the attributes of neighboring existing memories when a new memory is added.
- Mechanism: For each existing memory \(m_j\) in the neighborhood, an LLM determines, based on the new memory and other neighbors, whether \(m_j\)'s keywords, tags, and contextual description require updating. The updated \(m_j^*\) replaces the original entry.
- Design Motivation: Simulates human learning — new knowledge reshapes the understanding of prior knowledge — and incrementally builds an increasingly refined knowledge structure over time.

Loss & Training¶

No training is required; the system is entirely prompt-based.
Text embeddings are produced by all-minilm-l6-v2.
Memory retrieval uses top-\(k = 10\).

Key Experimental Results¶

Main Results¶

Evaluated on the LoCoMo long-conversation QA dataset (~9K tokens/conversation, 35 sessions).

Method	Multi-Hop F1	Temporal F1	Single-Hop F1	Avg F1 Rank	Token Length
LoCoMo	25.02	18.41	34.93	3.0	~13K
MemGPT	30.36	17.29	60.16	2.4	16,987
MemoryBank	6.49	2.47	8.28	5.0	569
A-Mem	32.86	39.41	48.43	1.6	1,216

A-Mem leads substantially on Temporal QA (39.41 vs. MemGPT's 17.29), ranks first overall, and consumes far fewer tokens than MemGPT.

Ablation Study¶

Configuration	Observation
w/o Link Generation	Removing the linking mechanism degrades performance, confirming the importance of inter-memory connections.
w/o Memory Evolution	Removing evolution causes the largest drop in temporal reasoning, indicating that memory updates are critical for long-term understanding.
w/o Structured Attributes	Storing raw text without generating keywords/tags degrades retrieval quality.

Key Findings¶

A-Mem achieves the largest advantage on temporal reasoning (more than 2×), as memory evolution automatically integrates temporal sequence information.
Token consumption is only 1,216 (vs. MemGPT's 16,987), an order-of-magnitude improvement in efficiency.
A-Mem consistently outperforms baselines across 6 backbone models (GPT-4o-mini, Qwen, Llama, etc.).
t-SNE visualizations reveal that memories form clearly delineated thematic clusters.

Highlights & Insights¶

Agentification of memory systems: Rather than passive storage and retrieval, the system actively organizes, links, and evolves memories — a natural next step following Agentic RAG.
Elegant integration of Zettelkasten and AI: Atomic notes, flexible links, and incremental construction represent a classical knowledge management methodology finding a fitting application in the LLM era.
Memory evolution mechanism: Triggering updates to existing memories upon the arrival of new knowledge is the key innovation, simulating the human process of reinterpreting prior knowledge in light of new information.
High efficiency: A-Mem achieves superior performance using only ~1.2K tokens, compared to MemGPT's ~17K.

Limitations & Future Work¶

Validation is limited to conversational QA; agent tasks such as tool use and multi-step reasoning have not been evaluated.
The LLM invocation cost of memory evolution is not quantified — each new memory triggers \(k\) LLM calls to update existing entries.
Scalability as the number of memories grows is insufficiently analyzed.
Links and evolution may introduce errors due to LLM hallucinations; no error-correction mechanism is provided.
A memory forgetting mechanism could be introduced to prevent outdated information from polluting the memory store.

vs. MemGPT: MemGPT employs a cache architecture that prioritizes recent information, but its memory structure is fixed. A-Mem's linking and evolution mechanisms are substantially more flexible.
vs. Mem0: Mem0 introduces a graph database but relies on predefined schemas, whereas A-Mem's "boxes" emerge automatically.
vs. Agentic RAG: RAG introduces agency at the retrieval stage, but the storage layer remains static. A-Mem introduces agency at both the storage and evolution stages.

Rating¶

Novelty: ⭐⭐⭐⭐ The combination of Zettelkasten, LLM agents, and memory evolution is highly creative.
Experimental Thoroughness: ⭐⭐⭐ Evaluated across 6 backbone models, but limited to conversational QA; agent task validation is absent.
Writing Quality: ⭐⭐⭐⭐ Method description is clear and formal notation is well-structured.
Value: ⭐⭐⭐⭐ Memory systems for LLM agents are an important research direction; A-Mem offers a valuable new paradigm.