Skip to content

A-MEM: Agentic Memory for LLM Agents

Conference: NeurIPS 2025 arXiv: 2502.12110 Code: https://github.com/WujiangXu/AgenticMemory Area: LLM Agent / Memory Systems Keywords: Agentic Memory, Zettelkasten, Long-term Memory, LLM Agent, Knowledge Management

TL;DR

This paper proposes A-Mem, a Zettelkasten-inspired agentic memory system for LLM agents. Each memory entry automatically generates a structured note (keywords/tags/contextual description), dynamically establishes inter-memory links, and triggers evolutionary updates to existing memories upon the insertion of new ones. A-Mem substantially outperforms baselines such as MemGPT on the LoCoMo long-conversation QA benchmark.

Background & Motivation

Background: LLM agents require memory systems to support long-term interaction; existing systems (MemGPT, MemoryBank) provide basic storage and retrieval functionality.

Limitations of Prior Work: - Existing memory systems rely on predefined storage structures and fixed access operations, lacking adaptive capacity. - Approaches such as Mem0 introduce graph databases but depend on predefined schemas, precluding flexible creation of new organizational patterns. - Fixed workflows constrain generalization across diverse tasks. - Memories are read-only — once stored, they are never updated and cannot evolve over time.

Key Challenge: Static memory structures vs. the need for dynamically evolving knowledge organization.

Goal: Design a flexible memory system capable of autonomous organization, dynamic linking, and continuous evolution.

Key Insight: The Zettelkasten (slip-box) method — atomic notes + flexible links + knowledge networks.

Core Idea: Enable the memory system to autonomously generate structured notes, establish links, and trigger the evolution of existing memories, analogous to the Zettelkasten approach.

Method

Overall Architecture

Input: Interaction records between the agent and its environment. The memory storage pipeline proceeds as follows: (1) Note Construction — an LLM generates keywords, tags, and contextual descriptions from the interaction content; (2) Link Generation — embedding-based retrieval identifies neighboring memories, and an LLM determines whether semantic links should be established; (3) Memory Evolution — newly added memories trigger updates to the context and tags of neighboring existing memories. At retrieval time, cosine similarity is used to identify top-\(k\) memories, along with their linked associated memories.

Key Designs

  1. Note Construction:

    • Function: Generate atomic, multi-attribute memory notes for each interaction.
    • Mechanism: Each memory \(m_i = \{c_i, t_i, K_i, G_i, X_i, e_i, L_i\}\), where \(c_i\) is content, \(t_i\) is timestamp, \(K_i\) keywords, \(G_i\) tags, \(X_i\) contextual description, \(e_i\) embedding, and \(L_i\) link set. \(K_i\), \(G_i\), and \(X_i\) are generated by an LLM from the raw interaction content. The embedding \(e_i\) is obtained by encoding the concatenation of all textual attributes via a text encoder.
    • Design Motivation: Multi-faceted representations capture different dimensions of a memory (keywords = concepts, tags = categories, description = context), enabling fine-grained retrieval and organization.
  2. Autonomous Link Generation:

    • Function: Automatically establish semantic links between a new memory and historical memories.
    • Mechanism: Embedding cosine similarity retrieves top-\(k\) neighboring memories; an LLM then analyzes whether links should be formed based on shared attributes and contextual associations. Links form "boxes" — thematic clusters analogous to those in Zettelkasten — and a single memory may belong to multiple boxes.
    • Design Motivation: Embeddings serve as an efficient initial filter, while the LLM provides precise judgments that capture subtle associations (causal, conceptual, etc.).
  3. Memory Evolution:

    • Function: Automatically update the attributes of neighboring existing memories when a new memory is added.
    • Mechanism: For each existing memory \(m_j\) in the neighborhood, an LLM determines, based on the new memory and other neighbors, whether \(m_j\)'s keywords, tags, and contextual description require updating. The updated \(m_j^*\) replaces the original entry.
    • Design Motivation: Simulates human learning — new knowledge reshapes the understanding of prior knowledge — and incrementally builds an increasingly refined knowledge structure over time.

Loss & Training

  • No training is required; the system is entirely prompt-based.
  • Text embeddings are produced by all-minilm-l6-v2.
  • Memory retrieval uses top-\(k = 10\).

Key Experimental Results

Main Results

Evaluated on the LoCoMo long-conversation QA dataset (~9K tokens/conversation, 35 sessions).

Method Multi-Hop F1 Temporal F1 Single-Hop F1 Avg F1 Rank Token Length
LoCoMo 25.02 18.41 34.93 3.0 ~13K
MemGPT 30.36 17.29 60.16 2.4 16,987
MemoryBank 6.49 2.47 8.28 5.0 569
A-Mem 32.86 39.41 48.43 1.6 1,216

A-Mem leads substantially on Temporal QA (39.41 vs. MemGPT's 17.29), ranks first overall, and consumes far fewer tokens than MemGPT.

Ablation Study

Configuration Observation
w/o Link Generation Removing the linking mechanism degrades performance, confirming the importance of inter-memory connections.
w/o Memory Evolution Removing evolution causes the largest drop in temporal reasoning, indicating that memory updates are critical for long-term understanding.
w/o Structured Attributes Storing raw text without generating keywords/tags degrades retrieval quality.

Key Findings

  • A-Mem achieves the largest advantage on temporal reasoning (more than 2×), as memory evolution automatically integrates temporal sequence information.
  • Token consumption is only 1,216 (vs. MemGPT's 16,987), an order-of-magnitude improvement in efficiency.
  • A-Mem consistently outperforms baselines across 6 backbone models (GPT-4o-mini, Qwen, Llama, etc.).
  • t-SNE visualizations reveal that memories form clearly delineated thematic clusters.

Highlights & Insights

  • Agentification of memory systems: Rather than passive storage and retrieval, the system actively organizes, links, and evolves memories — a natural next step following Agentic RAG.
  • Elegant integration of Zettelkasten and AI: Atomic notes, flexible links, and incremental construction represent a classical knowledge management methodology finding a fitting application in the LLM era.
  • Memory evolution mechanism: Triggering updates to existing memories upon the arrival of new knowledge is the key innovation, simulating the human process of reinterpreting prior knowledge in light of new information.
  • High efficiency: A-Mem achieves superior performance using only ~1.2K tokens, compared to MemGPT's ~17K.

Limitations & Future Work

  • Validation is limited to conversational QA; agent tasks such as tool use and multi-step reasoning have not been evaluated.
  • The LLM invocation cost of memory evolution is not quantified — each new memory triggers \(k\) LLM calls to update existing entries.
  • Scalability as the number of memories grows is insufficiently analyzed.
  • Links and evolution may introduce errors due to LLM hallucinations; no error-correction mechanism is provided.
  • A memory forgetting mechanism could be introduced to prevent outdated information from polluting the memory store.
  • vs. MemGPT: MemGPT employs a cache architecture that prioritizes recent information, but its memory structure is fixed. A-Mem's linking and evolution mechanisms are substantially more flexible.
  • vs. Mem0: Mem0 introduces a graph database but relies on predefined schemas, whereas A-Mem's "boxes" emerge automatically.
  • vs. Agentic RAG: RAG introduces agency at the retrieval stage, but the storage layer remains static. A-Mem introduces agency at both the storage and evolution stages.

Rating

  • Novelty: ⭐⭐⭐⭐ The combination of Zettelkasten, LLM agents, and memory evolution is highly creative.
  • Experimental Thoroughness: ⭐⭐⭐ Evaluated across 6 backbone models, but limited to conversational QA; agent task validation is absent.
  • Writing Quality: ⭐⭐⭐⭐ Method description is clear and formal notation is well-structured.
  • Value: ⭐⭐⭐⭐ Memory systems for LLM agents are an important research direction; A-Mem offers a valuable new paradigm.