Memory
Understand This First
- Harness (Agentic) – the harness stores and loads memory entries.
- Context Window – memory competes for space in the finite window.
Context
At the agentic level, memory is persisted information that allows an agent to maintain consistency across sessions. Unlike an instruction file, which is authored by a human and describes project conventions, memory is typically accumulated from experience: learnings, corrections, and preferences discovered during previous work sessions.
Memory addresses the statelessness of models. Each conversation starts fresh, and without memory, the agent will repeat the same mistakes, ask the same questions, and ignore the same corrections session after session. Memory gives the agent a persistent substrate for learning.
Problem
How do you prevent an agent from repeating mistakes or forgetting lessons learned in previous sessions?
A developer corrects an agent’s behavior (“don’t use library X, use library Y instead”) and the agent complies for the rest of the session. Next session, the agent uses library X again. The correction is lost because the model has no memory between sessions. Multiplied across dozens of corrections and preferences, this creates a frustrating cycle of re-education.
Forces
- Model statelessness: each session starts from zero.
- Correction fatigue: repeating the same feedback erodes trust in the workflow.
- Knowledge accumulation: real expertise grows through experience, and agents should benefit from past sessions.
- Noise risk: too much accumulated memory dilutes the context window with low-value information.
Solution
Use memory mechanisms provided by your harness to persist important learnings, corrections, and preferences across sessions. Memory entries are typically short, specific statements that capture a lesson:
- “When modifying database queries in this project, always include the tenant_id filter.”
- “The team prefers early returns over nested conditionals.”
- “The staging environment requires VPN access; don’t suggest direct connections.”
Good memory entries share several qualities:
Specificity. “Be careful with the database” is useless. “Always use parameterized queries to prevent SQL injection” is actionable.
Relevance. Memory entries should capture lessons that are likely to recur. A one-time debugging note about a transient issue is noise.
Currency. Memory entries can become stale. Periodically review and prune entries that no longer apply.
Memory works alongside instruction files but serves a different purpose. Instruction files are deliberately authored project documentation. Memory is the accumulation of corrections and discoveries: the notes a developer scribbles in the margins while learning a codebase.
Working examples as memory. Memory doesn’t have to be prose rules. Saving working code snippets, successful configurations, and proven recipes creates a personal knowledge library the agent can draw on in future sessions. A developer who solves a tricky OAuth flow can save the working implementation as a memory entry. Next time a similar integration arises, the agent has a tested reference point instead of generating from scratch. This turns personal expertise into reusable agent infrastructure.
When you correct an agent and the correction will apply to future sessions, ask the agent to save it as a memory entry. Frame it as a rule: “Remember: in this project, we always X because Y.” This turns a one-time correction into a durable improvement.
How It Plays Out
A developer spends a session working with an agent on a payment processing module. During the session, she corrects the agent three times: use decimal types for currency (not floats), always log transaction IDs, and wrap payment calls in idempotency guards. She saves each correction as a memory entry. In the next session, when she asks the agent to add a new payment method, the agent applies all three conventions without being reminded.
A team notices that their agent’s memory has grown to fifty entries over several months, some referencing deprecated patterns. They spend fifteen minutes pruning the list, removing outdated entries and consolidating related ones. Output quality improves because the context window is no longer carrying stale information.
A developer who frequently builds CLI tools saves her working argument-parser boilerplate as a memory entry. Two weeks later, she starts a new project and asks the agent to set up the CLI scaffolding. The agent pulls from the saved example rather than generating from defaults, producing code that matches her preferred structure on the first try.
“Save this as a memory: in this project, always use Decimal for currency fields, never use floating point. Also remember that all API responses must include a request_id header for tracing.”
Consequences
Memory makes agents feel like they learn over time. Corrections stick. Preferences accumulate. Working examples compound. The agent becomes more useful with continued use, and teams that invest in memory curation develop agents that behave like experienced colleagues who know the project’s quirks.
The cost is curation. Memory without pruning becomes noise. Contradictory entries confuse the model. Memory entries consume context window space in every session, so bloated memory directly reduces the space available for the current task. Treat memory as a curated collection, not an append-only log.
Related Patterns
- Depends on: Harness (Agentic) — the harness stores and loads memory entries.
- Contrasts with: Instruction File — instruction files are human-authored; memory is experience-accumulated.
- Uses: Context Engineering — memory is context that persists across sessions.
- Enables: Progress Log — memory captures learnings; progress logs capture actions.
- Depends on: Context Window — memory competes for space in the finite window.
Sources
- OpenAI introduced persistent memory for ChatGPT in February 2024, making it the first major AI assistant to retain user preferences and corrections across sessions. The feature established the pattern of accumulated, user-visible memory entries that this article describes.
- Anthropic’s Claude Code introduced file-based memory through CLAUDE.md files, where project conventions and accumulated learnings are stored as plain text that loads automatically at session start. This approach treats memory as editable, version-controlled documents rather than opaque database entries.
- Mem0, founded by Taranjeet Singh and Deshraj Yadav in January 2024, built the first dedicated open-source memory layer for AI agents, providing infrastructure for storing, retrieving, and managing persistent agent memories at scale.
- The semantic, episodic, and procedural memory taxonomy that underpins modern agent memory design traces to Endel Tulving, who distinguished episodic from semantic memory in Elements of Episodic Memory (1983). Agent memory systems map directly onto his categories.