Threat Modeling a Persistent Memory Store for AI Agents
Persistent memory for AI agents solves a real problem (the goldfish-with-a-PhD problem) but it introduces a new one: a high-trust, cross-session, cross-agent data store sitting inside the LLM’s context loop. Every recall is content that flows into a prompt. Every store is content that came from somewhere — sometimes the user, sometimes the model, sometimes a tool result that originated externally.
That’s a threat model worth writing down before the data store grows up. This post is the threat model for memstore — the persistent memory system I built for Claude Code — and the controls I’m applying or planning.
Self-Improving Recall: A Feedback Loop for AI Memory
A memory system that ranks facts the same way forever is dead weight. The signal that
actually matters — did this fact help, or did it waste context — only exists during a
real conversation. Memstore’s feedback loop captures that signal in-session and feeds it
back into recall ranking, so the system gets better at surfacing useful knowledge the
more it’s used.
The problem: static recall is stale recall
Memstore’s baseline ranking uses static signals — IDF, project boosts, semantic similarity, recency, surface-aware multipliers for project-level facts. All of them are derived from the fact itself, the query, or the static metadata around them. None answers the real question: when memstore injected this fact last time, did it help the agent or did it just crowd out something better?
Operational Lessons from 1,500 Memstore Facts
Building a persistent memory system for AI agents is one problem. Operating it at scale
is a different one. After two months of daily use, memstore holds around 1,500 active facts
across a dozen projects — project architecture, coding conventions, design decisions,
cross-cutting invariants, and roughly a thousand symbol-level descriptions generated by
running an LLM over every Go function in every codebase I work on. The system works. The
recall pipeline surfaces relevant context on every prompt without manual search. But getting
from “works” to “works well” required a series of scoring adjustments, noise filters, and
feedback mechanisms that the original design didn’t anticipate.
Fact Supersession: Version Control for Knowledge
Most memory systems for AI agents treat knowledge as a key-value store. Write a fact, overwrite it later, old value is gone. That works for simple preferences — “use dark mode” doesn’t need a paper trail. But knowledge that evolves over time is a different problem. When a design decision turns out to be wrong, or a project’s architecture shifts, or a dependency gets replaced, you don’t just want the current answer. You want to know what you believed before, when it changed, and ideally why. Losing that history means losing the reasoning trail, and reasoning is the expensive part. Memstore’s supersession system brings version control semantics to AI memory: facts get replaced, not erased, and the full chain of revisions is preserved.
Proactive Context Injection with Claude Code Hooks
Claude Code sessions start with amnesia. Every conversation begins cold — no memory of what you worked on yesterday, what invariants matter in the file you’re about to edit, or what tasks are still pending from last week. CLAUDE.md helps by injecting static project context, and MCP tools let the model search for facts on demand. But both of those require something to go right first: either the static file happens to cover the relevant topic, or the model decides to search before acting. In practice, the model often doesn’t search. It plows ahead with what it has, and the most valuable context — the constraint you documented last Tuesday, the task you left half-finished — stays in the database unsurfaced. This post is about closing that gap with hooks: small scripts that fire at specific points in the Claude Code lifecycle and inject relevant context automatically, before the model even knows it needs it.
Building Persistent Memory for AI Agents
AI coding assistants are goldfish with a PhD. They can solve complex problems within a single session — refactoring a module, debugging a race condition, designing an API — but the moment the conversation ends, everything they learned about your project evaporates. After months of building software with Claude Code, I found myself re-explaining the same project conventions, the same architectural decisions, the same mistakes we’d already caught and fixed, at the start of every session. CLAUDE.md files help, but they’re static and they don’t scale. You can’t stuff a dozen projects’ worth of design context into a single markdown file. So I built memstore: a persistent memory system that gives AI agents durable, searchable knowledge across sessions, projects, and machines.