Defense in Depth for AI Agents
The security conversation around AI agents has mostly focused on two things: keeping agents from hurting the host system, and keeping malicious tools out of the supply chain. These are real problems. Cisco documented
how OpenClaw leaks credentials and executes arbitrary shell commands. Projects like NanoClaw
respond by running agents in containers where bash commands can’t reach the host. Zencoder’s MCP survival guide
catalogs supply chain attacks against MCP servers and recommends pinning git tags and auditing source.
Threat Modeling a Persistent Memory Store for AI Agents
Persistent memory for AI agents solves a real problem (the goldfish-with-a-PhD problem) but it introduces a new one: a high-trust, cross-session, cross-agent data store sitting inside the LLM’s context loop. Every recall is content that flows into a prompt. Every store is content that came from somewhere — sometimes the user, sometimes the model, sometimes a tool result that originated externally.
That’s a threat model worth writing down before the data store grows up. This post is the threat model for memstore — the persistent memory system I built for Claude Code — and the controls I’m applying or planning.
math-mcp: Giving LLMs a Calculator That Knows It's a Calculator
LLMs do arithmetic in their heads. Mostly they’re close. Occasionally they’re off by enough to matter — a mortgage payment that’s $90 too high, an opportunity-cost claim that’s understated by 10%, a loan payoff term that ignores the interest still accruing while you pay it down. The model doesn’t know which case it’s in. Neither do you, unless you check.
math-mcp is a small MCP server that gives the model somewhere else to send those questions. It exposes ~55 discrete tools: Go’s math standard library, gonum’s statistical aggregates, and razorpay/go-financial’s time-value-of-money functions, each wrapped as its own tool so the LLM picks them out of its tool list by name and calls them directly. One tool per function, no expression evaluator. The point is to make “I should not estimate this” the easy path.
Operational Lessons from 1,500 Memstore Facts
Building a persistent memory system for AI agents is one problem. Operating it at scale
is a different one. After two months of daily use, memstore holds around 1,500 active facts
across a dozen projects — project architecture, coding conventions, design decisions,
cross-cutting invariants, and roughly a thousand symbol-level descriptions generated by
running an LLM over every Go function in every codebase I work on. The system works. The
recall pipeline surfaces relevant context on every prompt without manual search. But getting
from “works” to “works well” required a series of scoring adjustments, noise filters, and
feedback mechanisms that the original design didn’t anticipate.
Proactive Context Injection with Claude Code Hooks
Claude Code sessions start with amnesia. Every conversation begins cold — no memory of what you worked on yesterday, what invariants matter in the file you’re about to edit, or what tasks are still pending from last week. CLAUDE.md helps by injecting static project context, and MCP tools let the model search for facts on demand. But both of those require something to go right first: either the static file happens to cover the relevant topic, or the model decides to search before acting. In practice, the model often doesn’t search. It plows ahead with what it has, and the most valuable context — the constraint you documented last Tuesday, the task you left half-finished — stays in the database unsurfaced. This post is about closing that gap with hooks: small scripts that fire at specific points in the Claude Code lifecycle and inject relevant context automatically, before the model even knows it needs it.
Building Persistent Memory for AI Agents
AI coding assistants are goldfish with a PhD. They can solve complex problems within a single session — refactoring a module, debugging a race condition, designing an API — but the moment the conversation ends, everything they learned about your project evaporates. After months of building software with Claude Code, I found myself re-explaining the same project conventions, the same architectural decisions, the same mistakes we’d already caught and fixed, at the start of every session. CLAUDE.md files help, but they’re static and they don’t scale. You can’t stuff a dozen projects’ worth of design context into a single markdown file. So I built memstore: a persistent memory system that gives AI agents durable, searchable knowledge across sessions, projects, and machines.