Most AI agents are amnesiac by design. They're sharp within a session and blank at the start of the next one. Every conversation begins from zero — the same introductions, the same context-setting, the same explanation of how the user likes to work.

This isn't a bug in the model. It's a structural property of stateless inference APIs. But it's one you can work around. The pattern is to build a memory layer that persists between sessions and injects the right context at the right time.

There are three distinct memory patterns worth implementing. Each serves a different purpose, compounds differently over time, and requires a slightly different approach.

Pattern 1: Session Memory

Session memory is the most immediate win. At the end of each session, you save a summary of what happened. At the start of the next, you load the last few sessions and inject them into context.

Without this, every session with a coding assistant starts with re-establishing what the user is working on. What file they're in. What problem they were solving. What they tried that didn't work. That setup cost is friction — and it accumulates.

With session memory, the assistant opens knowing where things left off.

# At session end: save what happened
mx memories create \
  --conversation-id "conv_alice_001" \
  --content "Helped Alice debug Stripe webhook. Root cause: missing idempotency key on subscription creation. Fixed in PR #89. She's next looking at the retry logic in payment-processor.ts."

# At session start: load recent context
mx memories search --query "Alice recent work" --brief

In the TypeScript SDK, the retrieve-then-inject loop looks like this:

import { MemnexusClient } from "@memnexus-ai/mx-typescript-sdk";

const memory = new MemnexusClient({ apiKey: process.env.MX_API_KEY });

// Load the last 3 sessions for this user
const recent = await memory.memories.search({
  query: currentTopic,
  topics: [userId],
  limit: 3,
});

const sessionContext = recent.data
  .map((r) => r.memory.content)
  .join("\n\n");

// Inject into system prompt before calling the model
const systemPrompt = `You are a coding assistant.

Previous session context:
${sessionContext}

Help the user continue their work.`.trim();

The key detail: don't save raw transcripts. Save distilled summaries — what was decided, what was built, what comes next. A second LLM call at session end that extracts the key facts from the conversation is a small cost for a much cleaner memory store.

Best for: coding assistants, research agents, long-running project work.

Pattern 2: Preference Memory

Users state their preferences once. They expect the agent to remember forever. They almost always have to repeat themselves.

"I prefer bullet points over paragraphs." "Use TypeScript, not JavaScript." "Don't suggest ORMs — I write raw SQL." These are high-value signals that shape every response the agent gives, but they're only stated explicitly once, early in the relationship.

Preference memory captures these signals when they surface and injects them at every subsequent session start.

# Capture a stated preference
mx memories create \
  --conversation-id "prefs-alice" \
  --content "Alice prefers TypeScript. Writes tests first. Uses Result types, not exceptions. Wants concise answers — skip the preamble." \
  --topics "preferences"

# At session start: load all preferences for this user
mx memories search --query "Alice preferences" --topics "preferences" --brief

The retrieval side is straightforward — search for preference memories tagged to this user and inject them near the top of the system prompt, before the user's message. The model applies them consistently without the user needing to repeat themselves.

Named memories work well here for preferences that evolve. Create a named memory like alice-preferences and update it as new signals emerge. The latest version supersedes the previous one, and retrieval by name is deterministic — no search required.

# Named preference memory — deterministic retrieval, versioned
mx memories create \
  --name "alice-preferences" \
  --content "TypeScript only. TDD. Result types. Terse responses. No ORMs." \
  --conversation-id "NEW"

# Retrieve at session start
mx memories get --name "alice-preferences"

Best for: personal assistants, customer support agents, any agent with repeat users.

Pattern 3: Knowledge Memory

This is the pattern that makes month 6 dramatically more useful than month 1.

Session memory covers what happened in recent sessions. Preference memory covers how a user likes to work. Knowledge memory covers what the agent has learned — facts, decisions, domain knowledge, discovered gotchas — across its entire operating history.

When a coding assistant helps debug a tricky issue with a third-party library, that knowledge should persist. The next time anyone on the team hits the same issue, the agent already knows the root cause and the fix. When a research agent learns something specific about a domain — a regulatory nuance, an architectural pattern, a competitive insight — that knowledge compounds. The agent gets smarter over time, not just more experienced with individual users.

# Save a learned fact when something important is discovered
mx memories create \
  --conversation-id "conv_001" \
  --content "Stripe webhook signature verification fails silently when the raw body is parsed before reaching the middleware. Always use express.raw() on the webhook route, not express.json(). Learned debugging PR #89."

# Six weeks later: another developer hits a Stripe issue
mx memories search --query "Stripe webhook" --brief
# Returns the fix from PR #89 — the agent already knows this

The value is semantic retrieval. You don't need to know exactly what was stored or when. A query like "what do we know about payment handling" returns everything relevant — documented decisions, discovered bugs, architectural choices — regardless of when it was stored.

In the SDK:

// Retrieve knowledge relevant to the current topic
const knowledge = await memory.memories.search({
  query: "Stripe webhook payment processing",
  limit: 5,
});

// This returns memories from months ago that are directly relevant
// The user asking today benefits from everything learned before

Knowledge memory is where the compounding effect lives. A memory store that's been accumulating for six months contains institutional knowledge that no single session could build. The agent surfaces it when relevant, which makes every answer better calibrated to the specific codebase, domain, or context it's operating in.

Best for: team knowledge bases, domain-specialized research assistants, long-lived coding agents.

Putting it together

The three patterns layer cleanly:

At session start: load preferences (named memory), recent sessions (session memory), and topic-relevant knowledge (knowledge memory) — or use the build_context command to retrieve all three in a single call
Build the system prompt with all three injected
Call the model
At session end: extract and save what's worth keeping — new preferences stated, session summary, any domain facts learned

The total overhead is three reads and one write per session. The returns compound.

const [prefs, recentSessions, knowledge] = await Promise.all([
  memory.memories.get({ name: `${userId}-preferences` }),
  memory.memories.search({ query: topic, topics: [userId], limit: 3 }),
  memory.memories.search({ query: topic, limit: 5 }),
]);

const systemPrompt = buildSystemPrompt(prefs, recentSessions, knowledge);

Most AI agents skip all of this and wonder why users don't come back. The agents that implement these three patterns are the ones that feel like they actually know you — because they do.

MemNexus handles the storage, semantic indexing, and retrieval infrastructure. Install the SDK with npm install @memnexus-ai/mx-typescript-sdk and read the full SDK guide to get started.

MemNexus is in invite-only preview. Join the waitlist to get early access.

Three Memory Patterns Every AI Agent Needs

Pattern 1: Session Memory

Pattern 2: Preference Memory

Pattern 3: Knowledge Memory

Putting it together

Give your coding agents memory that persists

Related Posts

AI Agent Memory Architecture: How Persistent Memory Actually Works Under the Hood

What an MCP Memory Server Actually Does (And How MemNexus Implements One)

Memory in Agentic Frameworks: LangChain, CrewAI, AutoGen, and What They're All Missing