MemNexus is in gated preview — invite only. Learn more
Back to Blog
·11 min read

Your AI Agent Keeps Missing Half the Story. We Fixed That.

Memory Digest assembles complete project briefings with a single command. Gathers up to 100 memories, expands via topic and entity graphs, and synthesizes with LLM. No more missed context.

Claude Opus 4.6

AI, edited by Harry Mower

featuredigestai-agentssynthesis

You have 50 memories about the customer portal project. You ask your AI agent to "refresh your memory on the customer portal status." So the agent searches. Finds 12 relevant memories. Reads them. Presents a summary.

"Billing integration: NOT STARTED."

Except billing is done. It shipped two weeks ago. There are four PRs, three debugging sessions, and a deployment runbook all saved as memories. The agent just missed them.

This happens constantly. AI agents are excellent at targeted lookups — "what was the fix for bug #295?" — but terrible at broad synthesis. They search, hit their context window, and stop. They miss the billing work because it doesn't use the keyword "customer portal." They miss the deployment details because those were saved in a different conversation. They give you an incomplete picture and call it complete.

So we built something better.

The Idea

The problem isn't that the information doesn't exist. It's that gathering all the relevant pieces, organizing them coherently, and synthesizing them into a complete briefing requires multiple rounds of searching, fetching, and cross-referencing. Most AI agents give up after the first or second round.

We watched an agent perform 9 separate search-and-fetch operations trying to answer "what's the status of the customer portal?" It found memories about the frontend, the API routes, the database schema. But it missed the billing integration entirely because those memories mentioned "payment processing" more than "customer portal."

The agent's final answer: "Billing appears to be not started." The reality: billing was complete, tested, and deployed.

What if you could skip all those rounds of searching and just ask for a complete briefing?

Introducing Memory Digest

Memory Digest is a single command that does what AI agents struggle to do manually:

  1. Gather all relevant memories with a high-limit search
  2. Organize them by discovering related context through topic and entity connections
  3. Synthesize a coherent briefing using LLM analysis

One command. One complete answer. Nothing missed.

mx memories digest --query "customer portal project" --recent 7d

What It Looks Like in Practice

The Basic Digest

You ask for a project status update:

mx memories digest --query "authentication redesign"

MemNexus searches your entire memory base, finds all related memories (not just keyword matches — also memories sharing entities like "JWT", "auth-service", or "user-sessions"), organizes them by conversation and cross-references, and generates a structured briefing:

Authentication Redesign: Current Status

## Overview
The authentication system has been migrated from API key auth to JWT-based
authentication with refresh token rotation. Work completed across 4 PRs
between Jan 28 - Feb 3.

## Implementation Complete
- JWT token generation with 15-minute expiry
- Refresh token rotation using Redis backing store
- Backward compatibility layer for legacy API keys
- Rate limiting per user (100 req/min)

## Deployment Status
- Deployed to staging: Feb 1
- Production rollout: Feb 3, 14:00 UTC
- Monitoring shows 99.8% success rate
- Legacy API key usage down to 12% of traffic

## Outstanding Items
- Docs update pending (issue #892)
- iOS SDK needs refresh token support
- Consider removing legacy auth after March 1

Sources: 14 memories across 3 conversations

No missed billing integrations. No "appears to be not started." Just a complete, accurate briefing assembled from everything you've saved.

Four Output Formats

Different contexts need different presentations:

Structured (default) — Headers, bullet points, clear sections:

mx memories digest --query "Q1 roadmap"

Narrative — Flowing prose, chronological story:

mx memories digest --query "payment bug investigation" --digest-format narrative

This format works well for understanding how something evolved. "We initially thought it was a race condition, but after adding logging discovered it was actually..."

Timeline — Date-ordered events with status indicators:

mx memories digest --query "API migration" --digest-format timeline

Perfect for reconstructing what happened when. Each entry shows the date, what changed, and current versus superseded information.

Status Report — Executive summary focused on progress:

mx memories digest --query "customer portal" --digest-format status-report

Great for standups, weekly reviews, or handing off to teammates. Highlights completed work, in-progress items, blockers, and next steps.

Time Windows

Scope the digest to recent work:

# Last 24 hours
mx memories digest --query "project X" --recent 24h

# This week
mx memories digest --query "debugging" --recent 7d

# Last month
mx memories digest --query "infrastructure" --recent 30d

Machine-Readable Output

Feed digest results to another AI agent or tool:

# JSON output with source memory IDs
mx memories digest --query "API status" --format json

# LLM-optimized format (markdown with minimal formatting)
mx memories digest --query "deployment runbook" --format llm

The --format llm output is designed for consumption by other AI systems. Clean markdown, no ASCII art or formatting that confuses tokenization.

See What Was Used

Verify which memories contributed to the digest:

mx memories digest --query "project status" --show-sources

The output includes the memory IDs that were used. If something looks wrong, you can inspect the source memories directly with mx memories get <id>.

Under the Hood

The digest pipeline runs in three stages.

Stage 1: Gather

High-limit semantic search (up to 100 memories) finds everything potentially relevant. This isn't a normal search with a 10-result cutoff. It casts a wide net.

You can filter the gather stage:

  • --recent 7d limits to the last week
  • --topics "implementation,completed" scopes to specific topics
  • --conversation-ids "conv_abc,conv_def" pulls from specific work sessions

Time: 50-70ms for the search query.

Stage 2: Organize

This is where digest becomes more powerful than manual search.

The organize stage doesn't just take the raw search results. It expands context by:

Topic Graph Traversal — A memory about "payment integration" gets linked to memories about "payment-service" and "billing-api" because they share topic relationships. This catches the work an agent would miss by only searching for "customer portal."

Entity Co-Occurrence — Memories mentioning both "auth-service" and "user-sessions" get connected even if they don't share keywords. Entity relationships create bridges between related work.

Conversation Grouping — Memories from the same work session get organized together, preserving the narrative flow.

This expansion phase is why digest finds the complete picture. It doesn't just match keywords. It follows the knowledge graph.

Time: Graph traversal adds 20-50ms.

Stage 3: Synthesize

All the gathered and organized memories get sent to an LLM with a synthesis prompt tailored to the requested format. The model reads through everything and generates a coherent briefing.

The prompt instructs the model to:

  • Prioritize current information over superseded memories
  • Group related work logically
  • Highlight outcomes and decisions
  • Note any gaps or uncertainties
  • Cite specific source memories when making claims

Time: 2-5 seconds for LLM generation.

Total pipeline latency: 2.5 - 6 seconds.

Caching Makes It Instant

The first time you run a digest query, it takes 2-6 seconds to generate. The second time? Near-instant.

We added a smart cache that stores completed digests. Identical queries return near-instantly. Different queries generate fresh results.

The cache invalidates per-user whenever you create a new memory. This keeps digests accurate — you never get stale results that miss your most recent work.

From your perspective, caching is completely transparent. No API changes, no schema updates, no flags to set. Repeated queries are just faster.

Design Decisions

Why This Synthesis Model?

We tested several models for synthesis and chose the one that consistently produced the best structured output for technical content. It handles code snippets well, generates clean markdown, and runs fast enough that digest feels responsive.

For a feature designed around "ask once, get everything," speed and cost matter. We optimized for the 90% case.

You can request different formats (narrative, status-report) to get different synthesis styles from the same model.

Default to Structured Format

After testing with developers, structured format was the clear winner for most queries. People want to scan sections quickly, not read prose paragraphs. Bullet points, headers, and clear grouping make digests actionable.

Narrative and timeline formats are there for the cases where story and chronology matter more than scanability.

Graph Expansion vs Just Searching More

We considered just increasing the search result limit to 100 and calling it done. But that doesn't solve the core problem. More results don't help if the relevant results aren't in the top 100 by semantic similarity.

Graph expansion finds memories that wouldn't rank highly on semantic search but are connected through shared topics and entities. The billing integration memories mention "payment" more than "customer portal" — semantic search might rank them 150th. Graph expansion pulls them in because they share entities with the portal work.

This is the difference between a tool that searches harder and a tool that understands your knowledge graph.

5-Minute Cache TTL

We picked 5 minutes as the cache TTL because digest is often used in bursts. You ask for a project status, scan it, ask a follow-up question with slightly different parameters. That second query should be instant. After 5 minutes, you're probably doing something else, and stale results start to matter more than speed.

Cache invalidation on new memory creation means you never wait 5 minutes to see your latest work reflected. Write a memory, run the same digest, get the updated result immediately.

Use Cases

AI Agent Context Loading

Your AI coding assistant needs to understand the current state of a project:

mx memories digest --query "API v2 migration" --format llm | pbcopy

Paste the digest into your AI chat. The agent gets a complete, organized briefing instead of having to search multiple times and potentially miss context.

Monday Morning Standup Prep

You need to report what you worked on last week:

mx memories digest --query "my work" --recent 7d --digest-format status-report

One command. Complete summary. Completed items, in-progress work, blockers, next steps.

Project Handoff

A teammate is taking over your work. Give them the full context:

mx memories digest --query "checkout flow redesign" --digest-format narrative

Narrative format tells the story chronologically. They see what you tried first, what didn't work, why you pivoted, and where you landed.

Debugging Retrospective

You spent three days chasing a bug. What actually happened?

mx memories digest --query "memory leak investigation" --recent 7d --digest-format timeline

Timeline format reconstructs the debugging journey. Initial hypothesis, dead ends, breakthrough moment, final fix, lessons learned.

Architecture Decision Records

You made significant technical decisions across multiple conversations:

mx memories digest --query "database choice for analytics" --topics decision

Digest pulls every decision memory related to the analytics database: alternatives considered, trade-offs evaluated, final choice, rationale. One complete ADR assembled from your distributed decision-making process.

What This Replaces

Before digest, getting a complete project briefing required:

  1. Search for high-level memories: mx memories search --query "project name"
  2. Read the results, note missing areas
  3. Search for specifics: mx memories search --query "project name authentication"
  4. Search for related systems: mx memories search --query "project name database"
  5. Fetch individual memories: mx memories get <id1> <id2> <id3>...
  6. Manually synthesize everything you read into a coherent picture
  7. Realize you missed something, search again

This works fine for targeted questions ("what was the rate limit config?"). It breaks down for broad questions ("what's the status of everything?").

Digest does steps 1-6 automatically, and step 7 doesn't happen because graph expansion finds the related work you would have missed.

The Difference

| Manual Search + Fetch | Memory Digest | |----------------------|---------------| | 6-9 separate commands | 1 command | | Might miss related context | Graph expansion finds connections | | Manual synthesis required | LLM generates coherent briefing | | Depends on keyword overlap | Follows entity and topic relationships | | Results in flat list | Organized by conversation and topic |

Try It Now

Memory Digest is available now:

  • Core API: Deployed (all 4 phases complete)
  • CLI: v1.7.33
  • SDK: v1.32.1 (TypeScript)

Update your CLI:

mx update
mx --version  # Should show 1.7.33

Get your first digest:

mx memories digest --query "what did I work on today?" --recent 24h

Ask a broad question. See what you get. Check if it missed anything. (It won't.)


Memory Digest with graph-based context expansion and intelligent caching is available now in MemNexus CLI v1.7.33 and Core API.

Ready to give your AI a memory?

Join the waitlist for early access to MemNexus

Request Access

Get updates on AI memory and developer tools. No spam.