Your AI agent forgets everything between sessions. Here's how we fixed that for $6/month.

The Memory Problem

AI assistants are goldfish. Ask Claude or GPT something today, and tomorrow it's gone. OpenClaw and similar agent frameworks inherit this β€” sessions are isolated, context doesn't persist.

This means:

  • Repeating preferences every conversation
  • Re-explaining project context
  • Agents making the same mistakes twice
  • No institutional knowledge building over time

The Solution: Semantic Vector Memory

We built a memory layer using:

  • Cloudflare Vectorize β€” Vector database for embeddings
  • Workers AI β€” bge-base-en-v1.5 for embedding generation
  • R2 β€” Persistent storage for raw memory files
  • Workers β€” REST API for query/ingest

Total cost: ~$6/month for multiple agents with thousands of memories.

How Auto-Capture Works

After every agent response, a hook analyzes the conversation for:

const patterns = {
  decision: /decided|chose|going with|settled on/i,
  correction: /actually|no,|wrong|correction/i,
  preference: /prefer|always|never|like|hate/i,
  learning: /learned|TIL|realized|discovered/i
};

Matches get embedded and stored automatically. No manual "remember this" commands.

How Auto-Recall Works

Before every response:

  1. User message gets embedded
  2. Vectorize returns top-k similar memories
  3. Memories inject into system context
  4. Agent responds with historical awareness
Query: "What did we decide about storage?"
β†’ Vector search (cosine similarity)
β†’ Returns: "Decision: Triple-layer storage - Git + D1 + R2"
β†’ Agent responds with context it "remembers"

The Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 OpenClaw                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Session │───▢│ Memory   │───▢│ Vector β”‚ β”‚
β”‚  β”‚         │◀───│ Plugin   │◀───│ Worker β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β–Ό             β–Ό             β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   R2    β”‚  β”‚Vectorize β”‚  β”‚Workers  β”‚
    β”‚ (files) β”‚  β”‚ (index)  β”‚  β”‚   AI    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Results

  • 39 indexed memories across 2 agents
  • 200-400ms query latency
  • 0.95 threshold prevents duplicate storage
  • Cross-agent search β€” agents can access each other's memories
  • Type filtering β€” query only decisions, or only corrections

Why This Matters

AI agents that remember become genuinely useful assistants. They:

  • Build institutional knowledge
  • Learn from corrections
  • Maintain consistent preferences
  • Avoid repeating mistakes

This is table stakes for production AI systems. The fact that most frameworks don't include it out of the box is wild.


Built with OpenClaw + Cloudflare. The memory layer that makes AI actually useful.