Giving AI Agents Real Memory with Cloudflare Vectorize

Your AI agent forgets everything between sessions. Here's how we fixed that for $6/month.

The Memory Problem

AI assistants are goldfish. Ask Claude or GPT something today, and tomorrow it's gone. OpenClaw and similar agent frameworks inherit this — sessions are isolated, context doesn't persist.

This means:

Repeating preferences every conversation
Re-explaining project context
Agents making the same mistakes twice
No institutional knowledge building over time

The Solution: Semantic Vector Memory

We built a memory layer using:

Cloudflare Vectorize — Vector database for embeddings
Workers AI — bge-base-en-v1.5 for embedding generation
R2 — Persistent storage for raw memory files
Workers — REST API for query/ingest

Total cost: ~$6/month for multiple agents with thousands of memories.

How Auto-Capture Works

After every agent response, a hook analyzes the conversation for:

const patterns = {
  decision: /decided|chose|going with|settled on/i,
  correction: /actually|no,|wrong|correction/i,
  preference: /prefer|always|never|like|hate/i,
  learning: /learned|TIL|realized|discovered/i
};

Matches get embedded and stored automatically. No manual "remember this" commands.

How Auto-Recall Works

Before every response:

User message gets embedded
Vectorize returns top-k similar memories
Memories inject into system context
Agent responds with historical awareness

Query: "What did we decide about storage?"
→ Vector search (cosine similarity)
→ Returns: "Decision: Triple-layer storage - Git + D1 + R2"
→ Agent responds with context it "remembers"

The Architecture

┌─────────────────────────────────────────────┐
│                 OpenClaw                     │
├─────────────────────────────────────────────┤
│  ┌─────────┐    ┌──────────┐    ┌────────┐ │
│  │ Session │───▶│ Memory   │───▶│ Vector │ │
│  │         │◀───│ Plugin   │◀───│ Worker │ │
│  └─────────┘    └──────────┘    └────────┘ │
└─────────────────────────────────────────────┘
                       │
         ┌─────────────┼─────────────┐
         ▼             ▼             ▼
    ┌─────────┐  ┌──────────┐  ┌─────────┐
    │   R2    │  │Vectorize │  │Workers  │
    │ (files) │  │ (index)  │  │   AI    │
    └─────────┘  └──────────┘  └─────────┘

Results

39 indexed memories across 2 agents
200-400ms query latency
0.95 threshold prevents duplicate storage
Cross-agent search — agents can access each other's memories
Type filtering — query only decisions, or only corrections

Why This Matters

AI agents that remember become genuinely useful assistants. They:

Build institutional knowledge
Learn from corrections
Maintain consistent preferences
Avoid repeating mistakes

This is table stakes for production AI systems. The fact that most frameworks don't include it out of the box is wild.

Built with OpenClaw + Cloudflare. The memory layer that makes AI actually useful.

Giving AI Agents Real Memory with Cloudflare Vectorize

The Memory Problem

The Solution: Semantic Vector Memory

How Auto-Capture Works

How Auto-Recall Works

The Architecture

Results

Why This Matters

Related Posts

Building a Knowledge-Powered AI Agent: Handy Beaver's Multi-Layer Architecture

Building Lil Beaver: Knowledge Base + Social Content Generation for Service Businesses

Memory Log: March 17, 2026