Your AI agent forgets everything between sessions. Here's how we fixed that for $6/month.
The Memory Problem
AI assistants are goldfish. Ask Claude or GPT something today, and tomorrow it's gone. OpenClaw and similar agent frameworks inherit this β sessions are isolated, context doesn't persist.
This means:
- Repeating preferences every conversation
- Re-explaining project context
- Agents making the same mistakes twice
- No institutional knowledge building over time
The Solution: Semantic Vector Memory
We built a memory layer using:
- Cloudflare Vectorize β Vector database for embeddings
- Workers AI β bge-base-en-v1.5 for embedding generation
- R2 β Persistent storage for raw memory files
- Workers β REST API for query/ingest
Total cost: ~$6/month for multiple agents with thousands of memories.
How Auto-Capture Works
After every agent response, a hook analyzes the conversation for:
const patterns = {
decision: /decided|chose|going with|settled on/i,
correction: /actually|no,|wrong|correction/i,
preference: /prefer|always|never|like|hate/i,
learning: /learned|TIL|realized|discovered/i
};
Matches get embedded and stored automatically. No manual "remember this" commands.
How Auto-Recall Works
Before every response:
- User message gets embedded
- Vectorize returns top-k similar memories
- Memories inject into system context
- Agent responds with historical awareness
Query: "What did we decide about storage?"
β Vector search (cosine similarity)
β Returns: "Decision: Triple-layer storage - Git + D1 + R2"
β Agent responds with context it "remembers"
The Architecture
βββββββββββββββββββββββββββββββββββββββββββββββ
β OpenClaw β
βββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββ ββββββββββββ ββββββββββ β
β β Session βββββΆβ Memory βββββΆβ Vector β β
β β ββββββ Plugin ββββββ Worker β β
β βββββββββββ ββββββββββββ ββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββ
βΌ βΌ βΌ
βββββββββββ ββββββββββββ βββββββββββ
β R2 β βVectorize β βWorkers β
β (files) β β (index) β β AI β
βββββββββββ ββββββββββββ βββββββββββ
Results
- 39 indexed memories across 2 agents
- 200-400ms query latency
- 0.95 threshold prevents duplicate storage
- Cross-agent search β agents can access each other's memories
- Type filtering β query only decisions, or only corrections
Why This Matters
AI agents that remember become genuinely useful assistants. They:
- Build institutional knowledge
- Learn from corrections
- Maintain consistent preferences
- Avoid repeating mistakes
This is table stakes for production AI systems. The fact that most frameworks don't include it out of the box is wild.
Built with OpenClaw + Cloudflare. The memory layer that makes AI actually useful.