Problem: AI Assistants That Remember Everything (On Someone Else's Server)
You've used ChatGPT. It forgets everything between sessions. You've tried AI assistants that remember—but your conversations live in their cloud database, accessible to the company, subject to their retention policies, and vulnerable to data breaches.
You'll learn:
- How OpenClaw stores memory as local Markdown files
- Why file-first storage beats cloud databases for privacy
- How semantic search works without sending data externally
- Trade-offs between local and remote embeddings
Time: 12 min | Level: Intermediate
Why This Matters
Most AI assistants store your data in ways you can't inspect, edit, or fully delete. OpenClaw takes a radically different approach: your data never leaves your machine unless you explicitly choose a remote embedding provider.
Common pain points OpenClaw solves:
- "I don't know what my AI remembers about me"
- "I can't delete specific memories without wiping everything"
- "My conversations are analyzed by the AI company for training"
- "I need persistent memory across sessions without vendor lock-in"
How OpenClaw's Memory System Works
The File-First Architecture
OpenClaw uses plain Markdown files as the source of truth. No proprietary database, no cloud sync required.
Storage locations (default):
~/.openclaw/workspace/
├── MEMORY.md # Long-term curated knowledge
├── memory/
│ ├── 2026-02-07.md # Today's daily log
│ ├── 2026-02-06.md # Yesterday's log
│ └── project-notes.md # Custom memory files
└── USER.md # Your preferences and identity
Why Markdown?
- Human-readable: open
MEMORY.mdin any text editor - Version-controllable: track changes with git
- AI-friendly: easily parsed by language models
- No vendor lock-in: migrate to any tool that reads text files
Two Types of Memory
Daily Logs (Ephemeral)
Created automatically at /memory/YYYY-MM-DD.md. These capture:
- Running context from conversations
- Decisions made during the day
- Activities and task progress
Auto-loading behavior:
- Session starts → loads today + yesterday
- Provides recent temporal context
- Prevents context window overflow
Long-term Memory (Curated)
Stored in MEMORY.md. This contains:
- Personal preferences ("I prefer TypeScript over JavaScript")
- Project conventions ("Use kebab-case for filenames")
- Critical decisions ("Decided to use SQLite instead of PostgreSQL")
- Contact information and relationships
Manual editing encouraged:
# MEMORY.md example
## Preferences
- Email response time: within 4 hours during work hours
- Code style: Prettier defaults, 2-space indent
- Meeting preference: Zoom over Google Meet
## Projects
### HomeAutomation
- Stack: Node.js, MQTT, Home Assistant
- Repo: github.com/username/home-automation
- Started: 2025-11-12
Semantic Search: How It Works Locally
OpenClaw uses hybrid search combining two methods:
1. Vector Similarity (Embeddings)
Local mode (fully private):
{
"memorySearch": {
"provider": "local",
"local": {
"modelPath": "~/.openclaw/models/all-MiniLM-L6-v2.gguf"
}
}
}
- Uses
node-llama-cppto run embedding models locally - Auto-downloads GGUF models from HuggingFace on first use
- Requires ~1GB disk space
- No data sent to external APIs
How it works:
- Text chunks → local embedding model → vector representations
- Vectors stored in SQLite using
sqlite-vecextension - Search query → vector → cosine similarity match
- Returns relevant snippets without full file payload
Remote mode (optional, faster):
{
"memorySearch": {
"provider": "openai",
"remote": {
"apiKey": "your-key-here"
}
}
}
Providers: OpenAI, Voyage, Gemini. Uses their embedding APIs (data leaves your machine).
2. Keyword Search (BM25)
Uses SQLite's FTS5 (Full-Text Search) for keyword matching:
- No embeddings needed
- Extremely fast
- Works offline always
- Complements semantic search
Weighted score fusion:
final_score = (0.7 × vector_score) + (0.3 × bm25_score)
This catches both:
- Conceptually similar text (vectors)
- Exact keyword matches (BM25)
Chunking Strategy
OpenClaw uses a sliding window with overlap:
Target: ~400 tokens per chunk
Overlap: 80 tokens between chunks
Why overlap?
- Prevents context split mid-sentence
- Improves retrieval quality at chunk boundaries
- Example: "The API key for..." gets preserved across chunks
Privacy Comparison
| Feature | OpenClaw (Local) | Cloud AI Assistants |
|---|---|---|
| Data location | Your machine | Vendor's servers |
| Encryption at rest | Your responsibility | Vendor-managed |
| Data retention | Forever (until you delete) | Vendor policy |
| Third-party access | None (you control) | Training data, subpoenas |
| Offline operation | Yes (with local embeddings) | No |
| Manual editing | Direct file access | API/UI only |
| Portability | Plain text files | Export tools (limited) |
Advanced: QMD Backend (Experimental)
For users wanting more powerful search, OpenClaw supports QMD (Query-Memory-Documents):
{
"memory": {
"backend": "qmd"
}
}
QMD adds:
- BM25 + vectors + reranking
- Query expansion for better recall
- Runs fully locally via Bun
Setup:
# Install QMD CLI
bun install -g github.com/tobi/qmd
# Verify binary is on PATH
which qmd
First search triggers:
- Auto-downloads GGUF models (reranker, query expansion)
- Models cache to
~/.openclaw/agents/<agentId>/qmd/ - No separate Ollama daemon needed
Security Considerations
What OpenClaw Does Right
✅ No telemetry by default: your data stays on your machine
✅ Transparent storage: inspect files anytime
✅ Granular control: delete specific memories manually
✅ No API key leakage in memory: credentials stored separately in ~/.openclaw/credentials
Risks You Should Know
⚠️ Plain text on disk: if your machine is compromised, memory files are readable
⚠️ Predictable locations: ~/.openclaw/ is a known target for infostealers
⚠️ No encryption at rest (by default): consider full-disk encryption
⚠️ Group chat leakage: personal memory loads in group sessions unless configured otherwise
Mitigation strategies:
{
"agents": {
"defaults": {
"sandbox": {
"mode": "non-main" // Isolate group sessions
}
}
},
"channels": {
"telegram": {
"groups": {
"*": {
"requireMention": true // Prevent passive listening
}
}
}
}
}
1Password's security advice:
- Run OpenClaw on a dedicated machine (Mac mini, VPS)
- Give it its own email and password vault
- Treat it like a new employee with limited access
Performance: Local vs Remote Embeddings
Local embeddings (all-MiniLM-L6-v2):
- First search: ~2-5s (model loading)
- Subsequent: ~200-500ms
- Disk: ~1GB for model + embeddings
- CPU: moderate (fine on M1/M2 Macs)
Remote embeddings (OpenAI text-embedding-3-small):
- Latency: ~150-300ms per request
- Cost: $0.02 per 1M tokens
- Batch API: 50% discount for bulk indexing
- No local model storage needed
Hybrid approach:
{
"memorySearch": {
"provider": "local",
"fallback": "openai" // Use remote if local fails
}
}
Multi-Device Sync (Manual)
OpenClaw's default setup does not sync across devices. Each machine has its own workspace.
Workarounds:
1. Git-based sync:
cd ~/.openclaw/workspace
git init
git add MEMORY.md memory/
git commit -m "Sync memory"
git push origin main
# On other device
git pull origin main
2. Third-party plugins:
- MemoryPlugin: cloud vault for cross-device memory (defeats local-first goal)
- openclaw-graphiti-memory: temporal knowledge graphs (experimental)
3. Shared VPS: Run OpenClaw on a VPS, connect from multiple clients via SSH tunnel or Tailscale:
# On VPS
openclaw gateway --port 18789
# On laptop
ssh -L 18789:localhost:18789 user@vps-ip
Automatic Memory Flush (Context Compaction)
When sessions approach context limits, OpenClaw triggers auto-flush:
Trigger point:
context_window - reserve_tokens - soft_threshold
Example:
- Context window: 200K tokens
- Reserve: 20K
- Soft threshold: 4K
- Flush at: ~176K tokens
What happens:
- Silent agentic turn runs
- Model reviews current context
- Writes important info to
MEMORY.mdor daily log - Returns
NO_REPLYif nothing to save - Compaction proceeds without data loss
Why this matters:
- Prevents losing context during compaction
- Durable memories written before eviction
- Fully automatic, no user action needed
What You Learned
- OpenClaw uses plain Markdown files as the source of truth for memory
- Local embeddings keep your data private; remote embeddings trade privacy for speed
- Hybrid search (vectors + BM25) provides better recall than either alone
- Daily logs capture ephemeral context; MEMORY.md stores curated knowledge
- Security requires dedicated hardware or careful sandboxing for group chats
Limitations:
- No built-in multi-device sync (use git or VPS)
- Plain text on disk (encrypt your drive)
- Daily logs accumulate (~100-500MB per year)