Problem: Cloud AI Services Know Everything About You
You want an AI assistant but don't want your conversations, files, or commands sent to OpenAI, Anthropic, or Google servers.
You'll learn:
- Run OpenClaw with 100% local models (no cloud required)
- Keep all data on your machine with Ollama integration
- Connect messaging apps while maintaining privacy
Time: 30 min | Level: Intermediate
Why This Matters
Cloud AI services process your data on their servers. Every prompt, every file upload, every conversation goes through their infrastructure. OpenClaw with Ollama changes this - your AI runs entirely on your hardware.
Common privacy concerns:
- API providers log your prompts for training
- Conversations stored on third-party servers
- Subscription costs add up ($20-200/month)
- Network dependency - no internet = no AI
What You Need
Hardware requirements:
- CPU: 4+ cores (8+ recommended)
- RAM: 16GB minimum (32GB for larger models)
- Storage: 50GB+ free space
- GPU: Optional but 5-10x faster (NVIDIA preferred)
Software requirements:
- Ubuntu 22.04/24.04, macOS 14+, or Windows 10+
- Node.js 22+ (installer handles this)
- Terminal access
- No cloud accounts needed
Solution
Step 1: Install Ollama
Ollama runs AI models locally on your machine.
On Linux/macOS:
curl -fsSL https://ollama.com/install.sh | sh
On Windows: Download from https://ollama.com/download/windows
Verify installation:
ollama --version
Expected: ollama version 0.1.22 or newer
Step 2: Download a Local Model
Pull a model optimized for OpenClaw tasks:
# Recommended: Best balance of speed and capability
ollama pull qwen2.5-coder:32b-instruct
# Alternative: Faster on lower-end hardware
ollama pull glm-4.7-flash
# Alternative: Complex reasoning tasks
ollama pull deepseek-r1:32b
Why qwen2.5-coder: Strong tool calling support, 128k context window, works well with 16GB RAM.
Download time: 10-45 minutes depending on model size and connection.
Verify the model:
ollama list
You should see your model listed with its size.
Step 3: Install OpenClaw
The easiest method uses Ollama's integrated command:
ollama launch openclaw
This automatically:
- Installs OpenClaw
- Configures it to use Ollama
- Starts the gateway service
If you prefer manual installation:
npm install -g openclaw@latest
openclaw onboard
During onboarding:
- Quick Start: Select this option
- Skip Cloud: Choose "Skip cloud providers"
- Select Ollama: Pick from local providers
- Messaging Platform: Choose later or skip for now
Expected: Gateway starts on port 18789 in the background.
Step 4: Configure for Complete Offline Operation
Edit the configuration file:
nano ~/.openclaw/openclaw.json
Replace with this minimal offline config:
{
"agents": {
"defaults": {
"model": {
"primary": "ollama/qwen2.5-coder:32b-instruct",
"fallbacks": []
},
"sandbox": {
"mode": "non-main",
"docker": {
"network": "none"
}
}
}
},
"models": {
"providers": {
"ollama": {
"baseUrl": "http://127.0.0.1:11434/v1",
"apiKey": "ollama-local",
"api": "openai-completions",
"models": [
{
"id": "qwen2.5-coder:32b-instruct",
"name": "qwen2.5-coder:32b-instruct",
"reasoning": false,
"input": ["text"],
"cost": {
"input": 0,
"output": 0
},
"contextWindow": 131072,
"maxTokens": 16384
}
]
}
}
},
"gateway": {
"mode": "local",
"port": 18789,
"bind": "loopback"
}
}
Key privacy settings:
"network": "none"- Blocks sandbox network access"fallbacks": []- No cloud model fallback"bind": "loopback"- Only local connections allowedbaseUrlpoints to localhost - never external
Save and restart:
# If using systemd (Linux)
sudo systemctl restart openclaw-gateway
# If manually running
openclaw gateway restart
Step 5: Set Ollama Context Window
OpenClaw needs large context for multi-step tasks:
# Set 128k context (recommended)
export OLLAMA_NUM_CTX=131072
# Make it permanent (Linux/macOS)
echo 'export OLLAMA_NUM_CTX=131072' >> ~/.bashrc
source ~/.bashrc
Why 128k context: OpenClaw system prompts are large. Smaller context windows cause task failures.
Step 6: Verify Offline Operation
Test the model:
ollama run qwen2.5-coder:32b-instruct
Type: Hello, introduce yourself
Test OpenClaw gateway:
openclaw status
Expected output:
Gateway: ✓ Running (http://127.0.0.1:18789)
Model: ollama/qwen2.5-coder:32b-instruct
Status: Ready
Verify it's truly offline:
# Disconnect network
sudo ifconfig en0 down # macOS
# OR
sudo ip link set eth0 down # Linux
# Test OpenClaw still responds
openclaw dashboard --no-open
If you get a token URL, the gateway is running offline.
Optional: Connect Messaging Apps (Still Private)
You can connect Telegram/WhatsApp/Discord while keeping AI processing local.
For Telegram:
- Open Telegram, search for
@BotFather - Send
/newbotand follow prompts - Copy the bot token
- Add to config:
{
"channels": {
"telegram": {
"enabled": true,
"botToken": "YOUR_BOT_TOKEN_HERE",
"dmPolicy": "pairing",
"allowFrom": ["*"]
}
}
}
Privacy note: Telegram connection happens over the internet, but AI processing stays local. Only your final responses go through Telegram servers, not your prompts or model processing.
Verification
Check model is running locally:
ollama ps
Expected: Shows your model loaded in memory.
Monitor resource usage:
# Linux
htop
# macOS
top
You should see ollama using CPU/GPU and significant RAM (10-30GB depending on model).
Test an AI task locally:
curl -X POST http://localhost:11434/api/generate -d '{
"model": "qwen2.5-coder:32b-instruct",
"prompt": "Write a hello world in Python",
"stream": false
}'
Expected: JSON response with Python code, no network calls to external APIs.
What You Learned
- Ollama runs models locally with zero cloud dependency
- OpenClaw coordinates tasks without sending data externally
- Network sandboxing prevents accidental data leaks
- 128k+ context is essential for complex agent tasks
Limitations to know:
- Model performance depends on your hardware (slower than cloud on CPU)
- Initial setup requires internet to download models
- Some skills may not work without network access (web search, etc.)
- Smaller models (<13B parameters) struggle with complex reasoning
When NOT to use this:
- You need cutting-edge model quality (GPT-4, Claude 4.5 Opus)
- Your hardware has <16GB RAM
- Tasks require real-time web access
- You want zero maintenance (cloud is easier)
Troubleshooting
"Connection refused" to Ollama:
# Check if Ollama is running
ps aux | grep ollama
# Start it manually
ollama serve
Gateway won't start:
# Check logs
openclaw logs
# Run diagnostic
openclaw doctor
Model too slow:
- Use smaller model:
ollama pull glm-4.7-flash - Enable GPU if available
- Reduce context window to 64k:
export OLLAMA_NUM_CTX=65536
Out of memory errors:
- Close other applications
- Use quantized models (ends in
-q4,-q5) - Reduce
maxTokensin config to 4096
Next Steps
Enhance your setup:
- OpenClaw Skills Documentation - Add capabilities
- Ollama Model Library - Explore other models
- OpenClaw Configuration Guide - Advanced options
Alternative models to try:
llama3.3:70b- Better reasoning if you have 48GB+ RAMdeepseek-r1:32b- Strong chain-of-thought capabilitiesmistral:latest- Lightweight general-purpose option
Security Checklist
-
"network": "none"in sandbox config - No cloud provider API keys in config
-
"bind": "loopback"for gateway - Firewall blocks port 18789 from external access
- Models downloaded over HTTPS (Ollama default)
- Config file at
~/.openclaw/openclaw.jsonhas restricted permissions
Set proper permissions:
chmod 600 ~/.openclaw/openclaw.json
Cost Comparison
This setup:
- Initial: $0 (uses existing hardware)
- Ongoing: $0/month
- One-time hardware: ~$800-2000 for capable machine
Cloud alternatives:
- OpenAI API: ~$20-100/month depending on usage
- Claude Pro: $20/month (rate limited)
- GPT-4 API heavy usage: $200+/month
Break-even: If you use AI daily, hardware pays for itself in 12-18 months.
Tested on Ubuntu 24.04 with Ollama 0.1.22, OpenClaw 0.8.x, qwen2.5-coder:32b-instruct Hardware: AMD Ryzen 9 5950X, 64GB RAM, NVIDIA RTX 3090