Configure OpenClaw for 100% Offline Use in 30 Minutes

Run a completely private AI assistant with OpenClaw and Ollama. No cloud APIs, no data leaving your machine, zero recurring costs.

Problem: Cloud AI Services Know Everything About You

You want an AI assistant but don't want your conversations, files, or commands sent to OpenAI, Anthropic, or Google servers.

You'll learn:

  • Run OpenClaw with 100% local models (no cloud required)
  • Keep all data on your machine with Ollama integration
  • Connect messaging apps while maintaining privacy

Time: 30 min | Level: Intermediate


Why This Matters

Cloud AI services process your data on their servers. Every prompt, every file upload, every conversation goes through their infrastructure. OpenClaw with Ollama changes this - your AI runs entirely on your hardware.

Common privacy concerns:

  • API providers log your prompts for training
  • Conversations stored on third-party servers
  • Subscription costs add up ($20-200/month)
  • Network dependency - no internet = no AI

What You Need

Hardware requirements:

  • CPU: 4+ cores (8+ recommended)
  • RAM: 16GB minimum (32GB for larger models)
  • Storage: 50GB+ free space
  • GPU: Optional but 5-10x faster (NVIDIA preferred)

Software requirements:

  • Ubuntu 22.04/24.04, macOS 14+, or Windows 10+
  • Node.js 22+ (installer handles this)
  • Terminal access
  • No cloud accounts needed

Solution

Step 1: Install Ollama

Ollama runs AI models locally on your machine.

On Linux/macOS:

curl -fsSL https://ollama.com/install.sh | sh

On Windows: Download from https://ollama.com/download/windows

Verify installation:

ollama --version

Expected: ollama version 0.1.22 or newer


Step 2: Download a Local Model

Pull a model optimized for OpenClaw tasks:

# Recommended: Best balance of speed and capability
ollama pull qwen2.5-coder:32b-instruct

# Alternative: Faster on lower-end hardware
ollama pull glm-4.7-flash

# Alternative: Complex reasoning tasks
ollama pull deepseek-r1:32b

Why qwen2.5-coder: Strong tool calling support, 128k context window, works well with 16GB RAM.

Download time: 10-45 minutes depending on model size and connection.

Verify the model:

ollama list

You should see your model listed with its size.


Step 3: Install OpenClaw

The easiest method uses Ollama's integrated command:

ollama launch openclaw

This automatically:

  • Installs OpenClaw
  • Configures it to use Ollama
  • Starts the gateway service

If you prefer manual installation:

npm install -g openclaw@latest
openclaw onboard

During onboarding:

  1. Quick Start: Select this option
  2. Skip Cloud: Choose "Skip cloud providers"
  3. Select Ollama: Pick from local providers
  4. Messaging Platform: Choose later or skip for now

Expected: Gateway starts on port 18789 in the background.


Step 4: Configure for Complete Offline Operation

Edit the configuration file:

nano ~/.openclaw/openclaw.json

Replace with this minimal offline config:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/qwen2.5-coder:32b-instruct",
        "fallbacks": []
      },
      "sandbox": {
        "mode": "non-main",
        "docker": {
          "network": "none"
        }
      }
    }
  },
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://127.0.0.1:11434/v1",
        "apiKey": "ollama-local",
        "api": "openai-completions",
        "models": [
          {
            "id": "qwen2.5-coder:32b-instruct",
            "name": "qwen2.5-coder:32b-instruct",
            "reasoning": false,
            "input": ["text"],
            "cost": {
              "input": 0,
              "output": 0
            },
            "contextWindow": 131072,
            "maxTokens": 16384
          }
        ]
      }
    }
  },
  "gateway": {
    "mode": "local",
    "port": 18789,
    "bind": "loopback"
  }
}

Key privacy settings:

  • "network": "none" - Blocks sandbox network access
  • "fallbacks": [] - No cloud model fallback
  • "bind": "loopback" - Only local connections allowed
  • baseUrl points to localhost - never external

Save and restart:

# If using systemd (Linux)
sudo systemctl restart openclaw-gateway

# If manually running
openclaw gateway restart

Step 5: Set Ollama Context Window

OpenClaw needs large context for multi-step tasks:

# Set 128k context (recommended)
export OLLAMA_NUM_CTX=131072

# Make it permanent (Linux/macOS)
echo 'export OLLAMA_NUM_CTX=131072' >> ~/.bashrc
source ~/.bashrc

Why 128k context: OpenClaw system prompts are large. Smaller context windows cause task failures.


Step 6: Verify Offline Operation

Test the model:

ollama run qwen2.5-coder:32b-instruct

Type: Hello, introduce yourself

Test OpenClaw gateway:

openclaw status

Expected output:

Gateway: ✓ Running (http://127.0.0.1:18789)
Model: ollama/qwen2.5-coder:32b-instruct
Status: Ready

Verify it's truly offline:

# Disconnect network
sudo ifconfig en0 down  # macOS
# OR
sudo ip link set eth0 down  # Linux

# Test OpenClaw still responds
openclaw dashboard --no-open

If you get a token URL, the gateway is running offline.


Optional: Connect Messaging Apps (Still Private)

You can connect Telegram/WhatsApp/Discord while keeping AI processing local.

For Telegram:

  1. Open Telegram, search for @BotFather
  2. Send /newbot and follow prompts
  3. Copy the bot token
  4. Add to config:
{
  "channels": {
    "telegram": {
      "enabled": true,
      "botToken": "YOUR_BOT_TOKEN_HERE",
      "dmPolicy": "pairing",
      "allowFrom": ["*"]
    }
  }
}

Privacy note: Telegram connection happens over the internet, but AI processing stays local. Only your final responses go through Telegram servers, not your prompts or model processing.


Verification

Check model is running locally:

ollama ps

Expected: Shows your model loaded in memory.

Monitor resource usage:

# Linux
htop

# macOS
top

You should see ollama using CPU/GPU and significant RAM (10-30GB depending on model).

Test an AI task locally:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "qwen2.5-coder:32b-instruct",
  "prompt": "Write a hello world in Python",
  "stream": false
}'

Expected: JSON response with Python code, no network calls to external APIs.


What You Learned

  • Ollama runs models locally with zero cloud dependency
  • OpenClaw coordinates tasks without sending data externally
  • Network sandboxing prevents accidental data leaks
  • 128k+ context is essential for complex agent tasks

Limitations to know:

  • Model performance depends on your hardware (slower than cloud on CPU)
  • Initial setup requires internet to download models
  • Some skills may not work without network access (web search, etc.)
  • Smaller models (<13B parameters) struggle with complex reasoning

When NOT to use this:

  • You need cutting-edge model quality (GPT-4, Claude 4.5 Opus)
  • Your hardware has <16GB RAM
  • Tasks require real-time web access
  • You want zero maintenance (cloud is easier)

Troubleshooting

"Connection refused" to Ollama:

# Check if Ollama is running
ps aux | grep ollama

# Start it manually
ollama serve

Gateway won't start:

# Check logs
openclaw logs

# Run diagnostic
openclaw doctor

Model too slow:

  • Use smaller model: ollama pull glm-4.7-flash
  • Enable GPU if available
  • Reduce context window to 64k: export OLLAMA_NUM_CTX=65536

Out of memory errors:

  • Close other applications
  • Use quantized models (ends in -q4, -q5)
  • Reduce maxTokens in config to 4096

Next Steps

Enhance your setup:

Alternative models to try:

  • llama3.3:70b - Better reasoning if you have 48GB+ RAM
  • deepseek-r1:32b - Strong chain-of-thought capabilities
  • mistral:latest - Lightweight general-purpose option

Security Checklist

  • "network": "none" in sandbox config
  • No cloud provider API keys in config
  • "bind": "loopback" for gateway
  • Firewall blocks port 18789 from external access
  • Models downloaded over HTTPS (Ollama default)
  • Config file at ~/.openclaw/openclaw.json has restricted permissions

Set proper permissions:

chmod 600 ~/.openclaw/openclaw.json

Cost Comparison

This setup:

  • Initial: $0 (uses existing hardware)
  • Ongoing: $0/month
  • One-time hardware: ~$800-2000 for capable machine

Cloud alternatives:

  • OpenAI API: ~$20-100/month depending on usage
  • Claude Pro: $20/month (rate limited)
  • GPT-4 API heavy usage: $200+/month

Break-even: If you use AI daily, hardware pays for itself in 12-18 months.


Tested on Ubuntu 24.04 with Ollama 0.1.22, OpenClaw 0.8.x, qwen2.5-coder:32b-instruct Hardware: AMD Ryzen 9 5950X, 64GB RAM, NVIDIA RTX 3090