Your security team just banned ChatGPT. Your data cannot touch the internet. Yet you need AI capabilities for sensitive operations. Welcome to the world of air-gapped AI deployment, where paranoia meets productivity.
Air-gapped environments completely isolate systems from external networks. Government agencies, financial institutions, and defense contractors use these setups to protect classified data. This guide shows you how to deploy Ollama AI models in these fortress-like environments.
Why Air-Gapped AI Matters for Enterprise Security
Traditional cloud AI services send your data to external servers. This creates security risks for sensitive information. Air-gapped AI keeps everything local while maintaining powerful language model capabilities.
Security Benefits of Offline AI Deployment
Complete Data Isolation: No data leaves your network perimeter. Your sensitive documents stay within your controlled environment.
Zero External Dependencies: Models run without internet connectivity. Software updates happen through controlled channels only.
Regulatory Compliance: Meets strict requirements for HIPAA, SOX, and classified government work.
Reduced Attack Surface: Eliminates cloud-based vulnerabilities and data exfiltration risks.
Prerequisites for Air-Gapped Ollama Installation
Hardware Requirements
- RAM: Minimum 16GB, recommended 32GB+ for larger models
- Storage: 100GB+ SSD space for model files
- CPU: Modern multi-core processor (Intel i7/AMD Ryzen 7+)
- GPU: Optional but recommended (NVIDIA RTX 4060+ or AMD equivalent)
Software Dependencies
Download these components on an internet-connected system:
# Required packages (download offline)
- ollama-linux-amd64 (latest release)
- docker-ce (if using containerized deployment)
- python3.8+ with pip packages
- curl and wget utilities
Network Architecture Planning
Design your isolated network topology:
DMZ Transfer Zone: Secure area for moving files between networks Internal AI Network: Completely isolated segment for AI operations Management Network: Separate administrative access layer
Step-by-Step Air-Gapped Ollama Installation
Phase 1: Offline Package Preparation
Download Ollama and required models on an internet-connected system:
# Download Ollama binary
wget https://github.com/ollama/ollama/releases/latest/download/ollama-linux-amd64
chmod +x ollama-linux-amd64
# Download models (this requires internet)
./ollama-linux-amd64 pull llama2:7b
./ollama-linux-amd64 pull codellama:7b
./ollama-linux-amd64 pull mistral:7b
Critical Security Note: Verify checksums for all downloaded files. Use official sources only.
Phase 2: Secure File Transfer
Transfer files to your air-gapped environment using approved methods:
# Create transfer package
tar -czf ollama-airgapped.tar.gz ollama-linux-amd64 ~/.ollama/
# Verify package integrity
sha256sum ollama-airgapped.tar.gz > ollama-checksum.txt
Security Checkpoint: Scan all files with approved antivirus tools before transfer.
Phase 3: Air-Gapped Installation
Install Ollama in your isolated environment:
# Extract and install
tar -xzf ollama-airgapped.tar.gz
sudo mv ollama-linux-amd64 /usr/local/bin/ollama
# Set up service user
sudo useradd -r -s /bin/false ollama
sudo mkdir -p /var/lib/ollama
sudo chown ollama:ollama /var/lib/ollama
# Install systemd service
sudo tee /etc/systemd/system/ollama.service > /dev/null <<EOF
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="OLLAMA_HOST=0.0.0.0"
[Install]
WantedBy=default.target
EOF
Phase 4: Service Configuration and Security Hardening
Configure Ollama for maximum security:
# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable ollama
sudo systemctl start ollama
# Configure firewall rules
sudo ufw allow from 192.168.1.0/24 to any port 11434
sudo ufw deny 11434
# Set up logging
sudo mkdir -p /var/log/ollama
sudo chown ollama:ollama /var/log/ollama
Create a secure configuration file:
# /etc/ollama/config.yaml
host: "127.0.0.1"
port: 11434
origins: ["http://localhost:*"]
models_path: "/var/lib/ollama/models"
max_concurrent_requests: 5
timeout: 300
log_level: "info"
Model Management in Isolated Environments
Loading Pre-Downloaded Models
Import models into your air-gapped Ollama instance:
# Copy model files to Ollama directory
sudo cp -r ~/.ollama/models/* /var/lib/ollama/models/
sudo chown -R ollama:ollama /var/lib/ollama/models/
# Verify model availability
ollama list
Expected output:
NAME ID SIZE MODIFIED
llama2:7b 365c0bd3c000 3.8GB 2 hours ago
codellama:7b b52adb11bd3c 3.8GB 2 hours ago
mistral:7b f974a74358d6 4.1GB 2 hours ago
Custom Model Integration
Add organization-specific models:
# Create custom model directory
sudo mkdir -p /var/lib/ollama/models/custom
sudo chown ollama:ollama /var/lib/ollama/models/custom
# Import custom model (from approved source)
ollama create custom-model -f /path/to/custom-modelfile
Security Monitoring and Maintenance
Logging Configuration
Set up comprehensive logging for security audits:
# Configure rsyslog for Ollama
sudo tee /etc/rsyslog.d/50-ollama.conf > /dev/null <<EOF
# Ollama logging
:programname, isequal, "ollama" /var/log/ollama/ollama.log
:programname, isequal, "ollama" stop
EOF
sudo systemctl restart rsyslog
Performance Monitoring
Monitor resource usage for capacity planning:
# Create monitoring script
sudo tee /usr/local/bin/ollama-monitor.sh > /dev/null <<'EOF'
#!/bin/bash
while true; do
echo "$(date): CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | sed 's/%us,//'), \
RAM=$(free -m | awk 'NR==2{printf "%.1f%%", $3*100/$2}'), \
GPU=$(nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits 2>/dev/null || echo "N/A")" \
>> /var/log/ollama/performance.log
sleep 60
done
EOF
chmod +x /usr/local/bin/ollama-monitor.sh
Update Management
Handle updates in air-gapped environments:
# Create update staging area
sudo mkdir -p /opt/ollama-updates
sudo chown ollama:ollama /opt/ollama-updates
# Update process (manual)
# 1. Download updates on connected system
# 2. Transfer via approved method
# 3. Verify checksums
# 4. Apply updates during maintenance window
Integration with Enterprise Applications
API Access Configuration
Configure secure API endpoints for internal applications:
# Python client example
import requests
import json
class SecureOllamaClient:
def __init__(self, host="127.0.0.1", port=11434):
self.base_url = f"http://{host}:{port}"
def generate_response(self, model, prompt, max_tokens=1000):
payload = {
"model": model,
"prompt": prompt,
"options": {
"num_predict": max_tokens,
"temperature": 0.7
}
}
response = requests.post(
f"{self.base_url}/api/generate",
json=payload,
headers={"Content-Type": "application/json"}
)
return response.json()
# Usage example
client = SecureOllamaClient()
result = client.generate_response("llama2:7b", "Explain quantum computing")
Load Balancing Multiple Instances
Deploy multiple Ollama instances for high availability:
# HAProxy configuration snippet
backend ollama_backend
balance roundrobin
server ollama1 192.168.1.10:11434 check
server ollama2 192.168.1.11:11434 check
server ollama3 192.168.1.12:11434 check
Troubleshooting Common Air-Gapped Issues
Model Loading Failures
Problem: Models fail to load after transfer Solution: Verify file permissions and checksums
# Check model integrity
find /var/lib/ollama/models -name "*.bin" -exec sha256sum {} \;
# Fix permissions
sudo chown -R ollama:ollama /var/lib/ollama/models
sudo chmod -R 755 /var/lib/ollama/models
Performance Optimization
Problem: Slow response times Solution: Optimize system resources
# Increase file descriptor limits
echo "ollama soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "ollama hard nofile 65536" | sudo tee -a /etc/security/limits.conf
# Optimize memory settings
sudo sysctl -w vm.swappiness=10
sudo sysctl -w vm.overcommit_memory=1
Network Connectivity Issues
Problem: Cannot access Ollama from other systems Solution: Configure network settings properly
# Check listening ports
sudo netstat -tlnp | grep 11434
# Verify firewall rules
sudo ufw status numbered
# Test connectivity
curl -X POST http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model": "llama2:7b", "prompt": "Hello world"}'
Best Practices for Air-Gapped AI Operations
Security Hardening Checklist
System Level:
- Disable unnecessary services
- Apply security patches during maintenance windows
- Use strong authentication mechanisms
- Implement role-based access controls
Application Level:
- Configure resource limits
- Enable comprehensive logging
- Regular security audits
- Backup and recovery procedures
Model Management Standards
Version Control: Track all model versions and changes Access Control: Limit model access based on clearance levels Audit Trail: Log all model usage and modifications Backup Strategy: Regular backups of model files and configurations
Conclusion
Air-gapped Ollama deployment provides enterprise-grade AI capabilities while maintaining strict security controls. This setup eliminates cloud dependencies and keeps sensitive data within your controlled environment.
The key to success lies in careful planning, proper security implementation, and ongoing maintenance. Your organization gains powerful AI tools without compromising data security or regulatory compliance.
Ready to deploy secure AI in your environment? Start with the hardware requirements and work through each phase methodically. Your future self will thank you for the paranoid attention to security detail.