Remember when everyone shared the same computer Terminal? Those days are gone, but somehow AI deployments still act like it's 1985. Your enterprise needs bulletproof user isolation, not a digital free-for-all where anyone can access any model.
The Multi-Tenant Security Challenge
Organizations deploy Ollama Enterprise to serve multiple departments. Finance runs credit models while HR processes resumes. Without proper user isolation, your CFO might accidentally access employee data. That's a compliance nightmare waiting to happen.
Ollama Enterprise User Isolation solves this problem through robust security boundaries. This guide covers implementation strategies, access control patterns, and deployment best practices for secure multi-tenant environments.
What You'll Learn
- Configure namespace-based user isolation
- Implement role-based access control (RBAC)
- Set up secure model sharing boundaries
- Deploy monitoring and audit systems
- Troubleshoot common isolation failures
Understanding Multi-Tenant Architecture
Core Isolation Principles
Multi-tenant security operates on three fundamental levels:
Physical Isolation: Separate infrastructure resources prevent resource conflicts and data bleeding between tenants.
Logical Isolation: Software boundaries enforce access control without requiring separate hardware deployments.
Data Isolation: Information segregation ensures tenant data never crosses unauthorized boundaries.
Ollama Enterprise Isolation Models
Ollama Enterprise supports multiple isolation patterns:
- Namespace Isolation: Logical separation within shared infrastructure
- Container Isolation: Process-level boundaries using containerization
- Network Isolation: Traffic segregation through virtual networks
- Resource Isolation: CPU, memory, and storage quotas per tenant
Implementing Namespace-Based User Isolation
Creating Secure Namespaces
Start with namespace configuration for tenant separation:
# ollama-namespaces.yaml
apiVersion: v1
kind: Namespace
metadata:
name: tenant-finance
labels:
tenant: finance
isolation-level: strict
---
apiVersion: v1
kind: Namespace
metadata:
name: tenant-hr
labels:
tenant: hr
isolation-level: strict
Deploy namespaces with isolation labels:
# Apply namespace configuration
kubectl apply -f ollama-namespaces.yaml
# Verify namespace creation
kubectl get namespaces --show-labels
Configuring Network Policies
Network policies enforce traffic isolation between tenants:
# network-isolation.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: tenant-isolation
namespace: tenant-finance
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
# Allow traffic only from same namespace
- from:
- namespaceSelector:
matchLabels:
tenant: finance
egress:
# Restrict outbound to same tenant
- to:
- namespaceSelector:
matchLabels:
tenant: finance
Apply network policies for each tenant:
# Deploy network isolation
kubectl apply -f network-isolation.yaml
# Test connectivity between namespaces
kubectl exec -n tenant-finance [pod-name] -- ping [tenant-hr-service]
# Should fail with network isolation active
Role-Based Access Control Implementation
Creating Service Accounts
Service accounts provide identity for tenant workloads:
# service-accounts.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: finance-ollama-sa
namespace: tenant-finance
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: hr-ollama-sa
namespace: tenant-hr
Defining RBAC Policies
Role-based access control limits tenant permissions:
# rbac-policies.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: tenant-finance
name: ollama-model-access
rules:
# Allow model operations within namespace
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: finance-model-binding
namespace: tenant-finance
subjects:
- kind: ServiceAccount
name: finance-ollama-sa
namespace: tenant-finance
roleRef:
kind: Role
name: ollama-model-access
apiGroup: rbac.authorization.k8s.io
Deploy RBAC configuration:
# Apply service accounts and roles
kubectl apply -f service-accounts.yaml
kubectl apply -f rbac-policies.yaml
# Verify role assignments
kubectl auth can-i create deployments --as=system:serviceaccount:tenant-finance:finance-ollama-sa -n tenant-finance
Secure Model Sharing Configuration
Model Access Control Lists
Configure which models each tenant can access:
# model-acl.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: model-access-control
namespace: tenant-finance
data:
allowed-models: |
- llama2:7b-finance
- mistral:7b-instruct
- codellama:13b
denied-models: |
- personal-assistant
- hr-screening-model
Dynamic Model Loading
Implement runtime model access validation:
# model_access_validator.py
import yaml
import logging
from typing import List, Dict
class ModelAccessValidator:
def __init__(self, tenant_id: str, acl_config: str):
"""Initialize validator with tenant-specific ACL configuration"""
self.tenant_id = tenant_id
self.acl_config = yaml.safe_load(acl_config)
self.logger = logging.getLogger(__name__)
def validate_model_access(self, model_name: str) -> bool:
"""Validate if tenant can access requested model"""
allowed_models = self.acl_config.get('allowed-models', [])
denied_models = self.acl_config.get('denied-models', [])
# Check explicit denial first
if model_name in denied_models:
self.logger.warning(f"Model {model_name} explicitly denied for tenant {self.tenant_id}")
return False
# Check if model is in allowed list
if model_name in allowed_models:
self.logger.info(f"Model {model_name} access granted for tenant {self.tenant_id}")
return True
# Default deny for unlisted models
self.logger.warning(f"Model {model_name} not in allowed list for tenant {self.tenant_id}")
return False
def get_available_models(self) -> List[str]:
"""Return list of models available to tenant"""
return self.acl_config.get('allowed-models', [])
# Usage example
validator = ModelAccessValidator('tenant-finance', acl_config_data)
can_access = validator.validate_model_access('llama2:7b-finance')
Resource Isolation and Quotas
CPU and Memory Limits
Prevent resource starvation through quotas:
# resource-quotas.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-finance-quota
namespace: tenant-finance
spec:
hard:
# Compute resources
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
# Storage limits
requests.storage: "50Gi"
# Object limits
pods: "10"
persistentvolumeclaims: "5"
Pod Security Standards
Enforce security policies for tenant workloads:
# pod-security.yaml
apiVersion: v1
kind: Namespace
metadata:
name: tenant-finance
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Apply resource constraints:
# Deploy resource quotas
kubectl apply -f resource-quotas.yaml
# Verify quota enforcement
kubectl describe quota tenant-finance-quota -n tenant-finance
# Check current resource usage
kubectl top pods -n tenant-finance
Monitoring and Audit Implementation
Audit Log Configuration
Enable comprehensive audit logging:
# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log model access attempts
- level: Request
namespaces: ["tenant-finance", "tenant-hr"]
resources:
- group: ""
resources: ["configmaps"]
- group: "apps"
resources: ["deployments"]
# Log authentication events
- level: Metadata
omitStages:
- RequestReceived
resources:
- group: ""
resources: ["serviceaccounts"]
Security Monitoring Dashboard
Create monitoring for isolation violations:
# security_monitor.py
import json
import time
from datetime import datetime
from kubernetes import client, config
class SecurityMonitor:
def __init__(self):
"""Initialize Kubernetes client for monitoring"""
config.load_incluster_config()
self.v1 = client.CoreV1Api()
self.apps_v1 = client.AppsV1Api()
def check_namespace_isolation(self) -> Dict[str, List[str]]:
"""Verify namespace isolation boundaries"""
violations = []
# Get all namespaces with tenant labels
namespaces = self.v1.list_namespace(label_selector="tenant")
for namespace in namespaces.items:
tenant_label = namespace.metadata.labels.get('tenant')
# Check for cross-tenant service accounts
service_accounts = self.v1.list_namespaced_service_account(
namespace=namespace.metadata.name
)
for sa in service_accounts.items:
if not sa.metadata.name.startswith(tenant_label):
violations.append({
'type': 'cross_tenant_service_account',
'namespace': namespace.metadata.name,
'resource': sa.metadata.name,
'timestamp': datetime.utcnow().isoformat()
})
return violations
def monitor_resource_usage(self, namespace: str) -> Dict[str, str]:
"""Monitor resource usage against quotas"""
try:
quota = self.v1.read_namespaced_resource_quota(
name=f"{namespace}-quota",
namespace=namespace
)
return {
'namespace': namespace,
'cpu_usage': quota.status.used.get('requests.cpu', '0'),
'memory_usage': quota.status.used.get('requests.memory', '0'),
'cpu_limit': quota.status.hard.get('requests.cpu', '0'),
'memory_limit': quota.status.hard.get('requests.memory', '0')
}
except Exception as e:
return {'error': str(e)}
# Continuous monitoring loop
monitor = SecurityMonitor()
while True:
violations = monitor.check_namespace_isolation()
if violations:
print(f"Security violations detected: {json.dumps(violations, indent=2)}")
time.sleep(60) # Check every minute
Deployment Best Practices
Secure Configuration Management
Store sensitive configuration in encrypted secrets:
# Create encrypted secret for tenant configuration
kubectl create secret generic tenant-finance-config \
--from-literal=api-key=your-encrypted-key \
--from-literal=model-endpoint=https://finance.models.internal \
-n tenant-finance
# Apply security labels
kubectl label secret tenant-finance-config \
security-level=restricted \
tenant=finance \
-n tenant-finance
Health Check Implementation
Monitor isolation system health:
# health_checker.py
import requests
import subprocess
from typing import Dict, bool
def check_namespace_isolation() -> bool:
"""Verify namespace isolation is working"""
try:
# Test cross-namespace communication should fail
result = subprocess.run([
'kubectl', 'exec', '-n', 'tenant-finance',
'deployment/ollama-service', '--',
'curl', '-s', 'http://ollama-service.tenant-hr:8080/health'
], capture_output=True, timeout=10)
# Should fail due to network policy
return result.returncode != 0
except subprocess.TimeoutExpired:
return True # Timeout indicates proper isolation
def check_rbac_enforcement() -> bool:
"""Verify RBAC policies are active"""
try:
result = subprocess.run([
'kubectl', 'auth', 'can-i', 'get', 'secrets',
'--as=system:serviceaccount:tenant-finance:finance-ollama-sa',
'-n', 'tenant-hr'
], capture_output=True)
# Should be denied
return b'no' in result.stdout.lower()
except Exception:
return False
# Run health checks
isolation_ok = check_namespace_isolation()
rbac_ok = check_rbac_enforcement()
print(f"Namespace isolation: {'✓' if isolation_ok else '✗'}")
print(f"RBAC enforcement: {'✓' if rbac_ok else '✗'}")
Troubleshooting Common Issues
Permission Denied Errors
Problem: Service accounts cannot access required resources.
Solution: Verify role bindings and permissions:
# Check current permissions
kubectl auth can-i --list --as=system:serviceaccount:tenant-finance:finance-ollama-sa -n tenant-finance
# Debug role binding
kubectl describe rolebinding finance-model-binding -n tenant-finance
# Fix missing permissions
kubectl patch role ollama-model-access -n tenant-finance --type='json' -p='[
{
"op": "add",
"path": "/rules/-",
"value": {
"apiGroups": [""],
"resources": ["persistentvolumeclaims"],
"verbs": ["get", "list", "create"]
}
}
]'
Network Connectivity Issues
Problem: Legitimate traffic blocked by network policies.
Solution: Update network policies for required communication:
# Allow specific external access
- to: []
ports:
- protocol: TCP
port: 443 # HTTPS for external model APIs
- to:
- namespaceSelector:
matchLabels:
name: monitoring
# Allow access to monitoring namespace
Resource Quota Exceeded
Problem: Tenant hits resource limits during peak usage.
Solution: Implement auto-scaling within quota bounds:
# horizontal-pod-autoscaler.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ollama-autoscaler
namespace: tenant-finance
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ollama-service
minReplicas: 1
maxReplicas: 5 # Within quota limits
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Security Validation Checklist
Verify your multi-tenant deployment meets security requirements:
- Namespace Isolation: Each tenant runs in separate namespace
- Network Policies: Traffic restricted between tenants
- RBAC Active: Service accounts have minimal required permissions
- Resource Quotas: CPU, memory, and storage limits enforced
- Model Access Control: ACLs prevent unauthorized model access
- Audit Logging: All access attempts logged and monitored
- Secret Management: Sensitive data encrypted and isolated
- Health Monitoring: Automated checks verify isolation integrity
Performance Optimization
Efficient Resource Allocation
Balance security with performance through smart resource planning:
# resource_optimizer.py
def calculate_optimal_quotas(tenant_workload: Dict) -> Dict[str, str]:
"""Calculate resource quotas based on workload patterns"""
base_cpu = 2 # Minimum CPU cores
base_memory = 4 # Minimum GB RAM
# Scale based on model complexity
model_factor = {
'7b': 1.0,
'13b': 1.5,
'30b': 2.5,
'70b': 4.0
}
largest_model = max(tenant_workload.get('models', ['7b']))
factor = model_factor.get(largest_model.split(':')[-1], 1.0)
concurrent_users = tenant_workload.get('concurrent_users', 10)
user_factor = max(1.0, concurrent_users / 10)
optimal_cpu = base_cpu * factor * user_factor
optimal_memory = base_memory * factor * user_factor
return {
'requests.cpu': f"{optimal_cpu}",
'limits.cpu': f"{optimal_cpu * 1.5}",
'requests.memory': f"{optimal_memory}Gi",
'limits.memory': f"{optimal_memory * 1.5}Gi"
}
Model Caching Strategy
Implement shared model caching while maintaining isolation:
# shared-model-cache.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: shared-model-cache
spec:
capacity:
storage: 100Gi
accessModes:
- ReadOnlyMany # Read-only sharing
persistentVolumeReclaimPolicy: Retain
storageClassName: fast-ssd
hostPath:
path: /shared/models
---
# Per-tenant access to shared cache
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: model-cache-access
namespace: tenant-finance
spec:
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 100Gi
storageClassName: fast-ssd
Conclusion
Secure multi-tenancy transforms Ollama Enterprise from a single-user tool into an enterprise-grade platform. Proper user isolation protects sensitive data while enabling efficient resource sharing across departments.
The implementation strategies covered here - namespace isolation, RBAC policies, resource quotas, and monitoring systems - create robust security boundaries. Your organization gains the benefits of shared infrastructure without compromising data security or compliance requirements.
Ollama Enterprise User Isolation isn't just about security - it's about enabling confident AI adoption across your entire organization. Start with namespace-based isolation, add RBAC controls, and implement comprehensive monitoring for a bulletproof multi-tenant deployment.
Ready to secure your AI infrastructure? Begin with the namespace configuration and build your isolation layers step by step. Your future self (and your security team) will thank you.