Ever watched a team of developers try to share Ollama models like they're passing around a thumb drive from 2005? Three people download the same 7GB model, someone accidentally deletes the shared configuration, and suddenly nobody knows which version actually works in production.
Team-based Ollama development doesn't have to feel like herding cats. With proper collaboration tools and workflows, your team can share models, synchronize configurations, and deploy AI applications without the chaos.
This guide covers practical collaboration strategies for Ollama development teams, from shared model registries to automated deployment pipelines. You'll learn how to set up team workflows that scale beyond "it works on my machine."
Why Team Collaboration Matters for Ollama Development
Individual Ollama development works great until you need to share results. Teams face specific challenges that solo developers never encounter.
Storage and bandwidth problems hit teams hard. When five developers each download Llama 2 13B, that's 35GB of duplicate model storage. Remote teams suffer from repeated large downloads that slow development cycles.
Configuration drift creates mysterious bugs. Developer A uses temperature 0.7, Developer B uses 0.2, and production uses 0.5. The same prompt produces different results across environments, making testing unreliable.
Version control gaps plague model management. Git handles code well but struggles with multi-gigabyte model files. Teams need strategies for tracking model versions, prompt templates, and configuration changes.
Deployment coordination becomes complex with multiple contributors. Who deploys which model? How do you roll back problematic changes? Teams need structured deployment workflows that prevent conflicts.
Essential Collaboration Tools for Ollama Teams
Shared Model Registry with Ollama Hub
Set up a centralized model registry that your entire team can access. This eliminates duplicate downloads and provides version control for models.
# Create shared model registry
mkdir -p /shared/ollama-models
export OLLAMA_MODELS=/shared/ollama-models
# Team members pull from shared location
ollama pull llama2:13b
ollama pull codellama:7b
ollama pull mistral:7b
# List available models for team
ollama list
Configure team access with proper permissions:
# Set up group permissions
sudo groupadd ollama-team
sudo usermod -a -G ollama-team alice
sudo usermod -a -G ollama-team bob
sudo usermod -a -G ollama-team charlie
# Configure shared directory
sudo chown -R :ollama-team /shared/ollama-models
sudo chmod -R g+rw /shared/ollama-models
Docker Compose for Consistent Environments
Create reproducible development environments that work identically across team members' machines.
# docker-compose.yml
version: '3.8'
services:
ollama:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ollama_models:/root/.ollama
- ./models:/app/models
environment:
- OLLAMA_HOST=0.0.0.0
restart: unless-stopped
app:
build: .
ports:
- "8000:8000"
depends_on:
- ollama
environment:
- OLLAMA_BASE_URL=http://ollama:11434
volumes:
- ./src:/app/src
- ./config:/app/config
volumes:
ollama_models:
Team members get identical environments with:
# Clone team repository
git clone https://github.com/yourteam/ollama-project.git
cd ollama-project
# Start development environment
docker-compose up -d
# Verify setup
curl http://localhost:11434/api/tags
Configuration Management with Environment Files
Standardize model configurations across development, staging, and production environments.
# .env.development
OLLAMA_MODEL=llama2:7b
OLLAMA_TEMPERATURE=0.7
OLLAMA_MAX_TOKENS=1000
OLLAMA_TIMEOUT=30
# .env.staging
OLLAMA_MODEL=llama2:13b
OLLAMA_TEMPERATURE=0.5
OLLAMA_MAX_TOKENS=2000
OLLAMA_TIMEOUT=60
# .env.production
OLLAMA_MODEL=llama2:13b
OLLAMA_TEMPERATURE=0.3
OLLAMA_MAX_TOKENS=1500
OLLAMA_TIMEOUT=90
Load configurations programmatically:
import os
from dotenv import load_dotenv
def load_ollama_config(env="development"):
"""Load environment-specific Ollama configuration"""
load_dotenv(f".env.{env}")
return {
"model": os.getenv("OLLAMA_MODEL", "llama2:7b"),
"temperature": float(os.getenv("OLLAMA_TEMPERATURE", "0.7")),
"max_tokens": int(os.getenv("OLLAMA_MAX_TOKENS", "1000")),
"timeout": int(os.getenv("OLLAMA_TIMEOUT", "30"))
}
# Usage in application
config = load_ollama_config("production")
Shared Model Management Strategies
Model Versioning with Git LFS
Track large model files efficiently using Git Large File Storage:
# Initialize Git LFS in repository
git lfs install
# Track Ollama model files
git lfs track "*.safetensors"
git lfs track "*.bin"
git lfs track "*.gguf"
# Add tracking configuration
git add .gitattributes
git commit -m "Add LFS tracking for model files"
Create a model registry structure:
models/
├── llama2/
│ ├── 7b/
│ │ ├── model.gguf
│ │ └── metadata.json
│ └── 13b/
│ ├── model.gguf
│ └── metadata.json
└── codellama/
└── 7b/
├── model.gguf
└── metadata.json
Automated Model Synchronization
Create scripts to keep team models synchronized:
#!/usr/bin/env python3
"""
Team model synchronization script
Keeps local Ollama models in sync with team registry
"""
import json
import subprocess
import requests
from pathlib import Path
def get_team_models():
"""Fetch list of approved team models"""
# Replace with your team's model registry API
response = requests.get("https://api.yourteam.com/models")
return response.json()
def get_local_models():
"""Get currently installed Ollama models"""
result = subprocess.run(
["ollama", "list", "--format", "json"],
capture_output=True,
text=True
)
return json.loads(result.stdout)
def sync_models():
"""Synchronize local models with team registry"""
team_models = get_team_models()
local_models = {m["name"]: m for m in get_local_models()}
for model in team_models:
model_name = model["name"]
if model_name not in local_models:
print(f"Pulling new model: {model_name}")
subprocess.run(["ollama", "pull", model_name])
elif model["version"] != local_models[model_name]["version"]:
print(f"Updating model: {model_name}")
subprocess.run(["ollama", "pull", model_name])
# Remove deprecated models
for local_model in local_models:
if local_model not in [m["name"] for m in team_models]:
print(f"Removing deprecated model: {local_model}")
subprocess.run(["ollama", "rm", local_model])
if __name__ == "__main__":
sync_models()
Model Performance Benchmarking
Track model performance across team environments:
import time
import json
from datetime import datetime
import ollama
def benchmark_model(model_name, test_prompts):
"""Benchmark model performance with standard prompts"""
results = {
"model": model_name,
"timestamp": datetime.now().isoformat(),
"benchmarks": []
}
for prompt in test_prompts:
start_time = time.time()
response = ollama.chat(
model=model_name,
messages=[{"role": "user", "content": prompt}]
)
end_time = time.time()
results["benchmarks"].append({
"prompt": prompt[:50] + "..." if len(prompt) > 50 else prompt,
"response_time": end_time - start_time,
"tokens_generated": len(response["message"]["content"].split()),
"tokens_per_second": len(response["message"]["content"].split()) / (end_time - start_time)
})
return results
# Standard team benchmarks
test_prompts = [
"Write a Python function to calculate fibonacci numbers",
"Explain quantum computing in simple terms",
"Create a REST API endpoint for user authentication"
]
# Run benchmarks
results = benchmark_model("llama2:13b", test_prompts)
# Save results for team comparison
with open(f"benchmarks/{datetime.now().strftime('%Y%m%d_%H%M%S')}.json", "w") as f:
json.dump(results, f, indent=2)
Team Development Workflows
Feature Branch Strategy for AI Projects
Adapt Git Flow for AI development workflows:
# Create feature branch for new model integration
git checkout -b feature/llama2-integration
# Develop and test model integration
# ... development work ...
# Create pull request with model benchmarks
git add .
git commit -m "Add Llama 2 13B integration with benchmarks"
git push origin feature/llama2-integration
Include model testing in pull requests:
# .github/workflows/model-test.yml
name: Model Integration Tests
on:
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Ollama
run: |
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve &
sleep 5
- name: Pull test models
run: |
ollama pull llama2:7b
ollama pull codellama:7b
- name: Run model tests
run: |
python -m pytest tests/model_tests.py -v
- name: Upload benchmark results
uses: actions/upload-artifact@v3
with:
name: model-benchmarks
path: benchmarks/
Code Review Guidelines for AI Projects
Establish team standards for reviewing AI-related code:
Model Configuration Reviews:
- Verify temperature and parameter settings
- Check for hardcoded model names
- Ensure proper error handling for model failures
- Validate prompt templates and examples
Performance Reviews:
- Review response time benchmarks
- Check memory usage patterns
- Validate token limits and costs
- Ensure proper resource cleanup
Security Reviews:
- Validate input sanitization
- Check for prompt injection vulnerabilities
- Review API key management
- Ensure proper access controls
Deployment Strategies for Team Projects
Staging Environment Setup
Create isolated staging environments for team testing:
# docker-compose.staging.yml
version: '3.8'
services:
ollama-staging:
image: ollama/ollama:latest
ports:
- "11435:11434"
volumes:
- ollama_staging:/root/.ollama
environment:
- OLLAMA_HOST=0.0.0.0
restart: unless-stopped
deploy:
resources:
limits:
memory: 16G
reservations:
memory: 8G
app-staging:
build:
context: .
dockerfile: Dockerfile.staging
ports:
- "8001:8000"
depends_on:
- ollama-staging
environment:
- OLLAMA_BASE_URL=http://ollama-staging:11434
- ENVIRONMENT=staging
volumes:
- ./config/staging:/app/config
volumes:
ollama_staging:
Automated Deployment Pipeline
Set up CI/CD for Ollama applications:
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
branches: [ main ]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy to staging
run: |
docker-compose -f docker-compose.staging.yml up -d
sleep 30
- name: Run integration tests
run: |
python -m pytest tests/integration/ -v
- name: Deploy to production
if: success()
run: |
docker-compose -f docker-compose.prod.yml up -d
- name: Health check
run: |
curl -f http://localhost:8000/health || exit 1
- name: Notify team
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: "Deployment ${{ job.status }}: ${{ github.event.head_commit.message }}"
Blue-Green Deployment for Zero Downtime
Implement zero-downtime deployments for production systems:
#!/bin/bash
# blue-green-deploy.sh
set -e
BLUE_PORT=8000
GREEN_PORT=8001
HEALTH_ENDPOINT="/health"
# Function to check service health
check_health() {
local port=$1
curl -f "http://localhost:$port$HEALTH_ENDPOINT" > /dev/null 2>&1
}
# Function to switch traffic
switch_traffic() {
local new_port=$1
echo "Switching traffic to port $new_port"
# Update load balancer configuration
# This example uses nginx - adapt for your setup
sed -i "s/server localhost:[0-9]*/server localhost:$new_port/" /etc/nginx/sites-available/ollama-app
nginx -s reload
}
# Deploy new version to inactive environment
if check_health $BLUE_PORT; then
echo "Blue is active, deploying to green"
DEPLOY_PORT=$GREEN_PORT
ACTIVE_PORT=$BLUE_PORT
else
echo "Green is active, deploying to blue"
DEPLOY_PORT=$BLUE_PORT
ACTIVE_PORT=$GREEN_PORT
fi
# Stop inactive environment
docker-compose -f docker-compose.yml stop app-$([ $DEPLOY_PORT -eq $BLUE_PORT ] && echo "blue" || echo "green")
# Deploy new version
docker-compose -f docker-compose.yml up -d app-$([ $DEPLOY_PORT -eq $BLUE_PORT ] && echo "blue" || echo "green")
# Wait for deployment to be ready
echo "Waiting for deployment to be ready..."
for i in {1..30}; do
if check_health $DEPLOY_PORT; then
echo "Deployment ready!"
break
fi
sleep 10
done
# Verify deployment health
if ! check_health $DEPLOY_PORT; then
echo "Deployment failed health check"
exit 1
fi
# Switch traffic to new deployment
switch_traffic $DEPLOY_PORT
# Verify traffic switch
sleep 10
if ! check_health $DEPLOY_PORT; then
echo "Traffic switch failed, rolling back"
switch_traffic $ACTIVE_PORT
exit 1
fi
echo "Deployment successful!"
Monitoring and Observability
Team Performance Dashboards
Create shared dashboards for monitoring team Ollama deployments:
import json
import time
from datetime import datetime
from dataclasses import dataclass
from typing import List, Dict
import ollama
@dataclass
class ModelMetrics:
model_name: str
timestamp: datetime
response_time: float
tokens_per_second: float
memory_usage: float
active_requests: int
class OllamaMonitor:
def __init__(self, models: List[str]):
self.models = models
self.metrics_history = []
def collect_metrics(self) -> List[ModelMetrics]:
"""Collect performance metrics for all models"""
metrics = []
for model in self.models:
try:
# Test model response time
start_time = time.time()
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": "Hello"}]
)
end_time = time.time()
response_time = end_time - start_time
tokens = len(response["message"]["content"].split())
tokens_per_second = tokens / response_time if response_time > 0 else 0
# Get system metrics (placeholder - implement actual monitoring)
memory_usage = self._get_memory_usage()
active_requests = self._get_active_requests()
metrics.append(ModelMetrics(
model_name=model,
timestamp=datetime.now(),
response_time=response_time,
tokens_per_second=tokens_per_second,
memory_usage=memory_usage,
active_requests=active_requests
))
except Exception as e:
print(f"Error collecting metrics for {model}: {e}")
return metrics
def _get_memory_usage(self) -> float:
"""Get current memory usage (implement with psutil)"""
# Placeholder implementation
return 0.0
def _get_active_requests(self) -> int:
"""Get number of active requests (implement with your monitoring)"""
# Placeholder implementation
return 0
def export_metrics(self, metrics: List[ModelMetrics], format: str = "json"):
"""Export metrics in specified format"""
if format == "json":
return json.dumps([
{
"model": m.model_name,
"timestamp": m.timestamp.isoformat(),
"response_time": m.response_time,
"tokens_per_second": m.tokens_per_second,
"memory_usage": m.memory_usage,
"active_requests": m.active_requests
}
for m in metrics
], indent=2)
# Usage
monitor = OllamaMonitor(["llama2:7b", "llama2:13b", "codellama:7b"])
metrics = monitor.collect_metrics()
print(monitor.export_metrics(metrics))
Centralized Logging Strategy
Implement structured logging for team debugging:
import logging
import json
from datetime import datetime
from typing import Dict, Any
class OllamaLogger:
def __init__(self, service_name: str):
self.service_name = service_name
self.logger = logging.getLogger(service_name)
self.logger.setLevel(logging.INFO)
# Create structured JSON formatter
formatter = logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# Console handler
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
self.logger.addHandler(console_handler)
# File handler for team log aggregation
file_handler = logging.FileHandler(f"/var/log/ollama/{service_name}.log")
file_handler.setFormatter(formatter)
self.logger.addHandler(file_handler)
def log_request(self, model: str, prompt: str, response_time: float,
tokens_generated: int, user_id: str = None):
"""Log Ollama request with structured data"""
log_data = {
"event_type": "ollama_request",
"service": self.service_name,
"model": model,
"prompt_length": len(prompt),
"response_time": response_time,
"tokens_generated": tokens_generated,
"tokens_per_second": tokens_generated / response_time if response_time > 0 else 0,
"user_id": user_id,
"timestamp": datetime.now().isoformat()
}
self.logger.info(json.dumps(log_data))
def log_error(self, error: Exception, context: Dict[str, Any] = None):
"""Log error with context"""
log_data = {
"event_type": "error",
"service": self.service_name,
"error_type": type(error).__name__,
"error_message": str(error),
"context": context or {},
"timestamp": datetime.now().isoformat()
}
self.logger.error(json.dumps(log_data))
# Usage in your application
logger = OllamaLogger("ollama-api")
# Log successful requests
logger.log_request(
model="llama2:13b",
prompt="Write a function to sort arrays",
response_time=2.5,
tokens_generated=150,
user_id="user123"
)
# Log errors with context
try:
response = ollama.chat(model="nonexistent:model", messages=[...])
except Exception as e:
logger.log_error(e, {"model": "nonexistent:model", "user_id": "user123"})
Advanced Collaboration Features
Model A/B Testing Framework
Set up systematic model comparison across team projects:
import random
import json
from datetime import datetime
from typing import Dict, List, Tuple
from dataclasses import dataclass
import ollama
@dataclass
class ABTestResult:
model_a: str
model_b: str
prompt: str
response_a: str
response_b: str
user_preference: str
response_time_a: float
response_time_b: float
timestamp: datetime
class ModelABTester:
def __init__(self, model_pairs: List[Tuple[str, str]]):
self.model_pairs = model_pairs
self.results = []
def run_comparison(self, prompt: str, model_a: str, model_b: str) -> ABTestResult:
"""Run A/B comparison between two models"""
# Test model A
start_time = time.time()
response_a = ollama.chat(
model=model_a,
messages=[{"role": "user", "content": prompt}]
)
time_a = time.time() - start_time
# Test model B
start_time = time.time()
response_b = ollama.chat(
model=model_b,
messages=[{"role": "user", "content": prompt}]
)
time_b = time.time() - start_time
# Randomize presentation order to avoid bias
if random.choice([True, False]):
print(f"Response A: {response_a['message']['content']}")
print(f"Response B: {response_b['message']['content']}")
preference = input("Which response is better? (A/B): ").upper()
else:
print(f"Response B: {response_b['message']['content']}")
print(f"Response A: {response_a['message']['content']}")
preference = input("Which response is better? (B/A): ").upper()
# Adjust for swapped order
preference = "B" if preference == "A" else "A"
result = ABTestResult(
model_a=model_a,
model_b=model_b,
prompt=prompt,
response_a=response_a['message']['content'],
response_b=response_b['message']['content'],
user_preference=preference,
response_time_a=time_a,
response_time_b=time_b,
timestamp=datetime.now()
)
self.results.append(result)
return result
def generate_report(self) -> Dict:
"""Generate A/B test report"""
if not self.results:
return {"error": "No test results available"}
model_wins = {}
total_tests = len(self.results)
for result in self.results:
winner = result.model_a if result.user_preference == "A" else result.model_b
model_wins[winner] = model_wins.get(winner, 0) + 1
return {
"total_tests": total_tests,
"model_performance": {
model: {"wins": wins, "win_rate": wins / total_tests}
for model, wins in model_wins.items()
},
"average_response_times": {
"model_a": sum(r.response_time_a for r in self.results) / total_tests,
"model_b": sum(r.response_time_b for r in self.results) / total_tests
}
}
# Usage
tester = ModelABTester([("llama2:7b", "llama2:13b")])
# Run tests
test_prompts = [
"Explain machine learning in simple terms",
"Write a Python function to find prime numbers",
"What are the benefits of renewable energy?"
]
for prompt in test_prompts:
result = tester.run_comparison(prompt, "llama2:7b", "llama2:13b")
# Generate report
report = tester.generate_report()
print(json.dumps(report, indent=2))
Shared Prompt Library
Create a centralized prompt library for team reuse:
import json
import os
from datetime import datetime
from typing import Dict, List, Optional
from dataclasses import dataclass, asdict
@dataclass
class PromptTemplate:
name: str
description: str
template: str
variables: List[str]
category: str
author: str
created_at: datetime
tags: List[str]
examples: List[Dict[str, str]]
class TeamPromptLibrary:
def __init__(self, library_path: str = "prompts/"):
self.library_path = library_path
os.makedirs(library_path, exist_ok=True)
def add_prompt(self, prompt: PromptTemplate) -> bool:
"""Add a new prompt to the library"""
filename = f"{prompt.category}_{prompt.name.replace(' ', '_')}.json"
filepath = os.path.join(self.library_path, filename)
try:
# Convert dataclass to dict for JSON serialization
prompt_dict = asdict(prompt)
prompt_dict['created_at'] = prompt.created_at.isoformat()
with open(filepath, 'w') as f:
json.dump(prompt_dict, f, indent=2)
return True
except Exception as e:
print(f"Error saving prompt: {e}")
return False
def get_prompt(self, name: str, category: str = None) -> Optional[PromptTemplate]:
"""Retrieve a prompt by name and optional category"""
for filename in os.listdir(self.library_path):
if filename.endswith('.json'):
filepath = os.path.join(self.library_path, filename)
with open(filepath, 'r') as f:
data = json.load(f)
if data['name'] == name and (category is None or data['category'] == category):
# Convert back to dataclass
data['created_at'] = datetime.fromisoformat(data['created_at'])
return PromptTemplate(**data)
return None
def search_prompts(self, query: str = "", category: str = "",
tags: List[str] = None) -> List[PromptTemplate]:
"""Search prompts by query, category, or tags"""
results = []
tags = tags or []
for filename in os.listdir(self.library_path):
if filename.endswith('.json'):
filepath = os.path.join(self.library_path, filename)
with open(filepath, 'r') as f:
data = json.load(f)
# Check search criteria
matches = True
if query and query.lower() not in data['name'].lower() and query.lower() not in data['description'].lower():
matches = False
if category and data['category'] != category:
matches = False
if tags and not any(tag in data['tags'] for tag in tags):
matches = False
if matches:
data['created_at'] = datetime.fromisoformat(data['created_at'])
results.append(PromptTemplate(**data))
return results
def render_prompt(self, template: PromptTemplate, variables: Dict[str, str]) -> str:
"""Render a prompt template with provided variables"""
rendered = template.template
for var_name, var_value in variables.items():
placeholder = f"{{{var_name}}}"
rendered = rendered.replace(placeholder, var_value)
return rendered
# Usage
library = TeamPromptLibrary()
# Add prompts to library
code_review_prompt = PromptTemplate(
name="Code Review",
description="Template for reviewing code changes",
template="Review the following {language} code for:\n1. Bugs and errors\n2. Performance issues\n3. Best practices\n\nCode:\n{code}\n\nProvide specific feedback:",
variables=["language", "code"],
category="development",
author="alice@team.com",
created_at=datetime.now(),
tags=["code", "review", "development"],
examples=[{
"language": "Python",
"code": "def fibonacci(n):\n if n <= 1:\n return n\n return fibonacci(n-1) + fibonacci(n-2)"
}]
)
library.add_prompt(code_review_prompt)
# Search and use prompts
prompts = library.search_prompts(category="development")
for prompt in prompts:
print(f"Found prompt: {prompt.name}")
# Render prompt with variables
variables = {
"language": "Python",
"code": "def add(a, b):\n return a + b"
}
rendered = library.render_prompt(code_review_prompt, variables)
print(rendered)
Team Communication and Documentation
Automated Status Reports
Generate automated reports for team standup meetings:
import json
from datetime import datetime, timedelta
from typing import Dict, List
import subprocess
class TeamStatusReporter:
def __init__(self, project_path: str):
self.project_path = project_path
def get_git_activity(self, days: int = 1) -> List[Dict]:
"""Get recent git activity for the team"""
since_date = (datetime.now() - timedelta(days=days)).strftime("%Y-%m-%d")
try:
result = subprocess.run([
"git", "log", f"--since={since_date}", "--pretty=format:%H|%an|%ad|%s",
"--date=short"
], capture_output=True, text=True, cwd=self.project_path)
commits = []
for line in result.stdout.strip().split('\n'):
if line:
hash_val, author, date, message = line.split('|', 3)
commits.append({
"hash": hash_val,
"author": author,
"date": date,
"message": message
})
return commits
except Exception as e:
print(f"Error getting git activity: {e}")
return []
def get_model_performance(self) -> Dict:
"""Get recent model performance metrics"""
# Read latest benchmark results
try:
with open("benchmarks/latest.json", "r") as f:
return json.load(f)
except FileNotFoundError:
return {"error": "No benchmark data available"}
def get_deployment_status(self) -> Dict:
"""Check deployment status across environments"""
environments = ["development", "staging", "production"]
status = {}
for env in environments:
try:
# Check if containers are running
result = subprocess.run([
"docker-compose", "-f", f"docker-compose.{env}.yml", "ps", "-q"
], capture_output=True, text=True)
container_count = len(result.stdout.strip().split('\n')) if result.stdout.strip() else 0
status[env] = {
"status": "running" if container_count > 0 else "stopped",
"containers": container_count
}
except Exception as e:
status[env] = {"status": "error", "message": str(e)}
return status
def generate_report(self) -> str:
"""Generate comprehensive team status report"""
report = []
report.append("# Team Ollama Development Status Report")
report.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
report.append("")
# Git activity
commits = self.get_git_activity()
report.append("## Recent Development Activity")
if commits:
for commit in commits[:5]: # Show last 5 commits
report.append(f"- {commit['author']}: {commit['message']} ({commit['date']})")
else:
report.append("- No recent commits")
report.append("")
# Model performance
performance = self.get_model_performance()
report.append("## Model Performance")
if "error" not in performance:
report.append(f"- Last benchmark: {performance.get('timestamp', 'Unknown')}")
for benchmark in performance.get('benchmarks', [])[:3]:
report.append(f"- {benchmark['prompt']}: {benchmark['tokens_per_second']:.2f} tokens/sec")
else:
report.append("- No performance data available")
report.append("")
# Deployment status
deployments = self.get_deployment_status()
report.append("## Deployment Status")
for env, status in deployments.items():
report.append(f"- {env.title()}: {status['status']}")
report.append("")
return "\n".join(report)
# Usage
reporter = TeamStatusReporter("/path/to/project")
status_report = reporter.generate_report()
print(status_report)
# Send to team chat (Slack example)
def send_to_slack(message: str, webhook_url: str):
"""Send status report to team Slack channel"""
import requests
payload = {
"text": "Daily Ollama Team Status",
"attachments": [{
"color": "good",
"text": message,
"mrkdwn_in": ["text"]
}]
}
response = requests.post(webhook_url, json=payload)
return response.status_code == 200
Documentation Generation
Create automated documentation for team models and configurations:
import os
import json
from datetime import datetime
from typing import Dict, List
import subprocess
class TeamDocumentationGenerator:
def __init__(self, project_path: str):
self.project_path = project_path
def scan_models(self) -> List[Dict]:
"""Scan project for model configurations"""
models = []
# Scan Docker Compose files
for filename in os.listdir(self.project_path):
if filename.startswith("docker-compose") and filename.endswith(".yml"):
models.extend(self._extract_models_from_compose(filename))
# Scan environment files
for filename in os.listdir(self.project_path):
if filename.startswith(".env"):
models.extend(self._extract_models_from_env(filename))
return models
def _extract_models_from_compose(self, filename: str) -> List[Dict]:
"""Extract model information from Docker Compose file"""
models = []
try:
with open(os.path.join(self.project_path, filename), 'r') as f:
content = f.read()
# Simple parsing for OLLAMA_MODEL environment variables
lines = content.split('\n')
for line in lines:
if 'OLLAMA_MODEL' in line and '=' in line:
model_name = line.split('=')[1].strip()
models.append({
"name": model_name,
"source": filename,
"type": "docker-compose"
})
except Exception as e:
print(f"Error parsing {filename}: {e}")
return models
def _extract_models_from_env(self, filename: str) -> List[Dict]:
"""Extract model information from environment file"""
models = []
try:
with open(os.path.join(self.project_path, filename), 'r') as f:
for line in f:
if line.startswith('OLLAMA_MODEL='):
model_name = line.split('=')[1].strip()
models.append({
"name": model_name,
"source": filename,
"type": "environment"
})
except Exception as e:
print(f"Error parsing {filename}: {e}")
return models
def generate_model_docs(self) -> str:
"""Generate documentation for all models used in the project"""
models = self.scan_models()
docs = []
docs.append("# Ollama Models Documentation")
docs.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
docs.append("")
# Group models by name
model_groups = {}
for model in models:
name = model["name"]
if name not in model_groups:
model_groups[name] = []
model_groups[name].append(model)
for model_name, usages in model_groups.items():
docs.append(f"## {model_name}")
docs.append("")
# Model information
docs.append("### Usage Locations")
for usage in usages:
docs.append(f"- {usage['source']} ({usage['type']})")
docs.append("")
# Add model specifications if available
specs = self._get_model_specs(model_name)
if specs:
docs.append("### Specifications")
docs.append(f"- Size: {specs.get('size', 'Unknown')}")
docs.append(f"- Parameters: {specs.get('parameters', 'Unknown')}")
docs.append(f"- Quantization: {specs.get('quantization', 'Unknown')}")
docs.append("")
return "\n".join(docs)
def _get_model_specs(self, model_name: str) -> Dict:
"""Get model specifications from Ollama"""
try:
result = subprocess.run([
"ollama", "show", model_name, "--format", "json"
], capture_output=True, text=True)
if result.returncode == 0:
return json.loads(result.stdout)
except Exception as e:
print(f"Error getting specs for {model_name}: {e}")
return {}
def generate_api_docs(self) -> str:
"""Generate API documentation for team endpoints"""
docs = []
docs.append("# Team API Documentation")
docs.append("")
# Scan for API endpoints
endpoints = self._scan_api_endpoints()
for endpoint in endpoints:
docs.append(f"## {endpoint['method']} {endpoint['path']}")
docs.append("")
docs.append(f"**Description:** {endpoint['description']}")
docs.append("")
if endpoint['parameters']:
docs.append("**Parameters:**")
for param in endpoint['parameters']:
docs.append(f"- `{param['name']}` ({param['type']}): {param['description']}")
docs.append("")
if endpoint['example']:
docs.append("**Example Request:**")
docs.append("```bash")
docs.append(endpoint['example'])
docs.append("```")
docs.append("")
return "\n".join(docs)
def _scan_api_endpoints(self) -> List[Dict]:
"""Scan code for API endpoints"""
endpoints = []
# This is a simplified example - implement based on your framework
api_files = []
for root, dirs, files in os.walk(self.project_path):
for file in files:
if file.endswith('.py') and ('api' in file or 'routes' in file):
api_files.append(os.path.join(root, file))
for api_file in api_files:
try:
with open(api_file, 'r') as f:
content = f.read()
# Simple parsing for Flask/FastAPI routes
lines = content.split('\n')
for i, line in enumerate(lines):
if '@app.route' in line or '@router.' in line:
# Extract endpoint information
endpoint = self._parse_endpoint(lines, i)
if endpoint:
endpoints.append(endpoint)
except Exception as e:
print(f"Error parsing {api_file}: {e}")
return endpoints
def _parse_endpoint(self, lines: List[str], start_index: int) -> Dict:
"""Parse endpoint information from code lines"""
# Simplified parsing - implement based on your code structure
route_line = lines[start_index]
# Extract method and path
if 'methods=' in route_line:
method = 'POST' # Default assumption
else:
method = 'GET'
# Extract path (simplified)
path = "/api/example" # Placeholder
return {
"method": method,
"path": path,
"description": "API endpoint description",
"parameters": [],
"example": f"curl -X {method} http://localhost:8000{path}"
}
# Usage
doc_generator = TeamDocumentationGenerator("/path/to/project")
# Generate model documentation
model_docs = doc_generator.generate_model_docs()
with open("docs/models.md", "w") as f:
f.write(model_docs)
# Generate API documentation
api_docs = doc_generator.generate_api_docs()
with open("docs/api.md", "w") as f:
f.write(api_docs)
Troubleshooting Team Issues
Common Collaboration Problems
Model Version Conflicts Teams often struggle with different model versions across environments. Implement version pinning:
# docker-compose.yml
services:
ollama:
image: ollama/ollama:0.1.32 # Pin specific version
environment:
- OLLAMA_MODEL_REGISTRY=https://your-team-registry.com
Resource Contention Multiple developers pulling large models simultaneously can overwhelm bandwidth. Set up a local model cache:
# Set up shared model cache
sudo mkdir -p /opt/ollama-cache
sudo chown -R ollama-team:ollama-team /opt/ollama-cache
# Configure team members to use cache
export OLLAMA_MODELS=/opt/ollama-cache
Configuration Drift Different temperature settings across environments cause inconsistent results. Use configuration validation:
import json
from typing import Dict, List
def validate_team_config(config_file: str) -> List[str]:
"""Validate team configuration against standards"""
errors = []
try:
with open(config_file, 'r') as f:
config = json.load(f)
# Check required fields
required_fields = ['model', 'temperature', 'max_tokens']
for field in required_fields:
if field not in config:
errors.append(f"Missing required field: {field}")
# Validate temperature range
if 'temperature' in config:
temp = config['temperature']
if not 0 <= temp <= 1:
errors.append(f"Temperature {temp} outside valid range [0, 1]")
# Validate max_tokens
if 'max_tokens' in config:
max_tokens = config['max_tokens']
if max_tokens < 1 or max_tokens > 4096:
errors.append(f"max_tokens {max_tokens} outside valid range [1, 4096]")
except Exception as e:
errors.append(f"Error reading config file: {e}")
return errors
# Usage in CI/CD
config_errors = validate_team_config("config/production.json")
if config_errors:
print("Configuration validation failed:")
for error in config_errors:
print(f" - {error}")
exit(1)
Performance Optimization
Batch Processing for Teams Implement request batching to improve throughput:
import asyncio
import time
from typing import List, Dict
import ollama
class OllamaBatchProcessor:
def __init__(self, model: str, batch_size: int = 5):
self.model = model
self.batch_size = batch_size
self.queue = []
self.processing = False
async def add_request(self, prompt: str, user_id: str) -> str:
"""Add request to batch queue"""
request = {
"prompt": prompt,
"user_id": user_id,
"timestamp": time.time(),
"future": asyncio.Future()
}
self.queue.append(request)
# Start processing if not already running
if not self.processing:
asyncio.create_task(self.process_batch())
return await request["future"]
async def process_batch(self):
"""Process requests in batches"""
self.processing = True
while self.queue:
batch = self.queue[:self.batch_size]
self.queue = self.queue[self.batch_size:]
# Process batch concurrently
tasks = []
for request in batch:
task = asyncio.create_task(self._process_single_request(request))
tasks.append(task)
await asyncio.gather(*tasks)
self.processing = False
async def _process_single_request(self, request: Dict):
"""Process a single request"""
try:
response = await asyncio.to_thread(
ollama.chat,
model=self.model,
messages=[{"role": "user", "content": request["prompt"]}]
)
result = response["message"]["content"]
request["future"].set_result(result)
except Exception as e:
request["future"].set_exception(e)
# Usage
processor = OllamaBatchProcessor("llama2:13b", batch_size=3)
# Multiple concurrent requests
async def example_usage():
tasks = [
processor.add_request("Explain Python decorators", "user1"),
processor.add_request("Write a sorting algorithm", "user2"),
processor.add_request("What is machine learning?", "user3")
]
results = await asyncio.gather(*tasks)
for i, result in enumerate(results):
print(f"Response {i+1}: {result[:100]}...")
# Run example
asyncio.run(example_usage())
Security and Access Control
Team Access Management
Implement role-based access control for Ollama resources:
import json
import hashlib
from datetime import datetime, timedelta
from typing import Dict, List, Optional
from enum import Enum
class UserRole(Enum):
ADMIN = "admin"
DEVELOPER = "developer"
VIEWER = "viewer"
class Permission(Enum):
MODEL_PULL = "model_pull"
MODEL_PUSH = "model_push"
MODEL_DELETE = "model_delete"
CONFIG_READ = "config_read"
CONFIG_WRITE = "config_write"
DEPLOY = "deploy"
class TeamAccessManager:
def __init__(self, config_file: str = "team_access.json"):
self.config_file = config_file
self.users = self._load_users()
self.role_permissions = {
UserRole.ADMIN: [
Permission.MODEL_PULL, Permission.MODEL_PUSH, Permission.MODEL_DELETE,
Permission.CONFIG_READ, Permission.CONFIG_WRITE, Permission.DEPLOY
],
UserRole.DEVELOPER: [
Permission.MODEL_PULL, Permission.CONFIG_READ, Permission.CONFIG_WRITE
],
UserRole.VIEWER: [
Permission.MODEL_PULL, Permission.CONFIG_READ
]
}
def _load_users(self) -> Dict:
"""Load users from configuration file"""
try:
with open(self.config_file, 'r') as f:
return json.load(f)
except FileNotFoundError:
return {}
def _save_users(self):
"""Save users to configuration file"""
with open(self.config_file, 'w') as f:
json.dump(self.users, f, indent=2)
def add_user(self, username: str, email: str, role: UserRole, api_key: str = None):
"""Add a new user to the team"""
if api_key is None:
api_key = self._generate_api_key(username)
self.users[username] = {
"email": email,
"role": role.value,
"api_key": api_key,
"created_at": datetime.now().isoformat(),
"last_access": None
}
self._save_users()
def _generate_api_key(self, username: str) -> str:
"""Generate API key for user"""
seed = f"{username}_{datetime.now().isoformat()}_{hash(username)}"
return hashlib.sha256(seed.encode()).hexdigest()
def authenticate_user(self, api_key: str) -> Optional[Dict]:
"""Authenticate user by API key"""
for username, user_data in self.users.items():
if user_data["api_key"] == api_key:
# Update last access
user_data["last_access"] = datetime.now().isoformat()
self._save_users()
return {
"username": username,
"role": UserRole(user_data["role"]),
"permissions": self.role_permissions[UserRole(user_data["role"])]
}
return None
def check_permission(self, api_key: str, permission: Permission) -> bool:
"""Check if user has specific permission"""
user = self.authenticate_user(api_key)
if not user:
return False
return permission in user["permissions"]
def get_user_stats(self) -> Dict:
"""Get team user statistics"""
stats = {
"total_users": len(self.users),
"roles": {},
"recent_activity": []
}
for username, user_data in self.users.items():
role = user_data["role"]
stats["roles"][role] = stats["roles"].get(role, 0) + 1
if user_data["last_access"]:
stats["recent_activity"].append({
"username": username,
"last_access": user_data["last_access"]
})
# Sort by recent activity
stats["recent_activity"].sort(
key=lambda x: x["last_access"],
reverse=True
)
return stats
# Usage
access_manager = TeamAccessManager()
# Add team members
access_manager.add_user("alice", "alice@team.com", UserRole.ADMIN)
access_manager.add_user("bob", "bob@team.com", UserRole.DEVELOPER)
access_manager.add_user("charlie", "charlie@team.com", UserRole.VIEWER)
# Check permissions
api_key = access_manager.users["alice"]["api_key"]
can_deploy = access_manager.check_permission(api_key, Permission.DEPLOY)
print(f"Alice can deploy: {can_deploy}")
# Get team statistics
stats = access_manager.get_user_stats()
print(f"Team has {stats['total_users']} users")
Conclusion
Team-based Ollama development transforms from chaotic model sharing to structured collaboration with the right tools and workflows. Shared model registries eliminate duplicate downloads, standardized configurations prevent environment drift, and automated deployment pipelines ensure consistent releases.
The key to successful Ollama team collaboration lies in treating AI models like code - with version control, testing, and deployment automation. Teams that implement these practices see faster development cycles, fewer production issues, and better knowledge sharing across team members.
Start with basic shared configurations and Docker environments, then gradually add monitoring, A/B testing, and automated deployment as your team grows. The investment in collaboration infrastructure pays dividends in reduced debugging time and improved model performance.
Your team's Ollama development workflow should scale with your projects. Begin implementing these collaboration tools today, and watch your team's AI development velocity accelerate while maintaining quality and consistency across all environments.