Your AI model just learned quantum physics perfectly. Then you teach it cooking recipes. Suddenly, it forgets how atoms work but remembers soufflé techniques flawlessly. Welcome to catastrophic forgetting—the AI equivalent of cramming for finals and forgetting everything from last semester.

Ollama model updates face this exact challenge. Traditional fine-tuning overwrites previous knowledge, creating models that excel at new tasks but fail at old ones. This guide shows you how to implement continual learning techniques that preserve existing knowledge while adding new capabilities.

You'll discover practical methods to update your Ollama models incrementally, maintain performance across all learned tasks, and avoid the dreaded knowledge wipeout that plagues most AI training approaches.

Understanding Catastrophic Forgetting in Ollama Models

What Causes Knowledge Loss During Updates

Catastrophic forgetting occurs when neural networks overwrite previously learned weights during new training sessions. Your Ollama model's parameters shift dramatically to accommodate new data, effectively erasing old knowledge patterns.

Key factors that trigger catastrophic forgetting:

Weight interference: New task gradients conflict with existing weight configurations
Distribution shift: Training data differs significantly from original dataset
Learning rate: High rates accelerate forgetting, low rates slow new learning
Network capacity: Limited parameters force competition between old and new knowledge

Measuring Forgetting Impact

Before implementing solutions, establish baseline metrics to track knowledge retention:

# Evaluation script for measuring catastrophic forgetting
import ollama
import json
from typing import Dict, List

def evaluate_task_performance(model_name: str, task_datasets: Dict[str, List]) -> Dict[str, float]:
    """
    Evaluate model performance across multiple tasks
    Returns accuracy scores for each task
    """
    results = {}
    
    for task_name, test_data in task_datasets.items():
        correct = 0
        total = len(test_data)
        
        for sample in test_data:
            response = ollama.generate(
                model=model_name,
                prompt=sample['prompt']
            )
            
            # Compare response with expected answer
            if evaluate_response(response['response'], sample['expected']):
                correct += 1
        
        accuracy = correct / total
        results[task_name] = accuracy
        print(f"{task_name} accuracy: {accuracy:.2%}")
    
    return results

def calculate_forgetting_metric(before_scores: Dict, after_scores: Dict) -> float:
    """
    Calculate average forgetting across all previous tasks
    """
    forgetting_scores = []
    
    for task in before_scores:
        if task in after_scores:
            forgetting = before_scores[task] - after_scores[task]
            forgetting_scores.append(max(0, forgetting))  # Only positive forgetting
    
    return sum(forgetting_scores) / len(forgetting_scores) if forgetting_scores else 0

Continual Learning Strategies for Ollama

Elastic Weight Consolidation (EWC) Implementation

EWC prevents catastrophic forgetting by identifying important weights for previous tasks and penalizing changes to those parameters during new training.

# EWC implementation for Ollama model updates
import torch
import torch.nn.functional as F
from torch.utils.data import DataLoader

class EWC:
    def __init__(self, model, dataset, sample_size=200):
        self.model = model
        self.dataset = dataset
        self.sample_size = sample_size
        self.params = {n: p.clone().detach() for n, p in model.named_parameters() if p.requires_grad}
        self.precision_matrices = self._calculate_importance()
    
    def _calculate_importance(self):
        """
        Calculate Fisher Information Matrix to identify important weights
        """
        precision_matrices = {}
        
        # Sample subset of data for efficiency
        data_loader = DataLoader(self.dataset, batch_size=1, shuffle=True)
        
        for n, p in self.model.named_parameters():
            if p.requires_grad:
                precision_matrices[n] = torch.zeros_like(p)
        
        self.model.eval()
        sample_count = 0
        
        for input_data, target in data_loader:
            if sample_count >= self.sample_size:
                break
                
            self.model.zero_grad()
            output = self.model(input_data)
            loss = F.cross_entropy(output, target)
            loss.backward()
            
            for n, p in self.model.named_parameters():
                if p.requires_grad and p.grad is not None:
                    precision_matrices[n] += p.grad.data ** 2
            
            sample_count += 1
        
        # Normalize by sample count
        for n in precision_matrices:
            precision_matrices[n] /= sample_count
            
        return precision_matrices
    
    def penalty(self):
        """
        Calculate EWC penalty for current model state
        """
        loss = 0
        for n, p in self.model.named_parameters():
            if p.requires_grad and n in self.precision_matrices:
                loss += (self.precision_matrices[n] * 
                        (p - self.params[n]) ** 2).sum()
        return loss

# Training loop with EWC
def train_with_ewc(model, new_dataset, ewc_lambda=400, epochs=10):
    """
    Train model on new task while preserving old knowledge
    """
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    data_loader = DataLoader(new_dataset, batch_size=32, shuffle=True)
    
    # Calculate EWC penalty for previous task
    ewc = EWC(model, previous_dataset)
    
    model.train()
    for epoch in range(epochs):
        total_loss = 0
        
        for batch_data, batch_targets in data_loader:
            optimizer.zero_grad()
            
            # Forward pass
            outputs = model(batch_data)
            task_loss = F.cross_entropy(outputs, batch_targets)
            
            # Add EWC penalty
            ewc_loss = ewc.penalty()
            total_loss_batch = task_loss + ewc_lambda * ewc_loss
            
            # Backward pass
            total_loss_batch.backward()
            optimizer.step()
            
            total_loss += total_loss_batch.item()
        
        print(f"Epoch {epoch+1}: Loss = {total_loss/len(data_loader):.4f}")

Progressive Neural Networks for Ollama

Progressive networks add new neural pathways for each task while preserving existing ones through lateral connections.

# Progressive network architecture for continual learning
import torch.nn as nn

class ProgressiveColumn(nn.Module):
    def __init__(self, input_size, hidden_sizes, num_previous_columns=0):
        super().__init__()
        self.num_previous_columns = num_previous_columns
        
        # Main pathway for current task
        layers = []
        prev_size = input_size
        
        for hidden_size in hidden_sizes:
            layers.append(nn.Linear(prev_size, hidden_size))
            layers.append(nn.ReLU())
            prev_size = hidden_size
            
        self.main_pathway = nn.Sequential(*layers)
        
        # Lateral connections from previous columns
        self.lateral_connections = nn.ModuleList()
        for i in range(num_previous_columns):
            lateral_layers = []
            for hidden_size in hidden_sizes:
                lateral_layers.append(nn.Linear(hidden_size, hidden_size))
            self.lateral_connections.append(nn.ModuleList(lateral_layers))
    
    def forward(self, x, previous_activations=None):
        """
        Forward pass with lateral connections from previous columns
        """
        activations = []
        current_input = x
        
        # Process through main pathway
        for i, layer in enumerate(self.main_pathway):
            if isinstance(layer, nn.Linear):
                # Apply main transformation
                current_input = layer(current_input)
                
                # Add lateral connections from previous columns
                if previous_activations and i//2 < len(previous_activations[0]):
                    for col_idx, prev_acts in enumerate(previous_activations):
                        if col_idx < len(self.lateral_connections):
                            lateral_contrib = self.lateral_connections[col_idx][i//2](
                                prev_acts[i//2]
                            )
                            current_input += lateral_contrib
                
                activations.append(current_input)
            else:
                current_input = layer(current_input)
                
        return current_input, activations

class ProgressiveNetwork(nn.Module):
    def __init__(self, input_size, hidden_sizes, num_classes_per_task):
        super().__init__()
        self.columns = nn.ModuleList()
        self.classifiers = nn.ModuleList()
        self.num_classes_per_task = num_classes_per_task
        
    def add_task(self, num_classes):
        """
        Add new column for new task
        """
        num_previous = len(self.columns)
        
        # Add new column with lateral connections
        new_column = ProgressiveColumn(
            input_size=self.input_size if hasattr(self, 'input_size') else 784,
            hidden_sizes=[128, 64],
            num_previous_columns=num_previous
        )
        self.columns.append(new_column)
        
        # Add classifier for new task
        classifier = nn.Linear(64, num_classes)
        self.classifiers.append(classifier)
        
        # Freeze previous columns
        for i in range(num_previous):
            for param in self.columns[i].parameters():
                param.requires_grad = False
            for param in self.classifiers[i].parameters():
                param.requires_grad = False
    
    def forward(self, x, task_id):
        """
        Forward pass for specific task
        """
        if task_id >= len(self.columns):
            raise ValueError(f"Task {task_id} not trained yet")
            
        all_activations = []
        
        # Get activations from all previous columns
        for col_idx in range(task_id + 1):
            if col_idx == 0:
                output, activations = self.columns[col_idx](x)
            else:
                output, activations = self.columns[col_idx](x, all_activations)
            all_activations.append(activations)
        
        # Use appropriate classifier
        logits = self.classifiers[task_id](output)
        return logits

Memory-Based Approaches for Knowledge Retention

Experience Replay Implementation

Store representative samples from previous tasks and replay them during new training to maintain performance.

# Experience replay buffer for continual learning
import random
from collections import defaultdict
import pickle

class ExperienceReplayBuffer:
    def __init__(self, capacity_per_task=1000, selection_strategy="random"):
        self.capacity_per_task = capacity_per_task
        self.selection_strategy = selection_strategy
        self.memory = defaultdict(list)
        self.task_counters = defaultdict(int)
    
    def add_samples(self, task_id, samples):
        """
        Add samples to memory buffer for specific task
        """
        if len(self.memory[task_id]) < self.capacity_per_task:
            # Buffer not full, add samples directly
            self.memory[task_id].extend(samples)
        else:
            # Buffer full, replace samples based on strategy
            if self.selection_strategy == "random":
                self._random_replacement(task_id, samples)
            elif self.selection_strategy == "ring":
                self._ring_buffer_replacement(task_id, samples)
    
    def _random_replacement(self, task_id, new_samples):
        """
        Randomly replace samples in buffer
        """
        for sample in new_samples:
            if len(self.memory[task_id]) >= self.capacity_per_task:
                # Replace random sample
                replace_idx = random.randint(0, len(self.memory[task_id]) - 1)
                self.memory[task_id][replace_idx] = sample
            else:
                self.memory[task_id].append(sample)
    
    def _ring_buffer_replacement(self, task_id, new_samples):
        """
        Ring buffer replacement (FIFO)
        """
        for sample in new_samples:
            if len(self.memory[task_id]) >= self.capacity_per_task:
                # Remove oldest sample
                self.memory[task_id].pop(0)
            self.memory[task_id].append(sample)
    
    def sample_batch(self, task_ids, batch_size_per_task):
        """
        Sample batch from specified tasks
        """
        batch_samples = []
        
        for task_id in task_ids:
            if task_id in self.memory and self.memory[task_id]:
                # Sample from this task
                task_samples = random.sample(
                    self.memory[task_id],
                    min(batch_size_per_task, len(self.memory[task_id]))
                )
                batch_samples.extend(task_samples)
        
        return batch_samples
    
    def get_all_samples(self, task_id):
        """
        Get all samples for specific task
        """
        return self.memory.get(task_id, [])
    
    def save_buffer(self, filepath):
        """
        Save replay buffer to disk
        """
        with open(filepath, 'wb') as f:
            pickle.dump(dict(self.memory), f)
    
    def load_buffer(self, filepath):
        """
        Load replay buffer from disk
        """
        with open(filepath, 'rb') as f:
            self.memory = defaultdict(list, pickle.load(f))

# Training with experience replay
def train_with_replay(model, new_task_data, replay_buffer, task_id, epochs=10):
    """
    Train model with experience replay
    """
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    
    # Add new task data to replay buffer
    replay_buffer.add_samples(task_id, new_task_data)
    
    model.train()
    for epoch in range(epochs):
        total_loss = 0
        
        # Create mixed batches with new and old data
        new_batch = random.sample(new_task_data, min(16, len(new_task_data)))
        
        # Sample from previous tasks
        previous_tasks = [i for i in range(task_id)]
        replay_batch = replay_buffer.sample_batch(previous_tasks, batch_size_per_task=4)
        
        # Combine batches
        combined_batch = new_batch + replay_batch
        random.shuffle(combined_batch)
        
        if combined_batch:
            optimizer.zero_grad()
            
            # Process batch
            batch_loss = 0
            for sample in combined_batch:
                input_data, target = sample['input'], sample['target']
                output = model(input_data)
                loss = F.cross_entropy(output.unsqueeze(0), target.unsqueeze(0))
                batch_loss += loss
            
            # Average loss over batch
            batch_loss /= len(combined_batch)
            batch_loss.backward()
            optimizer.step()
            
            total_loss += batch_loss.item()
        
        print(f"Epoch {epoch+1}: Loss = {total_loss:.4f}")

Implementing Continual Learning in Ollama

Setting Up Your Development Environment

# Install required dependencies
pip install ollama torch transformers datasets
pip install numpy matplotlib seaborn

# Pull base Ollama model
ollama pull llama2:7b

# Create project structure
mkdir ollama-continual-learning
cd ollama-continual-learning
mkdir data models scripts results

Custom Modelfile for Continual Learning

# Modelfile for continual learning setup
FROM llama2:7b

# Set custom parameters for incremental learning
PARAMETER temperature 0.1
PARAMETER top_p 0.9
PARAMETER top_k 40
PARAMETER repeat_penalty 1.1

# Custom system prompt for task identification
SYSTEM """You are an AI assistant capable of learning new tasks while retaining previous knowledge. 
When responding, consider both your original training and any new specialized knowledge you've acquired.
If uncertain about which knowledge domain to use, explicitly state your reasoning process."""

# Template for structured learning
TEMPLATE """{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
Task Context: {{ .TaskContext }}
Question: {{ .Prompt }}<|end|>
<|assistant|>
{{ .Response }}<|end|>
{{ end }}"""

Automated Model Update Pipeline

# Automated pipeline for continual learning updates
import ollama
import json
import logging
from datetime import datetime
from pathlib import Path

class OllamaContinualLearner:
    def __init__(self, base_model="llama2:7b", work_dir="./ollama-cl"):
        self.base_model = base_model
        self.work_dir = Path(work_dir)
        self.work_dir.mkdir(exist_ok=True)
        
        # Setup logging
        logging.basicConfig(
            filename=self.work_dir / "training.log",
            level=logging.INFO,
            format='%(asctime)s - %(levelname)s - %(message)s'
        )
        self.logger = logging.getLogger(__name__)
        
        # Track model versions and tasks
        self.model_versions = {}
        self.task_history = []
        
    def create_task_dataset(self, task_name, examples, format_type="qa"):
        """
        Create formatted dataset for specific task
        """
        dataset_path = self.work_dir / f"{task_name}_dataset.jsonl"
        
        with open(dataset_path, 'w') as f:
            for example in examples:
                if format_type == "qa":
                    formatted = {
                        "prompt": f"Task: {task_name}\nQ: {example['question']}\nA:",
                        "completion": example['answer'],
                        "task_id": task_name
                    }
                elif format_type == "instruction":
                    formatted = {
                        "prompt": f"### Instruction:\n{example['instruction']}\n### Response:",
                        "completion": example['response'],
                        "task_id": task_name
                    }
                
                f.write(json.dumps(formatted) + '\n')
        
        self.logger.info(f"Created dataset for {task_name} with {len(examples)} examples")
        return dataset_path
    
    def evaluate_model(self, model_name, test_datasets):
        """
        Evaluate model across multiple tasks
        """
        results = {}
        
        for task_name, test_data in test_datasets.items():
            correct = 0
            total = len(test_data)
            task_results = []
            
            for item in test_data:
                try:
                    response = ollama.generate(
                        model=model_name,
                        prompt=item['prompt'],
                        options={'temperature': 0.1}
                    )
                    
                    # Simple accuracy check (customize based on task)
                    predicted = response['response'].strip().lower()
                    expected = item['expected'].strip().lower()
                    
                    is_correct = self._evaluate_response(predicted, expected, task_name)
                    if is_correct:
                        correct += 1
                    
                    task_results.append({
                        'prompt': item['prompt'],
                        'predicted': predicted,
                        'expected': expected,
                        'correct': is_correct
                    })
                    
                except Exception as e:
                    self.logger.error(f"Error evaluating {task_name}: {e}")
            
            accuracy = correct / total if total > 0 else 0
            results[task_name] = {
                'accuracy': accuracy,
                'correct': correct,
                'total': total,
                'details': task_results
            }
            
            self.logger.info(f"{task_name} accuracy: {accuracy:.2%}")
        
        return results
    
    def _evaluate_response(self, predicted, expected, task_name):
        """
        Task-specific response evaluation
        """
        if task_name == "math":
            # Extract numerical answer
            import re
            pred_nums = re.findall(r'-?\d+\.?\d*', predicted)
            exp_nums = re.findall(r'-?\d+\.?\d*', expected)
            return pred_nums and exp_nums and float(pred_nums[0]) == float(exp_nums[0])
        
        elif task_name == "classification":
            # Exact match for categories
            return predicted == expected
        
        else:
            # Default: substring match
            return expected in predicted or predicted in expected
    
    def incremental_update(self, task_name, training_data, method="replay", **kwargs):
        """
        Perform incremental model update
        """
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        new_model_name = f"{self.base_model}-{task_name}-{timestamp}"
        
        # Create training dataset
        dataset_path = self.create_task_dataset(task_name, training_data)
        
        # Pre-training evaluation
        if self.task_history:
            previous_model = self.model_versions[self.task_history[-1]]
            pre_results = self.evaluate_model(previous_model, self._get_test_datasets())
        else:
            previous_model = self.base_model
            pre_results = {}
        
        # Apply continual learning method
        if method == "replay":
            self._train_with_replay(previous_model, new_model_name, dataset_path, **kwargs)
        elif method == "ewc":
            self._train_with_ewc(previous_model, new_model_name, dataset_path, **kwargs)
        elif method == "progressive":
            self._train_progressive(previous_model, new_model_name, dataset_path, **kwargs)
        
        # Post-training evaluation
        post_results = self.evaluate_model(new_model_name, self._get_test_datasets())
        
        # Calculate forgetting metrics
        forgetting_score = self._calculate_forgetting(pre_results, post_results)
        
        # Update tracking
        self.model_versions[task_name] = new_model_name
        self.task_history.append(task_name)
        
        # Save results
        results = {
            'task_name': task_name,
            'model_name': new_model_name,
            'timestamp': timestamp,
            'method': method,
            'pre_results': pre_results,
            'post_results': post_results,
            'forgetting_score': forgetting_score,
            'parameters': kwargs
        }
        
        results_path = self.work_dir / f"results_{task_name}_{timestamp}.json"
        with open(results_path, 'w') as f:
            json.dump(results, f, indent=2)
        
        self.logger.info(f"Completed incremental update for {task_name}")
        self.logger.info(f"Forgetting score: {forgetting_score:.3f}")
        
        return results
    
    def _train_with_replay(self, base_model, new_model, dataset_path, replay_ratio=0.3):
        """
        Train with experience replay
        """
        # Implementation would integrate with Ollama's fine-tuning API
        # This is a placeholder for the actual implementation
        modelfile_content = f"""
FROM {base_model}

# Add replay samples from previous tasks
{self._generate_replay_samples(replay_ratio)}

# Fine-tune on new task
ADAPTER {dataset_path}

PARAMETER learning_rate 0.0001
PARAMETER epochs 3
PARAMETER batch_size 4
"""
        
        # Create and train model
        with open(self.work_dir / "Modelfile", 'w') as f:
            f.write(modelfile_content)
        
        # Note: Actual implementation would use Ollama's training API
        self.logger.info(f"Training {new_model} with replay method")
    
    def _generate_replay_samples(self, ratio):
        """
        Generate replay samples from previous tasks
        """
        replay_samples = []
        for task in self.task_history:
            task_file = self.work_dir / f"{task}_dataset.jsonl"
            if task_file.exists():
                with open(task_file) as f:
                    task_data = [json.loads(line) for line in f]
                    sample_size = int(len(task_data) * ratio)
                    replay_samples.extend(random.sample(task_data, sample_size))
        
        return "\n".join([f"# Replay: {s['prompt']} -> {s['completion']}" 
                         for s in replay_samples])
    
    def _get_test_datasets(self):
        """
        Load test datasets for all learned tasks
        """
        test_datasets = {}
        for task in self.task_history:
            test_file = self.work_dir / f"{task}_test.jsonl"
            if test_file.exists():
                with open(test_file) as f:
                    test_datasets[task] = [json.loads(line) for line in f]
        return test_datasets
    
    def _calculate_forgetting(self, pre_results, post_results):
        """
        Calculate average forgetting across previous tasks
        """
        if not pre_results:
            return 0.0
        
        forgetting_scores = []
        for task in pre_results:
            if task in post_results:
                pre_acc = pre_results[task]['accuracy']
                post_acc = post_results[task]['accuracy']
                forgetting = max(0, pre_acc - post_acc)
                forgetting_scores.append(forgetting)
        
        return sum(forgetting_scores) / len(forgetting_scores) if forgetting_scores else 0.0

# Usage example
def main():
    learner = OllamaContinualLearner()
    
    # Task 1: Math problems
    math_examples = [
        {'question': 'What is 15 + 27?', 'answer': '42'},
        {'question': 'Calculate 8 * 9', 'answer': '72'},
        # Add more examples...
    ]
    
    # Task 2: Science facts
    science_examples = [
        {'question': 'What is the chemical symbol for gold?', 'answer': 'Au'},
        {'question': 'How many planets are in our solar system?', 'answer': '8'},
        # Add more examples...
    ]
    
    # Sequential learning
    learner.incremental_update("math", math_examples, method="replay")
    learner.incremental_update("science", science_examples, method="replay")

if __name__ == "__main__":
    main()

Advanced Techniques and Optimization

Meta-Learning for Faster Adaptation

Implement Model-Agnostic Meta-Learning (MAML) to enable rapid adaptation to new tasks with minimal forgetting.

# MAML implementation for Ollama continual learning
import torch
import torch.nn as nn
from torch.optim import Adam
import copy

class MAMLLearner:
    def __init__(self, model, meta_lr=0.001, inner_lr=0.01, inner_steps=5):
        self.model = model
        self.meta_lr = meta_lr
        self.inner_lr = inner_lr
        self.inner_steps = inner_steps
        self.meta_optimizer = Adam(model.parameters(), lr=meta_lr)
    
    def inner_loop(self, support_data, model_copy):
        """
        Perform inner loop adaptation on support set
        """
        inner_optimizer = torch.optim.SGD(model_copy.parameters(), lr=self.inner_lr)
        
        for step in range(self.inner_steps):
            inner_optimizer.zero_grad()
            
            # Calculate loss on support set
            support_loss = 0
            for sample in support_data:
                input_data, target = sample['input'], sample['target']
                output = model_copy(input_data)
                loss = F.cross_entropy(output, target)
                support_loss += loss
            
            support_loss /= len(support_data)
            support_loss.backward()
            inner_optimizer.step()
        
        return model_copy
    
    def meta_update(self, task_batch):
        """
        Perform meta-update across batch of tasks
        """
        self.meta_optimizer.zero_grad()
        meta_loss = 0
        
        for task_data in task_batch:
            support_set = task_data['support']
            query_set = task_data['query']
            
            # Create copy of model for inner loop
            model_copy = copy.deepcopy(self.model)
            
            # Adapt on support set
            adapted_model = self.inner_loop(support_set, model_copy)
            
            # Evaluate on query set
            query_loss = 0
            for sample in query_set:
                input_data, target = sample['input'], sample['target']
                output = adapted_model(input_data)
                loss = F.cross_entropy(output, target)
                query_loss += loss
            
            query_loss /= len(query_set)
            meta_loss += query_loss
        
        meta_loss /= len(task_batch)
        meta_loss.backward()
        self.meta_optimizer.step()
        
        return meta_loss.item()

# Integration with Ollama pipeline
def meta_learning_pipeline(learner, task_distributions):
    """
    Apply meta-learning to improve continual learning
    """
    maml = MAMLLearner(learner.model)
    
    # Meta-training phase
    for episode in range(100):  # Meta-training episodes
        task_batch = []
        
        # Sample tasks from distribution
        for _ in range(4):  # Batch size
            task_name = random.choice(list(task_distributions.keys()))
            task_data = task_distributions[task_name]
            
            # Split into support and query sets
            random.shuffle(task_data)
            split_point = len(task_data) // 2
            
            task_batch.append({
                'support': task_data[:split_point],
                'query': task_data[split_point:],
                'task_name': task_name
            })
        
        # Meta-update
        meta_loss = maml.meta_update(task_batch)
        
        if episode % 10 == 0:
            print(f"Meta-training episode {episode}: Loss = {meta_loss:.4f}")
    
    return maml

Dynamic Architecture Expansion

Automatically expand model capacity when performance degrades due to task interference.

# Dynamic architecture expansion for continual learning
class DynamicExpansion:
    def __init__(self, base_model, expansion_threshold=0.1, max_expansions=5):
        self.base_model = base_model
        self.expansion_threshold = expansion_threshold
        self.max_expansions = max_expansions
        self.expansion_count = 0
        self.performance_history = []
    
    def should_expand(self, current_performance, baseline_performance):
        """
        Determine if architecture should be expanded
        """
        if not baseline_performance:
            return False
        
        # Calculate performance drop
        avg_drop = 0
        for task in baseline_performance:
            if task in current_performance:
                drop = baseline_performance[task]['accuracy'] - current_performance[task]['accuracy']
                avg_drop += max(0, drop)
        
        avg_drop /= len(baseline_performance)
        
        return (avg_drop > self.expansion_threshold and 
                self.expansion_count < self.max_expansions)
    
    def expand_architecture(self, model_name):
        """
        Add capacity to existing model
        """
        expansion_config = f"""
FROM {model_name}

# Add expansion layers
PARAMETER num_heads {8 + self.expansion_count * 2}
PARAMETER hidden_size {512 + self.expansion_count * 128}
PARAMETER intermediate_size {2048 + self.expansion_count * 512}

# Regularization for new parameters
PARAMETER dropout_rate 0.1
PARAMETER weight_decay 0.01
"""
        
        expanded_model_name = f"{model_name}-expanded-{self.expansion_count}"
        
        # Create expanded model (placeholder for actual implementation)
        self.expansion_count += 1
        
        return expanded_model_name
    
    def adaptive_learning(self, learner, new_task_data, task_name):
        """
        Adaptively expand architecture if needed
        """
        # Baseline performance before new task
        baseline = learner.evaluate_model(
            learner.model_versions.get(learner.task_history[-1], learner.base_model),
            learner._get_test_datasets()
        ) if learner.task_history else {}
        
        # Train on new task
        results = learner.incremental_update(task_name, new_task_data)
        
        # Check if expansion needed
        if self.should_expand(results['post_results'], baseline):
            print(f"Expanding architecture due to performance drop...")
            
            # Expand and retrain
            expanded_model = self.expand_architecture(results['model_name'])
            
            # Retrain with expanded capacity
            expanded_results = learner.incremental_update(
                f"{task_name}-expanded", 
                new_task_data,
                method="progressive"
            )
            
            return expanded_results
        
        return results

Monitoring and Evaluation Framework

Comprehensive Metrics Dashboard

# Monitoring dashboard for continual learning
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from typing import Dict, List
import numpy as np

class ContinualLearningMonitor:
    def __init__(self, results_dir="./results"):
        self.results_dir = Path(results_dir)
        self.results_dir.mkdir(exist_ok=True)
        
    def load_experiment_results(self, experiment_files: List[str]) -> pd.DataFrame:
        """
        Load and combine results from multiple experiments
        """
        all_results = []
        
        for file_path in experiment_files:
            with open(file_path) as f:
                result = json.load(f)
                
                # Flatten task results
                for task_name, task_result in result['post_results'].items():
                    all_results.append({
                        'experiment': file_path.stem,
                        'task_name': task_name,
                        'accuracy': task_result['accuracy'],
                        'timestamp': result['timestamp'],
                        'method': result['method'],
                        'forgetting_score': result['forgetting_score'],
                        'task_order': len(result['pre_results']) + 1
                    })
        
        return pd.DataFrame(all_results)
    
    def plot_learning_curve(self, df: pd.DataFrame, save_path: str = None):
        """
        Plot learning curves for different methods
        """
        plt.figure(figsize=(12, 8))
        
        # Plot by method
        for method in df['method'].unique():
            method_data = df[df['method'] == method]
            
            # Average accuracy over tasks
            avg_acc = method_data.groupby('task_order')['accuracy'].mean()
            std_acc = method_data.groupby('task_order')['accuracy'].std()
            
            plt.plot(avg_acc.index, avg_acc.values, marker='o', label=f'{method.title()}')
            plt.fill_between(avg_acc.index, 
                           avg_acc.values - std_acc.values,
                           avg_acc.values + std_acc.values, 
                           alpha=0.3)
        
        plt.xlabel('Task Order')
        plt.ylabel('Average Accuracy')
        plt.title('Continual Learning Performance')
        plt.legend()
        plt.grid(True, alpha=0.3)
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
        plt.show()
    
    def plot_forgetting_analysis(self, df: pd.DataFrame, save_path: str = None):
        """
        Analyze and visualize catastrophic forgetting
        """
        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
        
        # Forgetting by method
        forgetting_by_method = df.groupby(['method', 'task_order'])['forgetting_score'].mean().reset_index()
        
        for method in forgetting_by_method['method'].unique():
            method_data = forgetting_by_method[forgetting_by_method['method'] == method]
            ax1.plot(method_data['task_order'], method_data['forgetting_score'], 
                    marker='s', label=f'{method.title()}')
        
        ax1.set_xlabel('Task Order')
        ax1.set_ylabel('Forgetting Score')
        ax1.set_title('Catastrophic Forgetting by Method')
        ax1.legend()
        ax1.grid(True, alpha=0.3)
        
        # Forgetting heatmap by task
        pivot_data = df.pivot_table(values='forgetting_score', 
                                   index='task_name', 
                                   columns='method', 
                                   aggfunc='mean')
        
        sns.heatmap(pivot_data, annot=True, cmap='Reds', ax=ax2)
        ax2.set_title('Forgetting Score Heatmap')
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
        plt.show()
    
    def generate_performance_report(self, df: pd.DataFrame) -> Dict:
        """
        Generate comprehensive performance report
        """
        report = {}
        
        # Overall statistics
        report['overall'] = {
            'total_experiments': len(df['experiment'].unique()),
            'total_tasks': len(df['task_name'].unique()),
            'methods_compared': list(df['method'].unique()),
            'avg_accuracy': df['accuracy'].mean(),
            'avg_forgetting': df['forgetting_score'].mean()
        }
        
        # Method comparison
        method_stats = df.groupby('method').agg({
            'accuracy': ['mean', 'std'],
            'forgetting_score': ['mean', 'std']
        }).round(3)
        
        report['method_comparison'] = method_stats.to_dict()
        
        # Task difficulty analysis
        task_difficulty = df.groupby('task_name')['accuracy'].mean().sort_values()
        report['task_difficulty'] = {
            'easiest_tasks': task_difficulty.tail(3).to_dict(),
            'hardest_tasks': task_difficulty.head(3).to_dict()
        }
        
        # Best performing configurations
        best_configs = df.loc[df.groupby(['method', 'task_name'])['accuracy'].idxmax()]
        report['best_configurations'] = best_configs[['method', 'task_name', 'accuracy']].to_dict('records')
        
        return report
    
    def export_results(self, df: pd.DataFrame, report: Dict, export_path: str):
        """
        Export results and visualizations
        """
        export_dir = Path(export_path)
        export_dir.mkdir(exist_ok=True)
        
        # Save data
        df.to_csv(export_dir / 'results_data.csv', index=False)
        
        # Save report
        with open(export_dir / 'performance_report.json', 'w') as f:
            json.dump(report, f, indent=2)
        
        # Generate plots
        self.plot_learning_curve(df, export_dir / 'learning_curves.png')
        self.plot_forgetting_analysis(df, export_dir / 'forgetting_analysis.png')
        
        print(f"Results exported to {export_dir}")

# Usage example
def monitor_experiments():
    monitor = ContinualLearningMonitor()
    
    # Load experimental results
    experiment_files = list(Path("./ollama-cl").glob("results_*.json"))
    df = monitor.load_experiment_results(experiment_files)
    
    # Generate report
    report = monitor.generate_performance_report(df)
    
    # Export everything
    monitor.export_results(df, report, "./experiment_analysis")
    
    return report

Best Practices and Troubleshooting

Common Issues and Solutions

Problem: Rapid forgetting despite using EWC

Solution: Increase EWC lambda parameter or use larger Fisher Information sample size
Code fix: Set ewc_lambda=1000 and sample_size=500 in EWC initialization

Problem: New task learning too slow with replay

Solution: Adjust replay ratio and learning rate schedule
Implementation: Use curriculum learning with gradually increasing new task proportion

# Adaptive replay ratio based on task similarity
def adaptive_replay_ratio(new_task_data, replay_buffer, base_ratio=0.3):
    """
    Adjust replay ratio based on task similarity
    """
    if not replay_buffer.memory:
        return 0.0  # No previous tasks
    
    # Calculate task similarity (simplified)
    similarity_scores = []
    
    for task_id in replay_buffer.memory:
        old_samples = replay_buffer.get_all_samples(task_id)
        similarity = calculate_task_similarity(new_task_data, old_samples)
        similarity_scores.append(similarity)
    
    avg_similarity = sum(similarity_scores) / len(similarity_scores)
    
    # Higher similarity = more replay needed
    adaptive_ratio = base_ratio * (1 + avg_similarity)
    
    return min(adaptive_ratio, 0.8)  # Cap at 80%

def calculate_task_similarity(task1_data, task2_data):
    """
    Calculate similarity between two tasks using vocabulary overlap
    """
    # Extract vocabulary from both tasks
    vocab1 = set()
    vocab2 = set()
    
    for sample in task1_data:
        vocab1.update(sample['prompt'].lower().split())
    
    for sample in task2_data:
        vocab2.update(sample['prompt'].lower().split())
    
    # Jaccard similarity
    intersection = len(vocab1.intersection(vocab2))
    union = len(vocab1.union(vocab2))
    
    return intersection / union if union > 0 else 0

Performance Optimization Tips

Memory Management: Use gradient checkpointing for large models
Batch Processing: Implement smart batching for mixed task data
Early Stopping: Monitor validation performance to prevent overfitting

# Early stopping implementation
class EarlyStopping:
    def __init__(self, patience=7, min_delta=0.001):
        self.patience = patience
        self.min_delta = min_delta
        self.counter = 0
        self.best_score = None
        
    def __call__(self, val_score):
        if self.best_score is None:
            self.best_score = val_score
        elif val_score < self.best_score + self.min_delta:
            self.counter += 1
            if self.counter >= self.patience:
                return True
        else:
            self.best_score = val_score
            self.counter = 0
        return False

Conclusion

Implementing continual learning with Ollama models requires careful balance between acquiring new knowledge and preserving existing capabilities. The techniques covered—Elastic Weight Consolidation, Progressive Networks, and Experience Replay—provide robust solutions for different scenarios.

Key takeaways for successful Ollama model updates:

Start with experience replay for general-purpose continual learning scenarios
Use EWC when computational resources are limited but task relationships are well-understood
Apply progressive networks for highly dissimilar tasks that benefit from dedicated pathways
Monitor forgetting metrics continuously to catch performance degradation early
Implement adaptive strategies that adjust based on task similarity and model performance

The automated pipeline and monitoring framework provide production-ready tools for managing continual learning workflows. Regular evaluation across all learned tasks ensures your Ollama models maintain broad competency while successfully acquiring new specialized knowledge.

With these continual learning implementations, your AI systems can evolve gracefully—gaining new capabilities without losing their foundational knowledge, just like human learning should be.