Fix CUDA Driver Issues for Local AI in 20 Minutes

Resolve NVIDIA CUDA driver conflicts, version mismatches, and runtime errors for running local LLMs on Linux systems.

Problem: CUDA Drivers Block Your Local AI Setup

You're trying to run LLaMA, Stable Diffusion, or other local AI models on Linux, but you get CUDA driver version is insufficient or no CUDA-capable device detected errors even though your NVIDIA GPU is installed.

You'll learn:

  • How to diagnose CUDA driver vs toolkit version conflicts
  • Fix the most common driver installation issues on Ubuntu/Debian and Arch-based systems
  • Verify your setup works with actual AI workloads

Time: 20 min | Level: Intermediate


Why This Happens

CUDA has three separate version numbers that must align: driver version, toolkit version, and runtime version. Most errors occur when:

  1. Your driver is older than what PyTorch/TensorFlow expects
  2. Multiple CUDA toolkits are installed (conda vs system)
  3. Driver survived a kernel update but needs rebuilding

Common symptoms:

  • RuntimeError: CUDA driver version is insufficient for CUDA runtime version
  • torch.cuda.is_available() returns False
  • nvidia-smi works but Python can't see GPU
  • Driver works after reboot but fails after kernel update

Solution

Step 1: Check What You Actually Have

# GPU detection
lspci | grep -i nvidia

# Driver version (if installed)
nvidia-smi

# CUDA toolkit version (if installed)
nvcc --version

# What PyTorch expects
python3 -c "import torch; print(f'PyTorch CUDA: {torch.version.cuda}')"

Expected output:

  • nvidia-smi shows driver version (e.g., 550.54.15)
  • nvcc shows toolkit version (e.g., 12.4)
  • PyTorch shows required CUDA version

If nvidia-smi fails: Driver isn't installed or kernel module isn't loaded. Continue to Step 2.

If versions mismatch: Your driver is too old for your AI framework. Note the required version and continue.


Step 2: Remove Conflicting Installations

# Ubuntu/Debian - remove old packages
sudo apt remove --purge '^nvidia-.*' '^libnvidia-.*' '^cuda-.*'
sudo apt autoremove

# Arch/Manjaro
sudo pacman -Rns $(pacman -Qq | grep nvidia)

# Remove conda CUDA (if present)
conda list | grep cuda
conda uninstall cudatoolkit cudnn  # if found

# Clean module cache
sudo rm -rf /lib/modules/$(uname -r)/kernel/drivers/video
sudo depmod -a

Why this works: Mixing Ubuntu's nvidia packages with CUDA's official repo, or conda's CUDA with system CUDA, creates version conflicts. Clean slate prevents this.

If it fails:

  • "Unable to remove": Check for running processes with lsof | grep nvidia and kill them
  • Secure Boot enabled: You'll need to sign kernel modules or disable Secure Boot in BIOS

Step 3: Install Matching Driver

For Ubuntu 22.04/24.04 (recommended for AI workloads):

# Add official NVIDIA repository
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update

# Install driver + CUDA 12.4 (latest as of Feb 2026)
sudo apt install cuda-drivers-550 cuda-toolkit-12-4

# Alternatively, just driver (lighter)
sudo apt install nvidia-driver-550

For Arch/Manjaro:

# Latest driver
sudo pacman -S nvidia nvidia-utils cuda

# Or LTS kernel users
sudo pacman -S nvidia-lts nvidia-utils cuda

For other driver versions, check compatibility:

  • CUDA 12.4 requires driver ≥ 550.54.15
  • CUDA 12.1 requires driver ≥ 530.30.02
  • CUDA 11.8 requires driver ≥ 520.61.05

Reboot required after installation:

sudo reboot

Step 4: Verify Driver Loads

# Check module is loaded
lsmod | grep nvidia

# Should see nvidia, nvidia_uvm, nvidia_modeset

# Test GPU detection
nvidia-smi

Expected: nvidia-smi displays your GPU name, driver version, and CUDA version.

If it fails:

  • "NVIDIA-SMI has failed": Kernel module didn't load
    sudo modprobe nvidia
    dmesg | grep -i nvidia  # Check for errors
    
  • Secure Boot issue: Error mentions "Required key not available"
    • Option A: Disable Secure Boot in BIOS
    • Option B: Sign modules (advanced, see DKMS documentation)

Step 5: Install Python CUDA Runtime

# Create clean environment
python3 -m venv ~/ai-env
source ~/ai-env/bin/activate

# Install PyTorch with CUDA 12.4 support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# Verify
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU: {torch.cuda.get_device_name(0)}')"

Expected output:

CUDA available: True
GPU: NVIDIA GeForce RTX 4070

If False:

  • Check LD_LIBRARY_PATH: echo $LD_LIBRARY_PATH should include /usr/local/cuda/lib64
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
    

Verification with Real AI Workload

Test with actual model inference:

# Install transformers
pip install transformers accelerate

# Test CUDA with small model
python3 << 'EOF'
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Device: {torch.cuda.get_device_name(0)}")

# Load small model
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/phi-2",
    torch_dtype=torch.float16,
    device_map="cuda"
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2")

# Quick inference test
inputs = tokenizer("Hello, my name is", return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=20)
print(tokenizer.decode(outputs[0]))
print("✅ CUDA working with AI model!")
EOF

You should see: Model downloads, loads to GPU, generates text without errors.

If OOM (out of memory):

  • Your GPU works! Just need smaller model or quantization
  • Try torch_dtype=torch.float16 or use 4-bit quantization with bitsandbytes

Common Edge Cases

Issue: Works After Reboot, Fails After Kernel Update

Cause: DKMS didn't rebuild module for new kernel.

Fix:

# Rebuild for current kernel
sudo dkms install -m nvidia -v $(modinfo nvidia | grep ^version | awk '{print $2}')

# Or reinstall driver package
sudo apt install --reinstall nvidia-dkms-550

Issue: Multiple GPUs, PyTorch Uses Wrong One

Set default GPU:

export CUDA_VISIBLE_DEVICES=0  # Use first GPU
# Or in Python
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "1"  # Use second GPU

Issue: "CUDA out of memory" with Small Models

Check what's using VRAM:

nvidia-smi
# Look at "Memory-Usage" column

# Kill process hogging GPU
kill -9 <PID>

Clear PyTorch cache:

import torch
torch.cuda.empty_cache()

What You Learned

  • CUDA driver version must meet or exceed toolkit requirements
  • Mixing system CUDA, conda CUDA, and pip packages causes conflicts
  • nvidia-smi working ≠ PyTorch can use GPU (need matching runtimes)
  • Kernel updates can break DKMS drivers if not configured correctly

Limitations:

  • Secure Boot requires extra steps (module signing or disabling)
  • Laptop Optimus setups need additional configuration
  • WSL2 on Windows uses different driver model (not covered here)

Quick Reference

Version Compatibility Matrix (Feb 2026)

AI FrameworkRequires CUDAMin Driver
PyTorch 2.512.4 or 12.1550.54.15
TensorFlow 2.1712.3545.23.08
JAX 0.4.3512.x550+
llama.cppNone (uses CPU)N/A

Essential Commands

# Check everything
nvidia-smi                    # Driver status
nvcc --version                # Toolkit version  
python -c "import torch; print(torch.cuda.is_available())"  # Runtime check

# Fix common issues
sudo modprobe nvidia          # Load module manually
sudo systemctl restart display-manager  # Reset graphics after driver change
sudo dkms autoinstall        # Rebuild all DKMS modules

# Environment
export CUDA_VISIBLE_DEVICES=0           # Select GPU
export CUDA_LAUNCH_BLOCKING=1           # Debug CUDA errors (slower)

Tested on Ubuntu 24.04 LTS, Arch Linux (Feb 2026), with NVIDIA RTX 3060/4070/4090 and CUDA 12.4