Problem: Your Local LLM Stack Is a Supply Chain Target
You installed Ollama or llama.cpp to run models locally, but these tools pull dependencies from PyPI, Homebrew, and GitHub releases—all vectors attackers use to inject malicious code.
You'll learn:
- How to scan Python dependencies for known vulnerabilities
- Detect tampered binaries in LLM runtimes
- Set up automated monitoring for new threats
Time: 20 min | Level: Intermediate
Why This Happens
Local LLM tools depend on hundreds of packages. A single compromised dependency can:
- Exfiltrate your model weights or prompts
- Mine cryptocurrency using your GPU
- Establish persistent backdoors
Common attack vectors:
- Typosquatting (torch vs t0rch)
- Dependency confusion (internal package names)
- Compromised maintainer accounts
- Binary tampering in model files
Real incidents:
- PyTorch supply chain attack (2023)
- LangChain arbitrary code execution (2024)
- Compromised Hugging Face tokens (2025)
Solution
Step 1: Inventory Your Dependencies
First, identify what's actually installed:
# Python dependencies (for vLLM, transformers, etc.)
pip list --format=json > llm-dependencies.json
# Homebrew packages (Ollama on macOS)
brew list --versions > brew-packages.txt
# System packages (Ubuntu/Debian)
dpkg -l | grep -E 'ollama|llama|cuda' > system-packages.txt
Expected: JSON file with 200+ packages if you're using vLLM or transformers.
If it fails:
- "pip: command not found": You're in the wrong Python environment. Run
which python3to find it. - Permission denied on Homebrew: Add
sudoonly if you used sudo during install (not recommended).
Step 2: Scan for Known Vulnerabilities
Use pip-audit to check Python dependencies against OSV and PyPI advisories:
# Install auditing tool
pip install pip-audit --break-system-packages
# Scan dependencies
pip-audit --desc --format json -o audit-report.json
Why this works: pip-audit queries the Python advisory database for CVEs matching your exact package versions. The --desc flag explains each vulnerability.
Common findings:
- Pillow <10.2.0: Arbitrary code execution via crafted images
- Requests <2.31.0: Header injection vulnerability
- Protobuf <4.25.0: Denial of service
# Show only high-severity issues
pip-audit --vulnerability-service osv --severity HIGH
Step 3: Verify Binary Integrity
Ollama and llama.cpp distribute pre-compiled binaries. Verify they match official signatures:
# Download Ollama's public key (first time only)
curl -fsSL https://ollama.com/public.key | gpg --import
# Verify signature (example for Linux binary)
gpg --verify ollama-linux-amd64.sig ollama-linux-amd64
# For Homebrew-installed Ollama
brew audit ollama
Expected: "Good signature from Ollama Team"
If verification fails:
- "No public key": Re-import the key, ensure HTTPS
- "BAD signature": Delete the binary and re-download from official source only
Step 4: Check Model File Hashes
Model files (.gguf, .safetensors) can contain embedded exploits:
# Verify against Hugging Face hash
sha256sum llama-2-7b.gguf
# Compare with official value from model card
# Example: https://huggingface.co/TheBloke/Llama-2-7B-GGUF
Create a verification script:
# verify_model.py
import hashlib
import sys
def verify_hash(filepath, expected_hash):
"""Compare file SHA256 against known-good value"""
sha256 = hashlib.sha256()
with open(filepath, 'rb') as f:
# Read in chunks to handle large files
for chunk in iter(lambda: f.read(8192), b''):
sha256.update(chunk)
computed = sha256.hexdigest()
if computed == expected_hash:
print(f"✅ VERIFIED: {filepath}")
return True
else:
print(f"⌠MISMATCH: {filepath}")
print(f"Expected: {expected_hash}")
print(f"Got: {computed}")
return False
if __name__ == "__main__":
# Usage: python verify_model.py model.gguf <sha256_hash>
verify_hash(sys.argv[1], sys.argv[2])
Run it:
python verify_model.py llama-2-7b.gguf 8a8c9e0b1f2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8
Step 5: Set Up Automated Monitoring
Don't audit once—monitor continuously:
# Install GitHub Advisory Database CLI
pip install safety --break-system-packages
# Create daily cron job
echo "0 9 * * * cd /path/to/llm-project && safety check --json > safety-$(date +\%Y\%m\%d).json" | crontab -
Better approach: Use dependabot or renovate:
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "daily"
open-pull-requests-limit: 10
labels:
- "security"
- "dependencies"
Why this matters: New vulnerabilities are disclosed weekly. Manual audits go stale in days.
Step 6: Isolate the Runtime
Even with audits, assume compromise and limit blast radius:
# Run in Docker container with no network after model download
docker run --rm --gpus all \
--network none \
-v ./models:/models:ro \
ollama/ollama run llama2
# Or use firejail on Linux
firejail --net=none --private ollama run llama2
This prevents:
- Data exfiltration (no network)
- File system tampering (read-only mounts)
- GPU mining callbacks (no outbound connections)
Verification
Run a full audit and check exit codes:
# Comprehensive check
pip-audit && echo "✅ No Python vulnerabilities"
gpg --verify ollama-*.sig && echo "✅ Binary verified"
sha256sum -c model-hashes.txt && echo "✅ Models verified"
You should see: Three success messages. Any failure means investigate that component.
What You Learned
- Python dependencies are the highest-risk vector (200+ packages)
- Binary verification catches tampered executables
- Model files can contain exploits—always verify hashes
- Continuous monitoring beats one-time audits
Limitations:
- Zero-day exploits won't appear in audit databases
- Social engineering (fake Ollama sites) bypasses technical controls
- Compromised upstream maintainers are hard to detect
When NOT to rely only on audits:
- Production systems: Use signed container images
- Regulated industries: Add SBOM generation and attestation
- High-value targets: Air-gap the LLM runtime entirely
Advanced: Generate an SBOM
For compliance or deeper analysis, create a Software Bill of Materials:
# Install syft
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh
# Generate SBOM for your Python environment
syft packages dir:/path/to/venv -o spdx-json > llm-sbom.json
# Scan SBOM for vulnerabilities
grype sbom:llm-sbom.json
What you get: Machine-readable inventory of every component, version, and license—required for compliance frameworks like NIST SSDF.
Common Vulnerabilities in LLM Stacks
High-risk packages to audit immediately:
| Package | Risk | Check Version |
|---|---|---|
| transformers | RCE via pickle | ≥4.36.0 |
| torch | Arbitrary code execution | ≥2.1.2 |
| safetensors | Memory corruption | ≥0.4.2 |
| langchain | SQL injection | ≥0.1.0 |
| chromadb | Path traversal | ≥0.4.22 |
Update command:
pip install --upgrade transformers torch safetensors langchain chromadb
Emergency Response: If You Find a Compromised Package
Immediate actions:
Isolate the system: Disconnect from network
Check logs for exfiltration:
# Check outbound connections sudo netstat -tunapl | grep python # Check for crypto miners ps aux | grep -E 'xmrig|ethminer'Rotate all credentials: API keys, SSH keys, cloud credentials
Report to package maintainers: Open security advisory on GitHub
Document timeline: When installed, what ran, what was exposed
Don't:
- Just upgrade and move on (attacker may have persistence)
- Trust the same package repository immediately
- Skip forensic analysis (you need to know what was compromised)
Tested on Python 3.11+, Ollama 0.1.26, llama.cpp b1940, Ubuntu 24.04 & macOS 14
Tools used: