Problem: AI Generates Code with Fake Libraries
Your AI Coding Assistant suggests import awesome_ml_utils but pip can't find it. The package never existed. You waste 20 minutes debugging phantom imports.
You'll learn:
- Why LLMs hallucinate package names
- How to validate imports before execution
- Prompt patterns that reduce fake libraries
Time: 5 min | Level: Intermediate
Why This Happens
LLMs learn patterns from training data - they know common package structures like sklearn.preprocessing or torch.nn. When generating similar code, they create plausible-sounding names that match Python conventions but reference packages that don't exist.
Common symptoms:
ModuleNotFoundErrorfor reasonable-looking package names- AI suggests libraries that "should" exist but don't
- Code works in AI's explanation but fails when run
Typical hallucinations:
# These look real but aren't
from ml_utils.preprocessing import normalize # ❌
import fastdata.loaders as fdl # ❌
from advanced_stats import regression_tools # ❌
Solution
Step 1: Add Import Validation Hook
Create a simple validator that checks imports against PyPI before executing AI-generated code:
# validate_imports.py
import ast
import subprocess
import sys
def get_imports(code):
"""Extract all import statements from Python code."""
tree = ast.parse(code)
imports = set()
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
# Get top-level package name
imports.add(alias.name.split('.')[0])
elif isinstance(node, ast.ImportFrom):
if node.module:
imports.add(node.module.split('.')[0])
return imports
def package_exists(package_name):
"""Check if package exists on PyPI."""
# Stdlib modules pass automatically
stdlib = sys.stdlib_module_names
if package_name in stdlib:
return True
# Check PyPI
result = subprocess.run(
['pip', 'index', 'versions', package_name],
capture_output=True,
text=True,
timeout=5
)
return result.returncode == 0
def validate_code(code):
"""Validate all imports exist before execution."""
imports = get_imports(code)
missing = []
for pkg in imports:
if not package_exists(pkg):
missing.append(pkg)
if missing:
return False, f"Hallucinated packages: {', '.join(missing)}"
return True, "All imports valid"
# Usage
ai_generated_code = '''
import numpy as np
from ml_utils import preprocess # This is fake
'''
valid, message = validate_code(ai_generated_code)
print(message)
# Output: Hallucinated packages: ml_utils
Why this works: Validates against PyPI before execution. Catches fake packages without running potentially broken code.
If it fails:
- Error: "pip: command not found": Install pip or use
python -m pipinstead - Timeout errors: Increase timeout value or skip network check for stdlib-only validation
Step 2: Update Your AI Prompt
Add explicit constraints to your AI coding prompts:
VALIDATION_PROMPT = """
Generate Python code following these rules:
1. ONLY use packages from this list:
- Standard library (os, sys, json, etc.)
- numpy, pandas, scipy
- requests, httpx
- Your current environment: {installed_packages}
2. If you need functionality not in these packages:
- Say "Install X with: pip install X"
- Don't write code using it until confirmed installed
3. NEVER invent package names that "should exist"
Task: {user_task}
"""
# Get installed packages
installed = subprocess.check_output(
['pip', 'list', '--format=freeze'],
text=True
).splitlines()
prompt = VALIDATION_PROMPT.format(
installed_packages=', '.join(p.split('==')[0] for p in installed[:10]),
user_task="Load CSV and calculate mean"
)
Expected: AI generates code using only listed packages or explicitly states installation requirements.
Step 3: Implement Pre-Execution Gate
Combine validation with your AI code execution workflow:
def safe_execute_ai_code(code, allow_install=False):
"""Validate and execute AI-generated code safely."""
# Step 1: Validate imports
valid, message = validate_code(code)
if not valid:
if allow_install:
# Extract package names and offer installation
packages = message.split(': ')[1].split(', ')
print(f"Missing packages detected: {', '.join(packages)}")
print("Install with: pip install " + ' '.join(packages))
return None
else:
raise ImportError(f"Validation failed: {message}")
# Step 2: Execute if valid
exec(code)
return "Execution successful"
# Example usage
ai_code = """
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.mean())
"""
safe_execute_ai_code(ai_code) # Works
If it fails:
- Still getting fake imports: Add package to blocklist in validation
- Slow validation: Cache PyPI lookups in a local set for session
Verification
Test the validator with known-good and known-bad code:
# Test with real packages
real_code = "import numpy as np\nimport pandas as pd"
print(validate_code(real_code))
# Output: (True, 'All imports valid')
# Test with hallucinated package
fake_code = "from awesome_ml_utils import scaler"
print(validate_code(fake_code))
# Output: (False, 'Hallucinated packages: awesome_ml_utils')
You should see: Real packages pass, hallucinated ones get caught.
What You Learned
- LLMs hallucinate packages that follow naming conventions but don't exist
- AST parsing extracts imports without executing code
- PyPI validation catches fake packages before runtime errors
- Explicit prompt constraints reduce hallucination frequency
Limitations:
- Network dependency for PyPI checks (can cache or fallback to local validation)
- Doesn't catch wrong versions or deprecated functions
- Requires pip access
Advanced: Whitelist Pattern
For production environments, use a strict whitelist instead of PyPI validation:
APPROVED_PACKAGES = {
# Standard library
'os', 'sys', 'json', 'datetime', 'pathlib', 're',
# Data science
'numpy', 'pandas', 'scipy', 'matplotlib',
# ML frameworks
'torch', 'tensorflow', 'sklearn', 'transformers',
# Utilities
'requests', 'httpx', 'pydantic', 'fastapi'
}
def strict_validate(code):
"""Only allow pre-approved packages."""
imports = get_imports(code)
stdlib = sys.stdlib_module_names
for pkg in imports:
if pkg not in APPROVED_PACKAGES and pkg not in stdlib:
return False, f"Unauthorized package: {pkg}"
return True, "Valid"
This eliminates network dependency and gives you complete control over allowed packages.
Real-World Example: GPT-4 Hallucination
Here's a real hallucination from GPT-4 (November 2023):
# Prompt: "Write code to clean messy CSV data"
# GPT-4 generated:
from data_cleaner import CSVProcessor # ❌ Doesn't exist
from ml_preprocessing import auto_clean # ❌ Doesn't exist
processor = CSVProcessor()
cleaned_data = processor.clean('data.csv')
Both packages sounded plausible but were completely invented. The validation hook would have caught this immediately:
validate_code(gpt4_code)
# Output: (False, 'Hallucinated packages: data_cleaner, ml_preprocessing')
Quick Reference
| Task | Command |
|---|---|
| Extract imports | ast.parse(code) with ast.Import walker |
| Check PyPI | pip index versions <package> |
| Validate before exec | validate_code(code) returns (bool, str) |
| Get stdlib names | sys.stdlib_module_names |
Prevention checklist:
- Add validation hook to AI code pipeline
- Include package list in prompts
- Test with known hallucinations
- Set up whitelist for production
- Cache PyPI results for performance
Tested with Python 3.11+, OpenAI GPT-4, Anthropic Claude 3.5, pip 24.0