Stop AI from Inventing Python Libraries in 5 Minutes

Problem: AI Generates Code with Fake Libraries

Your AI Coding Assistant suggests import awesome_ml_utils but pip can't find it. The package never existed. You waste 20 minutes debugging phantom imports.

You'll learn:

Why LLMs hallucinate package names
How to validate imports before execution
Prompt patterns that reduce fake libraries

Time: 5 min | Level: Intermediate

Why This Happens

LLMs learn patterns from training data - they know common package structures like sklearn.preprocessing or torch.nn. When generating similar code, they create plausible-sounding names that match Python conventions but reference packages that don't exist.

Common symptoms:

ModuleNotFoundError for reasonable-looking package names
AI suggests libraries that "should" exist but don't
Code works in AI's explanation but fails when run

Typical hallucinations:

# These look real but aren't
from ml_utils.preprocessing import normalize  # ❌
import fastdata.loaders as fdl  # ❌
from advanced_stats import regression_tools  # ❌

Solution

Step 1: Add Import Validation Hook

Create a simple validator that checks imports against PyPI before executing AI-generated code:

# validate_imports.py
import ast
import subprocess
import sys

def get_imports(code):
    """Extract all import statements from Python code."""
    tree = ast.parse(code)
    imports = set()
    
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                # Get top-level package name
                imports.add(alias.name.split('.')[0])
        elif isinstance(node, ast.ImportFrom):
            if node.module:
                imports.add(node.module.split('.')[0])
    
    return imports

def package_exists(package_name):
    """Check if package exists on PyPI."""
    # Stdlib modules pass automatically
    stdlib = sys.stdlib_module_names
    if package_name in stdlib:
        return True
    
    # Check PyPI
    result = subprocess.run(
        ['pip', 'index', 'versions', package_name],
        capture_output=True,
        text=True,
        timeout=5
    )
    return result.returncode == 0

def validate_code(code):
    """Validate all imports exist before execution."""
    imports = get_imports(code)
    missing = []
    
    for pkg in imports:
        if not package_exists(pkg):
            missing.append(pkg)
    
    if missing:
        return False, f"Hallucinated packages: {', '.join(missing)}"
    return True, "All imports valid"

# Usage
ai_generated_code = '''
import numpy as np
from ml_utils import preprocess  # This is fake
'''

valid, message = validate_code(ai_generated_code)
print(message)
# Output: Hallucinated packages: ml_utils

Why this works: Validates against PyPI before execution. Catches fake packages without running potentially broken code.

If it fails:

Error: "pip: command not found": Install pip or use python -m pip instead
Timeout errors: Increase timeout value or skip network check for stdlib-only validation

Step 2: Update Your AI Prompt

Add explicit constraints to your AI coding prompts:

VALIDATION_PROMPT = """
Generate Python code following these rules:

1. ONLY use packages from this list:
   - Standard library (os, sys, json, etc.)
   - numpy, pandas, scipy
   - requests, httpx
   - Your current environment: {installed_packages}

2. If you need functionality not in these packages:
   - Say "Install X with: pip install X"
   - Don't write code using it until confirmed installed

3. NEVER invent package names that "should exist"

Task: {user_task}
"""

# Get installed packages
installed = subprocess.check_output(
    ['pip', 'list', '--format=freeze'],
    text=True
).splitlines()

prompt = VALIDATION_PROMPT.format(
    installed_packages=', '.join(p.split('==')[0] for p in installed[:10]),
    user_task="Load CSV and calculate mean"
)

Expected: AI generates code using only listed packages or explicitly states installation requirements.

Step 3: Implement Pre-Execution Gate

Combine validation with your AI code execution workflow:

def safe_execute_ai_code(code, allow_install=False):
    """Validate and execute AI-generated code safely."""
    # Step 1: Validate imports
    valid, message = validate_code(code)
    
    if not valid:
        if allow_install:
            # Extract package names and offer installation
            packages = message.split(': ')[1].split(', ')
            print(f"Missing packages detected: {', '.join(packages)}")
            print("Install with: pip install " + ' '.join(packages))
            return None
        else:
            raise ImportError(f"Validation failed: {message}")
    
    # Step 2: Execute if valid
    exec(code)
    return "Execution successful"

# Example usage
ai_code = """
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.mean())
"""

safe_execute_ai_code(ai_code)  # Works

If it fails:

Still getting fake imports: Add package to blocklist in validation
Slow validation: Cache PyPI lookups in a local set for session

Verification

Test the validator with known-good and known-bad code:

# Test with real packages
real_code = "import numpy as np\nimport pandas as pd"
print(validate_code(real_code))
# Output: (True, 'All imports valid')

# Test with hallucinated package
fake_code = "from awesome_ml_utils import scaler"
print(validate_code(fake_code))
# Output: (False, 'Hallucinated packages: awesome_ml_utils')

You should see: Real packages pass, hallucinated ones get caught.

What You Learned

LLMs hallucinate packages that follow naming conventions but don't exist
AST parsing extracts imports without executing code
PyPI validation catches fake packages before runtime errors
Explicit prompt constraints reduce hallucination frequency

Limitations:

Network dependency for PyPI checks (can cache or fallback to local validation)
Doesn't catch wrong versions or deprecated functions
Requires pip access

Advanced: Whitelist Pattern

For production environments, use a strict whitelist instead of PyPI validation:

APPROVED_PACKAGES = {
    # Standard library
    'os', 'sys', 'json', 'datetime', 'pathlib', 're',
    # Data science
    'numpy', 'pandas', 'scipy', 'matplotlib',
    # ML frameworks  
    'torch', 'tensorflow', 'sklearn', 'transformers',
    # Utilities
    'requests', 'httpx', 'pydantic', 'fastapi'
}

def strict_validate(code):
    """Only allow pre-approved packages."""
    imports = get_imports(code)
    stdlib = sys.stdlib_module_names
    
    for pkg in imports:
        if pkg not in APPROVED_PACKAGES and pkg not in stdlib:
            return False, f"Unauthorized package: {pkg}"
    return True, "Valid"

This eliminates network dependency and gives you complete control over allowed packages.

Real-World Example: GPT-4 Hallucination

Here's a real hallucination from GPT-4 (November 2023):

# Prompt: "Write code to clean messy CSV data"
# GPT-4 generated:

from data_cleaner import CSVProcessor  # ❌ Doesn't exist
from ml_preprocessing import auto_clean  # ❌ Doesn't exist

processor = CSVProcessor()
cleaned_data = processor.clean('data.csv')

Both packages sounded plausible but were completely invented. The validation hook would have caught this immediately:

validate_code(gpt4_code)
# Output: (False, 'Hallucinated packages: data_cleaner, ml_preprocessing')

Quick Reference

Task	Command
Extract imports	`ast.parse(code)` with `ast.Import` walker
Check PyPI	`pip index versions <package>`
Validate before exec	`validate_code(code)` returns `(bool, str)`
Get stdlib names	`sys.stdlib_module_names`

Prevention checklist:

Add validation hook to AI code pipeline
Include package list in prompts
Test with known hallucinations
Set up whitelist for production
Cache PyPI results for performance

Tested with Python 3.11+, OpenAI GPT-4, Anthropic Claude 3.5, pip 24.0