Stop AI from Inventing Python Libraries in 5 Minutes

Prevent LLM code generation from hallucinating non-existent Python packages with validation hooks and prompt engineering.

Problem: AI Generates Code with Fake Libraries

Your AI Coding Assistant suggests import awesome_ml_utils but pip can't find it. The package never existed. You waste 20 minutes debugging phantom imports.

You'll learn:

  • Why LLMs hallucinate package names
  • How to validate imports before execution
  • Prompt patterns that reduce fake libraries

Time: 5 min | Level: Intermediate


Why This Happens

LLMs learn patterns from training data - they know common package structures like sklearn.preprocessing or torch.nn. When generating similar code, they create plausible-sounding names that match Python conventions but reference packages that don't exist.

Common symptoms:

  • ModuleNotFoundError for reasonable-looking package names
  • AI suggests libraries that "should" exist but don't
  • Code works in AI's explanation but fails when run

Typical hallucinations:

# These look real but aren't
from ml_utils.preprocessing import normalize  # ❌
import fastdata.loaders as fdl  # ❌
from advanced_stats import regression_tools  # ❌

Solution

Step 1: Add Import Validation Hook

Create a simple validator that checks imports against PyPI before executing AI-generated code:

# validate_imports.py
import ast
import subprocess
import sys

def get_imports(code):
    """Extract all import statements from Python code."""
    tree = ast.parse(code)
    imports = set()
    
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                # Get top-level package name
                imports.add(alias.name.split('.')[0])
        elif isinstance(node, ast.ImportFrom):
            if node.module:
                imports.add(node.module.split('.')[0])
    
    return imports

def package_exists(package_name):
    """Check if package exists on PyPI."""
    # Stdlib modules pass automatically
    stdlib = sys.stdlib_module_names
    if package_name in stdlib:
        return True
    
    # Check PyPI
    result = subprocess.run(
        ['pip', 'index', 'versions', package_name],
        capture_output=True,
        text=True,
        timeout=5
    )
    return result.returncode == 0

def validate_code(code):
    """Validate all imports exist before execution."""
    imports = get_imports(code)
    missing = []
    
    for pkg in imports:
        if not package_exists(pkg):
            missing.append(pkg)
    
    if missing:
        return False, f"Hallucinated packages: {', '.join(missing)}"
    return True, "All imports valid"

# Usage
ai_generated_code = '''
import numpy as np
from ml_utils import preprocess  # This is fake
'''

valid, message = validate_code(ai_generated_code)
print(message)
# Output: Hallucinated packages: ml_utils

Why this works: Validates against PyPI before execution. Catches fake packages without running potentially broken code.

If it fails:

  • Error: "pip: command not found": Install pip or use python -m pip instead
  • Timeout errors: Increase timeout value or skip network check for stdlib-only validation

Step 2: Update Your AI Prompt

Add explicit constraints to your AI coding prompts:

VALIDATION_PROMPT = """
Generate Python code following these rules:

1. ONLY use packages from this list:
   - Standard library (os, sys, json, etc.)
   - numpy, pandas, scipy
   - requests, httpx
   - Your current environment: {installed_packages}

2. If you need functionality not in these packages:
   - Say "Install X with: pip install X"
   - Don't write code using it until confirmed installed

3. NEVER invent package names that "should exist"

Task: {user_task}
"""

# Get installed packages
installed = subprocess.check_output(
    ['pip', 'list', '--format=freeze'],
    text=True
).splitlines()

prompt = VALIDATION_PROMPT.format(
    installed_packages=', '.join(p.split('==')[0] for p in installed[:10]),
    user_task="Load CSV and calculate mean"
)

Expected: AI generates code using only listed packages or explicitly states installation requirements.


Step 3: Implement Pre-Execution Gate

Combine validation with your AI code execution workflow:

def safe_execute_ai_code(code, allow_install=False):
    """Validate and execute AI-generated code safely."""
    # Step 1: Validate imports
    valid, message = validate_code(code)
    
    if not valid:
        if allow_install:
            # Extract package names and offer installation
            packages = message.split(': ')[1].split(', ')
            print(f"Missing packages detected: {', '.join(packages)}")
            print("Install with: pip install " + ' '.join(packages))
            return None
        else:
            raise ImportError(f"Validation failed: {message}")
    
    # Step 2: Execute if valid
    exec(code)
    return "Execution successful"

# Example usage
ai_code = """
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.mean())
"""

safe_execute_ai_code(ai_code)  # Works

If it fails:

  • Still getting fake imports: Add package to blocklist in validation
  • Slow validation: Cache PyPI lookups in a local set for session

Verification

Test the validator with known-good and known-bad code:

# Test with real packages
real_code = "import numpy as np\nimport pandas as pd"
print(validate_code(real_code))
# Output: (True, 'All imports valid')

# Test with hallucinated package
fake_code = "from awesome_ml_utils import scaler"
print(validate_code(fake_code))
# Output: (False, 'Hallucinated packages: awesome_ml_utils')

You should see: Real packages pass, hallucinated ones get caught.


What You Learned

  • LLMs hallucinate packages that follow naming conventions but don't exist
  • AST parsing extracts imports without executing code
  • PyPI validation catches fake packages before runtime errors
  • Explicit prompt constraints reduce hallucination frequency

Limitations:

  • Network dependency for PyPI checks (can cache or fallback to local validation)
  • Doesn't catch wrong versions or deprecated functions
  • Requires pip access

Advanced: Whitelist Pattern

For production environments, use a strict whitelist instead of PyPI validation:

APPROVED_PACKAGES = {
    # Standard library
    'os', 'sys', 'json', 'datetime', 'pathlib', 're',
    # Data science
    'numpy', 'pandas', 'scipy', 'matplotlib',
    # ML frameworks  
    'torch', 'tensorflow', 'sklearn', 'transformers',
    # Utilities
    'requests', 'httpx', 'pydantic', 'fastapi'
}

def strict_validate(code):
    """Only allow pre-approved packages."""
    imports = get_imports(code)
    stdlib = sys.stdlib_module_names
    
    for pkg in imports:
        if pkg not in APPROVED_PACKAGES and pkg not in stdlib:
            return False, f"Unauthorized package: {pkg}"
    return True, "Valid"

This eliminates network dependency and gives you complete control over allowed packages.


Real-World Example: GPT-4 Hallucination

Here's a real hallucination from GPT-4 (November 2023):

# Prompt: "Write code to clean messy CSV data"
# GPT-4 generated:

from data_cleaner import CSVProcessor  # ❌ Doesn't exist
from ml_preprocessing import auto_clean  # ❌ Doesn't exist

processor = CSVProcessor()
cleaned_data = processor.clean('data.csv')

Both packages sounded plausible but were completely invented. The validation hook would have caught this immediately:

validate_code(gpt4_code)
# Output: (False, 'Hallucinated packages: data_cleaner, ml_preprocessing')

Quick Reference

TaskCommand
Extract importsast.parse(code) with ast.Import walker
Check PyPIpip index versions <package>
Validate before execvalidate_code(code) returns (bool, str)
Get stdlib namessys.stdlib_module_names

Prevention checklist:

  • Add validation hook to AI code pipeline
  • Include package list in prompts
  • Test with known hallucinations
  • Set up whitelist for production
  • Cache PyPI results for performance

Tested with Python 3.11+, OpenAI GPT-4, Anthropic Claude 3.5, pip 24.0