Prevent AI Agents from Making Unauthorized API Purchases in 20 Minutes

Stop AI agents from spending your budget without approval. Implement spending limits, approval gates, and audit trails for agent-driven API calls.

Problem: Your AI Agent Is Spending Money Without Permission

You deployed an AI agent that can call paid APIs — Stripe, Twilio, OpenAI, cloud services. Now it's racking up charges you didn't authorize because nothing stops it from calling those endpoints freely.

You'll learn:

  • How to add a spending gate that requires human approval above a threshold
  • How to implement per-session budget caps with hard cutoffs
  • How to build an immutable audit trail for every purchase attempt

Time: 20 min | Level: Intermediate


Why This Happens

Most agent frameworks (LangChain, AutoGen, CrewAI) treat tool calls as equal — a read_file and a charge_customer have the same trust level by default. There's no built-in concept of "this action costs money." Without explicit guardrails, an agent optimizing for task completion will call paid APIs as freely as free ones.

Common symptoms:

  • Unexpected charges on cloud provider dashboards
  • API rate limits hit from excessive agent polling
  • Duplicate purchases from retry loops on failed tasks

Solution

Step 1: Classify Your Tools by Cost Risk

Before adding guards, split your agent's tools into tiers.

from enum import Enum
from dataclasses import dataclass
from typing import Callable, Any

class CostTier(Enum):
    FREE = "free"          # No cost: read ops, internal queries
    LOW = "low"            # < $0.10 per call: small LLM calls, SMS
    HIGH = "high"          # > $0.10 per call: image gen, bulk email
    CRITICAL = "critical"  # Irreversible: charges, subscriptions, deletions

@dataclass
class GuardedTool:
    name: str
    fn: Callable
    tier: CostTier
    estimated_cost_usd: float = 0.0

Expected: Every tool in your agent has an explicit tier. If you're unsure, default to CRITICAL.


Step 2: Implement a Budget Gate

This wrapper intercepts any HIGH or CRITICAL call and checks remaining budget before executing.

import threading
from decimal import Decimal

class AgentBudgetGate:
    def __init__(self, session_limit_usd: float, auto_approve_below_usd: float = 0.10):
        self.session_limit = Decimal(str(session_limit_usd))
        self.auto_approve_threshold = Decimal(str(auto_approve_below_usd))
        self.spent = Decimal("0")
        self._lock = threading.Lock()  # Safe for concurrent agent threads

    def request_spend(self, tool: GuardedTool, context: str) -> bool:
        cost = Decimal(str(tool.estimated_cost_usd))

        with self._lock:
            # Hard stop: session budget exhausted
            if self.spent + cost > self.session_limit:
                raise BudgetExceededError(
                    f"Session limit ${self.session_limit} reached. "
                    f"Spent: ${self.spent}, Requested: ${cost}"
                )

            # Auto-approve small, non-critical charges
            if tool.tier != CostTier.CRITICAL and cost <= self.auto_approve_threshold:
                self.spent += cost
                return True

        # Require human approval for everything else
        return self._human_approval_required(tool, cost, context)

    def _human_approval_required(self, tool: GuardedTool, cost: Decimal, context: str) -> bool:
        # Replace with your notification system (Slack, email, webhook)
        print(f"\n[APPROVAL REQUIRED]\nTool: {tool.name}\nCost: ${cost}\nContext: {context}")
        response = input("Approve? (yes/no): ").strip().lower()

        if response == "yes":
            with self._lock:
                self.spent += cost
            return True

        return False  # Agent gets False — it should handle gracefully, not retry

If it fails:

  • BudgetExceededError on first call: Your session_limit is lower than estimated_cost_usd for that tool — raise the limit or fix the cost estimate.
  • _human_approval_required never fires: Confirm the tool's tier is HIGH or CRITICAL, not LOW.

Step 3: Build an Immutable Audit Log

Every purchase attempt — approved or denied — must be logged before execution, not after. Logging after means a crash loses the record.

import json
import hashlib
import time
from pathlib import Path

class PurchaseAuditLog:
    def __init__(self, log_path: str = "agent_purchases.jsonl"):
        self.log_path = Path(log_path)

    def record(self, tool_name: str, cost_usd: float, approved: bool,
               context: str, session_id: str) -> str:
        entry = {
            "timestamp": time.time(),
            "session_id": session_id,
            "tool": tool_name,
            "cost_usd": cost_usd,
            "approved": approved,
            "context": context[:500],  # Truncate to avoid log bloat
        }

        # Chain hash for tamper detection
        entry["hash"] = self._hash_entry(entry)

        # Append-only write — never overwrite existing entries
        with open(self.log_path, "a") as f:
            f.write(json.dumps(entry) + "\n")

        return entry["hash"]

    def _hash_entry(self, entry: dict) -> str:
        # Hash the content so alterations are detectable
        content = json.dumps({k: v for k, v in entry.items() if k != "hash"}, sort_keys=True)
        return hashlib.sha256(content.encode()).hexdigest()[:16]

Why this works: JSONL (one JSON object per line) is append-only by convention. The chain hash means if someone edits a past entry, the hash no longer matches — you'll know.


Step 4: Wire It Into Your Agent

Hook the gate and audit log into your tool executor.

def execute_tool(tool: GuardedTool, args: dict, context: str,
                 gate: AgentBudgetGate, audit: PurchaseAuditLog, session_id: str) -> Any:

    if tool.tier in (CostTier.HIGH, CostTier.CRITICAL):
        approved = False
        try:
            approved = gate.request_spend(tool, context)
        finally:
            # Log the attempt regardless of outcome — before execution
            audit.record(tool.name, tool.estimated_cost_usd, approved, context, session_id)

        if not approved:
            # Return a structured refusal — don't raise, let the agent decide next step
            return {"error": "purchase_not_approved", "tool": tool.name}

    return tool.fn(**args)

Expected: Every paid tool call now passes through the gate. The agent receives a clear purchase_not_approved error it can reason about instead of silently failing.


Verification

Run a quick integration test against your toolset:

# test_budget_gate.py
import pytest
from your_module import AgentBudgetGate, GuardedTool, CostTier, BudgetExceededError

def test_hard_stop_at_budget():
    gate = AgentBudgetGate(session_limit_usd=0.50, auto_approve_below_usd=0.10)
    cheap_tool = GuardedTool("sms", lambda: None, CostTier.LOW, estimated_cost_usd=0.05)

    # Auto-approve 5 cheap calls ($0.05 x 5 = $0.25 spent, under limit)
    for _ in range(5):
        gate.request_spend(cheap_tool, "test")

    expensive_tool = GuardedTool("bulk_email", lambda: None, CostTier.HIGH, estimated_cost_usd=0.30)

    # This should hit the budget ceiling ($0.25 + $0.30 > $0.50)
    with pytest.raises(BudgetExceededError):
        gate.request_spend(expensive_tool, "test")
pytest test_budget_gate.py -v

You should see: test_hard_stop_at_budget PASSED

Terminal showing pytest passing budget gate tests All assertions pass — the hard stop fires exactly at the session limit


What You Learned

  • Classifying tools by cost tier is the foundation — without it, no guardrail works reliably
  • Log purchase attempts before execution, not after — crashes between the two lose records
  • Return structured errors to the agent on denial; raising exceptions causes unpredictable retry behavior

Limitation: This guards against accidental overspending, not a compromised agent prompt. If an attacker controls the agent's context, they can craft approval-looking responses. Pair this with prompt injection defenses for production systems.

When NOT to use this: Fully automated pipelines where human approval would block the workflow — instead, lower auto_approve_threshold to $0 and rely entirely on budget caps.


Tested on Python 3.12, LangChain 0.3.x, AutoGen 0.4.x — gate logic is framework-agnostic