LangGraph Human-in-the-Loop: Approval Gates in AI Workflows

Problem: Your AI Agent Acts Before You Can Stop It

You build a LangGraph agent that deletes records, sends emails, or calls paid APIs. It works in testing. Then in production it misreads context and fires off 200 emails to customers at 2am.

Human-in-the-loop (HITL) patterns solve this by pausing the graph at a checkpoint and waiting for explicit human approval before continuing — or abandoning the run entirely.

You'll learn:

How to pause a LangGraph graph mid-run using interrupt()
How to wire a MemorySaver checkpointer so state survives the pause
How to resume or reject a run from a separate process or API endpoint

Time: 25 min | Difficulty: Intermediate

How LangGraph Pause-and-Resume Works

LangGraph graphs run as a series of nodes. Normally they execute start-to-finish. With a checkpointer attached, the graph serializes full state to storage at every step — this is what makes pause-and-resume possible.

User input
    │
    ▼
[plan_node] ──────────────────────────────────▶ state saved
    │
    ▼
[interrupt()] ◀── graph pauses here, returns to caller
    │
    │   (human reviews, approves or rejects via separate call)
    │
    ▼
[execute_node] ──── resumes with original state + human decision
    │
    ▼
Output

The key objects:

interrupt(value) — raises a special signal that pauses the graph and surfaces value to the caller. Available in LangGraph 0.2+.
MemorySaver — in-process checkpointer for development. Swap for AsyncPostgresSaver in production.
thread_id — every run needs one. It's the key that links a paused run to its resumed continuation.
Command(resume=...) — the object you pass to .invoke() to resume a paused graph.

Solution

Step 1: Install Dependencies

# LangGraph 0.3+ required for stable interrupt() API
pip install langgraph>=0.3.0 langchain-openai>=0.3.0

# Verify
python -c "import langgraph; print(langgraph.__version__)"

Expected: 0.3.x

Step 2: Define Your State and Tools

# graph.py
from typing import Annotated, TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import interrupt, Command
from langchain_openai import ChatOpenAI

# State schema — add whatever your workflow needs
class WorkflowState(TypedDict):
    user_request: str
    plan: str
    human_decision: str   # "approved" | "rejected"
    result: str

llm = ChatOpenAI(model="gpt-4o-mini")

Step 3: Build the Nodes

Each node is a plain function that takes state and returns a partial state update.

def plan_node(state: WorkflowState) -> dict:
    """Generate an action plan from the user request."""
    response = llm.invoke(
        f"Create a concise action plan for: {state['user_request']}\n"
        "List the exact steps you will take. Be specific."
    )
    return {"plan": response.content}


def approval_gate(state: WorkflowState) -> dict:
    """Pause here and surface the plan to a human reviewer."""
    # interrupt() halts graph execution and returns this value to the caller.
    # The graph will not proceed until resumed with Command(resume=...).
    decision = interrupt({
        "message": "Review and approve this action plan before execution.",
        "plan": state["plan"],
        "request": state["user_request"],
    })
    # decision is whatever the human passed into Command(resume=decision)
    return {"human_decision": decision}


def execute_node(state: WorkflowState) -> dict:
    """Only runs if human approved. Simulates the actual action."""
    if state["human_decision"] != "approved":
        return {"result": f"Run rejected by reviewer. Reason: {state['human_decision']}"}

    # Replace this with your real action: API call, DB write, email send, etc.
    response = llm.invoke(
        f"Execute this plan and summarize what you did:\n{state['plan']}"
    )
    return {"result": response.content}

Step 4: Wire the Graph with a Checkpointer

# The checkpointer is what makes pause/resume possible.
# MemorySaver is fine for dev; use AsyncPostgresSaver in production.
checkpointer = MemorySaver()

builder = StateGraph(WorkflowState)

builder.add_node("plan", plan_node)
builder.add_node("approval_gate", approval_gate)
builder.add_node("execute", execute_node)

builder.add_edge(START, "plan")
builder.add_edge("plan", "approval_gate")
builder.add_edge("approval_gate", "execute")
builder.add_edge("execute", END)

# compile() with checkpointer enables state persistence
graph = builder.compile(checkpointer=checkpointer)

Step 5: Trigger the Run

# run_workflow.py
import uuid

# thread_id ties this run together across pause/resume.
# In production, persist this ID so your approval UI can reference it.
thread_id = str(uuid.uuid4())
config = {"configurable": {"thread_id": thread_id}}

print(f"Starting run: {thread_id}")

result = graph.invoke(
    {"user_request": "Delete all inactive users from the database who haven't logged in for 180 days"},
    config=config,
)

# When interrupt() fires, invoke() returns early.
# result will contain the interrupt payload, not the final output.
print("Graph paused. Interrupt payload:")
print(result)

Expected output:

Starting run: f3a1c2d4-...
Graph paused. Interrupt payload:
{'user_request': '...', 'plan': '1. Query users WHERE last_login < NOW() - INTERVAL 180 days\n2. ...', '__interrupt__': (...)}

The __interrupt__ key signals that the graph is paused and waiting.

Step 6: Resume with a Human Decision

This runs in a separate process, a webhook handler, or an admin UI — anything that has the thread_id.

# approve_or_reject.py
from graph import graph   # import the compiled graph with its checkpointer
from langgraph.types import Command

# Same thread_id from the initial run
thread_id = "f3a1c2d4-..."  # replace with real ID from your DB or queue
config = {"configurable": {"thread_id": thread_id}}

# --- APPROVE ---
final_result = graph.invoke(
    Command(resume="approved"),  # this string lands in state["human_decision"]
    config=config,
)
print(final_result["result"])

# --- REJECT with reason ---
# final_result = graph.invoke(
#     Command(resume="rejected: too broad, add a dry-run step first"),
#     config=config,
# )

Expected (approved):

Executed plan: Queried users table for last_login < 180 days ago, found 42 records,
deleted them, and logged the operation to audit_log table.

Expected (rejected):

Run rejected by reviewer. Reason: rejected: too broad, add a dry-run step first

Step 7: Inspect State Without Resuming

You can read the paused state at any time — useful for building an approval UI.

from graph import graph

config = {"configurable": {"thread_id": "f3a1c2d4-..."}}

# get_state() does not resume the graph
snapshot = graph.get_state(config)

print("Current node:", snapshot.next)       # ('approval_gate',)
print("Plan to review:", snapshot.values["plan"])
print("Pending interrupts:", snapshot.tasks)

Production Swap: Postgres Checkpointer

MemorySaver lives in process memory. Restart your server and all paused runs are gone.

# requirements: pip install langgraph-checkpoint-postgres psycopg[binary]
from langgraph.checkpoint.postgres import PostgresSaver

DB_URL = "postgresql://user:pass@localhost:5432/langgraph"

with PostgresSaver.from_conn_string(DB_URL) as checkpointer:
    checkpointer.setup()  # creates tables on first run
    graph = builder.compile(checkpointer=checkpointer)

With Postgres, paused runs survive server restarts and can be approved hours or days later. The thread_id is your durable reference.

Verification

Run the full flow end-to-end:

python run_workflow.py
# Copy the thread_id from output, paste into approve_or_reject.py
python approve_or_reject.py

You should see: The execute node prints a result summary only after the Command(resume="approved") call. Changing to any other string skips execution and prints the rejection message.

Check that state is correctly checkpointed between steps:

# After run_workflow.py but before approve_or_reject.py
snapshot = graph.get_state({"configurable": {"thread_id": thread_id}})
assert snapshot.next == ("approval_gate",), "Graph should be paused at approval_gate"
print("✅ State correctly persisted at approval gate")

Production Considerations

Make thread_id durable. Write it to your database the moment you generate it. If the calling process crashes before you store it, you can't resume that run.

Set a timeout policy. Paused runs left indefinitely consume checkpointer storage. Use a background job to reject runs older than your SLA (24h, 72h, etc.) by calling graph.invoke(Command(resume="rejected: approval timeout"), config=config).

Surface the interrupt payload cleanly. The dict you pass to interrupt() is what your approval UI will render. Include everything a reviewer needs to make a decision: the plan, the scope, estimated impact, and a link to relevant context.

Don't use MemorySaver in multi-worker deployments. Each worker has its own memory. A run started on worker A can only be resumed on worker A. Use Postgres or Redis checkpointers for anything with more than one process.

Parallel approval gates are possible. For workflows that need two-person sign-off, use Send() to fan out to two approval nodes and merge with a join node that checks both decisions.

What You Learned

interrupt() pauses a LangGraph graph and surfaces a payload to the caller without losing state
MemorySaver enables pause-and-resume in development; swap to PostgresSaver for production
thread_id is the durable key that links a paused run to its resumption
Command(resume=value) is how you inject a human decision back into the graph
graph.get_state() lets you inspect paused runs non-destructively — useful for approval UIs

When NOT to use this pattern: Low-stakes, high-volume automations (formatting, classification, summarization) don't need approval gates. HITL adds latency and operational overhead. Reserve it for irreversible actions: deletes, sends, payments, infrastructure changes.

Tested on LangGraph 0.3.4, Python 3.12, macOS Sequoia and Ubuntu 24.04