LangSmith with LangGraph: Trace Multi-Agent Workflows

Problem: Multi-Agent Workflows Fail Silently

LangGraph agents fail mid-graph and you don't know which node broke, what state was passed, or which tool call returned bad data. print() debugging doesn't scale across parallel branches.

LangSmith gives you full trace visibility — every node, every LLM call, every tool result — with zero changes to your graph logic.

You'll learn:

Connect LangSmith tracing to any LangGraph workflow in under 5 minutes
Read traces to debug node failures and unexpected state mutations
Tag runs by agent, thread, and environment for production filtering

Time: 20 min | Difficulty: Intermediate

Why LangGraph Tracing Is Hard Without LangSmith

LangGraph executes nodes as a state machine. A single user request can trigger 10–30 LLM calls across multiple agents, tool calls, and conditional branches. Standard logging gives you:

INFO:root:Node supervisor called
INFO:root:Node researcher called
INFO:root:Node writer called

That tells you nothing about what state entered each node, what the LLM was prompted with, or why the router sent execution down one branch instead of another.

LangSmith captures the full execution tree — inputs, outputs, latency, token count, and errors — for every node in the graph.

Solution

Step 1: Install Dependencies

# LangSmith SDK + LangGraph — pin versions to avoid breaking changes
pip install langsmith==0.2.x langgraph==0.2.x langchain-openai==0.2.x

Expected output:

Successfully installed langsmith-0.2.x langgraph-0.2.x

If it fails:

pip: command not found → Use pip3 or activate your virtualenv first
Version conflict on langchain-core → Run pip install --upgrade langchain-core first

Step 2: Configure LangSmith Environment Variables

# Get your API key from https://smith.langchain.com → Settings → API Keys
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY="ls__your_key_here"
export LANGCHAIN_PROJECT="multi-agent-dev"   # Traces group under this project name

Set these in your .env file for persistent config:

# .env
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=ls__your_key_here
LANGCHAIN_PROJECT=multi-agent-dev
OPENAI_API_KEY=sk-your_openai_key

Load them in Python:

from dotenv import load_dotenv
load_dotenv()  # Reads .env — must call before importing LangGraph

That's the entire setup. LangSmith auto-instruments LangGraph once LANGCHAIN_TRACING_V2=true is set.

Step 3: Build a Traceable Multi-Agent Graph

Here's a two-agent graph — a supervisor routing between a researcher and a writer — with tracing enabled by default:

import os
from typing import Annotated, TypedDict, Literal
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage

# State shared across all nodes
class AgentState(TypedDict):
    messages: list
    next_agent: str
    research_output: str

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# ── Supervisor node ──────────────────────────────────────────────
def supervisor(state: AgentState) -> AgentState:
    """Routes to researcher or writer based on current state."""
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""

    # Simple router — LangSmith will show this decision in the trace
    if not state.get("research_output"):
        return {"next_agent": "researcher"}
    return {"next_agent": "writer"}

# ── Researcher node ──────────────────────────────────────────────
def researcher(state: AgentState) -> AgentState:
    """Calls LLM to produce research summary. LangSmith traces this LLM call."""
    response = llm.invoke([
        HumanMessage(content=f"Research this topic: {state['messages'][0].content}")
    ])
    return {
        "research_output": response.content,
        "messages": state["messages"] + [response]
    }

# ── Writer node ──────────────────────────────────────────────────
def writer(state: AgentState) -> AgentState:
    """Drafts output from research. LangSmith traces this as a child of the run."""
    response = llm.invoke([
        HumanMessage(content=f"Write a short report based on: {state['research_output']}")
    ])
    return {
        "messages": state["messages"] + [response]
    }

# ── Router function ──────────────────────────────────────────────
def route(state: AgentState) -> Literal["researcher", "writer", "__end__"]:
    """Conditional edge — visible as a branch decision in LangSmith."""
    next_agent = state.get("next_agent", "researcher")
    if next_agent == "researcher":
        return "researcher"
    if next_agent == "writer":
        return "writer"
    return "__end__"

# ── Build graph ──────────────────────────────────────────────────
builder = StateGraph(AgentState)
builder.add_node("supervisor", supervisor)
builder.add_node("researcher", researcher)
builder.add_node("writer", writer)

builder.set_entry_point("supervisor")
builder.add_conditional_edges("supervisor", route)
builder.add_edge("researcher", "supervisor")  # Loop back after research
builder.add_edge("writer", END)

graph = builder.compile()

Step 4: Run the Graph and Inspect Traces

# Run — LangSmith captures the full execution automatically
result = graph.invoke({
    "messages": [HumanMessage(content="Explain vector embeddings for RAG")],
    "next_agent": "",
    "research_output": ""
})

print(result["messages"][-1].content)

Expected output:

Vector embeddings are numerical representations of text...
[LangSmith trace URL printed to console in development mode]

Open smith.langchain.com → your project → you'll see a tree like:

▼ RunnableSequence                     2.3s   1,240 tokens
  ▼ supervisor                           12ms
  ▼ researcher                          890ms
      ▼ ChatOpenAI (gpt-4o-mini)        860ms   620 tokens
  ▼ supervisor                            8ms
  ▼ writer                             1.1s
      ▼ ChatOpenAI (gpt-4o-mini)        980ms   590 tokens

Every node is clickable. You see exact inputs, outputs, and latency.

Step 5: Add Run Metadata for Production Filtering

Tag runs so you can filter by user, session, or feature flag in LangSmith:

from langchain_core.runnables import RunnableConfig

config = RunnableConfig(
    run_name="multi-agent-report",          # Shows as the top-level trace name
    tags=["production", "report-pipeline"], # Filter by tag in LangSmith UI
    metadata={
        "user_id": "usr_123",               # Filter traces by user
        "session_id": "sess_abc",           # Group traces by session
        "agent_version": "v2.1"             # Track which agent version ran
    }
)

result = graph.invoke(
    {
        "messages": [HumanMessage(content="Explain vector embeddings for RAG")],
        "next_agent": "",
        "research_output": ""
    },
    config=config
)

In LangSmith, filter by metadata.user_id = "usr_123" to see every trace for that user across sessions.

Step 6: Trace Threads for Persistent Conversations

LangGraph supports stateful multi-turn conversations via thread_id. LangSmith groups these as a single thread trace:

from langgraph.checkpoint.memory import MemorySaver

# Add checkpointer to enable thread-level state persistence
memory = MemorySaver()
graph_with_memory = builder.compile(checkpointer=memory)

# Thread ID ties multiple invocations together in LangSmith
thread_config = RunnableConfig(
    configurable={"thread_id": "thread_456"},
    metadata={"user_id": "usr_123"}
)

# Turn 1
graph_with_memory.invoke(
    {"messages": [HumanMessage(content="What are vector embeddings?")], "next_agent": "", "research_output": ""},
    config=thread_config
)

# Turn 2 — LangSmith shows this as a continuation of the same thread
graph_with_memory.invoke(
    {"messages": [HumanMessage(content="How do they work with FAISS?")], "next_agent": "", "research_output": ""},
    config=thread_config
)

In LangSmith, open the thread view to see the full conversation history across turns with state snapshots between each.

Verification

# Run the script — should print the final agent response
python agent.py

Then verify in LangSmith:

Go to smith.langchain.com → Projects → multi-agent-dev
You should see a new run with child spans for each node
Click researcher span → confirm input prompt and LLM response are visible
Check Latency column — identifies which node is the bottleneck

You should see: A trace tree with supervisor → researcher → supervisor → writer and total token count on the root span.

If traces don't appear:

Empty project → Confirm LANGCHAIN_TRACING_V2=true is exported before running Python
AuthError → Regenerate API key in LangSmith Settings; key must start with ls__
Wrong project → Check LANGCHAIN_PROJECT env var matches the project name in the UI

What You Learned

Setting LANGCHAIN_TRACING_V2=true is the only required config — no code changes to the graph
Each LangGraph node becomes a child span; LLM calls inside nodes are nested one level deeper
RunnableConfig metadata lets you filter production traces by user, session, or version
Thread IDs connect multi-turn conversations into a single traceable thread

Limitation: LangSmith free tier stores traces for 14 days. For longer retention or high-volume production, use the LangSmith self-hosted Docker image or upgrade to a paid plan.

When NOT to use this approach: If you're running fully local models with no external API calls and have strict data residency requirements, self-host LangSmith rather than sending traces to smith.langchain.com.

Tested on LangSmith 0.2.x, LangGraph 0.2.x, Python 3.12, macOS & Ubuntu 24.04