LangGraph Memory: Short-Term and Long-Term Storage Patterns

What Is LangGraph Memory and Why It Matters

Most LangGraph agents are stateless by default — each invocation starts fresh with no knowledge of past interactions. That's fine for one-shot tasks. It breaks immediately for chatbots, task assistants, and any agent that needs context across turns or sessions.

LangGraph provides two distinct memory layers:

Short-term memory — state persisted within a single conversation thread (in-memory or checkpointed)
Long-term memory — facts, preferences, and summaries that survive across threads and restarts

Getting this right is what separates a demo agent from a production one. This article shows you exactly how both layers work, when to use each, and how to wire them together.

You'll learn:

How LangGraph's state and checkpointer system handles short-term memory
How to add persistent long-term storage with InMemoryStore and PostgreSQL
A production pattern combining both layers in a single agent

Time: 20 min | Difficulty: Intermediate

How LangGraph Memory Works

LangGraph models agent execution as a graph of nodes. Each invocation updates a typed state object. The framework persists snapshots of that state — called checkpoints — after every node executes.

User message
    │
    ▼
┌─────────────────────┐
│   Graph Invocation  │
│                     │
│  Node A → Node B    │  ◀── State flows through nodes
│       │             │
│  Checkpoint saved ──┼──▶ Checkpointer (short-term)
└─────────────────────┘
         │
         ▼
   Long-term Store  ◀── Agent explicitly reads/writes facts

Short-term memory is automatic once you attach a checkpointer. Long-term memory requires explicit read/write steps in your graph nodes.

Short-Term Memory: Checkpointers

Step 1: Define Your State

Start with a typed state that accumulates messages across turns.

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.messages import HumanMessage, AIMessage
from typing import Annotated
from typing_extensions import TypedDict

class AgentState(TypedDict):
    # add_messages reducer appends instead of replacing — critical for chat history
    messages: Annotated[list, add_messages]

The add_messages annotation is the key detail here. Without it, each node invocation overwrites the message list instead of appending to it.

Step 2: Attach the In-Memory Checkpointer

from langgraph.checkpoint.memory import MemorySaver
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-20250514")

def call_model(state: AgentState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

builder = StateGraph(AgentState)
builder.add_node("model", call_model)
builder.add_edge(START, "model")
builder.add_edge("model", END)

# MemorySaver keeps checkpoints in RAM — fine for dev, lost on restart
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)

Step 3: Use Thread IDs to Separate Conversations

The thread_id in the config is what scopes memory to a conversation. Different values = different memory contexts.

config = {"configurable": {"thread_id": "user-123-session-1"}}

# Turn 1
result = graph.invoke(
    {"messages": [HumanMessage(content="My name is Alex.")]},
    config=config
)
print(result["messages"][-1].content)

# Turn 2 — same thread_id, so graph remembers the previous turn
result = graph.invoke(
    {"messages": [HumanMessage(content="What's my name?")]},
    config=config
)
print(result["messages"][-1].content)
# Output: "Your name is Alex."

Expected output from turn 2:

Your name is Alex.

If it doesn't remember:

Verify you're passing the same thread_id in both calls
Confirm checkpointer is attached at compile(), not passed at invoke()

Persisting Short-Term Memory Across Restarts

MemorySaver is RAM-only. For production you need a durable checkpointer. LangGraph ships PostgreSQL and SQLite backends.

Step 4: Switch to SQLite Checkpointer

from langgraph.checkpoint.sqlite import SqliteSaver

# File-backed — survives restarts, good for single-server deployments
with SqliteSaver.from_conn_string("checkpoints.db") as checkpointer:
    graph = builder.compile(checkpointer=checkpointer)

    config = {"configurable": {"thread_id": "user-123"}}
    result = graph.invoke(
        {"messages": [HumanMessage(content="Remember: I prefer Python over TypeScript.")]},
        config=config
    )

Step 5: PostgreSQL for Multi-Instance Deployments

pip install langgraph-checkpoint-postgres psycopg

from langgraph.checkpoint.postgres import PostgresSaver
import psycopg

DB_URI = "postgresql://user:password@localhost:5432/agent_db"

with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    # Creates checkpoint tables on first run
    checkpointer.setup()
    graph = builder.compile(checkpointer=checkpointer)

Use PostgreSQL when you run multiple agent instances behind a load balancer — SQLite doesn't handle concurrent writes safely.

Long-Term Memory: Persistent Stores

Short-term memory gives you conversation history within a thread. Long-term memory gives you facts that survive across threads — user preferences, past decisions, extracted knowledge.

LangGraph's BaseStore interface handles this. The pattern: agent reads from store before generating, writes to store after extracting relevant facts.

Step 6: Set Up InMemoryStore for Development

from langgraph.store.memory import InMemoryStore
from langchain_core.messages import SystemMessage

store = InMemoryStore()

def call_model_with_memory(state: AgentState, config, *, store: InMemoryStore):
    user_id = config["configurable"].get("user_id", "default")

    # Retrieve long-term facts for this user
    namespace = ("user_facts", user_id)
    memories = store.search(namespace)

    # Build system message from stored facts
    facts = "\n".join(f"- {m.value['fact']}" for m in memories)
    system = SystemMessage(content=f"User facts:\n{facts}" if facts else "No prior facts.")

    response = llm.invoke([system] + state["messages"])
    return {"messages": [response]}

Step 7: Write Facts Back to the Store

import json
from langchain_core.messages import HumanMessage

def extract_and_store_facts(state: AgentState, config, *, store: InMemoryStore):
    """After responding, extract facts worth remembering."""
    user_id = config["configurable"].get("user_id", "default")
    namespace = ("user_facts", user_id)

    # Ask the LLM to extract memorable facts from the last user message
    last_message = state["messages"][-2]  # -1 is the AI response
    extraction_prompt = f"""Extract any personal facts, preferences, or important details from this message.
Return a JSON list of strings, or an empty list if none.
Message: {last_message.content}"""

    result = llm.invoke([HumanMessage(content=extraction_prompt)])

    try:
        facts = json.loads(result.content)
        for i, fact in enumerate(facts):
            # Use content hash as key so identical facts don't duplicate
            key = f"fact_{hash(fact) % 100000}"
            store.put(namespace, key, {"fact": fact})
    except (json.JSONDecodeError, TypeError):
        pass  # Extraction failed — skip silently

    return {}  # No state change needed

Step 8: Wire Both Nodes Into the Graph

from functools import partial

# Bind store to nodes that need it
builder = StateGraph(AgentState)
builder.add_node("recall_and_respond", partial(call_model_with_memory, store=store))
builder.add_node("extract_facts", partial(extract_and_store_facts, store=store))

builder.add_edge(START, "recall_and_respond")
builder.add_edge("recall_and_respond", "extract_facts")
builder.add_edge("extract_facts", END)

checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)

Full Production Pattern: Both Layers Together

Here's a complete agent combining SQLite checkpoints (short-term) with a store (long-term), with proper config threading.

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.store.memory import InMemoryStore
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage
from typing import Annotated
from typing_extensions import TypedDict
from functools import partial
import json

llm = ChatAnthropic(model="claude-sonnet-4-20250514")

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]

def respond(state: AgentState, config: dict, *, store: InMemoryStore):
    user_id = config["configurable"].get("user_id", "anonymous")
    namespace = ("facts", user_id)

    memories = store.search(namespace)
    facts_text = "\n".join(f"- {m.value['fact']}" for m in memories)
    system_content = f"Known facts about the user:\n{facts_text}" if facts_text else "No prior context."

    response = llm.invoke([SystemMessage(content=system_content)] + state["messages"])
    return {"messages": [response]}

def memorize(state: AgentState, config: dict, *, store: InMemoryStore):
    user_id = config["configurable"].get("user_id", "anonymous")
    namespace = ("facts", user_id)

    last_human = next(
        (m for m in reversed(state["messages"]) if isinstance(m, HumanMessage)),
        None
    )
    if not last_human:
        return {}

    extraction = llm.invoke([HumanMessage(
        content=f"Extract memorable facts as a JSON list of strings. Return [] if none.\nMessage: {last_human.content}"
    )])

    try:
        for fact in json.loads(extraction.content):
            store.put(namespace, f"fact_{hash(fact) % 100000}", {"fact": fact})
    except (json.JSONDecodeError, TypeError):
        pass

    return {}

store = InMemoryStore()

builder = StateGraph(AgentState)
builder.add_node("respond", partial(respond, store=store))
builder.add_node("memorize", partial(memorize, store=store))
builder.add_edge(START, "respond")
builder.add_edge("respond", "memorize")
builder.add_edge("memorize", END)

with SqliteSaver.from_conn_string("agent.db") as checkpointer:
    graph = builder.compile(checkpointer=checkpointer)

    # Thread-scoped config — thread_id for short-term, user_id for long-term
    config = {
        "configurable": {
            "thread_id": "session-001",
            "user_id": "alex-42"
        }
    }

    for user_input in ["I'm a backend developer who loves Rust.", "What stack should I use for a new API?"]:
        result = graph.invoke({"messages": [HumanMessage(content=user_input)]}, config=config)
        print(f"User: {user_input}")
        print(f"Agent: {result['messages'][-1].content}\n")

Verification

# After a few turns, inspect what's been stored
user_id = "alex-42"
namespace = ("facts", user_id)
stored = store.search(namespace)

for item in stored:
    print(item.value["fact"])

You should see: Extracted facts from the conversation, e.g.:

- User is a backend developer
- User prefers Rust

To inspect a checkpoint:

# Get the latest state for a thread
snapshot = graph.get_state({"configurable": {"thread_id": "session-001"}})
print(f"Messages in thread: {len(snapshot.values['messages'])}")

Production Considerations

Memory growth: Long-term stores grow unbounded without a cleanup strategy. Add a TTL or a summarization step that compresses old facts before the store exceeds a manageable size.

Fact conflicts: If the user changes a preference ("actually I prefer Go now"), the old Rust fact stays. Build a deduplication pass or use semantic similarity to detect and replace outdated facts.

Store backends for production: InMemoryStore is dev-only — data is lost on restart. Use langgraph-checkpoint-postgres with a custom store backed by Redis or PostgreSQL for real deployments.

Thread ID vs user ID: Keep these distinct. thread_id scopes a conversation session; user_id scopes the long-term identity. One user may have many threads.

What You Learned

add_messages on your state type is what enables multi-turn memory — without it, history is overwritten
MemorySaver is for development; SQLite or PostgreSQL checkpointers are for production
Long-term memory requires explicit read/write nodes — LangGraph doesn't extract facts automatically
thread_id and user_id serve different purposes and both belong in config["configurable"]
Fact extraction via LLM is useful but needs a deduplication strategy before production use

Tested on LangGraph 0.2.x, LangChain Core 0.3.x, Python 3.12, Ubuntu 24.04