What Is LangGraph Memory and Why It Matters
Most LangGraph agents are stateless by default — each invocation starts fresh with no knowledge of past interactions. That's fine for one-shot tasks. It breaks immediately for chatbots, task assistants, and any agent that needs context across turns or sessions.
LangGraph provides two distinct memory layers:
- Short-term memory — state persisted within a single conversation thread (in-memory or checkpointed)
- Long-term memory — facts, preferences, and summaries that survive across threads and restarts
Getting this right is what separates a demo agent from a production one. This article shows you exactly how both layers work, when to use each, and how to wire them together.
You'll learn:
- How LangGraph's state and checkpointer system handles short-term memory
- How to add persistent long-term storage with
InMemoryStoreand PostgreSQL - A production pattern combining both layers in a single agent
Time: 20 min | Difficulty: Intermediate
How LangGraph Memory Works
LangGraph models agent execution as a graph of nodes. Each invocation updates a typed state object. The framework persists snapshots of that state — called checkpoints — after every node executes.
User message
│
▼
┌─────────────────────┐
│ Graph Invocation │
│ │
│ Node A → Node B │ ◀── State flows through nodes
│ │ │
│ Checkpoint saved ──┼──▶ Checkpointer (short-term)
└─────────────────────┘
│
▼
Long-term Store ◀── Agent explicitly reads/writes facts
Short-term memory is automatic once you attach a checkpointer. Long-term memory requires explicit read/write steps in your graph nodes.
Short-Term Memory: Checkpointers
Step 1: Define Your State
Start with a typed state that accumulates messages across turns.
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.messages import HumanMessage, AIMessage
from typing import Annotated
from typing_extensions import TypedDict
class AgentState(TypedDict):
# add_messages reducer appends instead of replacing — critical for chat history
messages: Annotated[list, add_messages]
The add_messages annotation is the key detail here. Without it, each node invocation overwrites the message list instead of appending to it.
Step 2: Attach the In-Memory Checkpointer
from langgraph.checkpoint.memory import MemorySaver
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
def call_model(state: AgentState):
response = llm.invoke(state["messages"])
return {"messages": [response]}
builder = StateGraph(AgentState)
builder.add_node("model", call_model)
builder.add_edge(START, "model")
builder.add_edge("model", END)
# MemorySaver keeps checkpoints in RAM — fine for dev, lost on restart
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
Step 3: Use Thread IDs to Separate Conversations
The thread_id in the config is what scopes memory to a conversation. Different values = different memory contexts.
config = {"configurable": {"thread_id": "user-123-session-1"}}
# Turn 1
result = graph.invoke(
{"messages": [HumanMessage(content="My name is Alex.")]},
config=config
)
print(result["messages"][-1].content)
# Turn 2 — same thread_id, so graph remembers the previous turn
result = graph.invoke(
{"messages": [HumanMessage(content="What's my name?")]},
config=config
)
print(result["messages"][-1].content)
# Output: "Your name is Alex."
Expected output from turn 2:
Your name is Alex.
If it doesn't remember:
- Verify you're passing the same
thread_idin both calls - Confirm
checkpointeris attached atcompile(), not passed atinvoke()
Persisting Short-Term Memory Across Restarts
MemorySaver is RAM-only. For production you need a durable checkpointer. LangGraph ships PostgreSQL and SQLite backends.
Step 4: Switch to SQLite Checkpointer
from langgraph.checkpoint.sqlite import SqliteSaver
# File-backed — survives restarts, good for single-server deployments
with SqliteSaver.from_conn_string("checkpoints.db") as checkpointer:
graph = builder.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "user-123"}}
result = graph.invoke(
{"messages": [HumanMessage(content="Remember: I prefer Python over TypeScript.")]},
config=config
)
Step 5: PostgreSQL for Multi-Instance Deployments
pip install langgraph-checkpoint-postgres psycopg
from langgraph.checkpoint.postgres import PostgresSaver
import psycopg
DB_URI = "postgresql://user:password@localhost:5432/agent_db"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
# Creates checkpoint tables on first run
checkpointer.setup()
graph = builder.compile(checkpointer=checkpointer)
Use PostgreSQL when you run multiple agent instances behind a load balancer — SQLite doesn't handle concurrent writes safely.
Long-Term Memory: Persistent Stores
Short-term memory gives you conversation history within a thread. Long-term memory gives you facts that survive across threads — user preferences, past decisions, extracted knowledge.
LangGraph's BaseStore interface handles this. The pattern: agent reads from store before generating, writes to store after extracting relevant facts.
Step 6: Set Up InMemoryStore for Development
from langgraph.store.memory import InMemoryStore
from langchain_core.messages import SystemMessage
store = InMemoryStore()
def call_model_with_memory(state: AgentState, config, *, store: InMemoryStore):
user_id = config["configurable"].get("user_id", "default")
# Retrieve long-term facts for this user
namespace = ("user_facts", user_id)
memories = store.search(namespace)
# Build system message from stored facts
facts = "\n".join(f"- {m.value['fact']}" for m in memories)
system = SystemMessage(content=f"User facts:\n{facts}" if facts else "No prior facts.")
response = llm.invoke([system] + state["messages"])
return {"messages": [response]}
Step 7: Write Facts Back to the Store
import json
from langchain_core.messages import HumanMessage
def extract_and_store_facts(state: AgentState, config, *, store: InMemoryStore):
"""After responding, extract facts worth remembering."""
user_id = config["configurable"].get("user_id", "default")
namespace = ("user_facts", user_id)
# Ask the LLM to extract memorable facts from the last user message
last_message = state["messages"][-2] # -1 is the AI response
extraction_prompt = f"""Extract any personal facts, preferences, or important details from this message.
Return a JSON list of strings, or an empty list if none.
Message: {last_message.content}"""
result = llm.invoke([HumanMessage(content=extraction_prompt)])
try:
facts = json.loads(result.content)
for i, fact in enumerate(facts):
# Use content hash as key so identical facts don't duplicate
key = f"fact_{hash(fact) % 100000}"
store.put(namespace, key, {"fact": fact})
except (json.JSONDecodeError, TypeError):
pass # Extraction failed — skip silently
return {} # No state change needed
Step 8: Wire Both Nodes Into the Graph
from functools import partial
# Bind store to nodes that need it
builder = StateGraph(AgentState)
builder.add_node("recall_and_respond", partial(call_model_with_memory, store=store))
builder.add_node("extract_facts", partial(extract_and_store_facts, store=store))
builder.add_edge(START, "recall_and_respond")
builder.add_edge("recall_and_respond", "extract_facts")
builder.add_edge("extract_facts", END)
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
Full Production Pattern: Both Layers Together
Here's a complete agent combining SQLite checkpoints (short-term) with a store (long-term), with proper config threading.
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.store.memory import InMemoryStore
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage
from typing import Annotated
from typing_extensions import TypedDict
from functools import partial
import json
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
class AgentState(TypedDict):
messages: Annotated[list, add_messages]
def respond(state: AgentState, config: dict, *, store: InMemoryStore):
user_id = config["configurable"].get("user_id", "anonymous")
namespace = ("facts", user_id)
memories = store.search(namespace)
facts_text = "\n".join(f"- {m.value['fact']}" for m in memories)
system_content = f"Known facts about the user:\n{facts_text}" if facts_text else "No prior context."
response = llm.invoke([SystemMessage(content=system_content)] + state["messages"])
return {"messages": [response]}
def memorize(state: AgentState, config: dict, *, store: InMemoryStore):
user_id = config["configurable"].get("user_id", "anonymous")
namespace = ("facts", user_id)
last_human = next(
(m for m in reversed(state["messages"]) if isinstance(m, HumanMessage)),
None
)
if not last_human:
return {}
extraction = llm.invoke([HumanMessage(
content=f"Extract memorable facts as a JSON list of strings. Return [] if none.\nMessage: {last_human.content}"
)])
try:
for fact in json.loads(extraction.content):
store.put(namespace, f"fact_{hash(fact) % 100000}", {"fact": fact})
except (json.JSONDecodeError, TypeError):
pass
return {}
store = InMemoryStore()
builder = StateGraph(AgentState)
builder.add_node("respond", partial(respond, store=store))
builder.add_node("memorize", partial(memorize, store=store))
builder.add_edge(START, "respond")
builder.add_edge("respond", "memorize")
builder.add_edge("memorize", END)
with SqliteSaver.from_conn_string("agent.db") as checkpointer:
graph = builder.compile(checkpointer=checkpointer)
# Thread-scoped config — thread_id for short-term, user_id for long-term
config = {
"configurable": {
"thread_id": "session-001",
"user_id": "alex-42"
}
}
for user_input in ["I'm a backend developer who loves Rust.", "What stack should I use for a new API?"]:
result = graph.invoke({"messages": [HumanMessage(content=user_input)]}, config=config)
print(f"User: {user_input}")
print(f"Agent: {result['messages'][-1].content}\n")
Verification
# After a few turns, inspect what's been stored
user_id = "alex-42"
namespace = ("facts", user_id)
stored = store.search(namespace)
for item in stored:
print(item.value["fact"])
You should see: Extracted facts from the conversation, e.g.:
- User is a backend developer
- User prefers Rust
To inspect a checkpoint:
# Get the latest state for a thread
snapshot = graph.get_state({"configurable": {"thread_id": "session-001"}})
print(f"Messages in thread: {len(snapshot.values['messages'])}")
Production Considerations
Memory growth: Long-term stores grow unbounded without a cleanup strategy. Add a TTL or a summarization step that compresses old facts before the store exceeds a manageable size.
Fact conflicts: If the user changes a preference ("actually I prefer Go now"), the old Rust fact stays. Build a deduplication pass or use semantic similarity to detect and replace outdated facts.
Store backends for production: InMemoryStore is dev-only — data is lost on restart. Use langgraph-checkpoint-postgres with a custom store backed by Redis or PostgreSQL for real deployments.
Thread ID vs user ID: Keep these distinct. thread_id scopes a conversation session; user_id scopes the long-term identity. One user may have many threads.
What You Learned
add_messageson your state type is what enables multi-turn memory — without it, history is overwrittenMemorySaveris for development; SQLite or PostgreSQL checkpointers are for production- Long-term memory requires explicit read/write nodes — LangGraph doesn't extract facts automatically
thread_idanduser_idserve different purposes and both belong inconfig["configurable"]- Fact extraction via LLM is useful but needs a deduplication strategy before production use
Tested on LangGraph 0.2.x, LangChain Core 0.3.x, Python 3.12, Ubuntu 24.04