State and Graphs: The Core Abstraction | LangGraph Tutorial

StateGraph: The Builder

StateGraph is the class you instantiate to build a graph. It takes a state schema: a type that describes what state looks like.

from typing import TypedDict
from langgraph.graph import StateGraph, START, END

class State(TypedDict):
    count: int
    messages: list[str]

builder = StateGraph(State)

You then add nodes and edges, and call compile() to get an executable graph.

State Schemas

Three options, in order of common use.

TypedDict

from typing import TypedDict

class State(TypedDict):
    count: int
    messages: list[str]

Lightweight. Fields are plain keys. No runtime validation (but mypy catches mismatches).

Pydantic BaseModel

from pydantic import BaseModel

class State(BaseModel):
    count: int = 0
    messages: list[str] = []

Runtime validation. Default values are supported. Slightly heavier, worth it when inputs come from external sources.

dataclass

from dataclasses import dataclass, field

@dataclass
class State:
    count: int = 0
    messages: list[str] = field(default_factory=list)

Works fine. Less common in LangGraph examples. Pick TypedDict for simple cases, Pydantic when you want validation.

Nodes

A node is a function. It takes the current state and returns a partial update.

def increment(state: State) -> State:
    return {"count": state["count"] + 1}

Two things to notice:

The return type is the same state type, but you only return the keys you changed. LangGraph merges the return value with existing state.
You don't mutate state in place. Treat it as read-only; return new values.

Nodes can also be async:

async def fetch(state: State) -> State:
    data = await some_async_call()
    return {"data": data}

LangGraph runs async nodes on an event loop. You can mix sync and async nodes in one graph.

Nodes as Runnables

If you've used LangChain Runnables, they work as nodes:

from langchain_core.runnables import RunnableLambda

double_node = RunnableLambda(lambda state: {"count": state["count"] * 2})
builder.add_node("double", double_node)

Useful for small operations or when you already have a Runnable pipeline.

Edges

Edges connect nodes.

from langgraph.graph import START, END

builder.add_edge(START, "increment")
builder.add_edge("increment", END)

START and END are sentinel nodes. START is where execution begins; END is where it finishes. Every graph has at least one edge from START and at least one path to END.

Reducers: How State Updates Merge

Here's the key insight. When a node returns {"count": 5}, how does that merge with the existing state?

By default: replace. The new value overwrites the old. That works for scalars like count.

For lists, replace is usually wrong. You want "append", not "overwrite". LangGraph lets you specify a custom reducer via Annotated.

from typing import TypedDict, Annotated
from operator import add

class State(TypedDict):
    count: int                          # default: replace
    messages: Annotated[list, add]       # reducer: add (concatenate)

Now if a node returns {"messages": ["new"]}, LangGraph concatenates ["new"] onto the existing list instead of replacing it.

This is subtle but fundamental. Two nodes returning messages won't clobber each other; their returns are appended.

Custom Reducers

A reducer is any callable that takes (old, update) and returns the new value.

def deduplicate(old: list, new: list) -> list:
    return list(set(old + new))

class State(TypedDict):
    tags: Annotated[list[str], deduplicate]

Useful for sets, merged dicts, "latest by timestamp", and so on.

The add_messages Reducer

For message history (the most common list in LangGraph), use add_messages:

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing import TypedDict, Annotated

class State(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]

add_messages is smarter than plain add:

Appends new messages.
Handles message IDs: if a message with the same ID is passed again, it replaces the old one (useful for streaming updates).
Preserves message order.

Use this anywhere you track chat messages. It handles edge cases add doesn't.

Putting It Together

A slightly more realistic graph:

from typing import TypedDict, Annotated
from operator import add
from langgraph.graph import StateGraph, START, END

class State(TypedDict):
    count: int
    log: Annotated[list[str], add]

def increment(state: State) -> State:
    new_count = state["count"] + 1
    return {
        "count": new_count,
        "log": [f"incremented to {new_count}"],
    }

def double(state: State) -> State:
    new_count = state["count"] * 2
    return {
        "count": new_count,
        "log": [f"doubled to {new_count}"],
    }

graph = (
    StateGraph(State)
    .add_node("increment", increment)
    .add_node("double", double)
    .add_edge(START, "increment")
    .add_edge("increment", "double")
    .add_edge("double", END)
    .compile()
)

result = graph.invoke({"count": 5, "log": []})
print(result)
# {'count': 12, 'log': ['incremented to 6', 'doubled to 12']}

Each node returns a partial update. count is replaced (default). log is appended (reducer = add).

Compile and Invoke

compile() turns the builder into a CompiledGraph. After that, the graph is read-only; add more nodes and you need to re-compile.

graph = builder.compile()

Compile options you'll meet later:

graph = builder.compile(
    checkpointer=MemorySaver(),       # Chapter 5
    interrupt_before=["human_input"], # Chapter 6
)

Invoke runs the graph:

result = graph.invoke({"count": 0, "log": []})

The argument is the initial state. LangGraph walks from START, running each node and merging its returns, until it reaches END.

Visualizing the Graph

Helpful when debugging:

print(graph.get_graph().draw_ascii())
# or for a nicer rendering:
graph.get_graph().draw_mermaid_png(output_file_path="graph.png")

The ASCII version works in any terminal; the Mermaid PNG needs Graphviz or a renderer.

Common Pitfalls

Replacing a list you meant to append. Default reducer is replace. Always use Annotated[list, add] or add_messages for accumulating lists.

Returning unknown keys. Returning {"random_key": 5} when State has no random_key doesn't warn; it's just ignored. Keep types tight.

Mutating state in-place. state["messages"].append(...) may or may not persist depending on reducer and Python references. Always return new values.

Big state objects. Every checkpoint serializes the full state. Don't stuff embedding matrices or large blobs into state; keep references (IDs, paths) instead.

Async everywhere when not needed. If your nodes don't do I/O, sync is fine and less confusing.

Next Steps

Continue to 03-edges-and-flow.md to control how execution moves between nodes.