Multi-Agent: Supervisors, Hierarchies, Handoffs
This chapter covers the patterns for coordinating multiple agents: supervisor, hierarchical team, and swarm-style handoffs.
When Multiple Agents
One agent with many tools is often enough. Reach for multi-agent when:
- Different agents need different prompts, tools, or models.
- Tasks cleanly separate into specialist roles (research, write, review).
- You want to isolate failures: a broken research agent shouldn't break writing.
- Tokens and cost: specialist agents with tight contexts are cheaper than one big agent with everything.
When not to:
- Single agent with a few tools handles your case. Don't over-engineer.
- You need low latency. Multi-agent means multiple LLM calls per turn.
The Three Patterns
Supervisor
One "supervisor" agent routes to specialist agents.
┌─→ research_agent ─┐
user → supervisor │ → supervisor → response
└─→ writing_agent ─┘
Supervisor decides which specialist runs, passes state, collects the result, decides what next.
Hierarchical Teams
Multiple supervisor groups. Each group manages its own specialists.
top_supervisor
├── research_team_supervisor
│ ├── web_searcher
│ └── doc_searcher
└── writing_team_supervisor
├── drafter
└── editor
Each team is a subgraph. Top-level supervisor picks the team; team supervisor picks the specialist.
Swarm (Handoffs)
No supervisor. Agents hand off to each other directly.
research_agent → (when done) → writing_agent → (when done) → review_agent
Each agent knows which other agents exist and can route to them.
Pick based on your problem. Supervisor is the safe default. Swarm scales better when agents have clear, non-overlapping responsibilities.
Supervisor Pattern
Build one agent per role, plus a supervisor that routes.
from typing import TypedDict, Annotated, Literal
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain_anthropic import ChatAnthropic
class State(TypedDict):
messages: Annotated[list[BaseMessage], add_messages]
next: str
llm = ChatAnthropic(model="claude-sonnet-4-5")
def supervisor(state: State) -> State:
# In production, use structured output (tool call) to pick the next agent.
prompt = f"""You manage two agents: 'researcher' and 'writer'.
Given the conversation, decide who should act next. Respond with one word:
'researcher', 'writer', or 'FINISH' if the user's request is complete.
Conversation:
{state['messages']}"""
decision = llm.invoke(prompt).content.strip().lower()
return {"next": decision}
def researcher(state: State) -> State:
# Simulated research
return {
"messages": [AIMessage(content="Researched topic and found: [facts]")],
"next": "",
}
def writer(state: State) -> State:
return {
"messages": [AIMessage(content="Drafted: [content]")],
"next": "",
}
def route(state: State) -> Literal["researcher", "writer", "__end__"]:
return state["next"] if state["next"] in {"researcher", "writer"} else END
graph = (
StateGraph(State)
.add_node("supervisor", supervisor)
.add_node("researcher", researcher)
.add_node("writer", writer)
.add_edge(START, "supervisor")
.add_conditional_edges("supervisor", route, {
"researcher": "researcher",
"writer": "writer",
END: END,
})
.add_edge("researcher", "supervisor")
.add_edge("writer", "supervisor")
.compile()
)
Supervisor runs, decides, the chosen agent runs, returns to supervisor. Loop until supervisor says FINISH.
Structured Output for Routing
Asking Claude "respond with one word" is fragile. Better: give the supervisor a tool whose name is the routing target.
from langchain_core.tools import tool
@tool
def route_to_researcher() -> str:
"""Route to the researcher agent for fact-finding tasks."""
return "researcher"
@tool
def route_to_writer() -> str:
"""Route to the writer agent for drafting content."""
return "writer"
@tool
def finish() -> str:
"""Signal that the task is complete and we can respond to the user."""
return "FINISH"
supervisor_llm = llm.bind_tools([route_to_researcher, route_to_writer, finish])
def supervisor(state: State) -> State:
response = supervisor_llm.invoke(state["messages"])
if response.tool_calls:
target = response.tool_calls[0]["name"]
return {"next": target.replace("route_to_", "")}
return {"next": "FINISH"}
Tool calls are reliable; free-text routing is not.
Handoff Pattern with Command
Command(goto=...) lets an agent route directly to another agent. No supervisor.
from langgraph.types import Command
def researcher(state: State) -> Command:
# ... do research ...
findings = "web findings"
# Decide: hand off to writer
return Command(
update={"messages": [AIMessage(content=f"Findings: {findings}")]},
goto="writer",
)
def writer(state: State) -> Command:
# ... write draft ...
return Command(
update={"messages": [AIMessage(content="Final draft")]},
goto=END,
)
graph = (
StateGraph(State)
.add_node("researcher", researcher)
.add_node("writer", writer)
.add_edge(START, "researcher")
.compile()
)
Each agent knows what to do next and routes itself. No central dispatcher. Fits when the flow is mostly linear with agent-driven branches.
Sharing State vs Passing Messages
Two philosophies for multi-agent coordination.
Shared State
All agents read and write the same state. Message history, findings, drafts: everyone sees everything.
Pros: simple. Agents see the whole conversation.
Cons: context bloat. Every agent gets every message, whether relevant or not. Tokens add up.
Message-Passing
Each agent gets a narrow, scoped input. Outputs pass to the next agent explicitly.
def researcher(state: State) -> Command:
findings = do_research(state["messages"])
return Command(
update={"research_output": findings},
goto="writer",
)
def writer(state: State) -> Command:
draft = write_from(state["research_output"])
return Command(
update={"messages": [AIMessage(content=draft)]},
goto=END,
)
Writer reads research_output, not the full message history. Smaller context, cleaner boundaries.
Pick based on how coupled your agents are. Shared state for conversational multi-agent; message-passing for pipelines.
Prebuilt: create_supervisor
LangGraph ships a prebuilt supervisor builder:
from langgraph.prebuilt import create_react_agent
from langgraph_supervisor import create_supervisor
researcher = create_react_agent(model="claude-sonnet-4-5", tools=[web_search])
writer = create_react_agent(model="claude-sonnet-4-5", tools=[write_draft])
supervisor = create_supervisor(
agents=[researcher, writer],
model="claude-sonnet-4-5",
prompt="You manage two agents. Route each task to the appropriate agent.",
).compile()
Fast path for standard supervisor setups. Drop to custom code when you need control.
Observability
Multi-agent is harder to debug. Chapter 10 covers LangSmith; for multi-agent, it's essentially required. You want to see:
- Which agent ran.
- What it received.
- What it returned.
- How long it took.
Every agent a trace, every routing decision a span.
Common Pitfalls
No exit condition in supervisor. Supervisor keeps routing; never says FINISH. Recursion limit hits. Always have a clear end.
Supervisor that re-routes to the same agent repeatedly. Infinite loop. Add a guard (e.g. after N calls to the same agent, force FINISH).
Shared state when agents overwrite each other. Two agents both update draft. Last write wins. Use different keys per agent or use add_messages for accumulating lists.
Free-text routing. "Respond with 'researcher' or 'writer'" fails when the LLM responds with "I think researcher would be best." Use tool calls.
Context explosion. Every message in state goes to every agent. Trim or scope per agent.
Next Steps
Continue to 10-observability.md to see what your graphs are actually doing.