Multi-Agent: Supervisors, Hierarchies, Handoffs

This chapter covers the patterns for coordinating multiple agents: supervisor, hierarchical team, and swarm-style handoffs.

When Multiple Agents

One agent with many tools is often enough. Reach for multi-agent when:

  • Different agents need different prompts, tools, or models.
  • Tasks cleanly separate into specialist roles (research, write, review).
  • You want to isolate failures: a broken research agent shouldn't break writing.
  • Tokens and cost: specialist agents with tight contexts are cheaper than one big agent with everything.

When not to:

  • Single agent with a few tools handles your case. Don't over-engineer.
  • You need low latency. Multi-agent means multiple LLM calls per turn.

The Three Patterns

Supervisor

One "supervisor" agent routes to specialist agents.

                  ┌─→ research_agent ─┐
 user → supervisor                    │ → supervisor → response
                  └─→ writing_agent  ─┘

Supervisor decides which specialist runs, passes state, collects the result, decides what next.

Hierarchical Teams

Multiple supervisor groups. Each group manages its own specialists.

top_supervisor
├── research_team_supervisor
│   ├── web_searcher
│   └── doc_searcher
└── writing_team_supervisor
    ├── drafter
    └── editor

Each team is a subgraph. Top-level supervisor picks the team; team supervisor picks the specialist.

Swarm (Handoffs)

No supervisor. Agents hand off to each other directly.

research_agent → (when done) → writing_agent → (when done) → review_agent

Each agent knows which other agents exist and can route to them.

Pick based on your problem. Supervisor is the safe default. Swarm scales better when agents have clear, non-overlapping responsibilities.

Supervisor Pattern

Build one agent per role, plus a supervisor that routes.

from typing import TypedDict, Annotated, Literal
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain_anthropic import ChatAnthropic

class State(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]
    next: str

llm = ChatAnthropic(model="claude-sonnet-4-5")

def supervisor(state: State) -> State:
    # In production, use structured output (tool call) to pick the next agent.
    prompt = f"""You manage two agents: 'researcher' and 'writer'.
Given the conversation, decide who should act next. Respond with one word:
'researcher', 'writer', or 'FINISH' if the user's request is complete.

Conversation:
{state['messages']}"""

    decision = llm.invoke(prompt).content.strip().lower()
    return {"next": decision}

def researcher(state: State) -> State:
    # Simulated research
    return {
        "messages": [AIMessage(content="Researched topic and found: [facts]")],
        "next": "",
    }

def writer(state: State) -> State:
    return {
        "messages": [AIMessage(content="Drafted: [content]")],
        "next": "",
    }

def route(state: State) -> Literal["researcher", "writer", "__end__"]:
    return state["next"] if state["next"] in {"researcher", "writer"} else END

graph = (
    StateGraph(State)
    .add_node("supervisor", supervisor)
    .add_node("researcher", researcher)
    .add_node("writer", writer)
    .add_edge(START, "supervisor")
    .add_conditional_edges("supervisor", route, {
        "researcher": "researcher",
        "writer": "writer",
        END: END,
    })
    .add_edge("researcher", "supervisor")
    .add_edge("writer", "supervisor")
    .compile()
)

Supervisor runs, decides, the chosen agent runs, returns to supervisor. Loop until supervisor says FINISH.

Structured Output for Routing

Asking Claude "respond with one word" is fragile. Better: give the supervisor a tool whose name is the routing target.

from langchain_core.tools import tool

@tool
def route_to_researcher() -> str:
    """Route to the researcher agent for fact-finding tasks."""
    return "researcher"

@tool
def route_to_writer() -> str:
    """Route to the writer agent for drafting content."""
    return "writer"

@tool
def finish() -> str:
    """Signal that the task is complete and we can respond to the user."""
    return "FINISH"

supervisor_llm = llm.bind_tools([route_to_researcher, route_to_writer, finish])

def supervisor(state: State) -> State:
    response = supervisor_llm.invoke(state["messages"])
    if response.tool_calls:
        target = response.tool_calls[0]["name"]
        return {"next": target.replace("route_to_", "")}
    return {"next": "FINISH"}

Tool calls are reliable; free-text routing is not.

Handoff Pattern with Command

Command(goto=...) lets an agent route directly to another agent. No supervisor.

from langgraph.types import Command

def researcher(state: State) -> Command:
    # ... do research ...
    findings = "web findings"

    # Decide: hand off to writer
    return Command(
        update={"messages": [AIMessage(content=f"Findings: {findings}")]},
        goto="writer",
    )

def writer(state: State) -> Command:
    # ... write draft ...
    return Command(
        update={"messages": [AIMessage(content="Final draft")]},
        goto=END,
    )

graph = (
    StateGraph(State)
    .add_node("researcher", researcher)
    .add_node("writer", writer)
    .add_edge(START, "researcher")
    .compile()
)

Each agent knows what to do next and routes itself. No central dispatcher. Fits when the flow is mostly linear with agent-driven branches.

Sharing State vs Passing Messages

Two philosophies for multi-agent coordination.

Shared State

All agents read and write the same state. Message history, findings, drafts: everyone sees everything.

Pros: simple. Agents see the whole conversation.

Cons: context bloat. Every agent gets every message, whether relevant or not. Tokens add up.

Message-Passing

Each agent gets a narrow, scoped input. Outputs pass to the next agent explicitly.

def researcher(state: State) -> Command:
    findings = do_research(state["messages"])
    return Command(
        update={"research_output": findings},
        goto="writer",
    )

def writer(state: State) -> Command:
    draft = write_from(state["research_output"])
    return Command(
        update={"messages": [AIMessage(content=draft)]},
        goto=END,
    )

Writer reads research_output, not the full message history. Smaller context, cleaner boundaries.

Pick based on how coupled your agents are. Shared state for conversational multi-agent; message-passing for pipelines.

Prebuilt: create_supervisor

LangGraph ships a prebuilt supervisor builder:

from langgraph.prebuilt import create_react_agent
from langgraph_supervisor import create_supervisor

researcher = create_react_agent(model="claude-sonnet-4-5", tools=[web_search])
writer = create_react_agent(model="claude-sonnet-4-5", tools=[write_draft])

supervisor = create_supervisor(
    agents=[researcher, writer],
    model="claude-sonnet-4-5",
    prompt="You manage two agents. Route each task to the appropriate agent.",
).compile()

Fast path for standard supervisor setups. Drop to custom code when you need control.

Observability

Multi-agent is harder to debug. Chapter 10 covers LangSmith; for multi-agent, it's essentially required. You want to see:

  • Which agent ran.
  • What it received.
  • What it returned.
  • How long it took.

Every agent a trace, every routing decision a span.

Common Pitfalls

No exit condition in supervisor. Supervisor keeps routing; never says FINISH. Recursion limit hits. Always have a clear end.

Supervisor that re-routes to the same agent repeatedly. Infinite loop. Add a guard (e.g. after N calls to the same agent, force FINISH).

Shared state when agents overwrite each other. Two agents both update draft. Last write wins. Use different keys per agent or use add_messages for accumulating lists.

Free-text routing. "Respond with 'researcher' or 'writer'" fails when the LLM responds with "I think researcher would be best." Use tool calls.

Context explosion. Every message in state goes to every agent. Trim or scope per agent.

Next Steps

Continue to 10-observability.md to see what your graphs are actually doing.