Part II - Core execution patterns
Routing - conditional control flow
Lesson 03 turned one overloaded request into a fixed line of transformations. But a real assistant does not get one kind of request — it gets many, and the right sequence of steps depends on which one arrived. Routing is the first pattern that lets the agent decide the path instead of hard-coding it: read the input and the state, classify, then send control down a specialized branch. It is the smallest amount of decision-making that turns a pipeline into an adaptive system.
RunnableBranch "coordinator" plus a Google ADK Auto-Flow version, both delegating to booker / info / unclear handlers.New capability: A branch point that chooses the next path based on input, accumulated state, or a prior step's observation — conditional control flow, with a visible decision you can audit.
booker/info/unclear delegation) and re-skin it as our running coding/research assistant. (4) Make the abstain threshold quantitative: a confidence cutoff is a knob that trades misroutes against unnecessary clarifications, and there is a worked number for where to set it. (5) Failure modes, a checklist, and the hand-off into parallelization, where the router stops choosing one branch and starts firing several.1 · From a fixed path to a decided path
A prompt chain (lesson 03) is deterministic: step 1 always feeds step 2 always feeds step 3. That is exactly right when every request needs the same treatment — summarize, then translate, then format. But most agent inputs are not uniform. A customer-support agent receives order-status questions, product questions, technical-support questions, and gibberish, all through the same text box. Forcing every one of them down a single chain means one of two bad outcomes: the chain is general enough to handle all of them and therefore good at none, or it is tuned for one and silently mangles the rest.
Routing introduces conditional logic into the operating loop. The system evaluates the current situation against a set of criteria and selects which of several specialized functions, tools, or sub-flows should run next. The book's canonical illustration: a support agent first classifies the user's intent, then dispatches —
The mental model: routing is the switch statement of agent design. A plain program reads a value and jumps to the matching case; an agent reads a fuzzy natural-language situation, infers the case, and jumps. Everything hard about routing lives in that word "infers" — the decision is now probabilistic, so the router must be honest about how sure it is.
"product_info". That is not enough to operate or debug. A production router returns four things: the destination, a confidence in [0,1], a short rationale, and the minimal context the chosen branch needs. Without confidence you cannot abstain; without rationale you cannot tell why a legal question landed in the general-chat branch when you read the logs three weeks later.2 · Four ways to decide — and what each costs
The book lists four mechanisms for the decision component. They are not ranked; they trade accuracy, latency, cost, and flexibility differently, and mature systems layer them.
The crucial distinction the book draws is between the LLM-based and the trained-classifier approaches: both can be accurate, but only the LLM runs a generative model at decision time. The classifier has already baked its logic into weights, so it is cheap and deterministic per call — at the price of needing a labeled dataset and re-training when the route set changes.
Routing is not only a front-door classifier. The book stresses it can fire at any stage of the loop: as initial task classification, as a mid-chain decision about what to do next given accumulated state, or as tool selection inside a sub-flow. A research system might use one router to assign work among retrieval, summarization, and analysis agents; a coding assistant first identifies the language and the user's intent (debug, explain, translate) before handing the snippet to the matching tool.
3 · The book's coordinator, made concrete
The chapter's running code builds a "coordinator" that routes a user request to one of three handlers. In LangChain it is an LLM classifier piped into a RunnableBranch; in Google ADK it is a parent Agent with sub_agents that the framework's Auto-Flow delegates to automatically. Two framework styles, one idea: classify, then dispatch to a specialist, and keep the decision visible.
LangChain style (explicit branch). A prompt forces a single-word label, then a branch maps the label to a handler:
# 1) the decision component: LLM emits exactly one label
coordinator_router_prompt = ChatPromptTemplate.from_messages([
("system",
"Analyze the request and decide which handler should take it.\n"
"- flights or hotels -> output 'booker'\n"
"- general questions -> output 'info'\n"
"- otherwise/unclear -> output 'unclear'\n"
"Output ONLY one word: 'booker', 'info', or 'unclear'."),
("user", "{request}"),
])
router_chain = coordinator_router_prompt | llm | StrOutputParser()
# 2) the dispatch: map the label to a specialist branch
delegation = RunnableBranch(
(lambda x: x["decision"].strip() == "booker", booking_branch),
(lambda x: x["decision"].strip() == "info", info_branch),
unclear_branch, # default / fallback branch
)
# 3) compose: route first, then run the chosen branch
coordinator = ({"decision": router_chain, "request": RunnablePassthrough()}
| delegation | (lambda x: x["output"]))
Note three things the book's code embodies. The router emits a closed vocabulary of labels, not free text. The branch has an explicit default (unclear) — there is always a fallback. And the destination handlers (booking_handler, info_handler, unclear_handler) are isolated functions, so each branch can be tested and changed independently.
Google ADK style (capability/tool routing). Instead of an explicit graph, ADK gives the coordinator sub_agents and lets the framework's Auto-Flow match the request to the right one based on each sub-agent's description:
booking_agent = Agent(name="Booker", model="gemini-2.0-flash",
description="Handles flight and hotel bookings via the booking tool.",
tools=[booking_tool])
info_agent = Agent(name="Info", model="gemini-2.0-flash",
description="Answers general information questions via the info tool.",
tools=[info_tool])
coordinator = Agent(name="Coordinator", model="gemini-2.0-flash",
instruction=("You ONLY analyze the request and delegate; never answer directly.\n"
"- booking of flights/hotels -> delegate to 'Booker'\n"
"- general information -> delegate to 'Info'"),
sub_agents=[booking_agent, info_agent])
# runner.run(...) — Auto-Flow routes to a sub_agent based on its description
The book's contrast is exactly the one in the index's "framework choices": LangGraph's state-graph architecture suits complex routing where the decision depends on accumulated system state (you draw nodes and the functions/model-evaluations that govern transitions between them), while Google ADK routes implicitly via tool/capability descriptions, which suits agents whose actions are clearly named. Both are listed by the book alongside LangChain as the frameworks that give routing explicit structure.
Re-skinned as the running coding/research assistant
Our track's running example is a coding and research assistant. The same coordinator pattern, with the book's lesson about naming routes by their action rather than a vague category:
ROUTES = {
"explain_code": read_only_explanation_flow, # "what does this module do?"
"fix_failing_test": code_edit_flow, # "make test_auth pass"
"research_lookup": rag_retrieval_flow, # "how does asyncio.gather schedule?"
"destructive_op": approval_gated_flow, # "delete the generated files"
"clarify": ask_one_question_flow, # fallback when intent is unclear
}
decision = router(context) # -> {route, confidence, rationale, ctx}
if decision.confidence < TAU: # abstain threshold (next section)
branch = ROUTES["clarify"]
elif decision.route == "destructive_op":
branch = ROUTES["destructive_op"] # rule overrides model: always gate deletes
else:
branch = ROUTES[decision.route]
log_route(decision) # route, confidence, rationale, outcome
The destructive_op line shows the most important guardrail in routing: a deterministic rule must override model judgment for anything dangerous. You never let an LLM's confidence alone decide whether to run a delete. The model can suggest the destructive route, but a rule, not the model, gates it (this becomes human-in-the-loop, lesson 15).
4 · The abstain threshold is a knob you must set
Because the routing decision is probabilistic, the single most consequential design choice is when to refuse to route and ask for clarification instead. That is a confidence threshold τ. Set it too low and the agent confidently sends ambiguous requests to the wrong specialist (a misroute — a legal question answered by a generic chatbot). Set it too high and the agent pesters users for clarification on requests it could have handled fine, hurting the experience and adding a round-trip.
This is a precision/recall tradeoff in disguise, and it has a number. Suppose your router, on a labeled validation set, produces a confidence score and you measure: when it routes (does not abstain), how often is the route correct? And of the requests it could have answered, how many did it needlessly punt to clarification? Sweep τ and you trace a curve. The widget below lets you do exactly that.
destructive_op branch is catastrophic, you raise τ high for that route specifically; for a harmless FAQ misroute, a low τ is fine.Two refinements the book's spirit implies. First, τ can be per-route, not global — a high bar for dangerous or expensive branches, a low bar for cheap reversible ones. Second, you do not have to fully commit at the threshold: a near-tie between two routes can fan out to both (lesson 05) or escalate to a human (lesson 15) rather than guess.
5 · Where this routes (no pun) wrong
Failure modes
- Overlapping or vague labels. If
"product_info"and"technical_support"can both describe the same query, the router cannot make a stable decision and confidence collapses. Routes must be mutually distinguishable and named by action. - No low-confidence fallback. Without an
unclear/clarifydefault, an uncertain router still has to pick — so it guesses, confidently, on exactly the inputs it understands least. - Model routing bypassing policy. Letting LLM confidence alone trigger a destructive or privileged path. A rule must gate anything irreversible, regardless of how sure the model claims to be.
- Closed-vocabulary leak. An LLM router emits free text ("this looks like a product question, maybe?") instead of one label, breaking the downstream
switch. Constrain the output and validate it; fall back tounclearon any unparseable label. - Silent route drift. Input distribution shifts (a new product line) but the route set and classifier do not, so a growing slice quietly lands in the wrong branch. Without logged route + outcome you never see it.
Implementation checklist
- Are the route labels mutually exclusive and named by the action they trigger, not a category?
- Does every route — including the default — have explicit input and output contracts (lesson 02)?
- What deterministic rule overrides model routing for dangerous/expensive branches?
- Is there a fallback route, and a confidence threshold τ (global or per-route) that triggers it?
- Does the router return destination + confidence + rationale + minimal context — not just a label?
- Are route, confidence, evidence, and final outcome logged so misroutes can be sampled and fixed?
- Could a cheap rule/embedding pre-filter handle the obvious cases before the LLM call?
Checkpoint exercise
Where this points next
Routing answers "given this input, which one branch runs?" But sometimes the honest answer is "several." A research request might need retrieval and summarization and analysis at once; a near-tie between two routes might be best resolved by trying both and comparing. The next pattern — parallelization — keeps the router's classify-then-dispatch shape but fans control out to multiple branches that run concurrently, then fans the results back into one state. Lesson 05 builds the fan-out / fan-in machinery, the latency math that makes it worth doing, and the reducer that merges the branches the router lit up.
RunnableBranch, Google ADK Auto-Flow, LangGraph state graphs) is this one idea in three framework dialects.Interview prompts
- What does routing add over a prompt chain? (§1 — conditional control flow: the agent classifies the situation and dispatches to a specialized branch instead of forcing every input through one fixed pipeline.)
- Name the four routing mechanisms and when each wins. (§2 — rule-based: fast/deterministic/brittle, for hard constraints; LLM-based: flexible, costs an inference; embedding: semantic, cheap, many routes; trained classifier: cheap+fast at inference but needs labeled data. Layer cheap rules in front of the LLM.)
- Why must a router return more than a label? (§1 — without confidence you cannot abstain; without rationale and logged outcome you cannot debug a misroute or detect route drift.)
- How do you set the abstain threshold τ? (§4 — sweep it on a labeled set; raising τ catches more misroutes but causes more needless clarifications. Choose by the cost asymmetry — high τ for dangerous/expensive routes, low for harmless reversible ones; τ can be per-route.)
- Should the LLM's confidence be allowed to trigger a file deletion? (§3, §5 — no; a deterministic rule must hard-gate irreversible/privileged actions regardless of model confidence, then escalate to human approval.)
- Contrast LangGraph and Google ADK routing styles. (§3 — LangGraph: explicit state graph with nodes and transition functions, suits state-dependent multi-step routing; ADK: implicit Auto-Flow that matches the request to a sub-agent/tool by its description, suits clearly-named actions.)
- How would you cut the cost of an expensive LLM router by half without losing quality? (§2 — put a rule/embedding pre-filter in front that handles the unambiguous majority for ~$0 and ~0 ms, and only send the nuanced remainder to the model.)