all_lessons/agentic_systems/00 · orientationlesson 1 / 25

Part 0 - Map

Orientation - why agentic systems need patterns

A single model call can reason and write. It cannot remember, act, recover, or know when to stop. This track is the curriculum for closing that gap: turning a stateless completion into a reliable autonomous system. Before we build any one pattern, we need the map — what an agent actually is, the ladder of autonomy the book climbs, and why the 21 patterns ahead must be learned in dependency order rather than collected as tricks.

Book source
Introduction ("what is an agentic system", the framework trio), Agent characteristics (Level 0–3, the five future hypotheses), and the Conclusion. PDF outline pages 14–23 and 237–240.
The plan
Five moves. (1) Pin down the difference between a model call and an agent, using the book's five-step loop. (2) Climb the book's autonomy ladder — Level 0 reasoning engine through Level 3 multi-agent systems — and watch what new capability each rung adds. (3) Make the loop concrete as engineering state: a control loop with goal, state, policy, actions, observations, and a stop rule. (4) Run the numbers on why the loop must be governed — a worked retry/cost budget showing how autonomy compounds risk. (5) Lay out the dependency order that the rest of the track follows, and read the book's five future hypotheses as the forces pulling that order forward. We close on the governing rule: every new degree of autonomy must arrive with a matching control surface.
Linear position
Prerequisite: None. This is the map drawn before any mechanism. You should know roughly what a large language model (LLM) is — a system that, given text, predicts more text.
New capability: A durable mental model — an agent is a stateful control loop wrapped around a model — plus the vocabulary (boundary, state, policy, observation, stop rule, control surface) that every later lesson reuses. You will leave able to draw any AI workflow as a loop and say which edge each design pattern strengthens.

1 · A model is not an agent — the five-step loop

Start from the gap the book opens with. A modern LLM is extraordinary at one thing: given a context window of text, it produces a plausible continuation. But that is a single, stateless transaction. Ask it to "book my team's offsite," and it will write you a lovely description of how it would do so — then forget the entire conversation the instant the call returns. It has no calendar, no inbox, no memory of last week, and no way to notice that the venue it suggested is already full. It can advise; it cannot carry out.

The book's definition is deliberately simple: an agent is a system that perceives its environment and takes actions to achieve a goal. It is an LLM that has grown three things a bare model lacks — the ability to plan, to use tools, and to interact with an environment. The book's mental image is an assistant that learns on the job by repeating a five-step loop:

1Get the goal. A human states a target — "schedule my week," "fix this bug."
2Scan the environment. Gather the facts that matter — read the inbox, check the calendar, open the repo, look up contacts.
3Plan. Decide the best sequence of steps toward the goal.
4Act. Execute against the real world — send the invite, write the patch, run the tests.
5Learn & adapt. Observe the result and adjust. If a meeting gets moved, the system learns from it to do better next time.

The key word is loop. Step 5 feeds back into step 2: the agent acts, observes what changed, and decides again. A chatbot is steps 3–4 run exactly once with no memory; an agent runs the whole cycle until the goal is met or it decides to stop. That single architectural difference — a feedback loop carrying state — is the seed of everything in this track. Every one of the 21 patterns ahead is a named way to make one edge of that loop stronger or safer.

Why this matters now
The book grounds the urgency in numbers: surveys report the large majority of enterprises increasing agent use, with roughly one in five only starting in the last year; agent startups had raised over $2B by end-2024 against a ~$5.2B market, projected toward ~$200B by 2034. The takeaway is not the forecast — it is that systems are being shipped faster than the discipline of building them reliably is settling. This track is that discipline.

2 · The autonomy ladder — Level 0 to Level 3

The book frames the last two years as a paradigm shift from "simple automation" to "complex autonomous systems," and it gives that shift a ladder. Each rung adds exactly one new capability — which is precisely the structure this course imitates. Read the ladder as a sequence of capabilities the agent is allowed to gain, because every capability is also a new way to fail.

LevelNameWhat it can doWhat it cannot do
Level 0Core reasoning engineAnswer from pretrained knowledge. Explains known concepts well.No tools, no memory, no environment. Cannot know the 2025 Oscar winner if it post-dates training.
Level 1Connected problem-solverCalls external tools — web search, a database (RAG), a finance API for a live AAPL quote. Crosses steps to interact with the world.Limited strategy; mostly one tool per need, little context discipline.
Level 2Strategic problem-solverMulti-step strategy, proactive help, and self-improvement. Its core skill is context engineering: curating exactly the right information for each step. Asks for feedback to improve its own packaging of inputs.Still a single agent; no true division of labor.
Level 3Collaborative multi-agent systemA team of specialists coordinated like a human org — a "project manager" agent delegating to "market research," "design," "marketing" agents that communicate and share state.Bounded by the underlying LLM's reasoning; genuine inter-agent learning is still early.

The book's worked Level-2 example is the one we will carry through the entire course. A software-engineering agent receives a bug report. It reads the report and the codebase, then distills a large pile of information into a focused context — only the files, symbols, and error lines that matter — so it can write, test, and submit a correct patch. That distillation is context engineering, and the book promotes it to a first-class skill (Appendix A and the reasoning chapter): the way to push accuracy up is to give the model short, focused, relevant context rather than dumping everything in. A separate Level-2 example makes the same point: a travel assistant connected to your inbox extracts just the flight number, date, and location from a long confirmation email before handing that to a calendar and weather API — never the whole email.

LLM ─────▶ RAG ─────▶ single-tool agent ─────▶ agentic RAG ─────▶ agentic AI (Lvl 0) grounded in (Lvl 1) acts on (Lvl 2) decides (Lvl 3) teams of reasoning facts, less the world via WHEN/WHAT to specialists only hallucination one tool at a time retrieve, curates coordinate context per step

That arrow — from a bare LLM, to retrieval-grounded answers, to a tool-using agent, to an agent that decides when to retrieve, to collaborating teams — is the spine of the syllabus. Notice it is monotonic in one quantity: autonomy. Each rung lets the system make more decisions on its own. And that is exactly why each rung also demands more of the machinery in the back half of this track — monitoring, recovery, guardrails, evaluation. More freedom, more governance.

3 · The loop as engineering state

The book's five steps are a mental model; to build anything we have to name the state the loop carries. Strip the agent down and it is a small, explicit control loop. Everything else in the course bolts onto this skeleton.

agent = {
  goal,              # what success means (lesson 13 makes this measurable)
  state,             # everything known so far: scratchpad, memory, results
  policy,            # the model + prompt that chooses the next action
  allowed_actions,   # the tools/transitions permitted right now (lesson 07, 20)
  observation_log,   # an append-only record of what each action returned
  stop_condition,    # goal met, budget spent, or escalate to a human
}

while not stop_condition(state):
    context     = build_context(goal, state)        # context engineering (lesson 02)
    action      = policy.choose(context)            # the model decides
    checked     = validate(action, allowed_actions) # refuse out-of-bounds actions
    observation = execute(checked)                  # touch the real world
    state       = update(state, action, observation)# fold result back in
    log(observation_log, action, observation)       # so the run is replayable
return final_report(state)

Six pieces of named state, one loop. Read each later pattern as an answer to "which line does this strengthen?" Routing (lesson 04) enriches policy.choose with a branch decision. Tool use (07) populates allowed_actions and execute. Memory (10) gives state structure that survives across runs. Reflection (06) inserts a critique step before update. Guardrails (20) harden validate. Evaluation (21) scores the whole observation_log, not just the final report. If you cannot point at the line a feature touches, you probably do not yet have an agent — you have a prompt with extra steps.

A useful test
Pick any AI workflow you use. Try to name its state, its allowed_actions, and its stop_condition. If there is no state to name — if every call starts from a blank slate — it is a chatbot, not an agent, however clever its answers are.

4 · Why autonomy must be governed — a worked budget

It is tempting to think the hard part of an agent is the intelligence. In production, the hard part is the loop not stopping, or stopping wrong, or spending without bound. A loop that decides its own next action can also decide to keep going. Numbers make the danger concrete.

Worked example — the runaway research assistant. Our coding/research agent answers a question by repeatedly: building context, calling the model, and maybe calling a search tool. Suppose each iteration costs roughly:

Input tokens / step
~8,000
Output tokens / step
~1,000
Model latency / step
~4 s
Tool latency / step
~2 s

At an illustrative blended price of $3 per million input tokens and $15 per million output tokens, one step costs about 8000 × 3/10⁶ + 1000 × 15/10⁶ = $0.024 + $0.015 = $0.039, and takes about 6 seconds. A well-behaved run of 8 steps costs roughly 8 × $0.039 ≈ $0.31 and finishes in 8 × 6 = 48 s. Tolerable.

Now remove the stop rule. The agent hits a flaky search API, gets a confusing result, decides to "try once more," and loops. Worse, context grows: each step it appends the last observation, so input tokens climb — 8k, 12k, 16k, 20k. Left unbounded for 50 steps with growing context, the input bill alone can pass $5, latency exceeds 5 minutes, and the user is still staring at a spinner. One confused agent, one missing line of code. This is not hypothetical; it is the default failure of every naive loop.

The compounding trap
Autonomy multiplies the cost of every other bug. A stateless prompt that misfires wastes one call. An agentic loop that misfires can waste a hundred calls, corrupt its own state, and take an irreversible action (send the email, push the commit) before anyone notices. This is the single reason the back half of the book exists.

The fix is a control surface for every degree of freedom we grant. A budget (max steps, max tokens, max dollars — lesson 18). A goal-and-progress check that can detect "not getting closer" and halt (lesson 13). A recovery path that treats the flaky API as an expected observation rather than a reason to spin (lesson 14). A human-in-the-loop gate before irreversible actions (lesson 15). The design rule that governs the whole track: never add a degree of autonomy without naming the boundary, the evidence, the budget, and the stop rule that govern it.

5 · The dependency order — and the frameworks

Because each capability is also a liability, the patterns cannot be learned in any order. You cannot meaningfully evaluate a trajectory before the agent records one; you cannot debug a multi-agent team before a single agent is debuggable; you cannot add tools before the loop has a state to fold their results into. The book's conclusion groups the patterns, and this track linearizes them into a ladder:

Foundations (01–02)
Draw the agent boundary; make the prompt/context an enforced contract.
Core execution (03–08)
Chain, route, parallelize, reflect, use tools, plan — the verbs of the loop.
State & protocols (09–12)
Multi-agent roles, memory, learning from traces, MCP integration.
Control & grounding (13–18)
Goals/monitoring, recovery, human review, RAG, A2A, resource budgets.
Trust & autonomy (19–23)
Reasoning, guardrails, evaluation, prioritization, exploration.
Composition (24)
Assemble the full coding/research agent from the parts.

Every code example in the book lands on one of three concrete "canvases," and we will name them the same way it does, choosing whichever fits the pattern:

The book's stance, which is also this track's: frameworks are interchangeable canvases for the same underlying patterns. We learn the pattern — the state, the contract, the failure modes — so that the framework becomes an implementation detail you can swap.

Finally, the introduction's five future hypotheses explain why the order tilts ever further toward governance. Each one cranks autonomy up another notch: (1) generalist agents managing fuzzy long-horizon goals — possibly assembled "Lego-style" from many small specialist models (SLMs); (2) deep personalization with proactive goal discovery — agents that act before being asked; (3) embodiment, agents that perceive and act in the physical world; (4) an agent-driven economy, agents as independent economic actors; (5) goal-driven, morphing multi-agent systems that rewrite their own structure and prompts to hit a declared outcome. Every hypothesis adds freedom, and so every hypothesis adds the need for more boundaries, more monitoring, more recovery, more evaluation — the exact machinery the second half of this course builds.

6 · The running example

One example threads through all 25 lessons: a coding and research assistant. It receives a task, inspects a repository, searches references, plans edits, applies patches, runs tests, asks for human approval when an action is risky, and reports its evidence. We will meet it at every level of the ladder — as a Level-1 tool caller in lesson 07, a Level-2 context engineer throughout, and a Level-3 collaborator in lesson 09 — and watch it accumulate exactly one new capability and one new control surface per lesson. By the capstone (lesson 24) it is a complete system that can act, remember, recover, evaluate, and improve, and you will be able to point at the lesson that earned each piece.

Where this points next

We now have the map: an agent is a stateful control loop, autonomy climbs a four-rung ladder, and every rung must arrive with a control surface. But "a loop around a model" is still vague — where exactly does the model end and the system begin? Which decisions belong to the model's judgment and which to deterministic code we can test? Lesson 01, "Agent boundary — model, policy, state, environment," draws that line precisely, separating the stochastic model from the control system around it. Only once the boundary is sharp can lesson 02 turn prompting into an enforced contract, and the execution patterns begin.

Failure modes

  • Tools before state. Giving an agent powerful tools before it has a state or stop condition — the runaway loop of §4.
  • Teams before debuggability. Adding multi-agent roles (Level 3) before a single-agent loop can be replayed and debugged.
  • Context dumping. Stuffing the whole repo/email into the prompt instead of curating it — the opposite of the book's context engineering.
  • Checklist thinking. Treating the 21 patterns as independent tricks rather than a composition with a dependency order.

Implementation checklist

  • Can you draw the loop — goal, state, policy, actions, observations, stop — for your system?
  • Can you name which pattern (and which code line) controls each edge of the loop?
  • Is there a budget: max steps, max tokens, max dollars, and a stop rule that fires?
  • Can you replay a run from the stored observation log?
  • For every degree of autonomy, have you named its boundary, evidence, budget, and stop rule?
Takeaway
An LLM is a stateless text transaction; an agent is a stateful control loop wrapped around it — perceive, plan, act, observe, adapt — running until a goal is met or a stop rule fires. The book climbs an autonomy ladder (Level 0 reasoning engine → Level 1 tool user → Level 2 strategic context engineer → Level 3 multi-agent team), and each rung adds one capability that is also one new way to fail. Because autonomy compounds the cost of every bug, this track is ordered so that each pattern arrives with its control surface: never grant a degree of autonomy without naming its boundary, evidence, budget, and stop rule. Patterns are learned in dependency order on interchangeable canvases (LangChain/LangGraph, CrewAI, Google ADK); the framework is a detail, the loop is the invariant.

Interview prompts