From mock tools to a real backend
Every tool so far has been faking it. look_up_orders didn't look anything up —
it returned a hardcoded dict. That was the right call while we learned the loop,
tools, errors, validation, and permissions: a fake tool kept the focus on the
harness. But ShopBot can't help a real customer with invented data. In this
lesson we give it a real backend and point the tools at it.
The plan: a separate store service
In the real world, an agent rarely owns the data it works with. The orders live in some existing system — a database behind an internal API — and the agent is just another client that calls it. We'll mirror that:
- a small ecommerce store service: a SQLite database with a FastAPI HTTP API,
- the harness's tools make HTTP calls to that service.
Keeping the store separate from the harness isn't an accident — it's the point. The agent and the data it uses are different systems. The harness doesn't reach into a database; it calls an API, exactly as it would against a real company backend.
We could have the tools open the SQLite file directly. Going through an HTTP API instead matches reality: the orders system is owned by someone else, exposed as endpoints, and our agent is one of many callers. It also means the store can enforce its own rules, independent of the agent.
The store, briefly
The database has the tables you'd expect — customers, orders, tracking, and refunds — seeded with the very order ShopBot has been talking about all series:
# orders seed row: order_number, customer_id, item, amount, status, tracking
("4521", 1, "running shoes", 59.99, "shipped", "1Z999")
The API exposes one endpoint per thing a tool needs to do — and they line up one-to-one with the tools from What is a tool?:
| Tool | Endpoint |
|---|---|
look_up_orders | GET /orders?customer=... |
get_tracking_status | GET /tracking/{tracking_number} |
issue_refund | POST /refunds |
Here's one endpoint, so you can see there's nothing magic — it's a normal query:
@app.get("/orders")
def list_orders(customer: str) -> list[dict]:
"""Look up a customer's orders by their email or numeric id."""
conn = connect()
rows = conn.execute(
"SELECT o.* FROM orders o JOIN customers c ON c.id = o.customer_id "
"WHERE c.email = ? OR c.id = ?",
(customer, customer),
).fetchall()
return [dict(row) for row in rows]
The only change the tools needed
This is the satisfying part. Remember a tool is two halves: the schema the model sees, and the function the harness runs. To go from mock to real, we changed only the function bodies. The schemas didn't move a comma.
Before — fake data:
def look_up_orders(customer: str) -> dict:
"""Pretend to look up a customer's most recent order."""
return {
"order_number": "4521",
"item": "running shoes",
"status": "shipped",
"tracking_number": "1Z999",
}
After — a real HTTP call:
def look_up_orders(customer: str) -> list:
"""Look up a customer's orders from the store API."""
response = httpx.get(
f"{STORE_URL}/orders", params={"customer": customer}, timeout=TIMEOUT
)
response.raise_for_status()
return response.json()
Because the schema is unchanged, the model sees the exact same tool it always did. It has no idea the data went from fake to real — and it shouldn't. How a tool gets its answer is the harness's business, not the model's. That clean line is what let us swap the implementation without touching anything else.
Everything we built still applies — and now it's real
The earlier production-ready lessons were written against mock tools, but they were really preparing for this moment:
- Handling tool errors —
raise_for_status()throws if the store returns a 404 or the service is down. That used to be hypothetical; now it genuinely happens, and the loop'stry/exceptturns it into something the model can explain to the customer. - Validating arguments — still guards every call before it leaves for the network.
- Permissions —
issue_refundstill asks a human first; only after approval does itPOST /refundsand actually move money.
Nothing about the harness loop changed. We just made the tools honest.
Running the whole thing
It's now two processes — the store, and the agent that calls it:
# Terminal 1 — the store
cd ecommerce-store
uv run db.py # once, to create + seed the database
uv run uvicorn main:app # serves http://127.0.0.1:8000
# Terminal 2 — the agent
cd agent-harness
export ANTHROPIC_API_KEY=sk-ant-...
uv run shopbot.py
Now when ShopBot answers "Where's my order?", the order it reports really came out of a database. Stop the store and run it again, and you'll watch the error-handling path light up for real.
Recap
- Mock tools were perfect for learning the harness; a real assistant needs real data.
- We added a separate store service (SQLite + a FastAPI HTTP API) and pointed the tools at it — mirroring how agents call existing backends in the real world.
- Going from mock to real meant changing only the tool function bodies; the schemas — and therefore the model's view — stayed identical.
- Error handling, validation, and permissions all carry over unchanged, and now operate on genuine successes and failures.
That's a complete, real ShopBot: a harness loop driving validated, permissioned tools against a live backend. From here you can add more tools, more endpoints, and richer approval rules — the shape stays the same.