What is an agent harness? (and why we need it)

Let's start with a story you can picture.

You're building a chatbot for an online store — call it ShopBot. A customer types:

"Where's my order? I bought running shoes last week."

First attempt: just the model

You take a large language model (an LLM, like Claude), hand it the question, and read the reply. A good model answers honestly:

"I can help, but I don't have access to your order system from here. To check the running-shoes order, I'd need the order number, the email or phone used for the purchase, or a tracking number."

Notice what happened. The model didn't fail because it's dumb — it's being honest. It simply has no way to reach your order system. It can't open your database, it can't call the shipping carrier, it can't see anything beyond the words in the chat. So the best it can do is apologise and ask the customer to do the work themselves.

That's a dead end for ShopBot. The whole point was for the bot to look the order up for the customer.

And it can be worse

A weaker or badly-prompted model might not stop at "I can't access that." It might guess — confidently inventing something like "Your order #4521 shipped yesterday and arrives Thursday!" There is no order #4521; it just produced text that sounds right. A confident wrong answer about someone's order is worse than an honest "I can't check."

Either way, the lesson is the same:

An LLM on its own is a brilliant text generator with no hands and no eyes. It can reason and write beautifully, but it can't do anything — it can't look up an order, check inventory, or issue a refund.

What ShopBot actually needs to do

To genuinely help that customer, ShopBot has to:

Understand the question — "they want their order status"
Look up the customer's orders in your database
Check the shipping carrier for tracking info
Read those results
Write a helpful, accurate reply

Steps 1 and 5 are thinking and talking — the LLM is great at those. Steps 2–4 are doing — the LLM cannot do them by itself. Something has to connect the thinking to the doing.

That something is the agent harness.

So what is an agent harness?

An agent harness is the code that wraps around the model and gives it hands. It's the program that:

sends the user's message to the model,
notices when the model says "I need to look something up,"
actually performs that action (runs the database query, calls the API),
feeds the result back to the model,
and repeats until the model can give a final answer.

A simple way to picture it:

The LLM is a brilliant new employee on their first day — sharp and well-spoken, but locked in a room with no computer and no phone. The harness is the assistant standing outside the door. The employee calls out "can you pull up this customer's recent orders?", the assistant runs off, does it, and slides the answer back under the door. Back and forth, until the customer is helped.

A one-line definition to keep in your pocket:

An agent harness is the control layer around an LLM that manages the conversation, gives the model access to tools and data, runs those tool calls safely, feeds the results back, and repeats until the task is done.

You'll hear other names for it

People call this same idea an agent runtime, agent loop, orchestrator, tool-calling framework, or agent framework. They're not always identical, but they overlap heavily — all of them are "the code around the model."

The loop, step by step

Here's the same ShopBot question, but now with a harness in the middle:

User      → "Where's my order? I bought running shoes last week."

Harness   → Model:  here's the question, and here are the tools you may use:
                       look_up_orders(), get_tracking_status()

Model     → Harness: "I'd like to call look_up_orders(customer)."
                        (it doesn't run it — it just *asks*)

Harness   → runs the real query on your database
                       → order 4521, running shoes, status: shipped, tracking: 1Z999

Harness   → Model:  here's what look_up_orders returned

Model     → Harness: "Now call get_tracking_status(1Z999)."

Harness   → calls the carrier's API → "Out for delivery, arrives today"

Harness   → Model:  here's the tracking result

Model     → Harness: "I have everything I need. Final answer:
                        Your running shoes are out for delivery — they arrive today!"

Harness  → User:   shows that reply

The crucial detail:

The model never touches your database or the internet directly. It only ever asks. The harness decides whether to fulfil the request, does the real work, and keeps the loop running.

More than hands — it's also the supervisor

The "assistant outside the room" picture is a great start, but it undersells the job. Notice the word decides above. The harness doesn't blindly do whatever the model asks — it's also the manager and safety layer. It chooses which tools the model is even allowed to use, checks that a request is permitted before running it, and decides when the loop should stop.

So the harness both gives the model hands and supervises what those hands are allowed to touch. (We'll dig into everything a real harness is responsible for — permissions, errors, limits, logging — in a later lesson. For now, just hold onto: it connects, and it controls.)

Why do we need it?

Without a harness, the model can only:

use knowledge frozen at training time — it doesn't know today's orders, prices, or stock,
talk, never act — no refunds, no cancellations, no emails,
and worst of all, guess — when it doesn't know, it invents a confident-sounding answer, and a confident wrong answer about someone's order is worse than no answer at all.

With a harness, ShopBot gets:

live data — every reply is grounded in your real database,
real actions — it can cancel an order, apply a coupon, open a ticket,
control and safety — you decide which tools exist, what they're allowed to do, and you can require human approval before, say, issuing a big refund.

The harness is what turns "a chatbot that sounds smart" into "an agent that actually gets things done."

Chatbot vs. agent, in one line

A plain chatbot: one message in, one message out. It talks.
An agent: it can loop — think, act, see the result, think again — until the task is finished. It acts.

Not every bot needs the full loop, though. A simple support bot might just classify the question → call one fixed API → let the model write the reply. That's still a harness — but a straight-line one, not a fully autonomous agent. A true agent harness is built for repeated tool use, branching, and multi-step work.

The harness is exactly the machinery that makes that loop possible. No harness, no agent.

Recap

An LLM on its own can think and write, but can't do anything or see live data.
An agent harness is the surrounding code that gives the model hands and supervises them: it runs the tools the model asks for, controls what's allowed, and feeds the results back.
It works as a loop: the model requests an action → the harness performs it → returns the result → repeat → final answer.
We need it for accurate, live, action-capable, and controllable assistants — a real ShopBot instead of one that invents order numbers.

Next up: we'll open the hood and build the simplest possible harness loop, step by step.

First attempt: just the model​

What ShopBot actually needs to do​

So what is an agent harness?​

The loop, step by step​

More than hands — it's also the supervisor​

Why do we need it?​

Chatbot vs. agent, in one line​

Recap​