Docs | AI Spaceship

So far every tool ShopBot can call is harmless — looking up an order or checking tracking just reads data. But a real agent does things that change the world: issue refunds, cancel orders, send emails. Those you don't want a model deciding on its own. This lesson adds the harness's final safety job: controlling which tools can run, and requiring a human's OK for the risky ones.

This is the controls half of "connects and controls" from What is an agent harness? — the supervisor, not just the hands.

Two layers of permission

Think of it as two gates a tool call passes through:

Is this tool allowed at all? The model can only ask for tools the harness chose to expose. Anything else is refused outright.
Is this specific call allowed right now? Even for an exposed tool, a sensitive action can require a human to approve it before it runs.

We already have the first gate for free, and we'll add the second.

The model can only request tools that are in our schema list — that's the menu from What is a tool?. If the model ever names something that isn't in TOOL_FUNCTIONS, the harness already refuses it (the unknown-tool check from Handling tool errors).

So which tools exist at all is your decision, not the model's. Don't want ShopBot able to delete customers? Don't give it a delete_customer tool. The safest tool is the one you never expose.

Exposing a tool is a permission

There's no separate "permissions config" needed for gate 1 — the act of putting a tool in the list is granting permission. Leaving it out is denying it.

A tool that actually changes something

To make gate 2 worth having, ShopBot needs a tool with consequences. Here's a new one — issuing a refund:

def issue_refund(order_number: str, amount: float) -> dict:
    """Pretend to refund money to the customer for an order."""
    return {
        "order_number": order_number,
        "refunded": amount,
        "status": "refunded",
    }

Its schema even says so, right in the description the model reads:

{
    "name": "issue_refund",
    "description": (
        "Refund a given amount to the customer for an order. "
        "This moves money, so it is a sensitive action."
    ),
    "input_schema": { ...order_number and amount... },
}

Reading an order is fine to do automatically. Moving money is not. That's the line gate 2 protects.

Gate 2: a human in the loop

We mark which tools are sensitive, and add a tiny function that asks a person:

# Sensitive tools that move money or take irreversible action.
REQUIRES_APPROVAL = {"issue_refund"}


def human_approves(name: str, arguments: dict) -> bool:
    """Ask a human to approve a sensitive tool call. The model never decides this."""
    answer = input(f"   [approval needed] Allow {name}({arguments})? [y/N] ")
    return answer.strip().lower() in {"y", "yes"}

Then one check in run_tool, right after validation and just before the tool runs:

validate_arguments(name, arguments)  # stop bad calls before they run

# Sensitive actions need a human's OK first. A refusal is just a result the
# model reads — it isn't an error, and it doesn't stop the loop.
if name in REQUIRES_APPROVAL and not human_approves(name, arguments):
    print("   [harness] action declined by human")
    return {"status": "declined", "reason": "A human declined this action."}

function = TOOL_FUNCTIONS[name]
return function(**arguments)

The key design choice: a declined action is not an error — it's information. We hand the model a normal result saying "a human declined this," and the loop keeps going. The model reads it and responds gracefully, just like it does with a tool error.

The model is never the authority on risky actions

Notice human_approves is called by the harness, not the model. The model can ask to issue a refund, but it can't grant itself permission. That gap — model proposes, human disposes — is the whole point of human-in-the-loop.

What it looks like

Ask ShopBot for a refund and the run pauses for you:

--- Round 2: asking the model ---
   [approval needed] Allow issue_refund({'order_number': '4521', 'amount': 59.99})? [y/N]

Type y → the refund runs, and the model confirms it to the customer.
Type n → the harness returns "a human declined this action," and the model says something like "I've flagged your refund for review — a team member will confirm it shortly."

Either way, no money moves without a person saying yes.

Try it

The refund tool is wired up in agent-harness. Change the question in shopbot.py to something like "I'd like a refund for my running shoes," run uv run shopbot.py, and you'll hit the approval prompt.

Beyond a yes/no prompt

Our input() prompt is the simplest possible human-in-the-loop. Real systems grow it in obvious directions:

Thresholds — auto-approve refunds under $20, require approval above.
Roles — only a supervisor can approve over $500.
Async approval — send the request to a dashboard or Slack and pause until someone clicks approve, instead of blocking on a terminal.

The mechanism is the same every time: the harness pauses a sensitive call and asks an authority before proceeding.

Recap

Permissions come in two gates: which tools exist (you choose the menu) and which calls may run (approval for sensitive ones).
Exposing a tool is granting permission — the safest tool is one you never add.
Sensitive tools (like issue_refund) are gated by human approval, checked by the harness, not the model.
A declined action is information, not an error: it flows back as a result, the loop continues, and the model responds gracefully.

Next up: logging and observability — seeing what the harness and model did, so you can debug, audit, and trust it in production.

Permissions and human approval