← AI Agents Course12 / 14

Real-World Tools & Safe Execution

Move past toy tools — wire up web search, code execution, and API calls, and run the dangerous ones safely with sandboxing and permissioning so the agent can act without doing harm.

Ad 728×90

Web search — facts past the training cutoff

Why: a server-side tool like web search runs on the provider's side — you declare it and the model uses it without you writing any execution code. When: use it for current events and anything newer than the model. Where: results come back in the same response, already cited.

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=2048,
    tools=[{"type": "web_search_20260209", "name": "web_search"}],
    messages=[{"role": "user", "content": "What shipped in the latest Python release?"}],
)
print(text_of(response))   # the model searched and answered, no loop code needed

Code execution — let the agent compute

Why: some answers need real computation (math, data crunching, file parsing) that a language model should not do in its head. When: give it a sandboxed code tool and it writes and runs code to get an exact result. Where: this runs in the provider's sandbox — no execution environment for you to manage.

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=2048,
    tools=[{"type": "code_execution_20260120", "name": "code_execution"}],
    messages=[{"role": "user",
               "content": "What's the standard deviation of [4, 8, 15, 16, 23, 42]?"}],
)
print(text_of(response))   # the model ran Python and reported the number

Custom tools — call your own APIs

Why: most useful tools are your own — an internal API, a database, a CRM. When: wrap the call in a function and describe it as a tool, exactly like the weather example. Where: validate the arguments and handle failures, because the inputs come from the model.

def create_ticket(subject, priority="normal"):
    if priority not in {"low", "normal", "high"}:
        return "Error: priority must be low, normal, or high."
    resp = requests.post("https://api.example.com/tickets",
                         json={"subject": subject, "priority": priority},
                         timeout=10)
    if resp.status_code != 201:
        return f"Error creating ticket: {resp.status_code}"
    return f"Created ticket #{resp.json()['id']}."

Sandbox and permission dangerous tools

Why: a tool that sends email, spends money, or deletes data is easy to gate and hard to undo — so promote those to dedicated tools and require approval. When: gate any hard-to-reverse action behind a human confirmation; keep read-only tools automatic. Where: run code and shell tools in an isolated sandbox, never against your real machine.

DESTRUCTIVE = {"send_email", "delete_record", "make_payment"}

def execute_tool(name, tool_input):
    if name in DESTRUCTIVE:
        print(f"Agent wants to call {name}({tool_input})")
        if input("Approve? [y/N] ").lower() != "y":
            return "Denied by user."            # the model adapts and moves on
    return TOOLS[name](**tool_input)

# Read-only tools (search, lookups) run automatically; only
# irreversible actions stop for a human.

Web search — facts past the training cutoff

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=2048,
    tools=[{"type": "web_search_20260209", "name": "web_search"}],
    messages=[{"role": "user", "content": "What shipped in the latest Python release?"}],
)
print(text_of(response))   # the model searched and answered, no loop code needed

Code execution — let the agent compute

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=2048,
    tools=[{"type": "code_execution_20260120", "name": "code_execution"}],
    messages=[{"role": "user",
               "content": "What's the standard deviation of [4, 8, 15, 16, 23, 42]?"}],
)
print(text_of(response))   # the model ran Python and reported the number

Custom tools — call your own APIs

def create_ticket(subject, priority="normal"):
    if priority not in {"low", "normal", "high"}:
        return "Error: priority must be low, normal, or high."
    resp = requests.post("https://api.example.com/tickets",
                         json={"subject": subject, "priority": priority},
                         timeout=10)
    if resp.status_code != 201:
        return f"Error creating ticket: {resp.status_code}"
    return f"Created ticket #{resp.json()['id']}."

Sandbox and permission dangerous tools

DESTRUCTIVE = {"send_email", "delete_record", "make_payment"}

def execute_tool(name, tool_input):
    if name in DESTRUCTIVE:
        print(f"Agent wants to call {name}({tool_input})")
        if input("Approve? [y/N] ").lower() != "y":
            return "Denied by user."            # the model adapts and moves on
    return TOOLS[name](**tool_input)

# Read-only tools (search, lookups) run automatically; only
# irreversible actions stop for a human.