Move past toy tools — wire up web search, code execution, and API calls, and run the dangerous ones safely with sandboxing and permissioning so the agent can act without doing harm.
Why: a server-side tool like web search runs on the provider's side — you declare it and the model uses it without you writing any execution code. When: use it for current events and anything newer than the model. Where: results come back in the same response, already cited.
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=2048,
tools=[{"type": "web_search_20260209", "name": "web_search"}],
messages=[{"role": "user", "content": "What shipped in the latest Python release?"}],
)
print(text_of(response)) # the model searched and answered, no loop code neededWhy: some answers need real computation (math, data crunching, file parsing) that a language model should not do in its head. When: give it a sandboxed code tool and it writes and runs code to get an exact result. Where: this runs in the provider's sandbox — no execution environment for you to manage.
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=2048,
tools=[{"type": "code_execution_20260120", "name": "code_execution"}],
messages=[{"role": "user",
"content": "What's the standard deviation of [4, 8, 15, 16, 23, 42]?"}],
)
print(text_of(response)) # the model ran Python and reported the numberWhy: most useful tools are your own — an internal API, a database, a CRM. When: wrap the call in a function and describe it as a tool, exactly like the weather example. Where: validate the arguments and handle failures, because the inputs come from the model.
def create_ticket(subject, priority="normal"):
if priority not in {"low", "normal", "high"}:
return "Error: priority must be low, normal, or high."
resp = requests.post("https://api.example.com/tickets",
json={"subject": subject, "priority": priority},
timeout=10)
if resp.status_code != 201:
return f"Error creating ticket: {resp.status_code}"
return f"Created ticket #{resp.json()['id']}."Why: a tool that sends email, spends money, or deletes data is easy to gate and hard to undo — so promote those to dedicated tools and require approval. When: gate any hard-to-reverse action behind a human confirmation; keep read-only tools automatic. Where: run code and shell tools in an isolated sandbox, never against your real machine.
DESTRUCTIVE = {"send_email", "delete_record", "make_payment"}
def execute_tool(name, tool_input):
if name in DESTRUCTIVE:
print(f"Agent wants to call {name}({tool_input})")
if input("Approve? [y/N] ").lower() != "y":
return "Denied by user." # the model adapts and moves on
return TOOLS[name](**tool_input)
# Read-only tools (search, lookups) run automatically; only
# irreversible actions stop for a human.