← AI Agents Course5 / 14

Build an Agent From Scratch

Combine the loop and tool use into a real agent — one that keeps calling tools until the task is done, with the error and rate-limit handling that production needs.

Ad 728×90

The full manual loop

Why: this is the whole agent — keep calling the model, run any tools it asks for, feed results back, and stop when it answers. Where: append the model's tool_use turn AND your tool_result turn every iteration, or the conversation breaks. When: the loop ends the moment stop_reason is no longer "tool_use".

def execute_tool(name, tool_input):
    if name == "get_weather":
        return get_weather(**tool_input)
    return f"Error: unknown tool {name}"


def run_agent(user_input, tools):
    messages = [{"role": "user", "content": user_input}]

    while True:
        response = client.messages.create(
            model="claude-opus-4-8",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )

        if response.stop_reason != "tool_use":
            return text_of(response)          # done — return the answer

        messages.append({"role": "assistant", "content": response.content})

        results = []
        for block in response.content:
            if block.type == "tool_use":
                output = execute_tool(block.name, block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": output,
                })
        messages.append({"role": "user", "content": results})

Run it on a multi-step task

Why: with more than one tool, the model chains calls on its own — this is where it starts to feel like an agent. When: give it a task that needs two tools and watch the loop iterate twice before answering.

tools = [weather_tool, book_meeting_tool]

answer = run_agent(
    "If it's sunny in Paris tomorrow, book a 30-minute walk at 5pm.",
    tools,
)
print(answer)
# The agent calls get_weather, sees it's sunny, then calls book_meeting.

Handle errors and rate limits

Why: real runs hit rate limits and transient failures, and an unguarded loop dies on the first one. When: the SDK already retries 429 and 5xx with backoff; catch what is left and degrade gracefully. Where: cap the iterations so a confused agent cannot loop forever.

import anthropic

def run_agent(user_input, tools, max_steps=10):
    messages = [{"role": "user", "content": user_input}]

    for _ in range(max_steps):          # guard against infinite loops
        try:
            response = client.messages.create(
                model="claude-opus-4-8",
                max_tokens=1024,
                tools=tools,
                messages=messages,
            )
        except anthropic.RateLimitError:
            return "The service is busy — please try again shortly."
        except anthropic.APIError as e:
            return f"API error: {e}"

        if response.stop_reason != "tool_use":
            return text_of(response)
        # ... run tools, append results (as above) ...

    return "Stopped: too many steps."

When to reach for a framework

Why: you have now built by hand exactly what LangChain, LangGraph, CrewAI, and the others automate — the loop, tool dispatch, and message bookkeeping. When: reach for a framework once you need their extras (graphs, retries, integrations), not before — and you will understand what they are doing because you wrote it yourself.

You built this yourself:
  • the perceive→reason→act→observe loop
  • tool dispatch and result plumbing
  • stop_reason handling and step limits

Frameworks (LangChain, LangGraph, CrewAI, LlamaIndex, ...) wrap
this same loop and add graphs, memory, and integrations. Use one
when its extras pay for the indirection — not by default.

The full manual loop

def execute_tool(name, tool_input):
    if name == "get_weather":
        return get_weather(**tool_input)
    return f"Error: unknown tool {name}"


def run_agent(user_input, tools):
    messages = [{"role": "user", "content": user_input}]

    while True:
        response = client.messages.create(
            model="claude-opus-4-8",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )

        if response.stop_reason != "tool_use":
            return text_of(response)          # done — return the answer

        messages.append({"role": "assistant", "content": response.content})

        results = []
        for block in response.content:
            if block.type == "tool_use":
                output = execute_tool(block.name, block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": output,
                })
        messages.append({"role": "user", "content": results})

Run it on a multi-step task

tools = [weather_tool, book_meeting_tool]

answer = run_agent(
    "If it's sunny in Paris tomorrow, book a 30-minute walk at 5pm.",
    tools,
)
print(answer)
# The agent calls get_weather, sees it's sunny, then calls book_meeting.

Handle errors and rate limits

import anthropic

def run_agent(user_input, tools, max_steps=10):
    messages = [{"role": "user", "content": user_input}]

    for _ in range(max_steps):          # guard against infinite loops
        try:
            response = client.messages.create(
                model="claude-opus-4-8",
                max_tokens=1024,
                tools=tools,
                messages=messages,
            )
        except anthropic.RateLimitError:
            return "The service is busy — please try again shortly."
        except anthropic.APIError as e:
            return f"API error: {e}"

        if response.stop_reason != "tool_use":
            return text_of(response)
        # ... run tools, append results (as above) ...

    return "Stopped: too many steps."

When to reach for a framework

You built this yourself:
  • the perceive→reason→act→observe loop
  • tool dispatch and result plumbing
  • stop_reason handling and step limits

Frameworks (LangChain, LangGraph, CrewAI, LlamaIndex, ...) wrap
this same loop and add graphs, memory, and integrations. Use one
when its extras pay for the indirection — not by default.