Claude Code custom agent with tool use lets you wire real capabilities — bash execution, file I/O, web search — directly into an autonomous loop that calls tools, inspects results, and decides what to do next.
This isn't just prompting. You're building a proper agentic system where the model drives execution.
You'll learn:
- How to define tools with the Anthropic Messages API
toolsparameter - How to implement the agentic loop that handles
tool_useandtool_resultturns - How to give your agent bash, file read/write, and search capabilities safely
- How to stop runaway loops with turn limits and error handling
Time: 25 min | Difficulty: Intermediate
Why Tool Use Changes What Claude Code Can Do
Out of the box, Claude Code reasons and writes. Add tool use and it acts.
The pattern is simple: you describe tools in JSON schema, Claude returns a tool_use block when it wants to call one, you execute it, feed back a tool_result, and repeat. The model keeps control of what to call and when to stop.
This is exactly how Claude Code's internal agent loop works — the same mechanic you're about to build yourself.
Without tool use:
User prompt → Claude response → done
With tool use (agentic loop):
User prompt → Claude thinks → tool_use → you execute → tool_result → Claude thinks → ... → final text response
The agentic loop: Claude requests a tool, your executor runs it, the result feeds back — until Claude stops.
Prerequisites
- Python 3.12+
anthropicSDK 0.25+- Anthropic API key (starts at $3/million input tokens for claude-sonnet-4-20250514; billed in USD)
pip install anthropic --upgrade
export ANTHROPIC_API_KEY="sk-ant-..."
Verify the SDK version:
python -c "import anthropic; print(anthropic.__version__)"
# 0.25.0 or higher
Step 1: Define Your Tools
Each tool is a JSON schema object. The input_schema tells Claude exactly what parameters to pass.
# tools.py
TOOLS = [
{
"name": "bash",
"description": (
"Run a shell command and return stdout + stderr. "
"Use for file listing, git operations, running tests, or any CLI task. "
"Do NOT use for file reads — use read_file instead."
),
"input_schema": {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "The shell command to execute.",
}
},
"required": ["command"],
},
},
{
"name": "read_file",
"description": "Read the full contents of a file by path. Returns the text as a string.",
"input_schema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Absolute or relative path to the file.",
}
},
"required": ["path"],
},
},
{
"name": "write_file",
"description": "Write or overwrite a file with the given content.",
"input_schema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Path where the file will be written.",
},
"content": {
"type": "string",
"description": "Full file content to write.",
},
},
"required": ["path", "content"],
},
},
]
Why separate read_file from bash? Bash is stateful and can fail silently. A dedicated read tool returns clean errors and never accidentally executes anything.
Step 2: Implement the Tool Executor
This is the function that maps tool names to real Python code. Claude never runs code itself — you run it and pass back the result.
# executor.py
import subprocess
from pathlib import Path
def execute_tool(name: str, tool_input: dict) -> str:
"""Run the tool Claude requested and return a string result."""
if name == "bash":
return _run_bash(tool_input["command"])
if name == "read_file":
return _read_file(tool_input["path"])
if name == "write_file":
return _write_file(tool_input["path"], tool_input["content"])
return f"ERROR: Unknown tool '{name}'"
def _run_bash(command: str) -> str:
# timeout=30 prevents infinite loops from locking up your agent
result = subprocess.run(
command,
shell=True,
capture_output=True,
text=True,
timeout=30,
)
output = result.stdout
if result.stderr:
output += f"\n[stderr]\n{result.stderr}"
if result.returncode != 0:
output += f"\n[exit code: {result.returncode}]"
return output or "(no output)"
def _read_file(path: str) -> str:
try:
return Path(path).read_text(encoding="utf-8")
except FileNotFoundError:
return f"ERROR: File not found: {path}"
except Exception as e:
return f"ERROR: {e}"
def _write_file(path: str, content: str) -> str:
try:
p = Path(path)
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content, encoding="utf-8")
return f"OK: wrote {len(content)} chars to {path}"
except Exception as e:
return f"ERROR: {e}"
Step 3: Build the Agentic Loop
This is the core. The loop keeps calling Claude until it returns a plain text response with no tool_use blocks — meaning it's done.
# agent.py
import anthropic
from tools import TOOLS
from executor import execute_tool
client = anthropic.Anthropic()
MAX_TURNS = 20 # Hard stop — prevents runaway billing on stuck loops
def run_agent(user_prompt: str, system: str = "") -> str:
"""
Run the full agentic loop for a single task.
Returns Claude's final text response.
"""
messages = [{"role": "user", "content": user_prompt}]
for turn in range(MAX_TURNS):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system,
tools=TOOLS,
messages=messages,
)
print(f"[turn {turn + 1}] stop_reason={response.stop_reason}")
# Append Claude's full response to history
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
# No tool calls — Claude is done. Extract and return the text.
return _extract_text(response.content)
if response.stop_reason == "tool_use":
# Process every tool call in this response
tool_results = []
for block in response.content:
if block.type == "tool_use":
print(f" → calling tool: {block.name}({block.input})")
result = execute_tool(block.name, block.input)
print(f" ← result: {result[:120]}...")
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
# Feed all results back in a single user turn
messages.append({"role": "user", "content": tool_results})
else:
# max_tokens or other stop — return whatever text we have
return _extract_text(response.content)
return "ERROR: MAX_TURNS reached without a final response."
def _extract_text(content: list) -> str:
return "\n".join(
block.text for block in content if hasattr(block, "text")
)
Why MAX_TURNS = 20? A misbehaving tool (empty output, wrong format) can send Claude into a loop. At $3/million input tokens this adds up fast. Twenty turns covers even deep multi-file refactors.
Step 4: Run Your Agent
Wire it up with a real task:
# main.py
from agent import run_agent
SYSTEM = """
You are a precise coding agent. When asked to make changes:
1. Read the relevant files first.
2. Write the updated files.
3. Run tests to verify.
4. Report what you changed and the test result.
Never skip reading before writing.
"""
task = """
Read src/utils.py. Add a function `clamp(value, min_val, max_val)`
that clamps a number between min and max. Write the updated file.
Then run: python -m pytest tests/test_utils.py -v
"""
result = run_agent(task, system=SYSTEM)
print("\n=== Agent Result ===")
print(result)
Expected output:
[turn 1] stop_reason=tool_use
→ calling tool: read_file({'path': 'src/utils.py'})
← result: # utils.py...
[turn 2] stop_reason=tool_use
→ calling tool: write_file({'path': 'src/utils.py', ...})
← result: OK: wrote 312 chars to src/utils.py...
[turn 3] stop_reason=tool_use
→ calling tool: bash({'command': 'python -m pytest tests/test_utils.py -v'})
← result: PASSED tests/test_utils.py::test_clamp ...
[turn 4] stop_reason=end_turn
=== Agent Result ===
Added `clamp(value, min_val, max_val)` to src/utils.py. All 3 pytest tests pass.
Step 5: Add a Sandbox Guard (Production)
Before shipping, add an allowlist so bash can't run destructive commands.
import re
BLOCKED_PATTERNS = [
r"\brm\s+-rf\b", # Recursive delete
r"\bsudo\b", # Privilege escalation
r"\bcurl\b.*\|\s*sh", # Pipe-to-shell downloads
r"\bdd\b.*of=/dev/", # Disk overwrite
]
def _run_bash(command: str) -> str:
for pattern in BLOCKED_PATTERNS:
if re.search(pattern, command):
return f"BLOCKED: command matches unsafe pattern '{pattern}'"
result = subprocess.run(
command, shell=True, capture_output=True, text=True, timeout=30
)
# ... rest of implementation
For production deployments on AWS (us-east-1 is recommended for lowest Anthropic API latency from the US), run the agent inside a Docker container with --read-only and a mounted /workspace volume. This gives you filesystem isolation without a full VM.
Verification
After completing the steps above, run this smoke test:
# smoke_test.py
from agent import run_agent
result = run_agent("Create a file /tmp/hello.txt with content 'agent works'. Then read it back and confirm.")
assert "agent works" in result, f"Unexpected result: {result}"
print("✅ Smoke test passed:", result)
You should see: ✅ Smoke test passed: The file /tmp/hello.txt contains 'agent works'.
Tool Use vs. Raw Prompting vs. OpenAI Function Calling
| Claude Tool Use | Raw Prompting | OpenAI Function Calling | |
|---|---|---|---|
| Structured tool calls | ✅ Typed JSON schema | ❌ Fragile parsing | ✅ Typed JSON schema |
| Multi-tool per turn | ✅ Parallel blocks | ❌ | ✅ Parallel |
| Token cost (input) | $3/M (Sonnet) | Same | $2.50/M (GPT-4o) |
| Streaming tool calls | ✅ | N/A | ✅ |
| Computer use built-in | ✅ (computer tool) | ❌ | ❌ |
Claude's tool_use blocks are typed Python objects on the SDK side (block.type == "tool_use", block.id, block.input). You don't parse strings — the SDK does the deserialization.
What You Learned
- The Anthropic Messages API
toolsparameter accepts JSON schema tool definitions — Claude decides when and how to call them. - The agentic loop alternates between
tool_use(you execute) andend_turn(Claude is done); always append both Claude's response and yourtool_resultto message history. MAX_TURNSis not optional in production — it prevents infinite loops from unexpected tool output.- Sandbox your
bashtool with a blocklist before deploying to any shared or cloud environment.
Tested on anthropic SDK 0.25, Python 3.12, macOS Sequoia 15.3 & Ubuntu 24.04
FAQ
Q: Can I run multiple tools in parallel in a single turn?
A: Yes. Claude may return multiple tool_use blocks in one response. The loop above already handles this — it iterates all blocks and collects all tool_result entries before sending a single user turn back.
Q: What happens if a tool returns an error string?
A: Pass the error string as the tool_result content. Claude reads it and typically retries with a corrected call or explains why it can't proceed. Never raise a Python exception — that breaks the loop.
Q: Does this work with claude-haiku-4-5 for lower cost?
A: Yes. Swap claude-sonnet-4-20250514 for claude-haiku-4-5-20251001. Haiku costs $0.80/million input tokens but handles fewer parallel tool calls reliably. Use it for simpler single-tool tasks.
Q: What is the maximum number of tools I can define? A: The API supports up to 64 tool definitions per request. In practice, keep it under 10 — large tool lists inflate the system prompt and push you toward the context window limit faster.
Q: Can I give the agent memory across tasks?
A: Not natively. Serialize messages to JSON after each run and reload them as conversation history on the next call. For persistent memory at scale, write key facts to a file and have the agent read it at the start of every session via read_file.