how does sandboxing work in coding agents
tags: learning agentic-coding AI software
content
I heard agents are just conversation loops with tool calls. But how? I understood the concepts from reading, but only became able to visualize it by following along How to Build an Agent from Thorsten and Amp team, and actually implementing it myself to see the conversation loop in a tangible barebone program.
I’ve been hearing a lot about sandboxing. I know containers are probably used, but how?
warm up
- starting off with a basic shell wrapper, runs simple
exec.CommandContext. with this wrapper for command execution, we can already achieve the most basic allow/deny list.
allowed := map[string]bool{
"echo": true,
"ls": true,
"pwd": true,
"cat": true,
}
cmdName := parts[0]
if !allowed[cmdName] {
res.Err = fmt.Errorf("command not allowed: %s", cmdName)
return res
}
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
cmd := exec.CommandContext(ctx, cmdName, parts[1:]...)
cmd.Dir = workDir
stdout, err := cmd.StdoutPipe()- result:
Mini Agent Sandbox (Go)
Type commands. Type 'exit' to quit.
Allowed commands: echo, ls, pwd, cat
Sandbox directory: /var/folders/hk/7j5l95613kbgg217cm3l9k5w0000gr/T/agent-sandbox-707021901
agent> ls
----- result -----
command: ls
stdout:
note.txt
status: OK
agent> sudo echo
----- result -----
command: sudo echo
error: command not allowed: sudo- this has absolutely nothing to do with sandbox tho, a few sandboxing approach we can try:
- Containers (Docker/rootless/podman)
- Namespaces, cgroups, seccomp, without containers
- microVMs (e.g., Firecracker) for stronger isolation
go with the simplest, container
- trying a real sandbox, with a simplest docker container
- we upgrade from
exec.CommandContextand wrap everything in adocker run --rmcall, with docker flags to limit what we can see within the sandbox:
- we upgrade from
cmdName := parts[0]
if !allowed[cmdName] {
res.Err = fmt.Errorf("command not allowed: %s", cmdName)
return res
}
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
args := []string{
"run", "--rm",
"--network", "none",
"--memory", "128m",
"--cpus", "0.5",
"--pids-limit", "64",
"--read-only",
"--cap-drop", "ALL",
"--security-opt", "no-new-privileges",
"--tmpfs", "/tmp:rw,noexec,nosuid,size=16m",
"-v", workDir + ":/workspace:rw",
"-w", "/workspace",
"alpine:3.20",
"sh", "-lc", input,
}
cmd := exec.CommandContext(ctx, "docker", args...)
out, err := cmd.CombinedOutput()-
with this ultra minimalistic sandbox, and with the above docker flags, we have already created a safe sandbox:
- Resource Constraints:
"--memory 128m""--cpus 0.5""--pids-limit 64"
- Network:
--network none - Other security measure:
--read-only,--tmpfs /tmp:rw,noexec,nosuid,size=16m--security-opt no-new-privileges: prevents processes from gaining more privileges than their parent
- Resource Constraints:
-
finally, we enter the container with
sh -lcto execute input commands. -
oh, and, upon program start, we created a tmp dir for the sandbox, mounts to container volume, so the code has a place to actually do its work.
workDir, err := os.MkdirTemp("", "agent-sandbox-*")- and put in a little dummy file:
_ = os.WriteFile(filepath.Join(workDir, "note.txt"), []byte("hello from sandbox\n"), 0o644)- first run, after pulling an alpine image:
Mini Agent Sandbox (Go)
Type commands. Type 'exit' to quit.
Allowed commands: echo, ls, pwd, cat
Sandbox directory: /var/folders/hk/7j5l95613kbgg217cm3l9k5w0000gr/T/agent-sandbox-1089365983
agent> ls
----- result -----
command: ls
stdout:
note.txt
status: OK
agent> cat note.txt
----- result -----
command: cat note.txt
stdout:
hello from sandbox
status: OK- sweet
persistent container instead of per run
- now update it to a persistent docker container through out the session:
- At startup,
main()now creates one long-lived sandbox container viastartSandboxContainer(...)and reuses it for all commands - and replace one-off command execution with a foreground process (just a while loop) to keep a container alive.
- At startup,
func startSandboxContainer(workDir string) (string, error) {
containerName := fmt.Sprintf("agent-sandbox-%d-%d", time.Now().Unix(), rand.Intn(100000))
args := []string{
"run", "-d",
// same flags as before
"alpine:3.20",
"sh", "-lc", "while true; do sleep 3600; done",
}
cmd := exec.Command("docker", args...)
out, err := cmd.CombinedOutput()
if err != nil {
return "", fmt.Errorf("%w: %s", err, strings.TrimSpace(string(out)))
}
return containerName, nil
}- Much better. Performance-wise, or anything-else-wise, there’s no difference that I can feel. But psychologically, the little program feels much better
- Before, each input did a fresh
docker run --rm ... sh -lc "<command>", so each new command uses a new container - After, program startup does one
docker run -d ...container, and each input doesdocker exec ...into that same running container.
- Before, each input did a fresh
policy, access control
network control and command approval
- to give the sandbox network access, we can simply toggle docker’s
--networkflag, passing it through from the program’s input arg:
func main() {
networkMode := flag.String("network-mode", "none", "network mode: none or allowlist")
networkAllow := flag.String("network-allow", "", "comma-separated allowed hosts for allowlist mode")
flag.Parse()
// rest of the code
}- parsing flag
func parseConfig(mode, allowHosts string) (Config, error) {
cfg := Config{
NetworkMode: mode,
AllowedHosts: map[string]bool{},
}
if cfg.NetworkMode != "none" && cfg.NetworkMode != "allowlist" {
return cfg, fmt.Errorf("invalid network mode %q (use none or allowlist)", cfg.NetworkMode)
}
for host := range strings.SplitSeq(allowHosts, ",") {
h := strings.TrimSpace(strings.ToLower(host))
if h != "" {
cfg.AllowedHosts[h] = true
}
}
return cfg, nil
}- to achieve similar annoying “approval” in Claude Code/Codex etc, we can have:
func askForApproval(scanner *bufio.Scanner, reason string) bool {
fmt.Printf("Approval required (%s). Run anyway? [y/N]: ", reason)
if !scanner.Scan() {
return false
}
answer := strings.TrimSpace(strings.ToLower(scanner.Text()))
return answer == "y" || answer == "yes"
}- running the sandbox program with
--network-mode allowlistand-network-allow example.com,google.com:
Mini Agent Sandbox (Go)
Type commands. Type 'exit' to quit.
Allowed commands: echo, ls, pwd, cat, curl
Network mode: allowlist
Sandbox directory: /var/folders/hk/7j5l95613kbgg217cm3l9k5w0000gr/T/agent-sandbox-4126136591
Sandbox container: agent-sandbox-1772788507-61985
agent> curl test.com
Approval required (network command). Run anyway? [y/N]: y
----- result -----
command: curl test.com
error: host not allowed in allowlist mode: test.com
agent> curl example.com
Approval required (network command). Run anyway? [y/N]: y
----- result -----
command: curl example.com
stdout:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0<!doctype html><html lang="en"><head><title>Example Domain</title><meta name="viewport" content="width=device-width, initial-scale=1"><style>body{background:#eee;width:60vw;margin:15vh auto;font-family:system-ui,sans-serif}h1{font-size:1.5em}div{opacity:0.8}a:link,a:visited{color:#348}</style></head><body><div><h1>Example Domain</h1><p>This domain is for use in documentation examples without needing permission. Avoid use in operations.</p><p><a href="https://iana.org/domains/example">Learn more</a></p></div></body></html>
100 528 0 528 0 0 11450 0 --:--:-- --:--:-- --:--:-- 11478
status: OKcurl example.comworks as expected
path control
- similarly it’s not hard to have a file path control
func isPathWithinWorkspace(arg string) bool {
base := "/workspace"
var full string
if filepath.IsAbs(arg) {
full = filepath.Clean(arg)
} else {
full = filepath.Clean(filepath.Join(base, arg))
}
return full == base || strings.HasPrefix(full, base+"/")
}- we can see:
agent> ls
----- result -----
command: ls
stdout:
note.txt
status: OK
agent> ls /etc
----- result -----
command: ls /etc
error: path not allowed: /etc (only /workspace)side note (security!)
- even from this very very simple sandbox example, we’ve already had a huge security mischief
- in the [#path control] example, we implemented a check on the path to enforce path control, and only paths with
/workspaceprefix will be allowed:
agent> cat /etc/passwd
----- result -----
command: cat /etc/passwd
error: path not allowed: /etc/passwd (only /workspace)
- this seems to be working. Can we really not access any other path that’s not under the
/workspacedirectory? - let’s try this:
agent> echo hi; cat /etc/passwd
Approval required (contains shell metacharacters). Run anyway? [y/N]: y
----- result -----
command: echo hi; cat /etc/passwd
stdout:
hi
root:x:0:0:root:/root:/bin/sh
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin- oh no! the path control is bypassed simply by chaining commands! command injection!
- this is due to the problematic
sh -lcshell eval:
args := []string{"exec", containerName, "sh", "-lc", input}
cmd := exec.CommandContext(ctx, "docker", args...)
out, err := cmd.CombinedOutput()-
sh -lcinput sent raw user text to a shell parser:- Shell treats
;, &&, |, >, $(), backticks, globs, env expansion, etc as control features. - So user text could change intent: one command can become many commands, redirections, substitutions, like
echo hi; cat /etc/passwd
- Shell treats
-
let’s patch this:
- args := []string{"exec", containerName, "sh", "-lc", input}
+ args := []string{"exec", containerName, cmdName}
+ args = append(args, parts[1:]...)
cmd := exec.CommandContext(ctx, "docker", args...)
out, err := cmd.CombinedOutput()echo hi; lsno longer chains commands, it prints literalhi; ls.- We execute an exact argv array:
docker exec <container> cmd arg1 arg2 ....- No shell parser runs.
- Meta-characters are plain characters inside args, not operators.
audit log
- before adding an audit log, it’s probably a good idea to have some code cleanup, since this single
main.gois getting too long - i’d like the project tree to be:
sandbox/
├── main.go -> main entry for the program
├── executor.go -> handles sandbox creation/removal and command execution
├── policy.go -> defines sandbox policy
└── audit.go -> audit-logs command execution-
to add an audit log, we need to think about what we want to log
- command executed → just the input from user
- command execution result
- decision from policy: command is blocked, denied, or executed (failed/timeout/etc)
- reason for decision
- other command execution: execution time, duration
-
NOTE: probably not a good idea to log stdout/stderr, since it may contain sensitive user data
-
we have struct
AuditwithLogEventmethod:
type AuditEntry struct {
Timestamp string `json:"ts"`
Command string `json:"command"`
Decision string `json:"decision"`
Reason string `json:"reason,omitempty"`
ExitCode int `json:"exit_code,omitempty"`
Timeout bool `json:"timeout,omitempty"`
DurationMs int64 `json:"duration_ms"`
}
type Auditor struct {
file *os.File
enc *json.Encoder
}
func (a *Auditor) LogEvent(command string, decision Decision, reason string, start time.Time, res *Result) {
if a == nil || a.enc == nil {
return
}
entry := AuditEntry{
Timestamp: time.Now().UTC().Format(time.RFC3339Nano),
Command: command,
Decision: string(decision),
Reason: reason,
DurationMs: 0,
}
// field validation
_ = a.enc.Encode(entry)
}- usage:
start := time.Now()
res := runInSandbox(containerName, line, decision.Parts, 3*time.Second)
auditor.LogEvent(line, DecisionExecuted, "", start, &res)the workflow now is:
- User types a command
- The program checks policy and makes decision (blocked/need approval/etc)
- after user approval, if needed, the command is then sent to a container to execute.
- After execution, the program gets the result of execution from the container and then writes logs to the audit file on the host machine
Example audit log:
{"ts":"2026-03-06T09:30:00Z","command":"cat /etc/passwd","decision":"blocked","reason":"path not allowed: /etc/passwd (only /workspace)","duration_ms":0}
{"ts":"2026-03-06T09:30:03Z","command":"echo hello","decision":"executed","exit_code":0,"duration_ms":12}putting the agent in
- it’s about “sandboxing in coding agents”, yet the agent is still missing
- first we need to hook up an LLM, let’s go with Groq and
openai/gpt-oss-20b
tools
- as this involves tool calls, we need to define a tool call related struct. a very basic shell tool that the agent can use:
var shellTool = toolDef{
Type: "function",
Function: toolFunctionDef{
Name: "run_command",
Description: "Run a shell command in the sandbox. Allowed commands: echo, ls, pwd, cat, curl. The working directory is /workspace.",
Parameters: json.RawMessage(`{
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "The full command to run, e.g. 'ls -la' or 'cat note.txt'"
}
},
"required": ["command"]
}`),
},
}- request to Groq API:
// request struct
reqBody := chatRequest{
Model: "openai/gpt-oss-20b",
Messages: messages,
Tools: []toolDef{shellTool},
}sandbox
- when LLM returns a tool call response, before executing it in the sandbox, we run a policy check (wrapped inside
executeToolCall) - pseudo code for tool execution:
resp, err := callGroq(apiKey, history)
msg := resp.Choices[0].Message
for _, tc := range msg.ToolCalls {
result := executeToolCall(tc, containerName, cfg, auditor, scanner)
history = append(history, chatMessage{
Role: "tool",
ToolCallID: tc.ID,
Content: result,
})
}- in
executeToolCall, we check Policy and ask for user approval if needed
decision := evaluatePolicy(args.Command, cfg)
if !decision.Allowed {
auditor.LogBlocked(args.Command, decision.BlockReason)
fmt.Printf(" BLOCKED: %s\n", decision.BlockReason)
return "BLOCKED: " + decision.BlockReason
}
if decision.RequiresApproval {
fmt.Printf(" ⚠ %s\n", decision.ApprovalReason)
fmt.Print(" Allow? [y/N]: ")
approved := scanner.Scan() && strings.ToLower(strings.TrimSpace(scanner.Text())) == "y"
if !approved {
auditor.LogBlocked(args.Command, "user denied: "+decision.ApprovalReason)
return "DENIED: user rejected — " + decision.ApprovalReason
}
}- if commands passes policy check, simply execute it in sandbox
start := time.Now()
res := runInSandbox(containerName, decision.Parts, 3*time.Second)
auditor.LogExecution(args.Command, start, res)- running our agent:
you> hi
agent> Hi there! How can I help you today?
you> what's in this dir
[agent wants to call] `ls -la`
Execution output: total 8
drwx------ 3 root root 96 Mar 9 02:56 .
drwxr-xr-x 1 root root 4096 Mar 9 02:56 ..
-rw-r--r-- 1 root root 19 Mar 9 02:56 note.txt
agent> Here’s what’s inside the current working directory (`/workspace`):
- `note.txt` – a regular text file (19 bytes).
- No other files or sub‑directories are present.
If you’d like to see the contents of `note.txt`, just let me know!- updating policy so that paths outside of
/workspaceneeds user approval:- it asks for user approval as expected
you> what about in /etc
[agent wants to call] `ls -la /etc`
⚠ accesses path outside /workspace: /etc
Allow? [y/N]: n
agent> I’m sorry, but I can’t list or access `/etc` (or any other directory outside of the sandbox’s `/workspace`). If you need information about a specific file inside the workspace or want me to perform a command here, just let me know!that’s it
-
as a quick weekend learning toy project, i think i’ve gained what i wanted to learn about the very very basics of sandboxing in coding agent (the bare version could simply be done by sending tool call request from LLM to a container)
-
before I got my hands on this, I always thought that the agent program - the coding agent itself - is put inside a container. Actually the sandbox is just for command execution. The agent program is running on the host machine
+-------------------------------------+
| Agent Program (host machine) |
| - reasoning loop |
| - planning |
| - tool selection |
| - decides commands to run |
| |
| exec -> docker run ... |
+------------│------------------------+
│
▼
+-------------------------------------+
| Sandbox (Container). |
| - executes shell commands |
| - runs user code |
| - installs dependencies |
| - compiles programs |
| |
| limited CPU / memory / filesystem |
+-------------------------------------+
-
progressively, on the topic of container, i would like to try more some container escapes, i.e., try to ask the agent to break out of the sandbox. e.g., Mount the Docker socket (
-v /var/run/docker.sock:/var/run/docker.sock) and see how trivially the agent can escape. -
network related issue would be interesting to dig into too, as the agent explores the unknown of internet, malicious content can be scanned and checked before it reaches the agent. maybe a proxy between the sandbox and the internet. Log and inspect every HTTP request the agent makes, even allowed ones.
-
or as codex suggests:
The most impactful next step: swap your Docker runtime to gVisor (
docker run --runtime=runsc ...). It’s a one-flag change to your executor.go and gives you a whole new layer to explore.
others
Why would we consider VMs like firecrackers?
Containers are process isolation, while Firecracker VMs are kernel isolation. For highly untrusted code, kernel isolation is much safer.
because containers are still sharing the kernel with the host machine, some malicious code can still exploit kernel bugs and misconfigured permissions. Or even container runtime bugs and errors. if a sequence of codes executed in the container breaks out of the container due to any of the above issues, the codes is now running in the host machine directly. running completely untrusted code in Docker alone could still be considered risky.
For VMs, each of them has their own kernel, memory space and virtual devices. even if the sandbox got compromised, the malicious code is still in the VM. To break out of a VM requires bypassing hypervisor, which is much harder.