how does sandboxing work in coding agents

tags: learning agentic-coding AI software

content

I heard agents are just conversation loops with tool calls. But how? I understood the concepts from reading, but only became able to visualize it by following along How to Build an Agent from Thorsten and Amp team, and actually implementing it myself to see the conversation loop in a tangible barebone program.

I’ve been hearing a lot about sandboxing. I know containers are probably used, but how?

warm up

starting off with a basic shell wrapper, runs simple exec.CommandContext. with this wrapper for command execution, we can already achieve the most basic allow/deny list.

allowed := map[string]bool{
	"echo": true,
	"ls":   true,
	"pwd":  true,
	"cat":  true,
}
 
cmdName := parts[0]
if !allowed[cmdName] {
	res.Err = fmt.Errorf("command not allowed: %s", cmdName)
	return res
}
 
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
 
cmd := exec.CommandContext(ctx, cmdName, parts[1:]...)
cmd.Dir = workDir
stdout, err := cmd.StdoutPipe()

result:

Mini Agent Sandbox (Go)
Type commands. Type 'exit' to quit.
Allowed commands: echo, ls, pwd, cat
Sandbox directory: /var/folders/hk/7j5l95613kbgg217cm3l9k5w0000gr/T/agent-sandbox-707021901
 
agent> ls
----- result -----
command: ls
stdout:
note.txt
status: OK
 
agent> sudo echo
----- result -----
command: sudo echo
error: command not allowed: sudo

this has absolutely nothing to do with sandbox tho, a few sandboxing approach we can try:
- Containers (Docker/rootless/podman)
- Namespaces, cgroups, seccomp, without containers
- microVMs (e.g., Firecracker) for stronger isolation

go with the simplest, container

trying a real sandbox, with a simplest docker container
- we upgrade from exec.CommandContext and wrap everything in a docker run --rm call, with docker flags to limit what we can see within the sandbox:

cmdName := parts[0]
if !allowed[cmdName] {
	res.Err = fmt.Errorf("command not allowed: %s", cmdName)
	return res
}
 
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
 
args := []string{
	"run", "--rm",
	"--network", "none",
	"--memory", "128m",
	"--cpus", "0.5",
	"--pids-limit", "64",
	"--read-only",
	"--cap-drop", "ALL",
	"--security-opt", "no-new-privileges",
	"--tmpfs", "/tmp:rw,noexec,nosuid,size=16m",
	"-v", workDir + ":/workspace:rw",
	"-w", "/workspace",
	"alpine:3.20",
	"sh", "-lc", input,
}
cmd := exec.CommandContext(ctx, "docker", args...)
out, err := cmd.CombinedOutput()

with this ultra minimalistic sandbox, and with the above docker flags, we have already created a safe sandbox:
- Resource Constraints:
  - "--memory 128m"
  - "--cpus 0.5"
  - "--pids-limit 64"
- Network: --network none
- Other security measure:
  - --read-only, --tmpfs /tmp:rw,noexec,nosuid,size=16m
  - --security-opt no-new-privileges: prevents processes from gaining more privileges than their parent
finally, we enter the container with sh -lc to execute input commands.
oh, and, upon program start, we created a tmp dir for the sandbox, mounts to container volume, so the code has a place to actually do its work.

workDir, err := os.MkdirTemp("", "agent-sandbox-*")

and put in a little dummy file:

_ = os.WriteFile(filepath.Join(workDir, "note.txt"), []byte("hello from sandbox\n"), 0o644)

first run, after pulling an alpine image:

Mini Agent Sandbox (Go)
Type commands. Type 'exit' to quit.
Allowed commands: echo, ls, pwd, cat
Sandbox directory: /var/folders/hk/7j5l95613kbgg217cm3l9k5w0000gr/T/agent-sandbox-1089365983
 
agent> ls
----- result -----
command: ls
stdout:
note.txt
status: OK
 
agent> cat note.txt
----- result -----
command: cat note.txt
stdout:
hello from sandbox
status: OK

sweet

persistent container instead of per run

now update it to a persistent docker container through out the session:
- At startup, main() now creates one long-lived sandbox container via startSandboxContainer(...) and reuses it for all commands
- and replace one-off command execution with a foreground process (just a while loop) to keep a container alive.

func startSandboxContainer(workDir string) (string, error) {
	containerName := fmt.Sprintf("agent-sandbox-%d-%d", time.Now().Unix(), rand.Intn(100000))
 
	args := []string{
		"run", "-d",
		// same flags as before
		"alpine:3.20",
		"sh", "-lc", "while true; do sleep 3600; done",
	}
 
	cmd := exec.Command("docker", args...)
	out, err := cmd.CombinedOutput()
	if err != nil {
		return "", fmt.Errorf("%w: %s", err, strings.TrimSpace(string(out)))
	}
	return containerName, nil
}

Much better. Performance-wise, or anything-else-wise, there’s no difference that I can feel. But psychologically, the little program feels much better
- Before, each input did a fresh docker run --rm ... sh -lc "<command>", so each new command uses a new container
- After, program startup does one docker run -d ... container, and each input does docker exec ... into that same running container.

policy, access control

network control and command approval

to give the sandbox network access, we can simply toggle docker’s --network flag, passing it through from the program’s input arg:

func main() {
	networkMode := flag.String("network-mode", "none", "network mode: none or allowlist")
	networkAllow := flag.String("network-allow", "", "comma-separated allowed hosts for allowlist mode")
	flag.Parse()
	// rest of the code
}

parsing flag

func parseConfig(mode, allowHosts string) (Config, error) {
	cfg := Config{
		NetworkMode:  mode,
		AllowedHosts: map[string]bool{},
	}
 
	if cfg.NetworkMode != "none" && cfg.NetworkMode != "allowlist" {
		return cfg, fmt.Errorf("invalid network mode %q (use none or allowlist)", cfg.NetworkMode)
	}
 
	for host := range strings.SplitSeq(allowHosts, ",") {
		h := strings.TrimSpace(strings.ToLower(host))
		if h != "" {
			cfg.AllowedHosts[h] = true
		}
	}
	return cfg, nil
}

to achieve similar annoying “approval” in Claude Code/Codex etc, we can have:

func askForApproval(scanner *bufio.Scanner, reason string) bool {
	fmt.Printf("Approval required (%s). Run anyway? [y/N]: ", reason)
	if !scanner.Scan() {
		return false
	}
	answer := strings.TrimSpace(strings.ToLower(scanner.Text()))
	return answer == "y" || answer == "yes"
}

running the sandbox program with --network-mode allowlist and -network-allow example.com,google.com:

Mini Agent Sandbox (Go)
Type commands. Type 'exit' to quit.
Allowed commands: echo, ls, pwd, cat, curl
Network mode: allowlist
Sandbox directory: /var/folders/hk/7j5l95613kbgg217cm3l9k5w0000gr/T/agent-sandbox-4126136591
Sandbox container: agent-sandbox-1772788507-61985
 
agent> curl test.com
Approval required (network command). Run anyway? [y/N]: y
----- result -----
command: curl test.com
error: host not allowed in allowlist mode: test.com
 
agent> curl example.com
Approval required (network command). Run anyway? [y/N]: y
----- result -----
command: curl example.com
stdout:
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<!doctype html><html lang="en"><head><title>Example Domain</title><meta name="viewport" content="width=device-width, initial-scale=1"><style>body{background:#eee;width:60vw;margin:15vh auto;font-family:system-ui,sans-serif}h1{font-size:1.5em}div{opacity:0.8}a:link,a:visited{color:#348}</style></head><body><div><h1>Example Domain</h1><p>This domain is for use in documentation examples without needing permission. Avoid use in operations.</p><p><a href="https://iana.org/domains/example">Learn more</a></p></div></body></html>
100   528    0   528    0     0  11450      0 --:--:-- --:--:-- --:--:-- 11478
status: OK

curl example.com works as expected

path control

similarly it’s not hard to have a file path control

func isPathWithinWorkspace(arg string) bool {
	base := "/workspace"
	var full string
	if filepath.IsAbs(arg) {
		full = filepath.Clean(arg)
	} else {
		full = filepath.Clean(filepath.Join(base, arg))
	}
	return full == base || strings.HasPrefix(full, base+"/")
}

we can see:

agent> ls
----- result -----
command: ls
stdout:
note.txt
status: OK
 
agent> ls /etc
----- result -----
command: ls /etc
error: path not allowed: /etc (only /workspace)

side note (security!)

even from this very very simple sandbox example, we’ve already had a huge security mischief
in the [#path control] example, we implemented a check on the path to enforce path control, and only paths with /workspace prefix will be allowed:

agent> cat /etc/passwd
----- result -----
command: cat /etc/passwd
error: path not allowed: /etc/passwd (only /workspace)

this seems to be working. Can we really not access any other path that’s not under the /workspace directory?
let’s try this:

agent> echo hi; cat /etc/passwd
Approval required (contains shell metacharacters). Run anyway? [y/N]: y
----- result -----
command: echo hi; cat /etc/passwd
stdout:
hi
root:x:0:0:root:/root:/bin/sh
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin

oh no! the path control is bypassed simply by chaining commands! command injection!
this is due to the problematic sh -lc shell eval:

args := []string{"exec", containerName, "sh", "-lc", input}
cmd := exec.CommandContext(ctx, "docker", args...)
out, err := cmd.CombinedOutput()

sh -lc input sent raw user text to a shell parser:
- Shell treats ;, &&, |, >, $(), backticks, globs, env expansion, etc as control features.
- So user text could change intent: one command can become many commands, redirections, substitutions, like echo hi; cat /etc/passwd
let’s patch this:

-	args := []string{"exec", containerName, "sh", "-lc", input}
+	args := []string{"exec", containerName, cmdName}
+	args = append(args, parts[1:]...)
	cmd := exec.CommandContext(ctx, "docker", args...)
	out, err := cmd.CombinedOutput()

echo hi; ls no longer chains commands, it prints literal hi; ls.
We execute an exact argv array: docker exec <container> cmd arg1 arg2 ....
- No shell parser runs.
- Meta-characters are plain characters inside args, not operators.

audit log

before adding an audit log, it’s probably a good idea to have some code cleanup, since this single main.go is getting too long
i’d like the project tree to be:

sandbox/
├── main.go -> main entry for the program
├── executor.go -> handles sandbox creation/removal and command execution
├── policy.go -> defines sandbox policy
└── audit.go -> audit-logs command execution

to add an audit log, we need to think about what we want to log
- command executed → just the input from user
- command execution result
  - decision from policy: command is blocked, denied, or executed (failed/timeout/etc)
  - reason for decision
- other command execution: execution time, duration
NOTE: probably not a good idea to log stdout/stderr, since it may contain sensitive user data
we have struct Audit with LogEvent method:

type AuditEntry struct {
	Timestamp  string `json:"ts"`
	Command    string `json:"command"`
	Decision   string `json:"decision"`
	Reason     string `json:"reason,omitempty"`
	ExitCode   int    `json:"exit_code,omitempty"`
	Timeout    bool   `json:"timeout,omitempty"`
	DurationMs int64  `json:"duration_ms"`
}
 
type Auditor struct {
	file *os.File
	enc  *json.Encoder
}
 
func (a *Auditor) LogEvent(command string, decision Decision, reason string, start time.Time, res *Result) {
	if a == nil || a.enc == nil {
		return
	}
	entry := AuditEntry{
		Timestamp:  time.Now().UTC().Format(time.RFC3339Nano),
		Command:    command,
		Decision:   string(decision),
		Reason:     reason,
		DurationMs: 0,
	}
	// field validation
	_ = a.enc.Encode(entry)
}

usage:

start := time.Now()
res := runInSandbox(containerName, line, decision.Parts, 3*time.Second)
auditor.LogEvent(line, DecisionExecuted, "", start, &res)

the workflow now is:

User types a command
The program checks policy and makes decision (blocked/need approval/etc)
after user approval, if needed, the command is then sent to a container to execute.
After execution, the program gets the result of execution from the container and then writes logs to the audit file on the host machine

Example audit log:

{"ts":"2026-03-06T09:30:00Z","command":"cat /etc/passwd","decision":"blocked","reason":"path not allowed: /etc/passwd (only /workspace)","duration_ms":0}
{"ts":"2026-03-06T09:30:03Z","command":"echo hello","decision":"executed","exit_code":0,"duration_ms":12}

putting the agent in

it’s about “sandboxing in coding agents”, yet the agent is still missing
first we need to hook up an LLM, let’s go with Groq and openai/gpt-oss-20b

tools

as this involves tool calls, we need to define a tool call related struct. a very basic shell tool that the agent can use:

var shellTool = toolDef{
	Type: "function",
	Function: toolFunctionDef{
		Name:        "run_command",
		Description: "Run a shell command in the sandbox. Allowed commands: echo, ls, pwd, cat, curl. The working directory is /workspace.",
		Parameters: json.RawMessage(`{
			"type": "object",
			"properties": {
				"command": {
					"type": "string",
					"description": "The full command to run, e.g. 'ls -la' or 'cat note.txt'"
				}
			},
			"required": ["command"]
		}`),
	},
}

request to Groq API:

// request struct
reqBody := chatRequest{
	Model:    "openai/gpt-oss-20b",
	Messages: messages,
	Tools:    []toolDef{shellTool},
}

sandbox

when LLM returns a tool call response, before executing it in the sandbox, we run a policy check (wrapped inside executeToolCall)
pseudo code for tool execution:

resp, err := callGroq(apiKey, history)
msg := resp.Choices[0].Message
 
for _, tc := range msg.ToolCalls {
	result := executeToolCall(tc, containerName, cfg, auditor, scanner)
	history = append(history, chatMessage{
		Role:       "tool",
		ToolCallID: tc.ID,
		Content:    result,
	})
}

in executeToolCall, we check Policy and ask for user approval if needed

decision := evaluatePolicy(args.Command, cfg)
if !decision.Allowed {
	auditor.LogBlocked(args.Command, decision.BlockReason)
	fmt.Printf("    BLOCKED: %s\n", decision.BlockReason)
	return "BLOCKED: " + decision.BlockReason
}
 
if decision.RequiresApproval {
	fmt.Printf("    ⚠ %s\n", decision.ApprovalReason)
	fmt.Print("    Allow? [y/N]: ")
	approved := scanner.Scan() && strings.ToLower(strings.TrimSpace(scanner.Text())) == "y"
	if !approved {
		auditor.LogBlocked(args.Command, "user denied: "+decision.ApprovalReason)
		return "DENIED: user rejected — " + decision.ApprovalReason
	}
}

if commands passes policy check, simply execute it in sandbox

start := time.Now()
res := runInSandbox(containerName, decision.Parts, 3*time.Second)
auditor.LogExecution(args.Command, start, res)

running our agent:

you> hi
 
agent> Hi there! How can I help you today?
 
you> what's in this dir
  [agent wants to call] `ls -la`
    Execution output: total 8
drwx------    3 root     root            96 Mar  9 02:56 .
drwxr-xr-x    1 root     root          4096 Mar  9 02:56 ..
-rw-r--r--    1 root     root            19 Mar  9 02:56 note.txt
 
agent> Here’s what’s inside the current working directory (`/workspace`):
 
- `note.txt` – a regular text file (19 bytes).
- No other files or sub‑directories are present.
 
If you’d like to see the contents of `note.txt`, just let me know!

updating policy so that paths outside of /workspace needs user approval:
- it asks for user approval as expected

you> what about in /etc
  [agent wants to call] `ls -la /etc`
    ⚠ accesses path outside /workspace: /etc
    Allow? [y/N]: n
 
agent> I’m sorry, but I can’t list or access `/etc` (or any other directory outside of the sandbox’s `/workspace`). If you need information about a specific file inside the workspace or want me to perform a command here, just let me know!

that’s it

as a quick weekend learning toy project, i think i’ve gained what i wanted to learn about the very very basics of sandboxing in coding agent (the bare version could simply be done by sending tool call request from LLM to a container)
before I got my hands on this, I always thought that the agent program - the coding agent itself - is put inside a container. Actually the sandbox is just for command execution. The agent program is running on the host machine

+-------------------------------------+
| Agent Program (host machine)        |
| - reasoning loop                    |
| - planning                          |
| - tool selection                    |
| - decides commands to run           |
|                                     |
|  exec -> docker run ...             |
+------------│------------------------+
             │
             ▼
+-------------------------------------+
| Sandbox (Container).                |
| - executes shell commands           |
| - runs user code                    |
| - installs dependencies             |
| - compiles programs                 |
|                                     |
| limited CPU / memory / filesystem   |
+-------------------------------------+

progressively, on the topic of container, i would like to try more some container escapes, i.e., try to ask the agent to break out of the sandbox. e.g., Mount the Docker socket (-v /var/run/docker.sock:/var/run/docker.sock) and see how trivially the agent can escape.
network related issue would be interesting to dig into too, as the agent explores the unknown of internet, malicious content can be scanned and checked before it reaches the agent. maybe a proxy between the sandbox and the internet. Log and inspect every HTTP request the agent makes, even allowed ones.
or as codex suggests:

The most impactful next step: swap your Docker runtime to gVisor (docker run --runtime=runsc ...). It’s a one-flag change to your executor.go and gives you a whole new layer to explore.

others

Why would we consider VMs like firecrackers?

Containers are process isolation, while Firecracker VMs are kernel isolation. For highly untrusted code, kernel isolation is much safer.

because containers are still sharing the kernel with the host machine, some malicious code can still exploit kernel bugs and misconfigured permissions. Or even container runtime bugs and errors. if a sequence of codes executed in the container breaks out of the container due to any of the above issues, the codes is now running in the host machine directly. running completely untrusted code in Docker alone could still be considered risky.

For VMs, each of them has their own kernel, memory space and virtual devices. even if the sandbox got compromised, the malicious code is still in the VM. To break out of a VM requires bypassing hypervisor, which is much harder.

baggiiiie

Explorer

playing-with-sandbox