Sandbox Architecture

Design doc for platform-agnostic code execution sandboxing in Atlas.

Problem

Atlas needs isolated code execution for two purposes:

Explore tool -- run shell commands (ls, cat, grep) against the semantic layer YAML files. Read-only, no network, no secrets needed.
Python execution tool -- run agent-generated Python to analyze data retrieved via SQL. Needs a runtime (Python + pandas/numpy/matplotlib), must not have direct access to secrets.

The explore tool works on Vercel (Firecracker VM) and on Linux with nsjail. It fails on Railway because the platform runs shared-kernel containers that block clone() with namespace flags -- no CAP_SYS_ADMIN, no unprivileged user namespaces. This is a fundamental platform limitation, not a configuration issue.

Threat Model: Who Needs What

Not every deployment needs the same level of sandbox isolation. The right backend depends on your trust model:

Self-hosted / single-tenant

The agent and all its users are employees operating within the same trust boundary. In this model:

Prompt injection is the main risk -- a crafted value in the database could influence the agent's behavior. But the agent's tools are already scoped: executeSQL is SELECT-only, and explore only reads YAML files.
nsjail or the sidecar is plenty -- you're defending against accidental damage, not hostile tenants.
just-bash is acceptable -- if you run Atlas behind VPN with API key auth.

Multi-tenant SaaS / public-facing

Now you have real trust boundaries. User A should not be able to influence User B's queries or data. In this model:

Sandbox isolation is critical -- generated code must run in its own security context.
Firecracker (Vercel Sandbox, E2B) is the right answer -- hardware-level VM isolation, ephemeral per execution.

Security Model

Four Actors

Actor	Atlas equivalent	Trust level
Agent harness	Hono API + `streamText` loop	Trusted (deployed via SDLC)
Agent secrets	`ATLAS_DATASOURCE_URL`, API keys, `DATABASE_URL`	Must never enter sandbox
Generated code	`explore` commands, `executePython` code	Untrusted (prompt-injectable)
Filesystem / environment	Host OS, `semantic/` directory	Protected from generated code

Architecture

+--------------------------------------------------+
|  Agent Harness (Hono API server)                  |
|  +----------+  +-----------+  +----------------+ |
|  | explore  |  |executeSQL |  |executePython   | |
|  | (sandbox)|  | (in-proc) |  | (sandbox)      | |
|  +----+-----+  +-----------+  +-------+--------+ |
|       |                               |           |
|       | no network                    | no network|
|       | no secrets                    | no secrets|
|       | read-only fs                  | data via  |
|       |                               | stdin only|
|                                                   |
|  Secrets: ATLAS_DATASOURCE_URL,                   |
|  API keys, DATABASE_URL                           |
|  (never enter any sandbox)                        |
+---------------------------------------------------+

Sandbox Backends

Built-in Backends

The five built-in backends, in priority order (see Backend Selection Priority for the full table including plugin backends):

Priority 1: Vercel Sandbox     -- Firecracker VM, deny-all network
Priority 2: nsjail (explicit)  -- ATLAS_SANDBOX=nsjail, hard-fail if unavailable
Priority 3: Sidecar service    -- HTTP-isolated container with no secrets (Railway)
Priority 4: nsjail (auto)      -- nsjail found on PATH, graceful fallback on failure
Priority 5: just-bash          -- JS-level OverlayFs (in-memory writes), path-traversal protection

Sandbox Plugins

Two additional sandbox backends are available as plugins:

Plugin	Priority	Isolation	Install
E2B	90	Firecracker microVM (managed)	`bun add e2b`
Daytona	85	Cloud-hosted ephemeral sandbox	`bun add @daytonaio/sdk`

Plugin backends are always tried before any built-in backend (they sit at the top of the priority chain). The priority field only determines ordering among multiple plugins -- higher values are tried first. Plugins default to priority 60 (SANDBOX_DEFAULT_PRIORITY) if not specified. If all plugins fail to create a backend, the built-in chain (Vercel > nsjail > sidecar > just-bash) takes over.

// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { e2bSandboxPlugin } from "@atlas/plugin-e2b-sandbox";

export default defineConfig({
  plugins: [
    e2bSandboxPlugin({ apiKey: process.env.E2B_API_KEY! }),
  ],
});

What Each Backend Supports

Capability	Vercel Sandbox	E2B / Daytona	nsjail	Sidecar	just-bash
`explore` (shell)	Yes	Yes	Yes	Yes	Yes
`executePython`	Yes	Yes	Yes	Yes	No
VM-level isolation	Yes	Yes	No	No	No
Kernel namespace isolation	N/A	N/A	Yes	No	No
No secrets in sandbox	Yes	Yes	Yes	Yes	No
No network	Yes	Yes	Yes	Yes	No

Note on just-bash: The just-bash backend uses the just-bash npm package and shares the same process as the API server. It uses just-bash's OverlayFs class -- a JavaScript-level virtual filesystem overlay that intercepts writes in memory -- not Linux kernel OverlayFS. Because it runs in-process, host secrets (environment variables like ATLAS_DATASOURCE_URL and API keys) are accessible in the same memory space. The table above marks secret isolation as "No" for this reason. For multi-tenant deployments, use a higher-priority backend. Python execution is not supported under just-bash; executePython requires a sandbox backend (sidecar, Vercel sandbox, or nsjail) and returns an error without one.

Platform Capabilities

Platform	nsjail	Sidecar	Best backend
Vercel	N/A	N/A	Vercel Sandbox (priority 1)
Railway	No	Required*	Sidecar (priority 3)
Self-hosted Docker	With capabilities	Optional	nsjail (priority 2)
Self-hosted VM	Yes	Optional	nsjail (priority 2)

*On Railway, nsjail is unavailable (no CAP_SYS_ADMIN / user namespaces). Without the sidecar, the explore tool falls back to just-bash (no secret isolation, no Python support). The sidecar is effectively required for production isolation on Railway.

Sidecar Service Design

Since no kernel-level sandbox works on Railway, isolation comes from process/network separation -- a separate service with its own filesystem and no access to the main service's secrets.

+----------------------------------+     +-----------------------------+
|  Main Service (Hono API)         |     |  Sandbox Sidecar            |
|                                  |     |                             |
|  ENV:                            |     |  ENV:                       |
|    ATLAS_DATASOURCE_URL=...      |     |    SIDECAR_AUTH_TOKEN=...   |
|    ANTHROPIC_API_KEY=...         |     |    (no DB creds, no API     |
|    DATABASE_URL=...              |     |     keys, no secrets)       |
|                                  |     |                             |
|  Agent loop calls:               |     |  FILES:                     |
|    POST sidecar:8080/exec        |---->|    /semantic/**/*.yml       |
|    POST sidecar:8080/exec-python |---->|                             |
|                                  |     |  ENDPOINTS:                 |
|  Receives:                       |     |    GET  /health             |
|    { stdout, stderr, exitCode }  |<----|    POST /exec               |
|    or PythonResult               |<----|    POST /exec-python        |
+----------------------------------+     +-----------------------------+
        Railway private network

The sidecar enforces a concurrency limit of 10 (MAX_CONCURRENT = 10) across both /exec and /exec-python endpoints. Requests beyond this limit receive HTTP 429.

Backend Selection Priority

At startup, the explore tool selects the highest-priority backend available. The selection is evaluated top-to-bottom — the first match wins:

Priority	Backend	Condition
0	Sandbox plugin	A sandbox plugin is registered via `atlas.config.ts` (skipped when `ATLAS_SANDBOX=nsjail`). Explore tool only — `executePython` does not check plugin backends
1	Vercel Sandbox	`ATLAS_RUNTIME=vercel` or `VERCEL` env var is present
2	nsjail (explicit)	`ATLAS_SANDBOX=nsjail` is set. Hard-fails if the nsjail binary is not found — no fallback
3	Sidecar	`ATLAS_SANDBOX_URL` is set. Skips nsjail auto-detection entirely
4	nsjail (auto-detect)	nsjail binary found on `PATH` (no explicit config needed). Falls back gracefully on failure
5	just-bash	Fallback. JS-level `OverlayFs` + path-traversal protection only

Key behaviors:

Explicit nsjail is strict. Setting ATLAS_SANDBOX=nsjail means "nsjail or nothing" — if the binary is missing or initialization fails, the explore tool returns an error rather than falling back. Plugin-provided sandbox backends are skipped entirely when this flag is set.
Sidecar skips nsjail auto-detection. When ATLAS_SANDBOX_URL is set, nsjail auto-detection is completely skipped. This avoids noisy namespace warnings on platforms like Railway where clone() with namespace flags is blocked.
Auto-detected nsjail is graceful. If nsjail is found on PATH but fails to initialize (e.g. missing kernel capabilities), the backend falls back to just-bash with a warning.
Plugin backends take top priority. Sandbox plugins (E2B, Daytona, custom) are tried first and sorted by their priority field (highest wins). If all plugins fail, the built-in chain continues.

The health endpoint (GET /api/health) reports which backend is active in the explore.backend field.

executePython uses a different priority order. See Python Execution for the full priority chain. Key difference: sidecar is priority 1, plugin backends are skipped, and just-bash is not a fallback.

Configuration

Variable	Default	Description
`ATLAS_SANDBOX`	auto-detect	Force sandbox backend: `nsjail`
`ATLAS_SANDBOX_URL`	--	Sidecar service URL (enables sidecar backend)
`SIDECAR_AUTH_TOKEN`	--	Shared secret for sidecar auth
`ATLAS_NSJAIL_PATH`	--	Explicit path to nsjail binary
`ATLAS_NSJAIL_TIME_LIMIT`	`10`	nsjail per-command time limit in seconds
`ATLAS_NSJAIL_MEMORY_LIMIT`	`256`	nsjail per-command memory limit in MB (`rlimit_as`)

nsjail Resource Limits

In addition to the configurable time and memory limits above, nsjail enforces these hard-coded resource limits per command:

Limit	Value	Description
`rlimit_as`	`ATLAS_NSJAIL_MEMORY_LIMIT` (default 256 MB)	Virtual memory limit
`rlimit_fsize`	10 MB	Max file size a process can create
`rlimit_nproc`	5	Max number of processes
`rlimit_nofile`	64	Max open file descriptors
Stdout/stderr cap	1 MB (`MAX_OUTPUT`)	Output read from stdout and stderr is truncated at 1 MB

Sidecar Timeouts

Shell commands and Python execution have separate timeout configurations:

Shell commands (/exec): The sidecar enforces a 10-second command timeout (DEFAULT_TIMEOUT_MS = 10_000). The HTTP fetch uses a total abort signal of 15 seconds (10s execution + 5s HTTP overhead) to account for network latency and response serialization. Shell commands that exceed the timeout return exit code 124 (matching GNU timeout(1) convention).

Python execution (/exec-python): The default timeout is 30 seconds (PYTHON_DEFAULT_TIMEOUT_MS = 30_000), configurable via the ATLAS_PYTHON_TIMEOUT environment variable. The sidecar clamps the value to a maximum of 120 seconds (PYTHON_MAX_TIMEOUT_MS). The HTTP fetch adds 10 seconds of overhead on top of the execution timeout, giving a total abort signal of 40 seconds at the default (or up to 130s at the maximum).

Python Execution (`executePython`)

The executePython tool runs agent-generated Python code for data analysis and visualization. It uses a different backend selection priority than the explore tool:

Priority 1: Sidecar (ATLAS_SANDBOX_URL)      -- POST /exec-python
Priority 2: Vercel Sandbox                    -- Python 3.13 Firecracker microVM
Priority 3: nsjail (explicit, ATLAS_SANDBOX=nsjail) -- hard-fail if unavailable
Priority 4: nsjail (auto-detect, on PATH)     -- graceful fallback
Priority 5: No backend — error                -- just-bash is NOT a fallback

Key differences from the explore tool's priority chain:

Sidecar is priority 1 (not priority 3). When ATLAS_SANDBOX_URL is set, Python uses the sidecar immediately without checking for Vercel sandbox first.
Plugin sandbox backends are not checked. Python only uses built-in backends.
just-bash is not a fallback. If no sandbox backend is available, executePython returns an error rather than running Python in the host process.

Python execution has two layers of defense: an AST-based import guard (defense-in-depth, runs before execution) and the sandbox backend itself (the actual security boundary). The import guard blocks dangerous modules (subprocess, os, socket, etc.) and builtins (exec, eval, open, __import__, etc.). If python3 is not available locally for AST validation, the guard is skipped -- the sandbox enforces isolation regardless.

Data from a previous SQL query is injected into the sandbox as a pandas DataFrame (df) or raw dict (data). Results are returned as structured output: tables (_atlas_table), interactive Recharts charts (_atlas_chart), or PNG files (matplotlib via chart_path()).

Sandbox Architecture

On this page