Sandbox Architecture
Platform-agnostic code execution sandboxing design for Atlas.
Design doc for platform-agnostic code execution sandboxing in Atlas.
Problem
Atlas needs isolated code execution for two purposes:
-
Explore tool -- run shell commands (
ls,cat,grep) against the semantic layer YAML files. Read-only, no network, no secrets needed. -
Python execution tool -- run agent-generated Python to analyze data retrieved via SQL. Needs a runtime (Python + pandas/numpy/matplotlib), must not have direct access to secrets.
The explore tool works on Vercel (Firecracker VM) and on Linux with nsjail. It fails on Railway because the platform runs shared-kernel containers that block clone() with namespace flags -- no CAP_SYS_ADMIN, no unprivileged user namespaces. This is a fundamental platform limitation, not a configuration issue.
Threat Model: Who Needs What
Not every deployment needs the same level of sandbox isolation. The right backend depends on your trust model:
Self-hosted / single-tenant
The agent and all its users are employees operating within the same trust boundary. In this model:
- Prompt injection is the main risk -- a crafted value in the database could influence the agent's behavior. But the agent's tools are already scoped:
executeSQLis SELECT-only, andexploreonly reads YAML files. - nsjail or the sidecar is plenty -- you're defending against accidental damage, not hostile tenants.
- just-bash is acceptable -- if you run Atlas behind VPN with API key auth.
Multi-tenant SaaS / public-facing
Now you have real trust boundaries. User A should not be able to influence User B's queries or data. In this model:
- Sandbox isolation is critical -- generated code must run in its own security context.
- Firecracker (Vercel Sandbox, E2B) is the right answer -- hardware-level VM isolation, ephemeral per execution.
Security Model
Four Actors
| Actor | Atlas equivalent | Trust level |
|---|---|---|
| Agent harness | Hono API + streamText loop | Trusted (deployed via SDLC) |
| Agent secrets | ATLAS_DATASOURCE_URL, API keys, DATABASE_URL | Must never enter sandbox |
| Generated code | explore commands, executePython code | Untrusted (prompt-injectable) |
| Filesystem / environment | Host OS, semantic/ directory | Protected from generated code |
Architecture
+--------------------------------------------------+
| Agent Harness (Hono API server) |
| +----------+ +-----------+ +----------------+ |
| | explore | |executeSQL | |executePython | |
| | (sandbox)| | (in-proc) | | (sandbox) | |
| +----+-----+ +-----------+ +-------+--------+ |
| | | |
| | no network | no network|
| | no secrets | no secrets|
| | read-only fs | data via |
| | | stdin only|
| |
| Secrets: ATLAS_DATASOURCE_URL, |
| API keys, DATABASE_URL |
| (never enter any sandbox) |
+---------------------------------------------------+Sandbox Backends
Built-in Backends
The five built-in backends, in priority order (see Backend Selection Priority for the full table including plugin backends):
Priority 1: Vercel Sandbox -- Firecracker VM, deny-all network
Priority 2: nsjail (explicit) -- ATLAS_SANDBOX=nsjail, hard-fail if unavailable
Priority 3: Sidecar service -- HTTP-isolated container with no secrets (Railway)
Priority 4: nsjail (auto) -- nsjail found on PATH, graceful fallback on failure
Priority 5: just-bash -- JS-level OverlayFs (in-memory writes), path-traversal protectionSandbox Plugins
Two additional sandbox backends are available as plugins:
| Plugin | Priority | Isolation | Install |
|---|---|---|---|
| E2B | 90 | Firecracker microVM (managed) | bun add e2b |
| Daytona | 85 | Cloud-hosted ephemeral sandbox | bun add @daytonaio/sdk |
Plugin backends are always tried before any built-in backend (they sit at the top of the priority chain). The priority field only determines ordering among multiple plugins -- higher values are tried first. Plugins default to priority 60 (SANDBOX_DEFAULT_PRIORITY) if not specified. If all plugins fail to create a backend, the built-in chain (Vercel > nsjail > sidecar > just-bash) takes over.
// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { e2bSandboxPlugin } from "@atlas/plugin-e2b-sandbox";
export default defineConfig({
plugins: [
e2bSandboxPlugin({ apiKey: process.env.E2B_API_KEY! }),
],
});What Each Backend Supports
| Capability | Vercel Sandbox | E2B / Daytona | nsjail | Sidecar | just-bash |
|---|---|---|---|---|---|
explore (shell) | Yes | Yes | Yes | Yes | Yes |
executePython | Yes | Yes | Yes | Yes | No |
| VM-level isolation | Yes | Yes | No | No | No |
| Kernel namespace isolation | N/A | N/A | Yes | No | No |
| No secrets in sandbox | Yes | Yes | Yes | Yes | No |
| No network | Yes | Yes | Yes | Yes | No |
Note on just-bash: The just-bash backend uses the just-bash npm package and shares the same process as the API server. It uses just-bash's OverlayFs class -- a JavaScript-level virtual filesystem overlay that intercepts writes in memory -- not Linux kernel OverlayFS. Because it runs in-process, host secrets (environment variables like ATLAS_DATASOURCE_URL and API keys) are accessible in the same memory space. The table above marks secret isolation as "No" for this reason. For multi-tenant deployments, use a higher-priority backend. Python execution is not supported under just-bash; executePython requires a sandbox backend (sidecar, Vercel sandbox, or nsjail) and returns an error without one.
Platform Capabilities
| Platform | nsjail | Sidecar | Best backend |
|---|---|---|---|
| Vercel | N/A | N/A | Vercel Sandbox (priority 1) |
| Railway | No | Required* | Sidecar (priority 3) |
| Self-hosted Docker | With capabilities | Optional | nsjail (priority 2) |
| Self-hosted VM | Yes | Optional | nsjail (priority 2) |
*On Railway, nsjail is unavailable (no CAP_SYS_ADMIN / user namespaces). Without the sidecar, the explore tool falls back to just-bash (no secret isolation, no Python support). The sidecar is effectively required for production isolation on Railway.
Sidecar Service Design
Since no kernel-level sandbox works on Railway, isolation comes from process/network separation -- a separate service with its own filesystem and no access to the main service's secrets.
+----------------------------------+ +-----------------------------+
| Main Service (Hono API) | | Sandbox Sidecar |
| | | |
| ENV: | | ENV: |
| ATLAS_DATASOURCE_URL=... | | SIDECAR_AUTH_TOKEN=... |
| ANTHROPIC_API_KEY=... | | (no DB creds, no API |
| DATABASE_URL=... | | keys, no secrets) |
| | | |
| Agent loop calls: | | FILES: |
| POST sidecar:8080/exec |---->| /semantic/**/*.yml |
| POST sidecar:8080/exec-python |---->| |
| | | ENDPOINTS: |
| Receives: | | GET /health |
| { stdout, stderr, exitCode } |<----| POST /exec |
| or PythonResult |<----| POST /exec-python |
+----------------------------------+ +-----------------------------+
Railway private networkThe sidecar enforces a concurrency limit of 10 (MAX_CONCURRENT = 10) across both /exec and /exec-python endpoints. Requests beyond this limit receive HTTP 429.
Backend Selection Priority
At startup, the explore tool selects the highest-priority backend available. The selection is evaluated top-to-bottom — the first match wins:
| Priority | Backend | Condition |
|---|---|---|
| 0 | Sandbox plugin | A sandbox plugin is registered via atlas.config.ts (skipped when ATLAS_SANDBOX=nsjail). Explore tool only — executePython does not check plugin backends |
| 1 | Vercel Sandbox | ATLAS_RUNTIME=vercel or VERCEL env var is present |
| 2 | nsjail (explicit) | ATLAS_SANDBOX=nsjail is set. Hard-fails if the nsjail binary is not found — no fallback |
| 3 | Sidecar | ATLAS_SANDBOX_URL is set. Skips nsjail auto-detection entirely |
| 4 | nsjail (auto-detect) | nsjail binary found on PATH (no explicit config needed). Falls back gracefully on failure |
| 5 | just-bash | Fallback. JS-level OverlayFs + path-traversal protection only |
Key behaviors:
- Explicit nsjail is strict. Setting
ATLAS_SANDBOX=nsjailmeans "nsjail or nothing" — if the binary is missing or initialization fails, the explore tool returns an error rather than falling back. Plugin-provided sandbox backends are skipped entirely when this flag is set. - Sidecar skips nsjail auto-detection. When
ATLAS_SANDBOX_URLis set, nsjail auto-detection is completely skipped. This avoids noisy namespace warnings on platforms like Railway whereclone()with namespace flags is blocked. - Auto-detected nsjail is graceful. If nsjail is found on
PATHbut fails to initialize (e.g. missing kernel capabilities), the backend falls back to just-bash with a warning. - Plugin backends take top priority. Sandbox plugins (E2B, Daytona, custom) are tried first and sorted by their
priorityfield (highest wins). If all plugins fail, the built-in chain continues.
The health endpoint (GET /api/health) reports which backend is active in the explore.backend field.
executePythonuses a different priority order. See Python Execution for the full priority chain. Key difference: sidecar is priority 1, plugin backends are skipped, and just-bash is not a fallback.
Configuration
| Variable | Default | Description |
|---|---|---|
ATLAS_SANDBOX | auto-detect | Force sandbox backend: nsjail |
ATLAS_SANDBOX_URL | -- | Sidecar service URL (enables sidecar backend) |
SIDECAR_AUTH_TOKEN | -- | Shared secret for sidecar auth |
ATLAS_NSJAIL_PATH | -- | Explicit path to nsjail binary |
ATLAS_NSJAIL_TIME_LIMIT | 10 | nsjail per-command time limit in seconds |
ATLAS_NSJAIL_MEMORY_LIMIT | 256 | nsjail per-command memory limit in MB (rlimit_as) |
nsjail Resource Limits
In addition to the configurable time and memory limits above, nsjail enforces these hard-coded resource limits per command:
| Limit | Value | Description |
|---|---|---|
rlimit_as | ATLAS_NSJAIL_MEMORY_LIMIT (default 256 MB) | Virtual memory limit |
rlimit_fsize | 10 MB | Max file size a process can create |
rlimit_nproc | 5 | Max number of processes |
rlimit_nofile | 64 | Max open file descriptors |
| Stdout/stderr cap | 1 MB (MAX_OUTPUT) | Output read from stdout and stderr is truncated at 1 MB |
Sidecar Timeouts
Shell commands and Python execution have separate timeout configurations:
Shell commands (/exec): The sidecar enforces a 10-second command timeout (DEFAULT_TIMEOUT_MS = 10_000). The HTTP fetch uses a total abort signal of 15 seconds (10s execution + 5s HTTP overhead) to account for network latency and response serialization. Shell commands that exceed the timeout return exit code 124 (matching GNU timeout(1) convention).
Python execution (/exec-python): The default timeout is 30 seconds (PYTHON_DEFAULT_TIMEOUT_MS = 30_000), configurable via the ATLAS_PYTHON_TIMEOUT environment variable. The sidecar clamps the value to a maximum of 120 seconds (PYTHON_MAX_TIMEOUT_MS). The HTTP fetch adds 10 seconds of overhead on top of the execution timeout, giving a total abort signal of 40 seconds at the default (or up to 130s at the maximum).
Python Execution (executePython)
The executePython tool runs agent-generated Python code for data analysis and visualization. It uses a different backend selection priority than the explore tool:
Priority 1: Sidecar (ATLAS_SANDBOX_URL) -- POST /exec-python
Priority 2: Vercel Sandbox -- Python 3.13 Firecracker microVM
Priority 3: nsjail (explicit, ATLAS_SANDBOX=nsjail) -- hard-fail if unavailable
Priority 4: nsjail (auto-detect, on PATH) -- graceful fallback
Priority 5: No backend — error -- just-bash is NOT a fallbackKey differences from the explore tool's priority chain:
- Sidecar is priority 1 (not priority 3). When
ATLAS_SANDBOX_URLis set, Python uses the sidecar immediately without checking for Vercel sandbox first. - Plugin sandbox backends are not checked. Python only uses built-in backends.
- just-bash is not a fallback. If no sandbox backend is available,
executePythonreturns an error rather than running Python in the host process.
Python execution has two layers of defense: an AST-based import guard (defense-in-depth, runs before execution) and the sandbox backend itself (the actual security boundary). The import guard blocks dangerous modules (subprocess, os, socket, etc.) and builtins (exec, eval, open, __import__, etc.). If python3 is not available locally for AST validation, the guard is skipped -- the sandbox enforces isolation regardless.
Data from a previous SQL query is injected into the sandbox as a pandas DataFrame (df) or raw dict (data). Results are returned as structured output: tables (_atlas_table), interactive Recharts charts (_atlas_chart), or PNG files (matplotlib via chart_path()).