Atlas

Sandbox Architecture

Platform-agnostic code execution sandboxing design for Atlas.

Design doc for platform-agnostic code execution sandboxing in Atlas.

Problem

Atlas needs isolated code execution for two purposes:

  1. Explore tool -- run shell commands (ls, cat, grep) against the semantic layer YAML files. Read-only, no network, no secrets needed.

  2. Python execution tool -- run agent-generated Python to analyze data retrieved via SQL. Needs a runtime (Python + pandas/numpy/matplotlib), must not have direct access to secrets.

The explore tool works on Vercel (Firecracker VM) and on Linux with nsjail. It fails on Railway because the platform runs shared-kernel containers that block clone() with namespace flags -- no CAP_SYS_ADMIN, no unprivileged user namespaces. This is a fundamental platform limitation, not a configuration issue.

Threat Model: Who Needs What

Not every deployment needs the same level of sandbox isolation. The right backend depends on your trust model:

Self-hosted / single-tenant

The agent and all its users are employees operating within the same trust boundary. In this model:

  • Prompt injection is the main risk -- a crafted value in the database could influence the agent's behavior. But the agent's tools are already scoped: executeSQL is SELECT-only, and explore only reads YAML files.
  • nsjail or the sidecar is plenty -- you're defending against accidental damage, not hostile tenants.
  • just-bash is acceptable -- if you run Atlas behind VPN with API key auth.

Multi-tenant SaaS / public-facing

Now you have real trust boundaries. User A should not be able to influence User B's queries or data. In this model:

  • Sandbox isolation is critical -- generated code must run in its own security context.
  • Firecracker (Vercel Sandbox, E2B) is the right answer -- hardware-level VM isolation, ephemeral per execution.

Security Model

Four Actors

ActorAtlas equivalentTrust level
Agent harnessHono API + streamText loopTrusted (deployed via SDLC)
Agent secretsATLAS_DATASOURCE_URL, API keys, DATABASE_URLMust never enter sandbox
Generated codeexplore commands, executePython codeUntrusted (prompt-injectable)
Filesystem / environmentHost OS, semantic/ directoryProtected from generated code

Architecture

+--------------------------------------------------+
|  Agent Harness (Hono API server)                  |
|  +----------+  +-----------+  +----------------+ |
|  | explore  |  |executeSQL |  |executePython   | |
|  | (sandbox)|  | (in-proc) |  | (sandbox)      | |
|  +----+-----+  +-----------+  +-------+--------+ |
|       |                               |           |
|       | no network                    | no network|
|       | no secrets                    | no secrets|
|       | read-only fs                  | data via  |
|       |                               | stdin only|
|                                                   |
|  Secrets: ATLAS_DATASOURCE_URL,                   |
|  API keys, DATABASE_URL                           |
|  (never enter any sandbox)                        |
+---------------------------------------------------+

Sandbox Backends

Built-in Backends

The five built-in backends, in priority order (see Backend Selection Priority for the full table including plugin backends):

Priority 1: Vercel Sandbox     -- Firecracker VM, deny-all network
Priority 2: nsjail (explicit)  -- ATLAS_SANDBOX=nsjail, hard-fail if unavailable
Priority 3: Sidecar service    -- HTTP-isolated container with no secrets (Railway)
Priority 4: nsjail (auto)      -- nsjail found on PATH, graceful fallback on failure
Priority 5: just-bash          -- JS-level OverlayFs (in-memory writes), path-traversal protection

Sandbox Plugins

Two additional sandbox backends are available as plugins:

PluginPriorityIsolationInstall
E2B90Firecracker microVM (managed)bun add e2b
Daytona85Cloud-hosted ephemeral sandboxbun add @daytonaio/sdk

Plugin backends are always tried before any built-in backend (they sit at the top of the priority chain). The priority field only determines ordering among multiple plugins -- higher values are tried first. Plugins default to priority 60 (SANDBOX_DEFAULT_PRIORITY) if not specified. If all plugins fail to create a backend, the built-in chain (Vercel > nsjail > sidecar > just-bash) takes over.

// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { e2bSandboxPlugin } from "@atlas/plugin-e2b-sandbox";

export default defineConfig({
  plugins: [
    e2bSandboxPlugin({ apiKey: process.env.E2B_API_KEY! }),
  ],
});

What Each Backend Supports

CapabilityVercel SandboxE2B / DaytonansjailSidecarjust-bash
explore (shell)YesYesYesYesYes
executePythonYesYesYesYesNo
VM-level isolationYesYesNoNoNo
Kernel namespace isolationN/AN/AYesNoNo
No secrets in sandboxYesYesYesYesNo
No networkYesYesYesYesNo

Note on just-bash: The just-bash backend uses the just-bash npm package and shares the same process as the API server. It uses just-bash's OverlayFs class -- a JavaScript-level virtual filesystem overlay that intercepts writes in memory -- not Linux kernel OverlayFS. Because it runs in-process, host secrets (environment variables like ATLAS_DATASOURCE_URL and API keys) are accessible in the same memory space. The table above marks secret isolation as "No" for this reason. For multi-tenant deployments, use a higher-priority backend. Python execution is not supported under just-bash; executePython requires a sandbox backend (sidecar, Vercel sandbox, or nsjail) and returns an error without one.

Platform Capabilities

PlatformnsjailSidecarBest backend
VercelN/AN/AVercel Sandbox (priority 1)
RailwayNoRequired*Sidecar (priority 3)
Self-hosted DockerWith capabilitiesOptionalnsjail (priority 2)
Self-hosted VMYesOptionalnsjail (priority 2)

*On Railway, nsjail is unavailable (no CAP_SYS_ADMIN / user namespaces). Without the sidecar, the explore tool falls back to just-bash (no secret isolation, no Python support). The sidecar is effectively required for production isolation on Railway.

Sidecar Service Design

Since no kernel-level sandbox works on Railway, isolation comes from process/network separation -- a separate service with its own filesystem and no access to the main service's secrets.

+----------------------------------+     +-----------------------------+
|  Main Service (Hono API)         |     |  Sandbox Sidecar            |
|                                  |     |                             |
|  ENV:                            |     |  ENV:                       |
|    ATLAS_DATASOURCE_URL=...      |     |    SIDECAR_AUTH_TOKEN=...   |
|    ANTHROPIC_API_KEY=...         |     |    (no DB creds, no API     |
|    DATABASE_URL=...              |     |     keys, no secrets)       |
|                                  |     |                             |
|  Agent loop calls:               |     |  FILES:                     |
|    POST sidecar:8080/exec        |---->|    /semantic/**/*.yml       |
|    POST sidecar:8080/exec-python |---->|                             |
|                                  |     |  ENDPOINTS:                 |
|  Receives:                       |     |    GET  /health             |
|    { stdout, stderr, exitCode }  |<----|    POST /exec               |
|    or PythonResult               |<----|    POST /exec-python        |
+----------------------------------+     +-----------------------------+
        Railway private network

The sidecar enforces a concurrency limit of 10 (MAX_CONCURRENT = 10) across both /exec and /exec-python endpoints. Requests beyond this limit receive HTTP 429.

Backend Selection Priority

At startup, the explore tool selects the highest-priority backend available. The selection is evaluated top-to-bottom — the first match wins:

PriorityBackendCondition
0Sandbox pluginA sandbox plugin is registered via atlas.config.ts (skipped when ATLAS_SANDBOX=nsjail). Explore tool onlyexecutePython does not check plugin backends
1Vercel SandboxATLAS_RUNTIME=vercel or VERCEL env var is present
2nsjail (explicit)ATLAS_SANDBOX=nsjail is set. Hard-fails if the nsjail binary is not found — no fallback
3SidecarATLAS_SANDBOX_URL is set. Skips nsjail auto-detection entirely
4nsjail (auto-detect)nsjail binary found on PATH (no explicit config needed). Falls back gracefully on failure
5just-bashFallback. JS-level OverlayFs + path-traversal protection only

Key behaviors:

  • Explicit nsjail is strict. Setting ATLAS_SANDBOX=nsjail means "nsjail or nothing" — if the binary is missing or initialization fails, the explore tool returns an error rather than falling back. Plugin-provided sandbox backends are skipped entirely when this flag is set.
  • Sidecar skips nsjail auto-detection. When ATLAS_SANDBOX_URL is set, nsjail auto-detection is completely skipped. This avoids noisy namespace warnings on platforms like Railway where clone() with namespace flags is blocked.
  • Auto-detected nsjail is graceful. If nsjail is found on PATH but fails to initialize (e.g. missing kernel capabilities), the backend falls back to just-bash with a warning.
  • Plugin backends take top priority. Sandbox plugins (E2B, Daytona, custom) are tried first and sorted by their priority field (highest wins). If all plugins fail, the built-in chain continues.

The health endpoint (GET /api/health) reports which backend is active in the explore.backend field.

executePython uses a different priority order. See Python Execution for the full priority chain. Key difference: sidecar is priority 1, plugin backends are skipped, and just-bash is not a fallback.

Configuration

VariableDefaultDescription
ATLAS_SANDBOXauto-detectForce sandbox backend: nsjail
ATLAS_SANDBOX_URL--Sidecar service URL (enables sidecar backend)
SIDECAR_AUTH_TOKEN--Shared secret for sidecar auth
ATLAS_NSJAIL_PATH--Explicit path to nsjail binary
ATLAS_NSJAIL_TIME_LIMIT10nsjail per-command time limit in seconds
ATLAS_NSJAIL_MEMORY_LIMIT256nsjail per-command memory limit in MB (rlimit_as)

nsjail Resource Limits

In addition to the configurable time and memory limits above, nsjail enforces these hard-coded resource limits per command:

LimitValueDescription
rlimit_asATLAS_NSJAIL_MEMORY_LIMIT (default 256 MB)Virtual memory limit
rlimit_fsize10 MBMax file size a process can create
rlimit_nproc5Max number of processes
rlimit_nofile64Max open file descriptors
Stdout/stderr cap1 MB (MAX_OUTPUT)Output read from stdout and stderr is truncated at 1 MB

Sidecar Timeouts

Shell commands and Python execution have separate timeout configurations:

Shell commands (/exec): The sidecar enforces a 10-second command timeout (DEFAULT_TIMEOUT_MS = 10_000). The HTTP fetch uses a total abort signal of 15 seconds (10s execution + 5s HTTP overhead) to account for network latency and response serialization. Shell commands that exceed the timeout return exit code 124 (matching GNU timeout(1) convention).

Python execution (/exec-python): The default timeout is 30 seconds (PYTHON_DEFAULT_TIMEOUT_MS = 30_000), configurable via the ATLAS_PYTHON_TIMEOUT environment variable. The sidecar clamps the value to a maximum of 120 seconds (PYTHON_MAX_TIMEOUT_MS). The HTTP fetch adds 10 seconds of overhead on top of the execution timeout, giving a total abort signal of 40 seconds at the default (or up to 130s at the maximum).

Python Execution (executePython)

The executePython tool runs agent-generated Python code for data analysis and visualization. It uses a different backend selection priority than the explore tool:

Priority 1: Sidecar (ATLAS_SANDBOX_URL)      -- POST /exec-python
Priority 2: Vercel Sandbox                    -- Python 3.13 Firecracker microVM
Priority 3: nsjail (explicit, ATLAS_SANDBOX=nsjail) -- hard-fail if unavailable
Priority 4: nsjail (auto-detect, on PATH)     -- graceful fallback
Priority 5: No backend — error                -- just-bash is NOT a fallback

Key differences from the explore tool's priority chain:

  • Sidecar is priority 1 (not priority 3). When ATLAS_SANDBOX_URL is set, Python uses the sidecar immediately without checking for Vercel sandbox first.
  • Plugin sandbox backends are not checked. Python only uses built-in backends.
  • just-bash is not a fallback. If no sandbox backend is available, executePython returns an error rather than running Python in the host process.

Python execution has two layers of defense: an AST-based import guard (defense-in-depth, runs before execution) and the sandbox backend itself (the actual security boundary). The import guard blocks dangerous modules (subprocess, os, socket, etc.) and builtins (exec, eval, open, __import__, etc.). If python3 is not available locally for AST validation, the guard is skipped -- the sandbox enforces isolation regardless.

Data from a previous SQL query is injected into the sandbox as a pandas DataFrame (df) or raw dict (data). Results are returned as structured output: tables (_atlas_table), interactive Recharts charts (_atlas_chart), or PNG files (matplotlib via chart_path()).

On this page