CLI Reference

Complete reference for the Atlas CLI — init, diff, query, doctor, validate, mcp, and more.

The Atlas CLI (atlas) profiles databases, generates semantic layers, validates configuration, and queries data from the terminal.

# Run via bun workspace
bun run atlas -- <command> [options]

# Or directly (if installed globally)
atlas <command> [options]

init

Profile a database and generate semantic layer YAML files.

bun run atlas -- init [options]

Flag	Description
`--tables <t1,t2>`	Profile only specific tables/views (comma-separated)
`--schema <name>`	PostgreSQL schema name (default: `public`)
`--source <name>`	Write to `semantic/{name}/` subdirectory (per-source layout). Mutually exclusive with `--connection`
`--connection <name>`	Profile a named datasource from `atlas.config.ts`. Mutually exclusive with `--source`
`--csv <file1.csv,...>`	Load CSV files via DuckDB (no DB server needed). Requires `@duckdb/node-api`
`--parquet <f1.parquet,...>`	Load Parquet files via DuckDB. Requires `@duckdb/node-api`
`--enrich`	Add LLM-enriched descriptions and query patterns (requires API key)
`--no-enrich`	Explicitly skip LLM enrichment
`--demo [simple\|cybersec\|ecommerce]`	Load a demo dataset then profile (default: `simple`)

Examples:

# Profile all tables in the default schema
bun run atlas -- init

# Profile specific tables only
bun run atlas -- init --tables users,orders,products

# Profile a non-public schema
bun run atlas -- init --schema analytics

# Profile with LLM enrichment
bun run atlas -- init --enrich

# Load the cybersec demo dataset (62 tables, ~500K rows)
bun run atlas -- init --demo cybersec

# Profile a named connection from atlas.config.ts
bun run atlas -- init --connection warehouse

# Profile CSV files directly (no database needed)
bun run atlas -- init --csv sales.csv,products.csv

# Per-source layout (writes to semantic/warehouse/)
bun run atlas -- init --source warehouse

--demo without an argument loads the simple dataset (3 tables, ~330 rows). --demo cybersec loads the cybersec dataset (62 tables, ~500K rows). --demo ecommerce loads the e-commerce dataset (52 tables, ~480K rows).

--connection and --source cannot be used together. Here's how they differ:

No flag — Profiles the default datasource (ATLAS_DATASOURCE_URL) and writes output to semantic/entities/.
--source <name> — Writes output to semantic/<name>/entities/ (for multi-source layouts where you organize by source). Does not change which datasource is profiled.
--connection <name> — Profiles a named datasource defined in atlas.config.ts (e.g. datasources.warehouse) and automatically writes output to the matching semantic/<name>/ subdirectory.

In TTY mode (interactive terminal), init presents a table picker. Pass --tables to skip the picker for scripted/CI usage.

What it generates:

semantic/entities/*.yml — One file per table/view with columns, types, sample values, joins, measures, virtual dimensions, and query patterns
semantic/metrics/*.yml — Atomic and breakdown metrics per table
semantic/glossary.yml — Ambiguous terms, FK relationships, enum definitions
semantic/catalog.yml — Table catalog with use_for and common_questions

diff

Compare the database schema against the existing semantic layer. Exits with code 1 if drift is detected.

bun run atlas -- diff [options]

Flag	Description
`--tables <t1,t2>`	Diff only specific tables/views
`--schema <name>`	PostgreSQL schema. Falls back to `ATLAS_SCHEMA` env var, then `public`
`--source <name>`	Read from `semantic/{name}/` subdirectory

Examples:

# Check all tables for schema drift
bun run atlas -- diff

# Check specific tables only
bun run atlas -- diff --tables users,orders

# CI usage: fail the build if schema drifted
bun run atlas -- diff || echo "Schema drift detected!"

query

Ask a natural language question and get an answer. Calls POST /api/v1/query on a running Atlas API server — only bun run dev:api is needed (the full Next.js stack is not required).

bun run atlas -- query "your question" [options]

Flag	Description
`--json`	Raw JSON output (pipe-friendly)
`--csv`	CSV output (headers + rows, no narrative)
`--quiet`	Data only — no narrative, SQL, or stats
`--auto-approve`	Auto-approve any pending actions
`--connection <id>`	Query a specific datasource

Environment:

Variable	Default	Description
`ATLAS_API_URL`	`http://localhost:3001`	API server URL
`ATLAS_API_KEY`	—	API key for authentication

Examples:

# Table output (default)
bun run atlas -- query "How many users signed up last month?"

# JSON output for scripting
bun run atlas -- query "top 10 customers by revenue" --json

# CSV output (pipe to other tools)
bun run atlas -- query "monthly revenue by product" --csv > report.csv

# Data only, no explanation
bun run atlas -- query "active users today" --quiet

# Query a specific datasource
bun run atlas -- query "warehouse inventory" --connection warehouse

doctor

Validate the environment, connectivity, and configuration. Checks that all required services are reachable and properly configured.

bun run atlas -- doctor

No flags. Checks:

Database connectivity (ATLAS_DATASOURCE_URL)
Internal DB connectivity (DATABASE_URL)
LLM provider configuration
Semantic layer presence and validity
Sandbox availability

Example output:

✓ Environment: ATLAS_PROVIDER=anthropic, ANTHROPIC_API_KEY set
✓ Database: Connected to PostgreSQL 16.1
✓ Semantic layer: 5 entities, 2 metrics, 1 glossary
✓ Sandbox: nsjail available
✓ Auth: managed (Better Auth)

validate

Check the semantic layer YAML files for errors. Runs offline — no database or API key required.

bun run atlas -- validate

No flags. Validates:

YAML syntax and required fields
Column type consistency
Join references
Metric SQL syntax
Glossary entries

Useful in CI to catch semantic layer errors before deployment.

mcp

Start an MCP (Model Context Protocol) server for use with Claude Desktop, Cursor, and other MCP-compatible clients.

bun run atlas -- mcp [options]

Flag	Default	Description
`--transport <stdio\|sse>`	`stdio`	Transport type
`--port <n>`	`8080`	Port for SSE transport (only used with `--transport sse`)

When to use each transport:

stdio (default) — For local MCP clients that launch the server as a subprocess (Claude Desktop, Cursor, Windsurf). The client manages the process lifecycle. This is the most common setup.
sse — For remote or containerized MCP servers where the client connects over HTTP. Use this when the MCP server runs in a Docker container, on a remote host, or when multiple clients need to share one server instance. Clients connect via http://host:port/mcp.

Examples:

# Start MCP server on stdio (default, for Claude Desktop)
bun run atlas -- mcp

# Start with SSE transport on a custom port
bun run atlas -- mcp --transport sse --port 9090

Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "atlas": {
      "command": "bun",
      "args": ["run", "atlas", "--", "mcp"],
      "env": {
        "ATLAS_DATASOURCE_URL": "postgresql://user:pass@host:5432/db",
        "ATLAS_PROVIDER": "anthropic",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

eval

Run the evaluation pipeline against demo schemas. Used to measure text-to-SQL accuracy.

Test cases are YAML files in eval/cases/, organized by dataset (simple/, cybersec/, ecommerce/). Each case specifies id, question, schema, difficulty, category, gold_sql, and optionally expected_rows and tags. Results are written to JSONL files and can be compared against baselines for regression detection.

bun run atlas -- eval [options]

Flag	Description
`--schema <name>`	Filter by demo dataset name (e.g. `simple`, `cybersec`, `ecommerce`). This is the eval dataset name, not a PostgreSQL schema
`--category <name>`	Filter by category
`--difficulty <simple\|medium\|complex>`	Filter by difficulty
`--id <case-id>`	Run a single case
`--limit <n>`	Max cases to evaluate
`--resume <file>`	Resume from existing JSONL results file
`--baseline`	Save results as new baseline
`--compare <file.jsonl>`	Diff against baseline (exit 1 on regression)
`--csv`	CSV output
`--json`	JSON summary output

smoke

Run end-to-end smoke tests against a running Atlas deployment.

bun run atlas -- smoke [options]

Flag	Default	Description
`--target <url>`	`http://localhost:3001`	API base URL
`--api-key <key>`	—	Bearer auth token
`--timeout <ms>`	`30000`	Per-check timeout
`--verbose`	—	Show full response bodies on failure
`--json`	—	Machine-readable JSON output

Environment: Flags can also be set via environment variables. --target falls back to ATLAS_API_URL, and --api-key falls back to ATLAS_API_KEY. Explicit flags take precedence.

Variable	Default	Description
`ATLAS_API_URL`	`http://localhost:3001`	API base URL (overridden by `--target`)
`ATLAS_API_KEY`	—	Bearer auth token (overridden by `--api-key`)

plugin

Manage Atlas plugins.

plugin list

List installed plugins from atlas.config.ts.

bun run atlas -- plugin list

plugin create

Scaffold a new plugin.

bun run atlas -- plugin create <name> --type <type>

Flag	Description
`--type <type>`	Plugin type: `datasource`, `context`, `interaction`, `action`, `sandbox` (required)

plugin add

Install a plugin package.

bun run atlas -- plugin add <package-name>

migrate

Generate or apply plugin schema migrations.

bun run atlas -- migrate [options]

Flag	Description
`--apply`	Execute migrations against internal database (default: dry-run)

benchmark

Run the BIRD benchmark for text-to-SQL accuracy evaluation. This is a developer tool for measuring Atlas's query generation quality.

BIRD is an external academic benchmark dataset (~1500 questions across 11 SQLite databases) and is not included in the Atlas repository. You must download the BIRD dev set separately from the BIRD website and point --bird-path to the extracted directory.

bun run atlas -- benchmark [options]

Flag	Description
`--bird-path <path>`	Path to the downloaded BIRD dev directory (required)
`--limit <n>`	Max questions to evaluate
`--db <name>`	Filter to a single database
`--csv`	CSV output
`--resume <file>`	Resume from existing JSONL results file

On this page