Atlas

CLI Reference

Complete reference for the Atlas CLI — init, diff, query, doctor, validate, mcp, and more.

The Atlas CLI (atlas) profiles databases, generates semantic layers, validates configuration, and queries data from the terminal.

# Run via bun workspace
bun run atlas -- <command> [options]

# Or directly (if installed globally)
atlas <command> [options]

init

Profile a database and generate semantic layer YAML files.

bun run atlas -- init [options]
FlagDescription
--tables <t1,t2>Profile only specific tables/views (comma-separated)
--schema <name>PostgreSQL schema name (default: public)
--source <name>Write to semantic/{name}/ subdirectory (per-source layout). Mutually exclusive with --connection
--connection <name>Profile a named datasource from atlas.config.ts. Mutually exclusive with --source
--csv <file1.csv,...>Load CSV files via DuckDB (no DB server needed). Requires @duckdb/node-api
--parquet <f1.parquet,...>Load Parquet files via DuckDB. Requires @duckdb/node-api
--enrichAdd LLM-enriched descriptions and query patterns (requires API key)
--no-enrichExplicitly skip LLM enrichment
--demo [simple|cybersec|ecommerce]Load a demo dataset then profile (default: simple)

Examples:

# Profile all tables in the default schema
bun run atlas -- init

# Profile specific tables only
bun run atlas -- init --tables users,orders,products

# Profile a non-public schema
bun run atlas -- init --schema analytics

# Profile with LLM enrichment
bun run atlas -- init --enrich

# Load the cybersec demo dataset (62 tables, ~500K rows)
bun run atlas -- init --demo cybersec

# Profile a named connection from atlas.config.ts
bun run atlas -- init --connection warehouse

# Profile CSV files directly (no database needed)
bun run atlas -- init --csv sales.csv,products.csv

# Per-source layout (writes to semantic/warehouse/)
bun run atlas -- init --source warehouse

--demo without an argument loads the simple dataset (3 tables, ~330 rows). --demo cybersec loads the cybersec dataset (62 tables, ~500K rows). --demo ecommerce loads the e-commerce dataset (52 tables, ~480K rows).

--connection and --source cannot be used together. Here's how they differ:

  • No flag — Profiles the default datasource (ATLAS_DATASOURCE_URL) and writes output to semantic/entities/.
  • --source <name> — Writes output to semantic/<name>/entities/ (for multi-source layouts where you organize by source). Does not change which datasource is profiled.
  • --connection <name> — Profiles a named datasource defined in atlas.config.ts (e.g. datasources.warehouse) and automatically writes output to the matching semantic/<name>/ subdirectory.

In TTY mode (interactive terminal), init presents a table picker. Pass --tables to skip the picker for scripted/CI usage.

What it generates:

  • semantic/entities/*.yml — One file per table/view with columns, types, sample values, joins, measures, virtual dimensions, and query patterns
  • semantic/metrics/*.yml — Atomic and breakdown metrics per table
  • semantic/glossary.yml — Ambiguous terms, FK relationships, enum definitions
  • semantic/catalog.yml — Table catalog with use_for and common_questions

diff

Compare the database schema against the existing semantic layer. Exits with code 1 if drift is detected.

bun run atlas -- diff [options]
FlagDescription
--tables <t1,t2>Diff only specific tables/views
--schema <name>PostgreSQL schema. Falls back to ATLAS_SCHEMA env var, then public
--source <name>Read from semantic/{name}/ subdirectory

Examples:

# Check all tables for schema drift
bun run atlas -- diff

# Check specific tables only
bun run atlas -- diff --tables users,orders

# CI usage: fail the build if schema drifted
bun run atlas -- diff || echo "Schema drift detected!"

query

Ask a natural language question and get an answer. Calls POST /api/v1/query on a running Atlas API server — only bun run dev:api is needed (the full Next.js stack is not required).

bun run atlas -- query "your question" [options]
FlagDescription
--jsonRaw JSON output (pipe-friendly)
--csvCSV output (headers + rows, no narrative)
--quietData only — no narrative, SQL, or stats
--auto-approveAuto-approve any pending actions
--connection <id>Query a specific datasource

Environment:

VariableDefaultDescription
ATLAS_API_URLhttp://localhost:3001API server URL
ATLAS_API_KEYAPI key for authentication

Examples:

# Table output (default)
bun run atlas -- query "How many users signed up last month?"

# JSON output for scripting
bun run atlas -- query "top 10 customers by revenue" --json

# CSV output (pipe to other tools)
bun run atlas -- query "monthly revenue by product" --csv > report.csv

# Data only, no explanation
bun run atlas -- query "active users today" --quiet

# Query a specific datasource
bun run atlas -- query "warehouse inventory" --connection warehouse

doctor

Validate the environment, connectivity, and configuration. Checks that all required services are reachable and properly configured.

bun run atlas -- doctor

No flags. Checks:

  • Database connectivity (ATLAS_DATASOURCE_URL)
  • Internal DB connectivity (DATABASE_URL)
  • LLM provider configuration
  • Semantic layer presence and validity
  • Sandbox availability

Example output:

✓ Environment: ATLAS_PROVIDER=anthropic, ANTHROPIC_API_KEY set
✓ Database: Connected to PostgreSQL 16.1
✓ Semantic layer: 5 entities, 2 metrics, 1 glossary
✓ Sandbox: nsjail available
✓ Auth: managed (Better Auth)

validate

Check the semantic layer YAML files for errors. Runs offline — no database or API key required.

bun run atlas -- validate

No flags. Validates:

  • YAML syntax and required fields
  • Column type consistency
  • Join references
  • Metric SQL syntax
  • Glossary entries

Useful in CI to catch semantic layer errors before deployment.

mcp

Start an MCP (Model Context Protocol) server for use with Claude Desktop, Cursor, and other MCP-compatible clients.

bun run atlas -- mcp [options]
FlagDefaultDescription
--transport <stdio|sse>stdioTransport type
--port <n>8080Port for SSE transport (only used with --transport sse)

When to use each transport:

  • stdio (default) — For local MCP clients that launch the server as a subprocess (Claude Desktop, Cursor, Windsurf). The client manages the process lifecycle. This is the most common setup.
  • sse — For remote or containerized MCP servers where the client connects over HTTP. Use this when the MCP server runs in a Docker container, on a remote host, or when multiple clients need to share one server instance. Clients connect via http://host:port/mcp.

Examples:

# Start MCP server on stdio (default, for Claude Desktop)
bun run atlas -- mcp

# Start with SSE transport on a custom port
bun run atlas -- mcp --transport sse --port 9090

Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "atlas": {
      "command": "bun",
      "args": ["run", "atlas", "--", "mcp"],
      "env": {
        "ATLAS_DATASOURCE_URL": "postgresql://user:pass@host:5432/db",
        "ATLAS_PROVIDER": "anthropic",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

eval

Run the evaluation pipeline against demo schemas. Used to measure text-to-SQL accuracy.

Test cases are YAML files in eval/cases/, organized by dataset (simple/, cybersec/, ecommerce/). Each case specifies id, question, schema, difficulty, category, gold_sql, and optionally expected_rows and tags. Results are written to JSONL files and can be compared against baselines for regression detection.

bun run atlas -- eval [options]
FlagDescription
--schema <name>Filter by demo dataset name (e.g. simple, cybersec, ecommerce). This is the eval dataset name, not a PostgreSQL schema
--category <name>Filter by category
--difficulty <simple|medium|complex>Filter by difficulty
--id <case-id>Run a single case
--limit <n>Max cases to evaluate
--resume <file>Resume from existing JSONL results file
--baselineSave results as new baseline
--compare <file.jsonl>Diff against baseline (exit 1 on regression)
--csvCSV output
--jsonJSON summary output

smoke

Run end-to-end smoke tests against a running Atlas deployment.

bun run atlas -- smoke [options]
FlagDefaultDescription
--target <url>http://localhost:3001API base URL
--api-key <key>Bearer auth token
--timeout <ms>30000Per-check timeout
--verboseShow full response bodies on failure
--jsonMachine-readable JSON output

Environment: Flags can also be set via environment variables. --target falls back to ATLAS_API_URL, and --api-key falls back to ATLAS_API_KEY. Explicit flags take precedence.

VariableDefaultDescription
ATLAS_API_URLhttp://localhost:3001API base URL (overridden by --target)
ATLAS_API_KEYBearer auth token (overridden by --api-key)

plugin

Manage Atlas plugins.

plugin list

List installed plugins from atlas.config.ts.

bun run atlas -- plugin list

plugin create

Scaffold a new plugin.

bun run atlas -- plugin create <name> --type <type>
FlagDescription
--type <type>Plugin type: datasource, context, interaction, action, sandbox (required)

plugin add

Install a plugin package.

bun run atlas -- plugin add <package-name>

migrate

Generate or apply plugin schema migrations.

bun run atlas -- migrate [options]
FlagDescription
--applyExecute migrations against internal database (default: dry-run)

benchmark

Run the BIRD benchmark for text-to-SQL accuracy evaluation. This is a developer tool for measuring Atlas's query generation quality.

BIRD is an external academic benchmark dataset (~1500 questions across 11 SQLite databases) and is not included in the Atlas repository. You must download the BIRD dev set separately from the BIRD website and point --bird-path to the extracted directory.

bun run atlas -- benchmark [options]
FlagDescription
--bird-path <path>Path to the downloaded BIRD dev directory (required)
--limit <n>Max questions to evaluate
--db <name>Filter to a single database
--csvCSV output
--resume <file>Resume from existing JSONL results file

On this page