Atlas
Guides

Troubleshooting

Diagnose and fix common Atlas issues — startup errors, connection problems, SQL validation, and sandbox failures.

This page covers troubleshooting for both users (common issues when using Atlas) and operators (diagnosing deployment and configuration problems). Jump to the section that applies to you:


User Troubleshooting

These issues can occur on any Atlas deployment, including app.useatlas.dev.

SQL Validation Errors

"Query rejected: DML/DDL detected"

The SQL contains mutation keywords (INSERT, UPDATE, DELETE, DROP, etc.). Atlas only allows SELECT queries.

"Table not in whitelist"

The query references a table that is not in the semantic layer. Ask your administrator to add the table, or rephrase your question to use available tables.

"Failed to parse SQL"

The SQL could not be parsed. This typically means:

  • Database-specific syntax that the parser does not support
  • A syntax error in the generated SQL
  • Complex CTEs or window functions that the parser mishandles

If this persists, try rephrasing your question or ask about simpler queries first.

Auth Issues (Users)

401 Responses

Auth ModeCommon Causes
Simple KeyWrong key, missing Authorization: Bearer header
ManagedSession expired (7-day lifetime), cookie not sent (cross-origin without credentials: 'include')
BYOTJWT expired, wrong issuer, JWKS endpoint unreachable

Row-Level Security

Queries Return Empty Results

RLS is working but the claim value does not match any rows. This usually means the claim value in your JWT or session does not match the values in the filtered column. Contact your administrator to verify your tenant assignment.

Scheduled Tasks (Users)

Delivery Failures

  • Email: Verify the recipient address is correct and check spam folders
  • Slack: Ensure the Atlas bot has been added to the target channel

Runtime Error Codes

For chat and API runtime errors (rate limiting, provider failures, authentication errors, etc.), see the Error Codes Reference. That page covers every ChatErrorCode and ClientErrorCode with HTTP status, retryable classification, and fix guidance.


Operator Debugging

Self-Hosted Operators

The sections below are for operators managing their own Atlas deployment. On app.useatlas.dev, infrastructure diagnostics are handled by the Atlas platform team. If you encounter persistent issues on the hosted platform, contact support.

atlas doctor

Run atlas doctor to check your Atlas configuration and diagnose issues:

# Run a comprehensive health check — validates all services and configuration
bun run atlas -- doctor

This validates:

  • Datasource connectivity (can Atlas reach your database?)
  • Provider API key (is the LLM provider configured?)
  • Semantic layer (do entity YAML files exist?)
  • Internal database (can Atlas reach DATABASE_URL?)
  • Auth configuration (are required variables set?)
  • Sandbox backend (is isolation working?)
  • Config file (can atlas.config.ts be loaded?)

Each check reports pass, warn, or fail with actionable guidance.


Startup Errors

Atlas validates its environment on first request and reports structured diagnostic codes.

MISSING_DATASOURCE_URL

No analytics datasource configured.

# Fix: set the datasource URL
ATLAS_DATASOURCE_URL=postgresql://user:pass@host:5432/db

DB_UNREACHABLE

Cannot connect to the analytics datasource. Check:

  • Connection string format and credentials
  • Network access from Atlas to the database
  • SSL configuration (try ?sslmode=require for PostgreSQL, ?ssl=true for MySQL)
  • Firewall rules and security groups

MISSING_API_KEY

LLM provider API key not set. Atlas needs one to run the agent.

# Example for Anthropic
ATLAS_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...

MISSING_SEMANTIC_LAYER

No semantic/entities/*.yml files found. Generate them:

bun run atlas -- init                    # Profile your database
bun run atlas -- init --demo             # Use demo data instead

INVALID_SCHEMA

The ATLAS_SCHEMA value is not a valid SQL identifier or doesn't exist in the database.

# Check the schema exists
psql "$ATLAS_DATASOURCE_URL" -c "SELECT schema_name FROM information_schema.schemata"

WEAK_AUTH_SECRET

BETTER_AUTH_SECRET is shorter than 32 characters. Generate a strong secret:

openssl rand -base64 48

MISSING_AUTH_ISSUER

BYOT mode requires ATLAS_AUTH_ISSUER when ATLAS_AUTH_JWKS_URL is set.

MISSING_AUTH_PREREQ

Auth mode is set explicitly but required variables are missing. For example, ATLAS_AUTH_MODE=api-key without ATLAS_API_KEY.

ACTIONS_REQUIRE_AUTH

The action framework requires authentication. Set up any auth mode other than none.

INVALID_CONFIG

atlas.config.ts failed to load. Check for syntax errors or missing dependencies.


Connection Issues

PostgreSQL SSL

Most managed Postgres providers require SSL:

# Require SSL (recommended)
ATLAS_DATASOURCE_URL=postgresql://user:pass@host:5432/db?sslmode=require

# Skip certificate verification (development only)
ATLAS_DATASOURCE_URL=postgresql://user:pass@host:5432/db?sslmode=no-verify

MySQL Timeouts

If MySQL connections drop, check:

  • wait_timeout on the MySQL server (default 8 hours)
  • Connection pool idle timeout (configure via atlas.config.ts)
  • Network latency between Atlas and MySQL

Connection Refused

  • Verify the host and port are correct
  • Check that the database server is running
  • Ensure no firewall is blocking the connection
  • For Docker: use host.docker.internal instead of localhost to reach the host machine

SQL Validation Errors (Operator Details)

For user-facing descriptions, see User Troubleshooting above. The operator-specific details below cover remediation steps that require config access.

"Table not in whitelist" — Operator Fix

The query references a table that does not have an entity YAML file in semantic/entities/. Either:

  • Add the table: bun run atlas -- init --tables <table_name>
  • Or set ATLAS_TABLE_WHITELIST=false to disable (not recommended for production)

"Failed to parse SQL" — Operator Fix

Check ATLAS_LOG_LEVEL=debug for the full SQL that was rejected.


Sandbox Issues

"nsjail not found"

nsjail is not on PATH and ATLAS_NSJAIL_PATH is not set.

  • Docker: Build with the nsjail stage (see examples/docker/Dockerfile)
  • Local dev: nsjail is not needed -- bun run db:up starts a sidecar container that provides production-like isolation
  • Explicit mode: If ATLAS_SANDBOX=nsjail is set, nsjail must be available (hard fail)

"Sidecar unreachable"

When ATLAS_SANDBOX_URL is set, Atlas expects the sidecar service to be running at that URL.

# Check sidecar health
curl http://localhost:8080/health

Verify the sidecar container is running (docker compose ps sandbox) and the URL is correct. Run bun run db:up to start both Postgres and the sidecar.

"Permission denied" in nsjail

nsjail requires specific Linux kernel capabilities. Check:

  • The container has SYS_ADMIN capability (for mount namespaces)
  • User namespaces are enabled in the kernel
  • The nsjail binary has correct permissions

Auth Issues (Operator)

401 Responses

See User Troubleshooting > Auth Issues for common 401 causes by auth mode.

Operator-specific remediation:

  • Verify auth environment variables are set correctly (ATLAS_API_KEY, BETTER_AUTH_SECRET, or ATLAS_AUTH_JWKS_URL depending on mode). Run atlas doctor to validate
  • Enable debug logging with ATLAS_LOG_LEVEL=debug to see auth decisions, mode detection, and role extraction for each request
  • For BYOT: confirm the JWKS endpoint is reachable from the Atlas server (curl $ATLAS_AUTH_JWKS_URL) and that ATLAS_AUTH_ISSUER matches the iss claim in your JWTs
  • Check ATLAS_AUTH_MODE is not set to an unintended value — when set explicitly, it overrides auto-detection

CORS with Managed Auth

Cross-origin managed auth requires both CORS and CSRF configuration:

ATLAS_CORS_ORIGIN=https://app.example.com
BETTER_AUTH_TRUSTED_ORIGINS=https://app.example.com
BETTER_AUTH_URL=https://api.example.com

See Authentication - Cross-origin deployment.

JWKS Fetch Failures (BYOT)

Atlas caches JWKS keys but needs to reach the endpoint on startup. If your JWKS URL is unreachable:

  • Check network access from Atlas to the identity provider
  • Verify the URL format (must be HTTPS, end in .json or /jwks)
  • Check for firewall rules blocking outbound HTTPS

Scheduled Tasks (Operator)

Tasks Never Trigger

  • Verify ATLAS_SCHEDULER_ENABLED=true is set
  • Check the scheduler backend matches your deployment: bun (in-process), webhook (external cron), or vercel (Vercel Cron)
  • For bun backend: the scheduler tick loop runs every ATLAS_SCHEDULER_TICK_INTERVAL seconds (default 60). Tasks are checked against this interval, not executed at exact cron times
  • For webhook backend: ensure your external cron is hitting POST /api/v1/scheduled-tasks/:id/run with the correct ATLAS_SCHEDULER_SECRET
  • For vercel backend: ensure CRON_SECRET is set and the /tick endpoint is configured in vercel.json

Delivery Failures (Operator)

  • Email: Verify RESEND_API_KEY is set and valid. Check ATLAS_EMAIL_FROM format (default: Atlas <noreply@useatlas.dev>)
  • Slack: Ensure SLACK_BOT_TOKEN (or OAuth) is configured and the bot has chat:write scope for the target channel
  • Webhook: Check the endpoint URL is reachable. Webhook delivery includes a timeout -- check debug logs for the response status

"Lock contention" or Duplicate Runs

The scheduler uses an in-process lock to prevent overlapping ticks. If you see duplicate executions, check that only one instance is running the scheduler. For multi-instance deployments, use the webhook backend with an external scheduler service.


Action Framework (Operator)

Actions Not Appearing

  • Verify ATLAS_ACTIONS_ENABLED=true
  • Actions require authentication (any mode except none)
  • Without DATABASE_URL, actions use in-memory storage (action log is lost on restart). Set DATABASE_URL for persistent tracking
  • Check that the action plugin is loaded (built-in actions like email:send and jira:create register automatically)

Approval Stuck in "Pending"

  • The user approving must have the required role: analyst for manual mode, admin for admin-only mode
  • In admin-only mode, the requesting user cannot approve or deny their own action (separation of duties)
  • Check that the action hasn't expired or been superseded

Execution Failed

  • Check credentials: RESEND_API_KEY for email, JIRA_BASE_URL + JIRA_EMAIL + JIRA_API_TOKEN for JIRA
  • For email: verify ATLAS_EMAIL_ALLOWED_DOMAINS if set (restricts recipient domains)
  • Check debug logs for the specific error from the external service

Row-Level Security (Operator)

All Queries Blocked

  • If ATLAS_AUTH_MODE=none with RLS enabled, all queries are blocked (no user context to resolve claims). This is by design -- see Row-Level Security
  • Verify the auth mode provides claims (BYOT or managed auth)

"RLS policy requires claim X but it is missing"

The JWT doesn't contain the expected claim path. Check:

  • Claim path spelling and case (claim resolution is case-sensitive)
  • For nested paths like app_metadata.tenant, verify the JWT contains the full nested structure
  • Use ATLAS_LOG_LEVEL=debug to see the resolved claims

Queries Return Empty Results

RLS is working but the claim value does not match any rows. This usually means the claim value in the JWT does not match the values in the filtered column. Check debug logs for the injected WHERE conditions.


Debug Logging

Enable debug logs to see detailed agent actions, SQL validation steps, and request traces:

ATLAS_LOG_LEVEL=debug

Log levels: trace, debug, info (default), warn, error, fatal.

Key things to look for in debug logs:

  • Agent steps -- Tool calls, SQL generated, validation results
  • SQL validation -- Which layer rejected a query and why
  • Auth decisions -- Mode detection, role extraction, rate limit checks
  • Sandbox execution -- Command sent, stdout/stderr, exit code

Getting Help

  • GitHub Issues -- Bug reports and feature requests
  • Run atlas doctor and include the output in your issue for faster diagnosis
  • Set ATLAS_LOG_LEVEL=debug and include relevant log lines

See Also

On this page