Atlas
Getting Started

Connect Your Data

Connect Atlas to PostgreSQL, MySQL, BigQuery, ClickHouse, Snowflake, DuckDB, or Salesforce.

Connect Atlas to your database. Atlas has built-in support for PostgreSQL and MySQL. Additional datasources — BigQuery, ClickHouse, Snowflake, DuckDB (CSV/Parquet), and Salesforce — are available as plugins.

Using app.useatlas.dev?

On the hosted SaaS platform, go to Admin > Connections to add a datasource through the UI. You do not need to set environment variables or edit config files — the platform manages connection configuration, SSL, and pooling for you. The database-specific setup below (creating read-only users, SSL settings) still applies when preparing your database for Atlas access.

Pick your database below and follow the steps. Each section is self-contained.


PostgreSQL

Supported versions: 12+

Prerequisites:

  • Network access from Atlas to your PostgreSQL host
  • A user with SELECT and USAGE privileges (read-only recommended)

Create a read-only PostgreSQL user

Atlas enforces SELECT-only via a multi-layer SQL validation pipeline. For defense-in-depth, connect with a read-only user:

CREATE USER atlas_reader WITH PASSWORD 'your-strong-password';
GRANT CONNECT ON DATABASE your_db TO atlas_reader;
GRANT USAGE ON SCHEMA public TO atlas_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO atlas_reader;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO atlas_reader;

Set the PostgreSQL connection string

On app.useatlas.dev, paste the connection string into Admin > Connections > Add Connection. For self-hosted deployments, set the environment variable:

ATLAS_DATASOURCE_URL=postgresql://atlas_reader:your-strong-password@your-host:5432/your_db

Both postgresql:// and postgres:// prefixes are accepted.

SSL: Most managed providers (AWS RDS, Supabase, Neon, Railway) require SSL. Append ?sslmode=require:

ATLAS_DATASOURCE_URL=postgresql://atlas_reader:password@host:5432/db?sslmode=require

The pg driver (v8+) treats sslmode=require as verify-full. Atlas normalizes this automatically. For self-signed certificates, use ?sslmode=no-verify — not recommended for production.

Non-public schemas: Set ATLAS_SCHEMA to query a schema other than public:

ATLAS_SCHEMA=analytics

Atlas validates the schema name at startup (must be a valid SQL identifier — letters, digits, and underscores only) and checks that it exists via pg_namespace.

Generate the PostgreSQL semantic layer

ATLAS_DATASOURCE_URL="your-connection-string" bun run atlas -- init

You should see

Atlas Init — profiling postgres database...
  [1/N] Profiling table_name...
Done! Semantic layer for "default" is at ./semantic/

Verify: ls semantic/entities/ should list your tables as .yml files.

Verify PostgreSQL connectivity

Start the dev server and check the health endpoint:

bun run dev
curl http://localhost:3001/api/health

Confirm checks.datasource.status is "ok" and sources.default.dbType is "postgres".


MySQL

Supported versions: 8.0+

Prerequisites:

  • Network access from Atlas to your MySQL host
  • A user with SELECT privileges

Create a read-only MySQL user

CREATE USER 'atlas_reader'@'%' IDENTIFIED BY 'your-strong-password';
GRANT SELECT ON your_db.* TO 'atlas_reader'@'%';
FLUSH PRIVILEGES;

Atlas also enforces SET SESSION TRANSACTION READ ONLY on every connection as a defense-in-depth measure.

Set the MySQL connection string

On app.useatlas.dev, paste the connection string into Admin > Connections > Add Connection. For self-hosted deployments, set the environment variable:

ATLAS_DATASOURCE_URL=mysql://atlas_reader:your-strong-password@your-host:3306/your_db

Both mysql:// and mysql2:// prefixes are accepted.

SSL: Append ?ssl=true for TLS connections. For self-signed certificates: ?ssl={"rejectUnauthorized":false} — not recommended for production.

Generate the MySQL semantic layer

ATLAS_DATASOURCE_URL="your-connection-string" bun run atlas -- init

Verify: ls semantic/entities/ should list your tables.

Verify MySQL connectivity

bun run dev
curl http://localhost:3001/api/health

Confirm checks.datasource.status is "ok" and sources.default.dbType is "mysql".


BigQuery

BigQuery is a plugin-based datasource. Install the plugin and configure it in atlas.config.ts.

Prerequisites:

  • A GCP project with BigQuery enabled
  • A service account with roles/bigquery.dataViewer and roles/bigquery.jobUser
  • Network access from Atlas to Google APIs (no VPC peering needed — BigQuery uses the public API)

Install the BigQuery plugin

bun add @useatlas/bigquery @google-cloud/bigquery

Configure BigQuery in atlas.config.ts

import { defineConfig } from "@atlas/api/lib/config";
import { bigqueryPlugin } from "@useatlas/bigquery";

export default defineConfig({
  plugins: [
    bigqueryPlugin({
      projectId: "my-gcp-project",
      dataset: "analytics",
      location: "US",
      keyFilename: process.env.GOOGLE_APPLICATION_CREDENTIALS,
    }),
  ],
});
import { defineConfig } from "@atlas/api/lib/config";
import { bigqueryPlugin } from "@useatlas/bigquery";

export default defineConfig({
  plugins: [
    bigqueryPlugin({
      projectId: "my-gcp-project",
      dataset: "analytics",
      credentials: JSON.parse(process.env.GCP_CREDENTIALS!),
    }),
  ],
});
import { defineConfig } from "@atlas/api/lib/config";
import { bigqueryPlugin } from "@useatlas/bigquery";

export default defineConfig({
  plugins: [
    bigqueryPlugin({
      projectId: "my-gcp-project",
      dataset: "analytics",
    }),
  ],
});

ADC works automatically in GCE, Cloud Run, and GKE. For local development, run gcloud auth application-default login.

OptionTypeDefaultDescription
projectIdstringfrom credentials/ADCGCP project ID
datasetstringDefault dataset for unqualified table references
locationstringQuery job location (US, EU, us-east1, etc.)
keyFilenamestringPath to service account JSON key file
credentialsobjectParsed service account JSON key
costApproval"auto" | "threshold" | "always""threshold"When to gate queries on estimated cost
costThresholdnumber1.0USD threshold for "threshold" mode

Create the service account (if needed)

# Create service account
gcloud iam service-accounts create atlas-reader \
  --display-name="Atlas Reader"

# Grant BigQuery read access
gcloud projects add-iam-policy-binding my-gcp-project \
  --member="serviceAccount:atlas-reader@my-gcp-project.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataViewer"

gcloud projects add-iam-policy-binding my-gcp-project \
  --member="serviceAccount:atlas-reader@my-gcp-project.iam.gserviceaccount.com" \
  --role="roles/bigquery.jobUser"

# Download key file
gcloud iam service-accounts keys create atlas-reader-key.json \
  --iam-account=atlas-reader@my-gcp-project.iam.gserviceaccount.com

Set the key file path in your environment:

GOOGLE_APPLICATION_CREDENTIALS=./atlas-reader-key.json

Never commit service account key files to version control. Add *.json key files to .gitignore.

Generate the BigQuery semantic layer

bun run atlas -- init

The plugin registers as the default datasource. Atlas will profile your BigQuery tables and generate semantic/entities/*.yml.

Verify BigQuery connectivity

bun run dev
curl http://localhost:3001/api/health

Confirm the health response includes a source with dbType set to "bigquery" (the plugin registers as "bigquery-datasource" in the sources object).

BigQuery runs a dry-run cost estimate before every query. By default, queries estimated above $1.00 USD are blocked. Configure this with the costThreshold and costApproval plugin options.


ClickHouse

ClickHouse is a plugin-based datasource using the HTTP transport.

Prerequisites:

  • Network access from Atlas to your ClickHouse HTTP interface (default port: 8123 for HTTP, 8443 for HTTPS)
  • A user with SELECT privileges

Create a read-only ClickHouse user

CREATE USER atlas_reader IDENTIFIED BY 'your-strong-password';
GRANT SELECT ON your_db.* TO atlas_reader;

Atlas enforces readonly: 1 on every query via ClickHouse settings, preventing mutations even if the user has broader privileges.

Install the ClickHouse plugin

bun add @useatlas/clickhouse @clickhouse/client

Configure ClickHouse in atlas.config.ts

import { defineConfig } from "@atlas/api/lib/config";
import { clickhousePlugin } from "@useatlas/clickhouse";

export default defineConfig({
  plugins: [
    clickhousePlugin({
      url: "clickhouses://atlas_reader:your-strong-password@your-host:8443/your_db",
    }),
  ],
});

URL schemes:

  • clickhouse:// — plain HTTP (typically port 8123)
  • clickhouses:// — HTTPS/TLS (typically port 8443)
OptionTypeDefaultDescription
urlstringrequiredClickHouse connection URL
databasestringfrom URL pathDatabase name override

If you use clickhouse:// with port 8443, Atlas will warn that you likely intended clickhouses:// for TLS.

Verify ClickHouse connectivity

bun run dev
curl http://localhost:3001/api/health

Confirm the health response includes a source with dbType set to "clickhouse" (the plugin registers as "clickhouse-datasource" in the sources object).


Snowflake

Snowflake is a plugin-based datasource.

Prerequisites:

  • A Snowflake account with a running warehouse
  • A user/role with SELECT privileges on the target schema

Create a read-only Snowflake role

Snowflake has no session-level read-only mode. Atlas enforces SELECT-only via SQL validation, but for defense-in-depth, use a role with only SELECT privileges:

CREATE ROLE atlas_readonly;
GRANT USAGE ON WAREHOUSE your_warehouse TO ROLE atlas_readonly;
GRANT USAGE ON DATABASE your_db TO ROLE atlas_readonly;
GRANT USAGE ON SCHEMA your_db.your_schema TO ROLE atlas_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA your_db.your_schema TO ROLE atlas_readonly;
GRANT SELECT ON FUTURE TABLES IN SCHEMA your_db.your_schema TO ROLE atlas_readonly;

CREATE USER atlas_reader
  PASSWORD = 'your-strong-password'
  DEFAULT_ROLE = atlas_readonly;
GRANT ROLE atlas_readonly TO USER atlas_reader;

Install the Snowflake plugin

bun add @useatlas/snowflake snowflake-sdk

Configure Snowflake in atlas.config.ts

import { defineConfig } from "@atlas/api/lib/config";
import { snowflakePlugin } from "@useatlas/snowflake";

export default defineConfig({
  plugins: [
    snowflakePlugin({
      url: "snowflake://atlas_reader:your-strong-password@your-account/your_db/your_schema?warehouse=YOUR_WH&role=atlas_readonly",
    }),
  ],
});

URL format: snowflake://user:pass@account/database/schema?warehouse=WH&role=ROLE

  • account — plain identifier (e.g. xy12345) or fully-qualified locator (e.g. xy12345.us-east-1)
  • /database and /database/schema path segments are optional
  • warehouse and role query params are case-insensitive
OptionTypeDefaultDescription
urlstringrequiredSnowflake connection URL
maxConnectionsnumber10Connection pool size (max 100)

Atlas sets STATEMENT_TIMEOUT_IN_SECONDS per session and tags all queries with QUERY_TAG = 'atlas:readonly' for audit trail in QUERY_HISTORY.

Verify Snowflake connectivity

bun run dev
curl http://localhost:3001/api/health

Confirm the health response includes a source with dbType set to "snowflake" (the plugin registers as "snowflake-datasource" in the sources object).


DuckDB (CSV / Parquet)

DuckDB is an in-process analytical engine — no external database server required. Use it to query CSV or Parquet files directly.

Prerequisites:

  • CSV or Parquet files accessible from the Atlas host, or an existing .duckdb database file

Install the DuckDB plugin

bun add @useatlas/duckdb @duckdb/node-api

Option A: Ingest files with the CLI

The fastest path — Atlas creates an in-memory DuckDB, loads files, profiles them, and generates the semantic layer in one step:

# CSV files
bun run atlas -- init --csv file1.csv,file2.csv

# Parquet files
bun run atlas -- init --parquet file1.parquet,file2.parquet

No atlas.config.ts needed for this path.

Option B: Configure a persistent DuckDB database

For an existing .duckdb file, configure in atlas.config.ts:

import { defineConfig } from "@atlas/api/lib/config";
import { duckdbPlugin } from "@useatlas/duckdb";

export default defineConfig({
  plugins: [
    duckdbPlugin({ url: "duckdb:///absolute/path/to/analytics.duckdb" }),
  ],
});

URL formats:

  • duckdb:// or duckdb://:memory: — in-memory database
  • duckdb:///absolute/path.duckdb — absolute path (note the triple slash)
  • duckdb://relative/path.duckdb — relative path
OptionTypeDefaultDescription
urlstringDuckDB URL (alternative to path)
pathstringDirect file path or :memory:
readOnlybooleantrue for filesOpen in read-only mode

File databases open in read-only mode by default. In-memory databases are writable.

Then generate the semantic layer:

bun run atlas -- init

Verify DuckDB connectivity

bun run dev
curl http://localhost:3001/api/health

Confirm the health response includes a source with dbType set to "duckdb" (the plugin registers as "duckdb-datasource" in the sources object).


Salesforce

Salesforce uses SOQL, not SQL. Atlas provides a separate querySalesforce tool. The query syntax and capabilities differ from SQL datasources — no JOIN, no SELECT *, and relationship queries use dot notation.

Prerequisites:

  • A Salesforce user with API access enabled in their profile
  • The user's security token (reset from Salesforce Setup > My Personal Information > Reset Security Token)

Install the Salesforce plugin

bun add @useatlas/salesforce jsforce

Configure Salesforce in atlas.config.ts

import { defineConfig } from "@atlas/api/lib/config";
import { salesforcePlugin } from "@useatlas/salesforce";

export default defineConfig({
  plugins: [
    salesforcePlugin({
      url: "salesforce://user:pass@login.salesforce.com?token=SECURITY_TOKEN",
    }),
  ],
});

URL format: salesforce://username:password@hostname?token=TOKEN&clientId=ID&clientSecret=SECRET

ComponentDescription
hostnamelogin.salesforce.com (production) or test.salesforce.com (sandbox)
tokenSalesforce security token (appended to password for authentication)
clientId / clientSecretOptional — for OAuth connected app authentication

Store credentials in environment variables: salesforcePlugin({ url: process.env.SALESFORCE_URL! }). Never hardcode passwords in atlas.config.ts.

Generate the Salesforce semantic layer

bun run atlas -- init

Object discovery uses describe() and listObjects() API calls rather than system catalogs.

Verify Salesforce connectivity

bun run dev
curl http://localhost:3001/api/health

Confirm the health response includes a source with dbType set to "salesforce" (the plugin registers as "salesforce-datasource" in the sources object).


Generate the semantic layer

Using app.useatlas.dev?

The hosted platform generates the semantic layer automatically when you add a connection. You can review and edit entity files in Admin > Semantic Layer.

After configuring any datasource (self-hosted), generate the YAML files the agent reads before writing queries:

bun run atlas -- init

This queries system catalogs and generates:

  • semantic/entities/*.yml — one file per table with columns, types, sample values, joins, measures, and query patterns
  • semantic/catalog.yml — entry point listing all entities
  • semantic/glossary.yml — auto-detected ambiguous terms and enum definitions
  • semantic/metrics/*.yml — per-table metric definitions

Useful flags

FlagDescription
--tables users,ordersProfile specific tables only
--enrichUse your LLM to add richer descriptions and query patterns
--no-enrichSkip LLM enrichment explicitly
--csv file1.csv,file2.csvLoad CSV files into DuckDB
--parquet file1.parquetLoad Parquet files into DuckDB

The generated YAMLs are a starting point. Review them and add business context, remove tables that should not be queryable, and adjust sample values for sensitive columns. Better YAMLs produce better queries.

See Semantic Layer for a deep dive, or CLI Reference for all init flags.


Multi-source configuration

For a comprehensive guide on how the agent routes queries across datasources, semantic layer organization per source, cross-source relationships, and troubleshooting, see Multi-Datasource Routing.

To query multiple datasources from a single Atlas deployment, list them in atlas.config.ts:

import { defineConfig } from "@atlas/api/lib/config";
import { clickhousePlugin } from "@useatlas/clickhouse";
import { snowflakePlugin } from "@useatlas/snowflake";
import { duckdbPlugin } from "@useatlas/duckdb";

export default defineConfig({
  datasources: {
    default: { url: process.env.ATLAS_DATASOURCE_URL! },
  },
  plugins: [
    clickhousePlugin({ url: "clickhouses://user:pass@host:8443/analytics" }),
    snowflakePlugin({ url: "snowflake://user:pass@account/db/schema?warehouse=WH" }),
    duckdbPlugin({ url: "duckdb:///data/reports.duckdb" }),
  ],
  tools: ["explore", "executeSQL"],
  auth: "auto",
  semanticLayer: "./semantic",
});

The agent's executeSQL tool accepts an optional connectionId parameter to target a specific datasource. The "default" datasource is used when no ID is specified. Plugins register under their own IDs: "clickhouse-datasource", "snowflake-datasource", "duckdb-datasource", etc.


Safety configuration

Using app.useatlas.dev?

Safety limits are pre-configured on the hosted platform. You can adjust row limits and query timeouts in Admin > Settings.

Atlas enforces several safety limits. Self-hosted operators can tune them via environment variables:

VariableDefaultDescription
ATLAS_TABLE_WHITELISTtrueOnly allow queries against tables in semantic/entities/*.yml
ATLAS_ROW_LIMIT1000Maximum rows returned per query (auto-appended as LIMIT)
ATLAS_QUERY_TIMEOUT30000Per-query timeout in milliseconds. Enforced via SET statement_timeout (PostgreSQL), MAX_EXECUTION_TIME (MySQL), max_execution_time (ClickHouse), or STATEMENT_TIMEOUT_IN_SECONDS (Snowflake)

Non-SELECT SQL (INSERT, UPDATE, DELETE, DROP, etc.) is always rejected. There is no toggle to disable this.

Read-only enforcement by datasource

DatasourceRead-only mechanism
PostgreSQLSQL validation (regex + AST)
MySQLSQL validation + SET SESSION TRANSACTION READ ONLY
BigQuerySQL validation + dry-run cost gate
ClickHouseSQL validation + readonly: 1 per-query setting
SnowflakeSQL validation (use a SELECT-only role for defense-in-depth)
DuckDBSQL validation + READ_ONLY open mode
SalesforceSOQL is inherently read-only
# .env
ATLAS_TABLE_WHITELIST=true
ATLAS_ROW_LIMIT=500
ATLAS_QUERY_TIMEOUT=15000

Lower the row limit to reduce load. Reduce the timeout to kill runaway queries faster.

For multi-tenant deployments, see Row-Level Security.


See Also

On this page