Connect Your Data
Connect Atlas to PostgreSQL, MySQL, BigQuery, ClickHouse, Snowflake, DuckDB, or Salesforce.
Connect Atlas to your database. Atlas has built-in support for PostgreSQL and MySQL. Additional datasources — BigQuery, ClickHouse, Snowflake, DuckDB (CSV/Parquet), and Salesforce — are available as plugins.
Using app.useatlas.dev?
On the hosted SaaS platform, go to Admin > Connections to add a datasource through the UI. You do not need to set environment variables or edit config files — the platform manages connection configuration, SSL, and pooling for you. The database-specific setup below (creating read-only users, SSL settings) still applies when preparing your database for Atlas access.
Pick your database below and follow the steps. Each section is self-contained.
PostgreSQL
Supported versions: 12+
Prerequisites:
- Network access from Atlas to your PostgreSQL host
- A user with
SELECTandUSAGEprivileges (read-only recommended)
Create a read-only PostgreSQL user
Atlas enforces SELECT-only via a multi-layer SQL validation pipeline. For defense-in-depth, connect with a read-only user:
CREATE USER atlas_reader WITH PASSWORD 'your-strong-password';
GRANT CONNECT ON DATABASE your_db TO atlas_reader;
GRANT USAGE ON SCHEMA public TO atlas_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO atlas_reader;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO atlas_reader;Set the PostgreSQL connection string
On app.useatlas.dev, paste the connection string into Admin > Connections > Add Connection. For self-hosted deployments, set the environment variable:
ATLAS_DATASOURCE_URL=postgresql://atlas_reader:your-strong-password@your-host:5432/your_dbBoth postgresql:// and postgres:// prefixes are accepted.
SSL: Most managed providers (AWS RDS, Supabase, Neon, Railway) require SSL. Append ?sslmode=require:
ATLAS_DATASOURCE_URL=postgresql://atlas_reader:password@host:5432/db?sslmode=requireThe pg driver (v8+) treats sslmode=require as verify-full. Atlas normalizes this automatically. For self-signed certificates, use ?sslmode=no-verify — not recommended for production.
Non-public schemas: Set ATLAS_SCHEMA to query a schema other than public:
ATLAS_SCHEMA=analyticsAtlas validates the schema name at startup (must be a valid SQL identifier — letters, digits, and underscores only) and checks that it exists via pg_namespace.
Generate the PostgreSQL semantic layer
ATLAS_DATASOURCE_URL="your-connection-string" bun run atlas -- initYou should see
Atlas Init — profiling postgres database...
[1/N] Profiling table_name...
Done! Semantic layer for "default" is at ./semantic/Verify: ls semantic/entities/ should list your tables as .yml files.
Verify PostgreSQL connectivity
Start the dev server and check the health endpoint:
bun run dev
curl http://localhost:3001/api/healthConfirm checks.datasource.status is "ok" and sources.default.dbType is "postgres".
MySQL
Supported versions: 8.0+
Prerequisites:
- Network access from Atlas to your MySQL host
- A user with
SELECTprivileges
Create a read-only MySQL user
CREATE USER 'atlas_reader'@'%' IDENTIFIED BY 'your-strong-password';
GRANT SELECT ON your_db.* TO 'atlas_reader'@'%';
FLUSH PRIVILEGES;Atlas also enforces SET SESSION TRANSACTION READ ONLY on every connection as a defense-in-depth measure.
Set the MySQL connection string
On app.useatlas.dev, paste the connection string into Admin > Connections > Add Connection. For self-hosted deployments, set the environment variable:
ATLAS_DATASOURCE_URL=mysql://atlas_reader:your-strong-password@your-host:3306/your_dbBoth mysql:// and mysql2:// prefixes are accepted.
SSL: Append ?ssl=true for TLS connections. For self-signed certificates: ?ssl={"rejectUnauthorized":false} — not recommended for production.
Generate the MySQL semantic layer
ATLAS_DATASOURCE_URL="your-connection-string" bun run atlas -- initVerify: ls semantic/entities/ should list your tables.
Verify MySQL connectivity
bun run dev
curl http://localhost:3001/api/healthConfirm checks.datasource.status is "ok" and sources.default.dbType is "mysql".
BigQuery
BigQuery is a plugin-based datasource. Install the plugin and configure it in atlas.config.ts.
Prerequisites:
- A GCP project with BigQuery enabled
- A service account with
roles/bigquery.dataViewerandroles/bigquery.jobUser - Network access from Atlas to Google APIs (no VPC peering needed — BigQuery uses the public API)
Install the BigQuery plugin
bun add @useatlas/bigquery @google-cloud/bigqueryConfigure BigQuery in atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { bigqueryPlugin } from "@useatlas/bigquery";
export default defineConfig({
plugins: [
bigqueryPlugin({
projectId: "my-gcp-project",
dataset: "analytics",
location: "US",
keyFilename: process.env.GOOGLE_APPLICATION_CREDENTIALS,
}),
],
});import { defineConfig } from "@atlas/api/lib/config";
import { bigqueryPlugin } from "@useatlas/bigquery";
export default defineConfig({
plugins: [
bigqueryPlugin({
projectId: "my-gcp-project",
dataset: "analytics",
credentials: JSON.parse(process.env.GCP_CREDENTIALS!),
}),
],
});import { defineConfig } from "@atlas/api/lib/config";
import { bigqueryPlugin } from "@useatlas/bigquery";
export default defineConfig({
plugins: [
bigqueryPlugin({
projectId: "my-gcp-project",
dataset: "analytics",
}),
],
});ADC works automatically in GCE, Cloud Run, and GKE. For local development, run gcloud auth application-default login.
| Option | Type | Default | Description |
|---|---|---|---|
projectId | string | from credentials/ADC | GCP project ID |
dataset | string | — | Default dataset for unqualified table references |
location | string | — | Query job location (US, EU, us-east1, etc.) |
keyFilename | string | — | Path to service account JSON key file |
credentials | object | — | Parsed service account JSON key |
costApproval | "auto" | "threshold" | "always" | "threshold" | When to gate queries on estimated cost |
costThreshold | number | 1.0 | USD threshold for "threshold" mode |
Create the service account (if needed)
# Create service account
gcloud iam service-accounts create atlas-reader \
--display-name="Atlas Reader"
# Grant BigQuery read access
gcloud projects add-iam-policy-binding my-gcp-project \
--member="serviceAccount:atlas-reader@my-gcp-project.iam.gserviceaccount.com" \
--role="roles/bigquery.dataViewer"
gcloud projects add-iam-policy-binding my-gcp-project \
--member="serviceAccount:atlas-reader@my-gcp-project.iam.gserviceaccount.com" \
--role="roles/bigquery.jobUser"
# Download key file
gcloud iam service-accounts keys create atlas-reader-key.json \
--iam-account=atlas-reader@my-gcp-project.iam.gserviceaccount.comSet the key file path in your environment:
GOOGLE_APPLICATION_CREDENTIALS=./atlas-reader-key.jsonNever commit service account key files to version control. Add *.json key files to .gitignore.
Generate the BigQuery semantic layer
bun run atlas -- initThe plugin registers as the default datasource. Atlas will profile your BigQuery tables and generate semantic/entities/*.yml.
Verify BigQuery connectivity
bun run dev
curl http://localhost:3001/api/healthConfirm the health response includes a source with dbType set to "bigquery" (the plugin registers as "bigquery-datasource" in the sources object).
BigQuery runs a dry-run cost estimate before every query. By default, queries estimated above $1.00 USD are blocked. Configure this with the costThreshold and costApproval plugin options.
ClickHouse
ClickHouse is a plugin-based datasource using the HTTP transport.
Prerequisites:
- Network access from Atlas to your ClickHouse HTTP interface (default port: 8123 for HTTP, 8443 for HTTPS)
- A user with
SELECTprivileges
Create a read-only ClickHouse user
CREATE USER atlas_reader IDENTIFIED BY 'your-strong-password';
GRANT SELECT ON your_db.* TO atlas_reader;Atlas enforces readonly: 1 on every query via ClickHouse settings, preventing mutations even if the user has broader privileges.
Install the ClickHouse plugin
bun add @useatlas/clickhouse @clickhouse/clientConfigure ClickHouse in atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { clickhousePlugin } from "@useatlas/clickhouse";
export default defineConfig({
plugins: [
clickhousePlugin({
url: "clickhouses://atlas_reader:your-strong-password@your-host:8443/your_db",
}),
],
});URL schemes:
clickhouse://— plain HTTP (typically port 8123)clickhouses://— HTTPS/TLS (typically port 8443)
| Option | Type | Default | Description |
|---|---|---|---|
url | string | required | ClickHouse connection URL |
database | string | from URL path | Database name override |
If you use clickhouse:// with port 8443, Atlas will warn that you likely intended clickhouses:// for TLS.
Generate the ClickHouse semantic layer
bun run atlas -- initVerify ClickHouse connectivity
bun run dev
curl http://localhost:3001/api/healthConfirm the health response includes a source with dbType set to "clickhouse" (the plugin registers as "clickhouse-datasource" in the sources object).
Snowflake
Snowflake is a plugin-based datasource.
Prerequisites:
- A Snowflake account with a running warehouse
- A user/role with
SELECTprivileges on the target schema
Create a read-only Snowflake role
Snowflake has no session-level read-only mode. Atlas enforces SELECT-only via SQL validation, but for defense-in-depth, use a role with only SELECT privileges:
CREATE ROLE atlas_readonly;
GRANT USAGE ON WAREHOUSE your_warehouse TO ROLE atlas_readonly;
GRANT USAGE ON DATABASE your_db TO ROLE atlas_readonly;
GRANT USAGE ON SCHEMA your_db.your_schema TO ROLE atlas_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA your_db.your_schema TO ROLE atlas_readonly;
GRANT SELECT ON FUTURE TABLES IN SCHEMA your_db.your_schema TO ROLE atlas_readonly;
CREATE USER atlas_reader
PASSWORD = 'your-strong-password'
DEFAULT_ROLE = atlas_readonly;
GRANT ROLE atlas_readonly TO USER atlas_reader;Install the Snowflake plugin
bun add @useatlas/snowflake snowflake-sdkConfigure Snowflake in atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { snowflakePlugin } from "@useatlas/snowflake";
export default defineConfig({
plugins: [
snowflakePlugin({
url: "snowflake://atlas_reader:your-strong-password@your-account/your_db/your_schema?warehouse=YOUR_WH&role=atlas_readonly",
}),
],
});URL format: snowflake://user:pass@account/database/schema?warehouse=WH&role=ROLE
account— plain identifier (e.g.xy12345) or fully-qualified locator (e.g.xy12345.us-east-1)/databaseand/database/schemapath segments are optionalwarehouseandrolequery params are case-insensitive
| Option | Type | Default | Description |
|---|---|---|---|
url | string | required | Snowflake connection URL |
maxConnections | number | 10 | Connection pool size (max 100) |
Atlas sets STATEMENT_TIMEOUT_IN_SECONDS per session and tags all queries with QUERY_TAG = 'atlas:readonly' for audit trail in QUERY_HISTORY.
Generate the Snowflake semantic layer
bun run atlas -- initVerify Snowflake connectivity
bun run dev
curl http://localhost:3001/api/healthConfirm the health response includes a source with dbType set to "snowflake" (the plugin registers as "snowflake-datasource" in the sources object).
DuckDB (CSV / Parquet)
DuckDB is an in-process analytical engine — no external database server required. Use it to query CSV or Parquet files directly.
Prerequisites:
- CSV or Parquet files accessible from the Atlas host, or an existing
.duckdbdatabase file
Install the DuckDB plugin
bun add @useatlas/duckdb @duckdb/node-apiOption A: Ingest files with the CLI
The fastest path — Atlas creates an in-memory DuckDB, loads files, profiles them, and generates the semantic layer in one step:
# CSV files
bun run atlas -- init --csv file1.csv,file2.csv
# Parquet files
bun run atlas -- init --parquet file1.parquet,file2.parquetNo atlas.config.ts needed for this path.
Option B: Configure a persistent DuckDB database
For an existing .duckdb file, configure in atlas.config.ts:
import { defineConfig } from "@atlas/api/lib/config";
import { duckdbPlugin } from "@useatlas/duckdb";
export default defineConfig({
plugins: [
duckdbPlugin({ url: "duckdb:///absolute/path/to/analytics.duckdb" }),
],
});URL formats:
duckdb://orduckdb://:memory:— in-memory databaseduckdb:///absolute/path.duckdb— absolute path (note the triple slash)duckdb://relative/path.duckdb— relative path
| Option | Type | Default | Description |
|---|---|---|---|
url | string | — | DuckDB URL (alternative to path) |
path | string | — | Direct file path or :memory: |
readOnly | boolean | true for files | Open in read-only mode |
File databases open in read-only mode by default. In-memory databases are writable.
Then generate the semantic layer:
bun run atlas -- initVerify DuckDB connectivity
bun run dev
curl http://localhost:3001/api/healthConfirm the health response includes a source with dbType set to "duckdb" (the plugin registers as "duckdb-datasource" in the sources object).
Salesforce
Salesforce uses SOQL, not SQL. Atlas provides a separate querySalesforce tool. The query syntax and capabilities differ from SQL datasources — no JOIN, no SELECT *, and relationship queries use dot notation.
Prerequisites:
- A Salesforce user with API access enabled in their profile
- The user's security token (reset from Salesforce Setup > My Personal Information > Reset Security Token)
Install the Salesforce plugin
bun add @useatlas/salesforce jsforceConfigure Salesforce in atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { salesforcePlugin } from "@useatlas/salesforce";
export default defineConfig({
plugins: [
salesforcePlugin({
url: "salesforce://user:pass@login.salesforce.com?token=SECURITY_TOKEN",
}),
],
});URL format: salesforce://username:password@hostname?token=TOKEN&clientId=ID&clientSecret=SECRET
| Component | Description |
|---|---|
hostname | login.salesforce.com (production) or test.salesforce.com (sandbox) |
token | Salesforce security token (appended to password for authentication) |
clientId / clientSecret | Optional — for OAuth connected app authentication |
Store credentials in environment variables: salesforcePlugin({ url: process.env.SALESFORCE_URL! }). Never hardcode passwords in atlas.config.ts.
Generate the Salesforce semantic layer
bun run atlas -- initObject discovery uses describe() and listObjects() API calls rather than system catalogs.
Verify Salesforce connectivity
bun run dev
curl http://localhost:3001/api/healthConfirm the health response includes a source with dbType set to "salesforce" (the plugin registers as "salesforce-datasource" in the sources object).
Generate the semantic layer
Using app.useatlas.dev?
The hosted platform generates the semantic layer automatically when you add a connection. You can review and edit entity files in Admin > Semantic Layer.
After configuring any datasource (self-hosted), generate the YAML files the agent reads before writing queries:
bun run atlas -- initThis queries system catalogs and generates:
semantic/entities/*.yml— one file per table with columns, types, sample values, joins, measures, and query patternssemantic/catalog.yml— entry point listing all entitiessemantic/glossary.yml— auto-detected ambiguous terms and enum definitionssemantic/metrics/*.yml— per-table metric definitions
Useful flags
| Flag | Description |
|---|---|
--tables users,orders | Profile specific tables only |
--enrich | Use your LLM to add richer descriptions and query patterns |
--no-enrich | Skip LLM enrichment explicitly |
--csv file1.csv,file2.csv | Load CSV files into DuckDB |
--parquet file1.parquet | Load Parquet files into DuckDB |
The generated YAMLs are a starting point. Review them and add business context, remove tables that should not be queryable, and adjust sample values for sensitive columns. Better YAMLs produce better queries.
See Semantic Layer for a deep dive, or CLI Reference for all init flags.
Multi-source configuration
For a comprehensive guide on how the agent routes queries across datasources, semantic layer organization per source, cross-source relationships, and troubleshooting, see Multi-Datasource Routing.
To query multiple datasources from a single Atlas deployment, list them in atlas.config.ts:
import { defineConfig } from "@atlas/api/lib/config";
import { clickhousePlugin } from "@useatlas/clickhouse";
import { snowflakePlugin } from "@useatlas/snowflake";
import { duckdbPlugin } from "@useatlas/duckdb";
export default defineConfig({
datasources: {
default: { url: process.env.ATLAS_DATASOURCE_URL! },
},
plugins: [
clickhousePlugin({ url: "clickhouses://user:pass@host:8443/analytics" }),
snowflakePlugin({ url: "snowflake://user:pass@account/db/schema?warehouse=WH" }),
duckdbPlugin({ url: "duckdb:///data/reports.duckdb" }),
],
tools: ["explore", "executeSQL"],
auth: "auto",
semanticLayer: "./semantic",
});The agent's executeSQL tool accepts an optional connectionId parameter to target a specific datasource. The "default" datasource is used when no ID is specified. Plugins register under their own IDs: "clickhouse-datasource", "snowflake-datasource", "duckdb-datasource", etc.
Safety configuration
Using app.useatlas.dev?
Safety limits are pre-configured on the hosted platform. You can adjust row limits and query timeouts in Admin > Settings.
Atlas enforces several safety limits. Self-hosted operators can tune them via environment variables:
| Variable | Default | Description |
|---|---|---|
ATLAS_TABLE_WHITELIST | true | Only allow queries against tables in semantic/entities/*.yml |
ATLAS_ROW_LIMIT | 1000 | Maximum rows returned per query (auto-appended as LIMIT) |
ATLAS_QUERY_TIMEOUT | 30000 | Per-query timeout in milliseconds. Enforced via SET statement_timeout (PostgreSQL), MAX_EXECUTION_TIME (MySQL), max_execution_time (ClickHouse), or STATEMENT_TIMEOUT_IN_SECONDS (Snowflake) |
Non-SELECT SQL (INSERT, UPDATE, DELETE, DROP, etc.) is always rejected. There is no toggle to disable this.
Read-only enforcement by datasource
| Datasource | Read-only mechanism |
|---|---|
| PostgreSQL | SQL validation (regex + AST) |
| MySQL | SQL validation + SET SESSION TRANSACTION READ ONLY |
| BigQuery | SQL validation + dry-run cost gate |
| ClickHouse | SQL validation + readonly: 1 per-query setting |
| Snowflake | SQL validation (use a SELECT-only role for defense-in-depth) |
| DuckDB | SQL validation + READ_ONLY open mode |
| Salesforce | SOQL is inherently read-only |
Recommended production settings
# .env
ATLAS_TABLE_WHITELIST=true
ATLAS_ROW_LIMIT=500
ATLAS_QUERY_TIMEOUT=15000Lower the row limit to reduce load. Reduce the timeout to kill runaway queries faster.
For multi-tenant deployments, see Row-Level Security.
See Also
- Multi-Datasource Routing — Connect multiple databases in a single deployment
- Configuration — Declarative datasource configuration in
atlas.config.ts - Semantic Layer — Generate entity YAML files from your connected database
- CLI Reference —
atlas initcommand for profiling your database - Row-Level Security — Automatic data isolation per tenant