Datasource Plugins
Connect ClickHouse, DuckDB, MySQL, Snowflake, and Salesforce as Atlas query targets.
Atlas supports PostgreSQL and MySQL natively via ATLAS_DATASOURCE_URL. For other databases and APIs, use datasource plugins. Each plugin provides a connection factory, SQL dialect hints for the agent, query validation rules, and health checks.
All datasource plugins follow the same pattern:
- Install the database driver (optional peer dependency)
- Import the plugin factory in
atlas.config.ts - Pass connection credentials via environment variables
Never commit credentials to version control. Use environment variables (process.env.CLICKHOUSE_URL) in atlas.config.ts and add .env to .gitignore.
ClickHouse
Connects to a ClickHouse instance via the HTTP transport provided by @clickhouse/client. The HTTP transport is stateless -- a new client is created per call, so no connection pooling is needed. Every query is executed with readonly: 1 at the ClickHouse settings level, providing database-enforced read-only access.
Installation
bun add @clickhouse/clientConfiguration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | -- | Connection URL. Must start with clickhouse:// (plain) or clickhouses:// (TLS) |
database | string | No | from URL path | Database name override |
The plugin rewrites clickhouse:// to http:// and clickhouses:// to https:// for the HTTP transport internally.
Usage
// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { clickhousePlugin } from "@atlas/plugin-clickhouse-datasource";
export default defineConfig({
plugins: [
clickhousePlugin({
url: process.env.CLICKHOUSE_URL!,
database: "analytics",
}),
],
});SQL Dialect
The following hints are injected into the agent's system prompt so it writes valid ClickHouse SQL:
- Use
toStartOfMonth(),toStartOfWeek()for date truncation (notDATE_TRUNC) - Use
countIf(condition)instead ofCOUNT(CASE WHEN ... END) - Use
sumIf(column, condition)instead ofSUM(CASE WHEN ... END) - Use
arrayJoin()to unnest arrays - String functions:
lower(),upper(),trim(),splitByChar() - Do not add
FORMATclauses -- the adapter handles output format automatically - ClickHouse is column-oriented -- avoid
SELECT *on wide tables
Validation
In addition to the standard SQL validation pipeline (regex guard + AST parse + table whitelist), the plugin blocks ClickHouse-specific admin commands: SYSTEM, KILL, ATTACH, DETACH, RENAME, EXCHANGE, SHOW, DESCRIBE, EXPLAIN, and USE. The AST parser uses PostgreSQL mode (the closest available match in node-sql-parser).
Security
ClickHouse enforces read-only access at the query level via the readonly: 1 setting on every request. Statement timeout is set per query via max_execution_time.
DuckDB
Connects to a DuckDB database -- either a file on disk or an in-memory instance. DuckDB runs in-process via @duckdb/node-api, so there is no network connection or connection pooling. The plugin uses lazy initialization with a cached connection promise and automatic retry on transient failures.
Installation
bun add @duckdb/node-apiConfiguration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | No | -- | Connection URL starting with duckdb://. Use duckdb://:memory: for in-memory, duckdb:///absolute/path.duckdb for absolute paths, or duckdb://relative/path.duckdb for relative paths |
path | string | No | -- | Direct path to a .duckdb file, or :memory: for in-memory. Alternative to url |
readOnly | boolean | No | true for files, false for :memory: | Open the database in read-only mode |
Either url or path is required. When both are provided, url takes precedence.
Usage
// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { duckdbPlugin } from "@atlas/plugin-duckdb-datasource";
export default defineConfig({
plugins: [
duckdbPlugin({ url: "duckdb://analytics.duckdb" }),
],
});Or with a direct path:
duckdbPlugin({ path: "/data/warehouse.duckdb", readOnly: true })SQL Dialect
- DuckDB syntax is similar to PostgreSQL with additional features
- Use
UNNEST()to expand arrays into rows LISTandSTRUCTtypes are natively supported- File-reading functions (
read_csv_auto,read_parquet, etc.) are blocked for security - Use
DATE_TRUNC()andDATE_PART()for date operations - Use
STRING_AGG()for string aggregation - Supports window functions, CTEs, and lateral joins
- In-process engine -- no connection pooling needed
Validation
The plugin blocks DuckDB-specific statements beyond the standard SQL pipeline: PRAGMA, ATTACH, DETACH, INSTALL, EXPORT, IMPORT, CHECKPOINT, SET, DESCRIBE, EXPLAIN, and SHOW. File-reading table functions (read_csv_auto, read_csv, read_parquet, read_json, read_json_auto, read_text, parquet_scan, csv_scan, json_scan) are blocked to prevent host filesystem access. The AST parser uses PostgreSQL mode.
Security
File-based databases default to READ_ONLY access mode, which provides database-enforced read-only access. In-memory databases cannot use read-only mode (nothing to protect), so they rely on SQL validation. DuckDB has no native query timeout API -- timeout is enforced via Promise.race().
DuckDB runs in the same process as Atlas. File-reading functions are blocked at the validation layer to prevent the agent from reading arbitrary files on the host filesystem.
MySQL
Connects to a MySQL instance via a mysql2/promise connection pool. The pool is cached per plugin instance -- unlike stateless HTTP transports, MySQL pools are heavyweight resources that should be reused.
Installation
bun add mysql2Configuration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | -- | Connection URL. Must start with mysql:// or mysql2:// |
poolSize | number | No | 10 | Maximum pool size (1--500) |
idleTimeoutMs | number | No | 30000 | Idle connection timeout in milliseconds |
Usage
// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { mysqlPlugin } from "@atlas/plugin-mysql-datasource";
export default defineConfig({
plugins: [
mysqlPlugin({
url: process.env.MYSQL_URL!,
poolSize: 20,
}),
],
});Atlas also supports MySQL natively via ATLAS_DATASOURCE_URL (when the URL starts with mysql://). Use this plugin when you need MySQL as an additional datasource alongside the native connection, or when you want explicit pool configuration.
SQL Dialect
- Use
LIMITbeforeOFFSET(e.g.LIMIT 10 OFFSET 20) - Use backtick quoting for identifiers (e.g.
`table_name`) - Use
DATE_FORMAT()instead ofTO_CHAR()for date formatting - Use
IFNULL()instead ofCOALESCE()for two-argument null handling - Use
STR_TO_DATE()for string-to-date conversion GROUP_CONCAT()for string aggregation (notSTRING_AGG)- Use
NOW()for current timestamp,CURDATE()for current date
Validation
The plugin uses the standard SQL validation pipeline with MySQL parser mode (auto-detected from dbType: "mysql"). No additional forbidden patterns are needed beyond the base DML/DDL guard.
Security
Every query runs in a session set to READ ONLY mode (SET SESSION TRANSACTION READ ONLY), providing defense-in-depth beyond SQL validation. Per-query timeout is enforced via SET SESSION MAX_EXECUTION_TIME.
Snowflake
Connects to a Snowflake account via the snowflake-sdk callback-based API. The plugin manages a connection pool internally (snowflake.createPool()) and sets a session-level statement timeout and audit tag before each query.
Installation
bun add snowflake-sdkConfiguration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | -- | Connection URL in the format snowflake://user:pass@account/database/schema?warehouse=WH&role=ROLE |
maxConnections | number | No | 10 | Maximum pool connections (1--100) |
The URL is parsed into Snowflake SDK connection options:
| URL component | Maps to |
|---|---|
| hostname | account (e.g. xy12345 or xy12345.us-east-1) |
| username | username |
| password | password |
| path segment 1 | database (optional) |
| path segment 2 | schema (optional) |
?warehouse= | warehouse (optional) |
?role= | role (optional) |
Usage
// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { snowflakePlugin } from "@atlas/plugin-snowflake-datasource";
export default defineConfig({
plugins: [
snowflakePlugin({
url: process.env.SNOWFLAKE_URL!,
// e.g. "snowflake://analyst:pass@xy12345.us-east-1/ANALYTICS/PUBLIC?warehouse=COMPUTE_WH&role=ATLAS_READONLY"
}),
],
});SQL Dialect
- Use
FLATTEN()for semi-structured data (VARIANT,ARRAY,OBJECTcolumns) - Use
PARSE_JSON()to cast strings to semi-structured types - Use
DATE_TRUNC('month', col)for date truncation (notEXTRACTor dateadd patterns) - Use
QUALIFYfor window function filtering (e.g.QUALIFY ROW_NUMBER() OVER (...) = 1) - Use
TRY_CAST()for safe type conversions that return NULL on failure - Use
$$for dollar-quoted string literals - Identifiers are case-insensitive and stored uppercase by default
VARIANTtype supports semi-structured data -- use:keyor['key']notation to access fields
Validation
The plugin blocks Snowflake-specific statements beyond the standard SQL pipeline: PUT, GET, LIST, REMOVE, RM, MERGE, SHOW, DESCRIBE, DESC, EXPLAIN, and USE. The AST parser uses Snowflake mode (native support in node-sql-parser).
Security
Snowflake has no session-level read-only mode. Atlas enforces SELECT-only access via SQL validation (regex + AST parsing), but there is no database-level defense-in-depth like ClickHouse's readonly: 1 or MySQL's SET SESSION TRANSACTION READ ONLY.
For defense-in-depth, configure the Snowflake connection with a role that has only SELECT privileges:
CREATE ROLE atlas_readonly;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE atlas_readonly;
GRANT USAGE ON DATABASE ANALYTICS TO ROLE atlas_readonly;
GRANT USAGE ON SCHEMA ANALYTICS.PUBLIC TO ROLE atlas_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA ANALYTICS.PUBLIC TO ROLE atlas_readonly;
GRANT SELECT ON FUTURE TABLES IN SCHEMA ANALYTICS.PUBLIC TO ROLE atlas_readonly;Then pass ?role=ATLAS_READONLY in the connection URL.
All Atlas queries are tagged with QUERY_TAG = 'atlas:readonly' for audit trail visibility in Snowflake's QUERY_HISTORY. Statement timeout is enforced via ALTER SESSION SET STATEMENT_TIMEOUT_IN_SECONDS.
Salesforce
Connects to Salesforce via the jsforce library. Unlike the other datasource plugins, Salesforce uses SOQL (Salesforce Object Query Language) instead of SQL. The plugin registers a dedicated querySalesforce agent tool -- the agent uses this tool instead of executeSQL when querying Salesforce data.
Installation
bun add jsforceConfiguration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | -- | Connection URL in the format salesforce://user:pass@login.salesforce.com?token=TOKEN |
The URL is parsed into Salesforce connection options:
| URL component | Maps to |
|---|---|
| hostname | loginUrl (defaults to login.salesforce.com; use test.salesforce.com for sandboxes) |
| username | username |
| password | password |
?token= | securityToken (appended to password for login) |
?clientId= | clientId (for OAuth Connected App) |
?clientSecret= | clientSecret (for OAuth Connected App) |
Usage
// atlas.config.ts
import { defineConfig } from "@atlas/api/lib/config";
import { salesforcePlugin } from "@atlas/plugin-salesforce-datasource";
export default defineConfig({
plugins: [
salesforcePlugin({
url: process.env.SALESFORCE_URL!,
// e.g. "salesforce://analyst@company.com:password@login.salesforce.com?token=SECURITY_TOKEN"
}),
],
});The querySalesforce Tool
The plugin automatically registers a querySalesforce tool during initialization. The agent uses this tool instead of executeSQL for Salesforce queries. The tool:
- Validates SOQL structure (SELECT-only, no DML, no semicolons)
- Checks the object whitelist against the semantic layer
- Auto-appends a
LIMITclause (respectsATLAS_ROW_LIMIT, default 1000) - Enforces query timeout (respects
ATLAS_QUERY_TIMEOUT, default 30s) - Scrubs sensitive error messages (credentials, session IDs) before returning them to the agent
SOQL Dialect
SOQL is not SQL. The agent receives these hints to write valid SOQL:
- SOQL queries Salesforce objects, not database tables
- No JOINs -- use relationship queries instead (e.g.
SELECT Account.Name FROM Contact) - Parent-to-child: subquery in SELECT (e.g.
SELECT Id, (SELECT LastName FROM Contacts) FROM Account) - Child-to-parent: dot notation (e.g.
SELECT Account.Name FROM Contact) - Aggregate functions:
COUNT(),SUM(),AVG(),MIN(),MAX(),COUNT_DISTINCT() GROUP BYandHAVINGare supported- Date literals:
YESTERDAY,TODAY,LAST_WEEK,THIS_MONTH,LAST_N_DAYS:n, etc. - No wildcards in field lists -- always list specific fields (no
SELECT *)
Validation
Salesforce uses a custom SOQL validation pipeline instead of the standard node-sql-parser-based SQL validation:
- Empty check -- reject empty/whitespace queries
- Semicolons -- reject statement chaining
- Regex mutation guard -- block
INSERT,UPDATE,DELETE,UPSERT,MERGE,UNDELETE(string literals are stripped first to prevent false positives) - SELECT-only -- query must start with
SELECT - Object whitelist -- objects in
FROMclauses must be in the semantic layer (case-insensitive). Parent-to-child relationship subqueries in the SELECT list are skipped since they use relationship names, not object names
Security
The jsforce session is stateful and cached -- the plugin handles session expiry by automatically re-authenticating and retrying the query. Error messages are scrubbed for sensitive patterns (passwords, credentials, session IDs, login errors) before being returned to the agent or user.
Salesforce enforces object-level and field-level security server-side based on the connected user's profile and permission sets. Use a dedicated integration user with read-only access to the objects the agent should query.
Comparison
| Feature | ClickHouse | DuckDB | MySQL | Snowflake | Salesforce |
|---|---|---|---|---|---|
| Driver package | @clickhouse/client | @duckdb/node-api | mysql2 | snowflake-sdk | jsforce |
| Transport | HTTP | In-process | TCP pool | SDK pool | REST API |
| Query language | ClickHouse SQL | DuckDB SQL | MySQL SQL | Snowflake SQL | SOQL |
| Parser dialect | PostgreSQL (closest) | PostgreSQL (closest) | MySQL | Snowflake | Custom (regex) |
| Read-only enforcement | readonly: 1 per query | READ_ONLY access mode | READ ONLY session | SQL validation only | SELECT-only validation |
| Agent tool | executeSQL | executeSQL | executeSQL | executeSQL | querySalesforce |
| Connection caching | No (stateless HTTP) | Yes (in-process) | Yes (pool) | Yes (pool) | Yes (session) |