Atlas vs Raw MCP

Why connecting an AI client directly to your database via MCP is risky — and what a semantic layer adds.

MCP (Model Context Protocol) lets AI clients like Claude Desktop and Cursor call external tools. Database MCP servers like DBHub give the AI direct access to your schema and the ability to run arbitrary SQL.

This works for quick exploration. It does not work for production.

This page explains what you lose when the AI talks directly to your database, and what Atlas adds on top of MCP.

The problem with raw MCP

When you point Claude Desktop at a database MCP server, the AI sees raw schema metadata — column names like dim_user_status_cd, fact_txn_amt, is_del_flg. It has no business context, no guardrails, and no audit trail.

Three things go wrong:

The AI guesses. Without descriptions, sample values, or documented relationships, the AI infers what columns mean from their names. It gets it wrong often enough to be dangerous — silently returning incorrect numbers that look plausible.
Nothing validates the SQL. The generated query runs directly against the database. There's no mutation guard, no table whitelist, no row-level filtering. A prompt injection or hallucinated query could read sensitive tables or, if the connection has write access, modify data.
Nobody knows what happened. No audit log records which user asked which question, what SQL ran, or what data was returned. For compliance-regulated environments, this is a non-starter.

What Atlas adds

Atlas sits between the AI and the database, providing context before query generation and validation before execution.

Capability	Raw MCP	Atlas
Business context	Column names only	Entity YAML with descriptions, sample values, joins, metrics, glossary + dynamic learned patterns
SQL validation	None	7-layer pipeline (empty check, regex guard, AST parse, table whitelist, RLS, auto-LIMIT, timeout)
Table access control	Full schema visible	Only tables defined in the semantic layer
Row-level security	None	Automatic WHERE clause injection based on user identity (multi-column, array, OR-logic)
Mutation protection	Depends on DB user permissions	SELECT-only enforcement at the application layer, independent of DB permissions
Audit logging	None	Every query logged with user attribution, timing, tables/columns accessed, data classification
PII protection	None	PII detection and column masking with role-based strategies
Sandboxed execution	None	Explore tool isolated via nsjail, sidecar, or BYOC backends
Auth & enterprise	None	SSO/SCIM, custom roles, IP allowlists, approval workflows
Admin console	None	Connections, users, plugins, semantic editor, analytics, notebooks, billing
Chat integrations	None	8 platforms via Chat SDK (Slack, Teams, Discord, Telegram, Google Chat, GitHub, Linear, WhatsApp)
Data residency	None	3-region deployment (US, EU, APAC) with misrouting detection

Side-by-side: the same question, two approaches

Question: "What's our monthly revenue by product?"

Raw MCP

The AI sees the schema and guesses:

-- The AI picks the wrong table — fact_order_lines stores
-- gross amounts before discounts and refunds.
-- It also misses the soft-delete flag, including deleted orders.
SELECT
  DATE_TRUNC('month', order_date) AS month,
  product_name,
  SUM(fact_txn_amt) AS revenue
FROM fact_order_lines
GROUP BY 1, 2
ORDER BY 1 DESC

Problems:

fact_txn_amt is gross amount, not net revenue
Deleted rows (is_del_flg = 1) are included
No LIMIT — could return thousands of rows
No timeout — a full table scan on a large table could run for minutes

Atlas

The agent reads the semantic layer first, which defines revenue clearly:

# semantic/entities/orders.yml — the agent reads this before writing SQL
table: orders
dimensions:
  product_name:
    type: string
    description: Display name of the product
  created_at:
    type: timestamp
    description: When the order was placed
  is_deleted:
    type: boolean
    description: Soft-delete flag — exclude from all queries
measures:
  - name: net_revenue
    sql: "total - discount - refund_amount"
    type: sum
    description: Net revenue after discounts and refunds

The agent generates:

-- Uses the documented measure definition for net revenue.
-- Filters out soft-deleted rows as the entity description instructs.
-- Atlas appends LIMIT 1000 to the query and sets a 30-second statement timeout on the connection.
SELECT
  DATE_TRUNC('month', created_at) AS month,
  product_name,
  SUM(total - discount - refund_amount) AS net_revenue
FROM orders
WHERE is_deleted = false
GROUP BY 1, 2
ORDER BY 1 DESC

Before this query reaches the database, Atlas validates it through the SQL validation pipeline:

Empty check — rejects empty or whitespace-only input
Regex guard — confirms no DML/DDL keywords
AST parse — confirms single SELECT statement
Table whitelist — orders is in the semantic layer, allowed
RLS injection — if configured, injects a WHERE clause like AND tenant_id = 'acme-corp' based on the user's JWT claims
Auto LIMIT — appends LIMIT 1000
Statement timeout — sets 30-second deadline on the database connection

Security implications

The security gap between raw MCP and Atlas is not theoretical. Here are three concrete risks:

No validation = SQL injection surface

MCP servers pass AI-generated SQL directly to the database. If a user crafts a prompt that causes the AI to generate SELECT 1; DROP TABLE users, there's nothing between that SQL and your database except the connection's permission level. Atlas blocks this at multiple layers — the regex guard catches DROP, and the AST parser rejects multi-statement queries.

No RLS = data leakage

In a multi-tenant application, raw MCP has no concept of row-level isolation. The AI can query any row in any table the connection can access. Atlas injects WHERE clauses automatically based on the authenticated user's identity claims — fail-closed, so missing claims block the query entirely.

No audit = compliance gap

Regulations like SOC 2, HIPAA, and GDPR require knowing who accessed what data and when. Raw MCP provides no audit trail. Atlas logs every query — successful and failed — with user attribution, timing, the SQL that ran, and whether results were truncated.

When raw MCP is fine

Raw MCP is a reasonable choice when:

Local development — You're exploring your own database in Claude Desktop or Cursor, iterating on queries, and don't need guardrails because the data isn't sensitive.
Read-only on non-sensitive data — The database contains only public or non-regulated data, the connection is strictly read-only at the database level, and you're the only user.
Prototyping — You're testing whether text-to-SQL is useful for your use case before investing in a semantic layer.

Once any of these change — multiple users, sensitive data, production workloads, compliance requirements — you need the semantic layer, validation, and audit trail that Atlas provides.

Atlas has its own MCP server (stdio + SSE transport) that gives Claude Desktop and Cursor the same validated, context-aware SQL execution as the API. You don't have to choose between MCP compatibility and safety.

The semantic layer advantage

The core difference isn't just validation — it's context. Raw MCP gives the AI a schema. Atlas gives it understanding.

A schema tells the AI that a column named total exists and is numeric. The semantic layer tells it that total is the order total in USD before discounts, that net revenue is calculated as SUM(total - discount - refund_amount), and that rows with is_deleted = true should always be excluded.

This matters because the quality of AI-generated SQL is bounded by the quality of the context the AI receives. Without business context, an AI agent can write syntactically correct SQL that returns confidently wrong answers — and nobody catches it until a decision has already been made on bad data.

A semantic layer is that governance. It's the difference between an AI that can query your database and one that understands your business.

MCP Server — use Atlas as an MCP server with full validation and semantic context
SQL Validation Pipeline — the 7-layer defense-in-depth pipeline
Row-Level Security — automatic tenant isolation
Semantic Layer — how entity YAML files provide business context
Atlas vs Alternatives — comparisons with WrenAI, Vanna, and Metabase

Atlas vs Raw MCP

On this page