Keep Your Semantic Layer in Sync with Schema Changes
Detect database schema drift, update entity YAMLs, and preserve manual enrichments using the Atlas CLI.
Your database schema evolves — columns get added, tables get dropped, types change. When the schema drifts from your semantic layer (see key concepts for definitions), the agent may generate incorrect SQL or miss new data entirely. This guide walks through detecting drift and updating your semantic layer to match.
Prerequisites
- Atlas CLI installed (
bun install) - A semantic layer already generated via
atlas init(entity YAMLs exist insemantic/entities/) ATLAS_DATASOURCE_URLset to your database connection string- Your database schema has changed since the last
atlas init
Detecting drift
Run atlas diff to compare your live database schema against the entity YAMLs in semantic/entities/:
# Compare all tables against their entity YAML files
bun run atlas -- diff
# Compare only specific tables (comma-separated)
bun run atlas -- diff --tables orders,customers
# Target a non-public PostgreSQL schema
bun run atlas -- diff --schema analytics
# Compare against a per-source subdirectory (semantic/warehouse/entities/)
bun run atlas -- diff --source warehouseThe command connects to your database, profiles the current schema, and compares it against the YAML files. The output groups changes into three categories:
New tables
Tables that exist in the database but have no corresponding entity YAML:
New tables (in DB, not in semantic layer):
+ user_preferences (8 columns)
+ audit_events (12 columns)These tables are invisible to the agent — it cannot query them until you create entity files.
Removed tables
Tables that have entity YAMLs but no longer exist in the database:
Removed tables (in semantic layer, not in DB):
- legacy_sessionsQueries referencing these tables will fail. Remove or archive the stale YAML files.
Changed tables
Tables that exist in both but have column, type, foreign key, or metadata differences:
Changed tables:
orders
+ added column: shipping_method (string)
- removed column: legacy_status (string)
~ type changed: amount — YAML: integer, DB: number
+ added FK: warehouse_id→warehouses.id
- removed FK: old_ref_id→legacy_refs.id
~ type changed: view → materialized_view
~ partition strategy added: RANGEEach line uses a prefix to indicate the change type:
+— something new in the database-— something removed from the database~— a value changed between YAML and database
Metadata changes include object type drift (e.g., table to view), partition strategy, and partition key changes.
At the bottom, a summary line counts all changes present in the diff:
Summary: 2 new tables, 1 removed, 3 changed (2 columns added, 1 removed, 1 type change, 1 FK added, 1 metadata change)If everything is in sync, you'll see:
Atlas Diff — comparing database against semantic/entities/
No drift detected — semantic layer is in sync with the database.Update workflow
Run atlas diff to see what changed
bun run atlas -- diffReview the output carefully. Not every change requires action — some new columns may be internal and shouldn't be exposed to the agent.
Decide your update strategy
You have two approaches:
Full regeneration — re-run atlas init to regenerate all entity files from scratch. This is the simplest approach but overwrites all manual enrichments (descriptions, query patterns, measures, virtual dimensions).
Manual update — edit the YAML files directly to add, remove, or modify dimensions. This preserves all your enrichments but requires more effort.
atlas init deletes all existing .yml files in semantic/entities/ and semantic/metrics/, and overwrites catalog.yml and glossary.yml. Any manual enrichments — custom descriptions, query patterns, measures, virtual dimensions — will be lost. See Preserving customizations before regenerating.
Apply the changes
Option A: Full regeneration
# Regenerate all entity files
bun run atlas -- init
# Regenerate and add LLM-generated descriptions
bun run atlas -- init --enrichOption B: Manual YAML edits
For a new column, add a dimension entry to the entity file:
# semantic/entities/orders.yml — add the new column
dimensions:
# ... existing dimensions ...
- name: shipping_method
sql: shipping_method
type: string
description: Delivery method selected by the customer
sample_values: [standard, express, overnight]For a removed column, delete the corresponding dimension entry. For a type change, update the type field.
Validate the updated semantic layer
Run atlas validate to check for syntax errors, missing fields, and broken cross-references — without needing a database connection:
bun run atlas -- validateExample output:
Atlas Validate
✓ atlas.config.ts Valid (defineConfig)
✓ semantic/entities/ 12 entities parsed
✓ semantic/glossary.yml Valid (8 terms)
✓ semantic/catalog.yml Valid
✓ semantic/metrics/ 5 metrics parsedIndividual warnings (missing descriptions, unused entities, broken join references) appear as additional lines with ⚠ or ✗ icons. Fix any errors before deploying.
Confirm drift is resolved
Run atlas diff again to verify the semantic layer matches the database:
bun run atlas -- diffYou should see:
No drift detected — semantic layer is in sync with the database.Common scenarios
New table added to database
A new user_preferences table was created. atlas diff reports:
New tables (in DB, not in semantic layer):
+ user_preferences (8 columns)To fix: Either regenerate with atlas init, or manually create semantic/entities/user_preferences.yml:
name: UserPreferences
type: dimension_table
table: user_preferences
grain: one row per user preference setting
description: User-level configuration and preference settings.
dimensions:
- name: id
sql: id
type: integer
description: Unique preference record identifier
primary_key: true
- name: user_id
sql: user_id
type: integer
description: Reference to the user
- name: theme
sql: theme
type: string
description: Selected UI theme
sample_values: [light, dark, system]
joins:
- target_entity: Users
relationship: many_to_one
join_columns:
from: user_id
to: id
description: user_preferences.user_id → users.idThen add the entity to semantic/catalog.yml so the agent knows when to use it.
Column renamed or type changed
The orders.amount column was changed from integer to numeric. atlas diff reports:
Changed tables:
orders
~ type changed: amount — YAML: integer, DB: numberTo fix: Update the dimension type in semantic/entities/orders.yml:
dimensions:
- name: amount
sql: amount
type: number # Changed from integer to number
description: Order total in dollarsIf a column was renamed (e.g., status to order_status), atlas diff shows it as a removed column plus an added column:
Changed tables:
orders
+ added column: order_status (string)
- removed column: status (string)To fix: Update the dimension's name and sql fields. Update any references in query patterns, measures, or the glossary.
Table dropped from database
The legacy_sessions table was dropped. atlas diff reports:
Removed tables (in semantic layer, not in DB):
- legacy_sessionsTo fix: Delete semantic/entities/legacy_sessions.yml and remove the entry from semantic/catalog.yml. Also remove any joins from other entities that reference it, and any metrics in semantic/metrics/legacy_sessions.yml. Run atlas validate to catch any broken cross-references.
New relationships (joins) added
A new foreign key orders.warehouse_id → warehouses.id was added. atlas diff reports:
Changed tables:
orders
+ added FK: warehouse_id→warehouses.idTo fix: Add a join entry to semantic/entities/orders.yml:
joins:
# ... existing joins ...
- target_entity: Warehouses
relationship: many_to_one
join_columns:
from: warehouse_id
to: id
description: orders.warehouse_id → warehouses.idMake sure semantic/entities/warehouses.yml exists — if the warehouses table is also new, create its entity file first.
Preserving customizations
When you run atlas init, it deletes all existing YAML files in semantic/entities/ and semantic/metrics/ before generating new ones. This means manual enrichments are lost:
Preserved across atlas init | Lost across atlas init |
|---|---|
| Nothing — all files are regenerated | Custom descriptions |
| Query patterns | |
| Virtual dimensions | |
| Measures (regenerated from scratch) | |
| Manually added sample values | |
| Glossary terms (regenerated) | |
| Catalog customizations (regenerated) |
Strategies to preserve enrichments
1. Back up before regenerating
# Back up existing semantic layer (preserves your enrichments)
cp -r semantic/ semantic-backup/
# Regenerate all entity files from the database
bun run atlas -- init
# Compare old and new to manually merge back your customizations
# diff semantic-backup/entities/orders.yml semantic/entities/orders.yml2. Use manual edits for small changes
If atlas diff shows only a few changes (a new column, a type change), edit the YAML files directly instead of regenerating. This preserves all your enrichments.
3. Use --enrich to restore LLM descriptions
After regenerating, run enrichment to fill in descriptions:
bun run atlas -- init --enrichThis adds LLM-generated descriptions for any fields that lack them. It won't restore your hand-written descriptions, but it provides a starting point.
As a rule of thumb: use atlas init for initial setup or major schema overhauls. Use manual YAML edits for incremental changes. The more you've customized your semantic layer, the more reason to update manually.
CI integration
Run atlas diff in CI to catch schema drift before it causes agent errors. A non-zero exit when drift is detected lets you fail the pipeline or post a warning.
# Fail the CI pipeline if the semantic layer is out of sync with the database
- name: Check schema drift
env:
ATLAS_DATASOURCE_URL: ${{ secrets.ATLAS_DATASOURCE_URL }}
run: |
# Exit code 0 = in sync, exit code 1 = drift detected
bun run atlas -- diffThis is especially useful when:
- Database migrations run in a separate pipeline from application deploys
- Multiple teams modify the schema independently
- You want to enforce that semantic layer updates are part of every migration PR
atlas diff requires a live database connection (ATLAS_DATASOURCE_URL). For offline validation of YAML syntax and cross-references, use atlas validate instead — it works without any network access.
Troubleshooting
atlas diff shows no changes but agent queries fail
Cause: The entity YAML has the correct columns but the type field doesn't match the database column type (e.g., YAML says integer but the column is numeric). If atlas diff was run against a different database than the agent uses, the mismatch may not appear.
Fix: Run atlas diff against the same database the agent queries (ATLAS_DATASOURCE_URL). Check for type mismatches in the diff output and update the YAML accordingly.
atlas init overwrites my custom descriptions
Cause: atlas init deletes all existing YAML files before regenerating. Custom descriptions, query patterns, measures, and virtual dimensions are lost.
Fix: Back up your semantic layer before regenerating: cp -r semantic/ semantic-backup/. After regeneration, manually merge back your customizations. For small changes, prefer manual YAML edits over full regeneration. See Preserving customizations.
atlas validate passes but queries still fail
Cause: atlas validate checks YAML syntax and cross-references offline — it doesn't verify that tables and columns actually exist in the database.
Fix: Run atlas diff (requires a database connection) to verify that your YAML matches the live schema. validate catches structural issues; diff catches drift.
For more, see Troubleshooting.
See Also
- CLI Reference —
atlas diffandatlas initcommand flags - Semantic Layer — How entity YAML files define the semantic layer
- Semantic Layer Concepts — Definitions of entities, dimensions, measures, and joins
- Troubleshooting — Diagnostic steps when CLI commands fail