Pipeline Overview
Three commands cover the full workflow: analyze the current state, snapshot it for history, and diff against a previous snapshot to see what changed.
Analyze
Reads PostgreSQL SQL dump and Airtable JSON export. Runs all assessments. Outputs analysis.json and a Markdown report.
Snapshot
Saves the current analysis as a dated file in data/snapshots/. Keeps a history of the platform state over time.
Diff
Compares two snapshots and reports every change — tables added or removed, field count shifts, complexity tier movements, overlap changes.
Quick Start
# PostgreSQL: dump schema (no data)
pg_dump --schema-only --no-owner --no-privileges your_database > data/schema.sql
# Airtable: export base schema via API or scripting extension
# Save to data/efi_airtable.json
npm run analyze
# Output:
# data/analysis.json ← structured data (consumed by pipeline)
# data/report.md ← human-readable Markdown report
npm run snapshot
# → Saves data/snapshots/2026-02-18.json
# Terminal report:
npm run diff -- data/snapshots/2026-01-15.json
# Markdown report:
npm run diff:report -- data/snapshots/2026-01-15.json
What Gets Analyzed
The pipeline runs six analyzers, each focused on a different aspect of the platform.
| Analyzer | Input | What It Detects |
|---|---|---|
parse-sql-schema |
SQL dump | Tables, columns, types, constraints, indexes, enums, sequences |
analyze-postgres |
Parsed SQL | Soft-delete pattern, meta columns, _id conventions, implicit relationships, FK coverage, index gaps |
analyze-airtable |
Airtable JSON | Formula complexity tiers, pattern dedup (65% reduction), cross-table reuse, function usage, per-table stats |
analyze-overlap |
Both schemas | Entity matching (alias + fuzzy), PG-only and AT-only tables, domain ownership |
analyze-orphans |
PG analysis | Generates SQL to check referential integrity on _id columns without FKs |
analyze-diff |
Two snapshots | Tables added/removed, column changes, complexity shifts, overlap changes |
Reading the Diff Report
The diff covers both systems side by side and calls out exactly what moved.
PostgreSQL Changes
Tables added or removed, column count shifts, new foreign keys or constraints, index changes, soft-delete coverage changes, architecture pattern shifts.
Airtable Changes
Tables added or removed, field counts per table, computed field growth or reduction, complexity tier movements (did complex formulas increase?), new or resolved complex formulas listed by name, formula function usage changes.
Overlap Changes
New shared entities appearing between systems, entities moving from shared to single-system, changes in PG-only and AT-only counts.
Diff output formats: The terminal command (npm run diff) prints a
formatted box-drawing report to stdout. The report command (npm run diff:report) writes
a Markdown file to data/diff-report.md that you can open in any editor
or share. Both accept the same arguments.
Configuration
Edit data/analysis-config.json to tune the analysis without changing code.
{
"aliases": {
"customers": ["Customers"],
"suppliers": ["Suppliers"],
"my_pg_table": ["My Airtable Table"]
},
"domainOwnership": {
"customers": "shared",
"suppliers": "shared"
},
"excludeTables": {
"postgresql": ["temp_migration_table"],
"airtable": ["Archive"]
}
}
| Setting | Purpose |
|---|---|
aliases |
Map PostgreSQL table names to Airtable table names for overlap detection. Prevents false positives from fuzzy matching. |
domainOwnership |
Override which system owns an overlapping entity. Values: postgresql, airtable, shared. |
excludeTables |
Tables to ignore during analysis (migration artifacts, temp tables, etc.). |
Output Files
| File | Format | Purpose |
|---|---|---|
data/analysis.json |
JSON | Full structured analysis — all metrics, tables, relationships, complexity tiers. Machine-readable. |
data/report.md |
Markdown | Human-readable report with tables, charts, and summaries. Open in any editor. |
data/snapshots/*.json |
JSON | Dated copies of analysis.json for historical comparison. |
data/diff-report.md |
Markdown | Change report comparing two snapshots. Generated by npm run diff:report. |
CLI Reference
npm run analyze # Full analysis + report (verbose)
npm run snapshot # Save dated snapshot
npm run diff -- <before> # Terminal diff report
npm run diff:report -- <before> # Markdown diff report
node analyzers/run-analysis.js \
--pg-sql data/schema.sql \
--airtable data/efi_airtable.json \
--config data/analysis-config.json \
--orphan-results data/orphan-results.json \
--output data/analysis.json \
--verbose --pretty
node analyzers/run-diff.js \
data/snapshots/2026-01-15.json \
data/analysis.json \
--output data/diff.json \
--json
Suggested Workflow
Monthly cadence: Export schemas → run analyze → snapshot → diff against last month → review the change report. This takes about 5 minutes and gives you a clear picture of schema drift, formula sprawl, and overlap evolution.