How to Analyze — EFX Platform

Pipeline Overview

Three commands cover the full workflow: analyze the current state, snapshot it for history, and diff against a previous snapshot to see what changed.

①

Analyze

Reads PostgreSQL SQL dump and Airtable JSON export. Runs all assessments. Outputs analysis.json and a Markdown report.

②

Snapshot

Saves the current analysis as a dated file in data/snapshots/. Keeps a history of the platform state over time.

③

Diff

Compares two snapshots and reports every change — tables added or removed, field count shifts, complexity tier movements, overlap changes.

Quick Start

1 — Export fresh schemas

# PostgreSQL: dump schema (no data)
pg_dump --schema-only --no-owner --no-privileges your_database > data/schema.sql

# Airtable: export base schema via API or scripting extension
# Save to data/efi_airtable.json

2 — Run analysis

npm run analyze

# Output:
#   data/analysis.json   ← structured data (consumed by pipeline)
#   data/report.md       ← human-readable Markdown report

3 — Save a snapshot

npm run snapshot

# → Saves data/snapshots/2026-02-18.json

4 — Compare against a previous snapshot

# Terminal report:
npm run diff -- data/snapshots/2026-01-15.json

# Markdown report:
npm run diff:report -- data/snapshots/2026-01-15.json

What Gets Analyzed

The pipeline runs six analyzers, each focused on a different aspect of the platform.

Analyzer	Input	What It Detects
`parse-sql-schema`	SQL dump	Tables, columns, types, constraints, indexes, enums, sequences
`analyze-postgres`	Parsed SQL	Soft-delete pattern, meta columns, _id conventions, implicit relationships, FK coverage, index gaps
`analyze-airtable`	Airtable JSON	Formula complexity tiers, pattern dedup (65% reduction), cross-table reuse, function usage, per-table stats
`analyze-overlap`	Both schemas	Entity matching (alias + fuzzy), PG-only and AT-only tables, domain ownership
`analyze-orphans`	PG analysis	Generates SQL to check referential integrity on _id columns without FKs
`analyze-diff`	Two snapshots	Tables added/removed, column changes, complexity shifts, overlap changes

Reading the Diff Report

The diff covers both systems side by side and calls out exactly what moved.

PostgreSQL Changes

Tables added or removed, column count shifts, new foreign keys or constraints, index changes, soft-delete coverage changes, architecture pattern shifts.

Airtable Changes

Tables added or removed, field counts per table, computed field growth or reduction, complexity tier movements (did complex formulas increase?), new or resolved complex formulas listed by name, formula function usage changes.

Overlap Changes

New shared entities appearing between systems, entities moving from shared to single-system, changes in PG-only and AT-only counts.

i

Diff output formats: The terminal command (npm run diff) prints a formatted box-drawing report to stdout. The report command (npm run diff:report) writes a Markdown file to data/diff-report.md that you can open in any editor or share. Both accept the same arguments.

Configuration

Edit data/analysis-config.json to tune the analysis without changing code.

data/analysis-config.json

{
  "aliases": {
    "customers": ["Customers"],
    "suppliers": ["Suppliers"],
    "my_pg_table": ["My Airtable Table"]
  },
  "domainOwnership": {
    "customers": "shared",
    "suppliers": "shared"
  },
  "excludeTables": {
    "postgresql": ["temp_migration_table"],
    "airtable": ["Archive"]
  }
}

Setting	Purpose
`aliases`	Map PostgreSQL table names to Airtable table names for overlap detection. Prevents false positives from fuzzy matching.
`domainOwnership`	Override which system owns an overlapping entity. Values: `postgresql`, `airtable`, `shared`.
`excludeTables`	Tables to ignore during analysis (migration artifacts, temp tables, etc.).

Output Files

File	Format	Purpose
`data/analysis.json`	JSON	Full structured analysis — all metrics, tables, relationships, complexity tiers. Machine-readable.
`data/report.md`	Markdown	Human-readable report with tables, charts, and summaries. Open in any editor.
`data/snapshots/*.json`	JSON	Dated copies of analysis.json for historical comparison.
`data/diff-report.md`	Markdown	Change report comparing two snapshots. Generated by `npm run diff:report`.

CLI Reference

npm scripts

npm run analyze          # Full analysis + report (verbose)
npm run snapshot         # Save dated snapshot
npm run diff -- <before> # Terminal diff report
npm run diff:report -- <before>   # Markdown diff report

Direct CLI (advanced)

node analyzers/run-analysis.js \
  --pg-sql data/schema.sql \
  --airtable data/efi_airtable.json \
  --config data/analysis-config.json \
  --orphan-results data/orphan-results.json \
  --output data/analysis.json \
  --verbose --pretty

node analyzers/run-diff.js \
  data/snapshots/2026-01-15.json \
  data/analysis.json \
  --output data/diff.json \
  --json

Suggested Workflow

✓

Monthly cadence: Export schemas → run analyze → snapshot → diff against last month → review the change report. This takes about 5 minutes and gives you a clear picture of schema drift, formula sprawl, and overlap evolution.

MonthlyExport both schemas, run analysis, snapshot, diff against previous month.

After any Airtable restructureRe-export Airtable JSON, re-analyze. Check if complexity tiers shifted.

After PostgreSQL migrationsRe-export SQL dump, re-analyze. Verify new FKs/constraints show up.

Before architecture decisionsRun a fresh analysis so recommendations are based on current state, not stale numbers.