Market Research Agent — EFX Platform

System Architecture

The agent follows a research → write → validate → review → learn cycle. Admin edits become distilled guidelines that shape future generations — a continuous learning loop.

Foundation

📄 Sample Reports PDFs define voice, categories, structure

📚 Knowledge Base Price history + correction guidelines

↓

Data Layer

ERP (Postgres) FOB prices, routes, deadlines

CRM (Airtable) Customers, contracts, products

Web Search Claude + 20 live searches

Price History 12-month commodity accumulator

↓

Prompt Construction

MemoryPrior month context

Guidelines25 learned rules

ERP ContextReal prices & routes

↓

AI Pipeline

Pass 1: Research Claude Sonnet 4.6 + web_search

~8K in / ~4K out

→

Pass 2: Writer Claude Sonnet 4.6 × 7 calls

~18K in / ~1.8K out

→

Pass 3: Validate Google Gemini 2.5 Flash

~8K in / ~600 out

↓

Output & Review

Draft Report 7 sections + confidence score

→

Admin Review Edit, re-prompt, hide, reorder, add sections

→

Published Report PDF + Web + Customer view

↓

Learning Loop

Admin Edits Every edit = correction entry

→

Haiku Distillation Corrections → max 25 rules

→

Guidelines + Memory Injected into next generation

↺ Feeds back into Prompt Construction

Data Sources

Every report combines four independent data streams. The AI never invents numbers — prices, routes, and deadlines come from verified sources.

🗃

ERP (PostgreSQL)

Database efi_erp_2025_14 on DigitalOcean. Supplies:

FOB prices from supplier_purchase_orders (781 POs)
Transit times from ref_otw_transit_times (84 port pairs)
Ocean freight rates from ref_landed_cost_estimates
Sourcing deadlines from supplier_capacities

📋

CRM (Airtable)

12 fetchers pull customer-facing data:

Customer profile from Customers table
Monthly snapshot, YTD totals from Sales Orders
Sourcing analysis, pricing recap from Contract Line Items
Product logos from Products table

🌐

Web Research (Claude)

Pass 1 uses Claude Sonnet 4.6 with the web_search tool to perform up to 20 live searches across 9 research categories:

US tariffs & trade policy (HTS 3823, 1511, 1516)
Palm oil pricing (CPO, stearin, PFAD, palmitic acid)
Ocean freight indices (FBX01, FBX03)
Currency, supply/demand, weather, geopolitical
US dairy market (CME butter, Class III/IV, butterfat)

📈

Price History (Accumulated)

Rolling 12-month commodity price archive, updated each generation run:

CPO (Bursa Malaysia, MYR/MT)
Palm Stearin (USD/MT)
PFAD (USD/MT)
Palmitic Acid (USD/MT)

Stored in knowledge/price-history.json — feeds the commodity pricing chart on the report.

📄

Sample reports as training data. Historical PDFs in data/samples/ (Oct 2024, Nov 2024, Aug 2021, Dec 2018) defined the report structure, section categories, bullet format, and narrative voice. These informed the prompt engineering — the AI is instructed to match this specific style.

AI Pipeline — Three Passes

Each monthly report requires exactly 9 API calls across two AI providers. Using a separate model for validation eliminates self-confirmation bias.

Pass 1 — Research Agent

Model: Claude Sonnet 4.6 with web_search tool

Calls: 1 (multi-turn with up to 20 searches)

Tokens: ~8K input / ~4K output

View full prompt →

The research agent executes a structured search plan across 9 priority-ordered categories. It returns a typed MarketResearch JSON with every data point sourced — including the URL and retrieval date.

9 Research Categories (priority order)

1 US Trade Policy & Tariffs

2 Palm Oil Pricing (CPO, stearin, PFAD)

3 Currency (MYR/USD, IDR/USD)

4 Supply & Demand

5 Ocean Freight (FBX01, FBX03)

6 Weather & Seasonal

7 Geopolitical Events

8 Historical Price Data (12 months)

9 US Dairy Market (CME, USDA)

Pass 2 — Narrative Writer

Model: Claude Sonnet 4.6 (no web search)

Calls: 7 sequential (one per section)

Tokens: ~18K input / ~1.8K output

Temperature: 0.3 (deterministic)

View full prompt →

Each section call receives the same base context (research JSON + ERP data + memory + guidelines) plus section-specific instructions. The writer never searches — it only shapes existing data into narrative.

7 Sequential Calls → 7 Sections

Assessment5 bullets + opening

Top News2 logistics + 2 pricing

Trade Policy3 policy bullets

Price FactorsBearish/bullish matrix

Ocean Freight3–4 freight bullets

Action ItemsExactly 3 CTAs

Asia InsightIndonesia + Malaysia + India

Pass 3 — Validation Agent

Model: Google Gemini 2.5 Flash (independent provider)

Calls: 1

Tokens: ~8K input / ~600 output

View full prompt →

A separate AI provider cross-checks every narrative claim against the research JSON. Using Gemini (not Claude) eliminates self-confirmation bias — the validator has no knowledge of how the narrative was generated.

Validation checks

✓ Supported — claim backed by research data

✗ Unsupported — no research backing

! Contradiction — conflicts with data

⏲ Stale — data older than 14 days

💬 Note — observation or clarification

Overall confidence = 40% rule-based data quality + 60% Gemini AI validation (0–100)

Prompt Engineering

Each narrative call injects four layers of context into the prompt. This is how the system learns — prior corrections shape future output without retraining the model.

Layer 1 Cross-Month Memory

The system stores a summary for each prior month: market direction (bearish/bullish/neutral), key prices, top themes, and admin feedback scores. Injected as:

"PREVIOUS REPORT (Feb 2026):
Direction: bullish
CPO: 4,186 MYR/MT (up)
Themes: tariff uncertainty, freight tightening"

Stored in .narrative-cache/agent-memory.json — max 12 months

Layer 2 Distilled Guidelines

Every admin edit generates a correction entry. Claude Haiku periodically distills the full correction log into max 25 actionable rules, grouped by section.

"EDITORIAL GUIDELINES:
• Always include specific dates
• Never hedge with 'seems' or 'appears'
• Max 200 chars per bullet point"

Stored in knowledge/corrections-guidelines.json — version-controlled

Layer 3 ERP Context

Real business data from EFI's systems. The AI uses these as ground truth for prices, routes, and deadlines — never inventing numbers.

"EFI DATA:
FOB: MagnaPalm $845/MT (-2.1%)
Route: Belawan→Houston 32 days
Deadline: April orders by Mar 15"

Extracted live from PostgreSQL + Airtable

Layer 4 Voice & Structure Rules

Hard-coded in the system prompt, derived from the sample PDFs in data/samples/. Defines how the AI writes — tone, length, formatting, audience.

"VOICE:
• Max 200 chars per bullet
• Use 'EFI' not 'we'
• Confident: 'we expect' not 'seems'"

Hard-coded in narrative-writer.ts system prompt

💡

Voice rules are embedded in the system prompt. The writer is instructed to be "professional but accessible, written for dairy farmers" — confident ("we expect" not "it seems"), specific with numbers, action-oriented, and reassuring even with bad news. Max 200 chars per bullet. Use "EFI" not "we". These rules were derived from the sample PDFs uploaded to data/samples/.

The Learning Loop

Every admin interaction with the report creates a feedback signal that shapes future reports. This is the core of the agentic pattern — the system gets better with each cycle.

Admin Interaction → Micro-Learning → Better Next Report

✎

Inline Edit

Admin rewrites a paragraph. Original vs. edited text becomes a correction entry.

💬

Re-prompt

"Make this more specific about tariffs." The instruction + result teaches the system what detail level is expected.

👁

Hide / Show

Hidden sections signal content the AI should de-prioritize. Feedback scores track approval.

📈

Shrink / Expand

AI condenses to ~50% or expands to ~150%. Teaches preferred section length.

✚

Add Section

Create custom sections via AI prompt or manual text. ADDED badge, reordering, PDF support, delete with confirm.

Corrections Distillation Process

1

Admin edits a paragraph

triggers → correction entry

The original AI text and the admin's corrected version are stored in corrections-log.json with a category (factual, tone, structure, omission, emphasis) and explanation.

2

Haiku distills corrections

background → fire-and-forget

Claude Haiku reads the full correction log and distills it into max 25 actionable rules, grouped by section (general, assessment, topNews, priceFactors, oceanFreight, actionItems, asiaInsight). Max 5 rules per section to keep prompts focused.

3

Guidelines stored in version control

knowledge/corrections-guidelines.json

Distilled rules are saved as a versioned JSON file. This means the learning is transparent, auditable, and survives code deploys.

4

Injected into next generation

every narrative prompt includes guidelines

The next time a report is generated, all 25 rules are injected as EDITORIAL GUIDELINES in the system prompt. The AI follows these rules alongside the voice guidelines from the sample reports.

🛠

Cross-month memory is separate from corrections. Memory tracks what happened (prices, direction, themes) for continuity. Guidelines track how to write (rules, style, emphasis) for quality. Both are injected but serve different purposes.

LLM Stack & Cost

Component	Model	Calls	Input Tokens	Output Tokens	Cost
Research Agent	Claude Sonnet 4.6	1 (+ web search)	~8,000	~4,000	$0.07
Narrative Writer	Claude Sonnet 4.6	7 sequential	~18,000	~1,800	$0.07
Validation	Gemini 2.5 Flash	1	~8,000	~600	<$0.01
Corrections Distillation	Claude Haiku 4.5	1 (on-demand)	varies	~1,500	<$0.01
Total per run		9	~34,000	~6,400	~$0.14

$0.14

Per fresh run (Sonnet)

$0.04

Per run (Haiku dev mode)

$1.70

Annual production cost

$28

200-run dev sprint

Caching & Orchestration

Reports are generated once per month and cached on disk. The orchestration layer handles deduplication, rate limiting, and failure recovery.

Disk Cache — .narrative-cache/

{date}.json

Narrative sections (7 sections)

{date}-research.json

Raw research data + sources

{date}-confidence.json

Validation score + findings

{date}-feedback.json

Edits, overrides, hidden sections

agent-memory.json

Cross-month memory (12 months)

report-status.json

Draft / Published states

🔄

Request Deduplication

If multiple users request the same month's report simultaneously, a single AI generation runs and all callers share the same Promise. Prevents redundant API calls and rate limit issues.

⏱

Failure Cooldown

After a generation failure (API error, rate limit), a 5-minute cooldown prevents repeated hammering. Exponential backoff on 429 errors (30s, 60s, 120s).

Admin Editing & Workflow

Rich editorial controls let admins shape AI-generated reports before publishing to customers. Every admin action participates in the learning loop.

✎

Rich Text Editor

Contenteditable + toolbar with bold, italic, underline, and color formatting. Stored as HTML in editorialOverrides. Figures auto-highlighted in non-rich text.

↺

Undo / Redo

Full undo/redo stack for editorial changes. POST /api/undo and POST /api/redo restore previous states. GET /api/undo-status returns stack counts.

📝

Version History

Every editorial change is versioned. GET /api/version-history shows all versions for a section. POST /api/activate-version restores a specific version.

🔎

Content Consistency Checker

Gemini-powered verification of admin edits against the full report. Pre-publish gate: blocks on errors, allows warnings. Purple section highlighting.

📄

Draft / Published Workflow

Reports start as Draft (admin-only). Admin publishes to make visible to customers. Unpublish reverts to draft. Status tracked in report-status.json.

👁

Section & Item Controls

Hide/unhide sections and individual items. Reorder sections and items within sections. Move items between sections. Add custom AI-generated or manual sections with ADDED badge.

Customer Insights Server (Port 3001)

A separate Express server for data-only customer dashboards. No AI dependency — starts instantly and runs without ANTHROPIC_API_KEY.

📈

Customer Dashboard

Per-customer insights: monthly snapshot, YTD totals, pricing recap, sourcing analysis, purchase chart with trailing 12-month data, product logos, and loadout schedule. All 12 data fetchers run in parallel via Promise.all().

📄

PDF Export

Full customer dashboard as PDF via Puppeteer. Admin picker lets you select any customer + month combination. Customer list based on trailing 12-month fulfilled tonnage.

ⓘ

Two independent servers, one package. Market Report (port 3000) and Customer Insights (port 3001) share fetchers, views, and auth code but have different route sets. Start with npm run market and npm run insights.

Hosted Modules (Port 3000)

The market report server hosts several additional modules alongside the core market report. Desktop top nav: Market Report | Trade Intel | Calculators | Guide.

Trade Intelligence

Competitive analysis from Datamyne BOL records: dashboard, importer/supplier rankings, classification watch, entity resolution, and brand discovery. Routes at /trade-intel/*.

Full documentation →

Calculators

Milk Pricing & ROI (live USDA data, EFI sales overlay, AI narrative), Fat Blending (LP optimizer, Product Database targets), Product Database (60+ products, CRUD, 9 categories).

Full documentation →

Mobile PWA

Progressive Web App at /m/* routes. Bottom tab bar: Reports | Intel | Calc. Service worker (network-first), Add to Home Screen from Safari. Intel and Calc sub-pages via pill bars.

Supplier Intel

IMAP email poller for supplier market intelligence (GNNH, RIM, etc.). AI extraction, formatted detail views, reprocess/delete controls. Routes at /supplier-intel/*.

Supplier Intel Email Poller

IMAP-based email ingestion for supplier market intelligence. Automatically polls for emails from key suppliers (GNNH, RIM, etc.), extracts structured data via AI, and presents formatted detail views.

📧

Two-Phase Fetch

Envelopes downloaded first (fast), then full message source per message. 90s socket timeout, 30s greeting/connection. Scans last 50 messages by sequence number.

📄

HTML Extraction

Handles base64 decoding, quoted-printable encoding, HTML tag stripping, and Outlook forwarded emails with nested MIME parts.

🔬

Formatted Detail Views

Source-specific formatting: GNNH (prices table, FOB, production, outlook), RIM (tariffs, AD/CVD, impact), Other (summary, data points). Icon buttons (eye/refresh/trash) with custom confirm dialogs.

🔄

Reprocess & blocklist. The reprocess button re-runs AI extraction with improved text decoding on any previously ingested email. Deleted emails are added to a blocklist to prevent re-ingestion on the next poll cycle.

Key Source Files

File	Purpose	Lines
`fetchers/market-narrative.ts`	Orchestration — caching, dedup, queue, pipeline execution	~1,500
`narrative/research-agent.ts`	Pass 1 — Claude + web_search, 9 research categories	~400
`narrative/narrative-writer.ts`	Pass 2 — 7 sequential section generators + shrink/expand/re-prompt	~600
`narrative/validation-agent.ts`	Pass 3 — Gemini cross-check, confidence scoring	~200
`narrative/memory.ts`	Cross-month memory — prices, direction, themes, feedback	~150
`narrative/corrections.ts`	Correction log + Haiku distillation → 25 editorial rules	~200
`narrative/types.ts`	Section definitions, feedback types, editorial overrides	~250
`narrative/confidence-score.ts`	Rule-based data quality scoring (40% of overall confidence)	~150
`narrative/consistency-checker.ts`	Gemini pre-publish gate — flags contradictions in edited reports	~150
`supplier-intel/email-poller.ts`	IMAP email polling — two-phase fetch, HTML extraction, delete blocklist	~400
`supplier-intel/extractor.ts`	AI extraction — GNNH/RIM/Other format detection, structured data output	~300
`knowledge/corrections-guidelines.json`	Distilled editorial rules (max 25) — version-controlled	—
`data/samples/*.pdf`	Historical reports that defined voice, structure, bullet format	—