Agent 01 / 04

Market Research Agent

The primary EFI application server. Hosts the AI market report pipeline (3-pass Claude + Gemini), trade intelligence dashboards, calculators, supplier intel email poller, and a mobile PWA. Each admin edit creates micro-learning that improves future reports.

Package @efi/market-research
AI Models Claude Sonnet 4.6 + Gemini 2.5 Flash
Cost / Run ~$0.14
Learning Corrections → 25 distilled rules
EFI Market Report
3
AI Passes
9
API Calls / Run
7
Report Sections
25
Max Learned Rules

System Architecture

The agent follows a research → write → validate → review → learn cycle. Admin edits become distilled guidelines that shape future generations — a continuous learning loop.

Foundation
📄 Sample Reports PDFs define voice, categories, structure
📚 Knowledge Base Price history + correction guidelines
Data Layer
ERP (Postgres) FOB prices, routes, deadlines
CRM (Airtable) Customers, contracts, products
Web Search Claude + 20 live searches
Price History 12-month commodity accumulator
Prompt Construction
MemoryPrior month context
Guidelines25 learned rules
ERP ContextReal prices & routes
AI Pipeline
Pass 1: Research Claude Sonnet 4.6 + web_search
~8K in / ~4K out
Pass 2: Writer Claude Sonnet 4.6 × 7 calls
~18K in / ~1.8K out
Pass 3: Validate Google Gemini 2.5 Flash
~8K in / ~600 out
Output & Review
Draft Report 7 sections + confidence score
Admin Review Edit, re-prompt, hide, reorder, add sections
Published Report PDF + Web + Customer view
Learning Loop
Admin Edits Every edit = correction entry
Haiku Distillation Corrections → max 25 rules
Guidelines + Memory Injected into next generation
↺ Feeds back into Prompt Construction

Data Sources

Every report combines four independent data streams. The AI never invents numbers — prices, routes, and deadlines come from verified sources.

🗃

ERP (PostgreSQL)

Database efi_erp_2025_14 on DigitalOcean. Supplies:

  • FOB prices from supplier_purchase_orders (781 POs)
  • Transit times from ref_otw_transit_times (84 port pairs)
  • Ocean freight rates from ref_landed_cost_estimates
  • Sourcing deadlines from supplier_capacities
📋

CRM (Airtable)

12 fetchers pull customer-facing data:

  • Customer profile from Customers table
  • Monthly snapshot, YTD totals from Sales Orders
  • Sourcing analysis, pricing recap from Contract Line Items
  • Product logos from Products table
🌐

Web Research (Claude)

Pass 1 uses Claude Sonnet 4.6 with the web_search tool to perform up to 20 live searches across 9 research categories:

  • US tariffs & trade policy (HTS 3823, 1511, 1516)
  • Palm oil pricing (CPO, stearin, PFAD, palmitic acid)
  • Ocean freight indices (FBX01, FBX03)
  • Currency, supply/demand, weather, geopolitical
  • US dairy market (CME butter, Class III/IV, butterfat)
📈

Price History (Accumulated)

Rolling 12-month commodity price archive, updated each generation run:

  • CPO (Bursa Malaysia, MYR/MT)
  • Palm Stearin (USD/MT)
  • PFAD (USD/MT)
  • Palmitic Acid (USD/MT)

Stored in knowledge/price-history.json — feeds the commodity pricing chart on the report.

📄

Sample reports as training data. Historical PDFs in data/samples/ (Oct 2024, Nov 2024, Aug 2021, Dec 2018) defined the report structure, section categories, bullet format, and narrative voice. These informed the prompt engineering — the AI is instructed to match this specific style.

AI Pipeline — Three Passes

Each monthly report requires exactly 9 API calls across two AI providers. Using a separate model for validation eliminates self-confirmation bias.

Pass 1 — Research Agent

Model: Claude Sonnet 4.6 with web_search tool

Calls: 1 (multi-turn with up to 20 searches)

Tokens: ~8K input / ~4K output

View full prompt →

The research agent executes a structured search plan across 9 priority-ordered categories. It returns a typed MarketResearch JSON with every data point sourced — including the URL and retrieval date.

9 Research Categories (priority order)
1 US Trade Policy & Tariffs
2 Palm Oil Pricing (CPO, stearin, PFAD)
3 Currency (MYR/USD, IDR/USD)
4 Supply & Demand
5 Ocean Freight (FBX01, FBX03)
6 Weather & Seasonal
7 Geopolitical Events
8 Historical Price Data (12 months)
9 US Dairy Market (CME, USDA)

Pass 2 — Narrative Writer

Model: Claude Sonnet 4.6 (no web search)

Calls: 7 sequential (one per section)

Tokens: ~18K input / ~1.8K output

Temperature: 0.3 (deterministic)

View full prompt →

Each section call receives the same base context (research JSON + ERP data + memory + guidelines) plus section-specific instructions. The writer never searches — it only shapes existing data into narrative.

7 Sequential Calls → 7 Sections
Assessment5 bullets + opening
Top News2 logistics + 2 pricing
Trade Policy3 policy bullets
Price FactorsBearish/bullish matrix
Ocean Freight3–4 freight bullets
Action ItemsExactly 3 CTAs
Asia InsightIndonesia + Malaysia + India

Pass 3 — Validation Agent

Model: Google Gemini 2.5 Flash (independent provider)

Calls: 1

Tokens: ~8K input / ~600 output

View full prompt →

A separate AI provider cross-checks every narrative claim against the research JSON. Using Gemini (not Claude) eliminates self-confirmation bias — the validator has no knowledge of how the narrative was generated.

Validation checks
Supported — claim backed by research data
Unsupported — no research backing
! Contradiction — conflicts with data
Stale — data older than 14 days
💬 Note — observation or clarification

Overall confidence = 40% rule-based data quality + 60% Gemini AI validation (0–100)

Prompt Engineering

Each narrative call injects four layers of context into the prompt. This is how the system learns — prior corrections shape future output without retraining the model.

Layer 1 Cross-Month Memory

The system stores a summary for each prior month: market direction (bearish/bullish/neutral), key prices, top themes, and admin feedback scores. Injected as:

"PREVIOUS REPORT (Feb 2026):
Direction: bullish
CPO: 4,186 MYR/MT (up)
Themes: tariff uncertainty, freight tightening"

Stored in .narrative-cache/agent-memory.json — max 12 months

Layer 2 Distilled Guidelines

Every admin edit generates a correction entry. Claude Haiku periodically distills the full correction log into max 25 actionable rules, grouped by section.

"EDITORIAL GUIDELINES:
• Always include specific dates
• Never hedge with 'seems' or 'appears'
• Max 200 chars per bullet point"

Stored in knowledge/corrections-guidelines.json — version-controlled

Layer 3 ERP Context

Real business data from EFI's systems. The AI uses these as ground truth for prices, routes, and deadlines — never inventing numbers.

"EFI DATA:
FOB: MagnaPalm $845/MT (-2.1%)
Route: Belawan→Houston 32 days
Deadline: April orders by Mar 15"

Extracted live from PostgreSQL + Airtable

Layer 4 Voice & Structure Rules

Hard-coded in the system prompt, derived from the sample PDFs in data/samples/. Defines how the AI writes — tone, length, formatting, audience.

"VOICE:
• Max 200 chars per bullet
• Use 'EFI' not 'we'
• Confident: 'we expect' not 'seems'"

Hard-coded in narrative-writer.ts system prompt

💡

Voice rules are embedded in the system prompt. The writer is instructed to be "professional but accessible, written for dairy farmers" — confident ("we expect" not "it seems"), specific with numbers, action-oriented, and reassuring even with bad news. Max 200 chars per bullet. Use "EFI" not "we". These rules were derived from the sample PDFs uploaded to data/samples/.

The Learning Loop

Every admin interaction with the report creates a feedback signal that shapes future reports. This is the core of the agentic pattern — the system gets better with each cycle.

Admin Interaction → Micro-Learning → Better Next Report
Inline Edit

Admin rewrites a paragraph. Original vs. edited text becomes a correction entry.

💬
Re-prompt

"Make this more specific about tariffs." The instruction + result teaches the system what detail level is expected.

👁
Hide / Show

Hidden sections signal content the AI should de-prioritize. Feedback scores track approval.

📈
Shrink / Expand

AI condenses to ~50% or expands to ~150%. Teaches preferred section length.

Add Section

Create custom sections via AI prompt or manual text. ADDED badge, reordering, PDF support, delete with confirm.

Corrections Distillation Process

1
Admin edits a paragraph
triggers → correction entry

The original AI text and the admin's corrected version are stored in corrections-log.json with a category (factual, tone, structure, omission, emphasis) and explanation.

2
Haiku distills corrections
background → fire-and-forget

Claude Haiku reads the full correction log and distills it into max 25 actionable rules, grouped by section (general, assessment, topNews, priceFactors, oceanFreight, actionItems, asiaInsight). Max 5 rules per section to keep prompts focused.

3
Guidelines stored in version control
knowledge/corrections-guidelines.json

Distilled rules are saved as a versioned JSON file. This means the learning is transparent, auditable, and survives code deploys.

4
Injected into next generation
every narrative prompt includes guidelines

The next time a report is generated, all 25 rules are injected as EDITORIAL GUIDELINES in the system prompt. The AI follows these rules alongside the voice guidelines from the sample reports.

🛠

Cross-month memory is separate from corrections. Memory tracks what happened (prices, direction, themes) for continuity. Guidelines track how to write (rules, style, emphasis) for quality. Both are injected but serve different purposes.

LLM Stack & Cost

Component Model Calls Input Tokens Output Tokens Cost
Research Agent Claude Sonnet 4.6 1 (+ web search) ~8,000 ~4,000 $0.07
Narrative Writer Claude Sonnet 4.6 7 sequential ~18,000 ~1,800 $0.07
Validation Gemini 2.5 Flash 1 ~8,000 ~600 <$0.01
Corrections Distillation Claude Haiku 4.5 1 (on-demand) varies ~1,500 <$0.01
Total per run 9 ~34,000 ~6,400 ~$0.14
$0.14
Per fresh run (Sonnet)
$0.04
Per run (Haiku dev mode)
$1.70
Annual production cost
$28
200-run dev sprint

Caching & Orchestration

Reports are generated once per month and cached on disk. The orchestration layer handles deduplication, rate limiting, and failure recovery.

Disk Cache — .narrative-cache/
{date}.json
Narrative sections (7 sections)
{date}-research.json
Raw research data + sources
{date}-confidence.json
Validation score + findings
{date}-feedback.json
Edits, overrides, hidden sections
agent-memory.json
Cross-month memory (12 months)
report-status.json
Draft / Published states
🔄

Request Deduplication

If multiple users request the same month's report simultaneously, a single AI generation runs and all callers share the same Promise. Prevents redundant API calls and rate limit issues.

Failure Cooldown

After a generation failure (API error, rate limit), a 5-minute cooldown prevents repeated hammering. Exponential backoff on 429 errors (30s, 60s, 120s).

Admin Editing & Workflow

Rich editorial controls let admins shape AI-generated reports before publishing to customers. Every admin action participates in the learning loop.

Rich Text Editor

Contenteditable + toolbar with bold, italic, underline, and color formatting. Stored as HTML in editorialOverrides. Figures auto-highlighted in non-rich text.

Undo / Redo

Full undo/redo stack for editorial changes. POST /api/undo and POST /api/redo restore previous states. GET /api/undo-status returns stack counts.

📝

Version History

Every editorial change is versioned. GET /api/version-history shows all versions for a section. POST /api/activate-version restores a specific version.

🔎

Content Consistency Checker

Gemini-powered verification of admin edits against the full report. Pre-publish gate: blocks on errors, allows warnings. Purple section highlighting.

📄

Draft / Published Workflow

Reports start as Draft (admin-only). Admin publishes to make visible to customers. Unpublish reverts to draft. Status tracked in report-status.json.

👁

Section & Item Controls

Hide/unhide sections and individual items. Reorder sections and items within sections. Move items between sections. Add custom AI-generated or manual sections with ADDED badge.

Customer Insights Server (Port 3001)

A separate Express server for data-only customer dashboards. No AI dependency — starts instantly and runs without ANTHROPIC_API_KEY.

📈

Customer Dashboard

Per-customer insights: monthly snapshot, YTD totals, pricing recap, sourcing analysis, purchase chart with trailing 12-month data, product logos, and loadout schedule. All 12 data fetchers run in parallel via Promise.all().

📄

PDF Export

Full customer dashboard as PDF via Puppeteer. Admin picker lets you select any customer + month combination. Customer list based on trailing 12-month fulfilled tonnage.

Two independent servers, one package. Market Report (port 3000) and Customer Insights (port 3001) share fetchers, views, and auth code but have different route sets. Start with npm run market and npm run insights.

Hosted Modules (Port 3000)

The market report server hosts several additional modules alongside the core market report. Desktop top nav: Market Report | Trade Intel | Calculators | Guide.

Trade Intelligence

Competitive analysis from Datamyne BOL records: dashboard, importer/supplier rankings, classification watch, entity resolution, and brand discovery. Routes at /trade-intel/*.

Full documentation →

Calculators

Milk Pricing & ROI (live USDA data, EFI sales overlay, AI narrative), Fat Blending (LP optimizer, Product Database targets), Product Database (60+ products, CRUD, 9 categories).

Full documentation →

Mobile PWA

Progressive Web App at /m/* routes. Bottom tab bar: Reports | Intel | Calc. Service worker (network-first), Add to Home Screen from Safari. Intel and Calc sub-pages via pill bars.

Supplier Intel

IMAP email poller for supplier market intelligence (GNNH, RIM, etc.). AI extraction, formatted detail views, reprocess/delete controls. Routes at /supplier-intel/*.

Supplier Intel Email Poller

IMAP-based email ingestion for supplier market intelligence. Automatically polls for emails from key suppliers (GNNH, RIM, etc.), extracts structured data via AI, and presents formatted detail views.

📧

Two-Phase Fetch

Envelopes downloaded first (fast), then full message source per message. 90s socket timeout, 30s greeting/connection. Scans last 50 messages by sequence number.

📄

HTML Extraction

Handles base64 decoding, quoted-printable encoding, HTML tag stripping, and Outlook forwarded emails with nested MIME parts.

🔬

Formatted Detail Views

Source-specific formatting: GNNH (prices table, FOB, production, outlook), RIM (tariffs, AD/CVD, impact), Other (summary, data points). Icon buttons (eye/refresh/trash) with custom confirm dialogs.

🔄

Reprocess & blocklist. The reprocess button re-runs AI extraction with improved text decoding on any previously ingested email. Deleted emails are added to a blocklist to prevent re-ingestion on the next poll cycle.

Key Source Files

File Purpose Lines
fetchers/market-narrative.ts Orchestration — caching, dedup, queue, pipeline execution ~1,500
narrative/research-agent.ts Pass 1 — Claude + web_search, 9 research categories ~400
narrative/narrative-writer.ts Pass 2 — 7 sequential section generators + shrink/expand/re-prompt ~600
narrative/validation-agent.ts Pass 3 — Gemini cross-check, confidence scoring ~200
narrative/memory.ts Cross-month memory — prices, direction, themes, feedback ~150
narrative/corrections.ts Correction log + Haiku distillation → 25 editorial rules ~200
narrative/types.ts Section definitions, feedback types, editorial overrides ~250
narrative/confidence-score.ts Rule-based data quality scoring (40% of overall confidence) ~150
narrative/consistency-checker.ts Gemini pre-publish gate — flags contradictions in edited reports ~150
supplier-intel/email-poller.ts IMAP email polling — two-phase fetch, HTML extraction, delete blocklist ~400
supplier-intel/extractor.ts AI extraction — GNNH/RIM/Other format detection, structured data output ~300
knowledge/corrections-guidelines.json Distilled editorial rules (max 25) — version-controlled
data/samples/*.pdf Historical reports that defined voice, structure, bullet format