Four commitments that shape every scan
Forensic, not editorial
We score what is structurally present — framing, evidence quality, omissions, emotional load — not whether we agree with the conclusion.
Multi-axis, not left/right
A single ideology label flattens complex argumentation. FME decomposes text into independent rhetorical dimensions that can be reasoned about separately.
Span-anchored evidence
Every score traces back to the exact words that produced it. "Why this 72?" is always answerable with a quoted passage and a scholarly technique name.
Reproducible & auditable
Every scan returns a prompt_hash and fme_version. Same inputs produce equivalent outputs. Bench F1 is published per release.
What FME V19 measures
V19 ships seven capabilities in P1, with argument mining and cognitive bias schema-reserved for P2. The table below shows the full traceability from scholarly literature to current status.
| # | Capability | Scholarly Source | Status |
|---|---|---|---|
| 1 | Span-level propaganda detection | Da San Martino et al., SemEval-2020 Task 11 | P1 ✓ |
| 2 | Aristotelian appeals | Aristotle's Rhetoric; LEPAn | P1 ✓ |
| 3 | Full fallacy enumeration | SemEval persuasion + classical logic | P1 ✓ |
| 4 | Emotion model + arc | Plutchik-8; VAD model | P1 ✓ |
| 5 | Full-document analysis | Eliminates V18.6 truncation bias | P1 ✓ |
| 6 | External claim grounding | Google Fact Check Tools, Wikidata | P1 ✓ |
| 6b | Cross-platform corroboration (Stage 1.6) | Real-time X, News, Web signals — blended into FGI | P1 ✓ NEW |
| 7 | Ground-truth validation pipeline | SemEval-2020 public set | P1 ✓ |
| 8 | Argument mining | Stab & Gurevych (2017) | P2 reserved |
| 9 | Cognitive bias / nudge detection | Kahneman, Thaler | P2 reserved |
| 10 | Cross-source framing | Media Frames Corpus | P3 — CSCE |
| 11 | Strategic silence | Agenda-setting theory | P3 — corpus |
| 12 | Discourse structure | Rhetorical Structure Theory | P3 — parser |
The four-stage forensic pipeline
V19 replaces the V18.6 monolithic single-pass LLM prompt with a deterministic five-stage architecture. The LLM observes; code scores.
Preprocessing
Paragraph-aware chunking into 3-paragraph windows with 1-paragraph overlap. Eliminates V18.6 truncation bias.
Deterministic · No LLMSpan Annotation (batched, parallel)
Each chunk is sent as one LLM call. Output per span: char offsets, technique, appeal, emotion, confidence, and rationale.
LLM · gpt-4o-miniClaim Grounding (parallel with 1.6)
Factual claim spans are canonicalized and queried against Google Fact Check Tools and Wikidata. Stage 2 is never blocked.
External APIs · 30-day cacheCross-Platform Corroboration (parallel with 1.5)
Article title triggers real-time signal retrieval from X, News, and Web via the Intelligence Brief edge function. Relevant signals are scored for entity overlap and recency, then blended into the Factual Grounding Index (FGI = Stage 1.5 × 70% + cross-platform × 30% when ≥2 signals found). Fail-open — never blocks Stage 2.
Real-time · X · News · WebAggregation (deterministic)
Document scores rolled up from paragraph scores using a prevalence-weighted, paragraph-length-normalized formula. FGI incorporates Stage 1.6 cross-platform blend when coverage is sufficient.
Deterministic · No LLMValidation
Zod schema check, span offset sanity, and bench-score CI gate (merge blocked on >2pp macro-F1 regression).
Schema · CI gate