Four commitments that shape every scan
Forensic, not editorial
We score what is structurally present — framing, evidence quality, omissions, emotional load — not whether we agree with the conclusion.
Multi-axis, not left/right
A single ideology label flattens complex argumentation. FME decomposes text into independent rhetorical dimensions that can be reasoned about separately.
Span-anchored evidence
Every score traces back to the exact words that produced it. "Why this 72?" is always answerable with a quoted passage and a scholarly technique name.
Reproducible & auditable
Every scan returns a prompt_hash and fme_version. Same inputs produce equivalent outputs. Bench F1 is published per release.
What FME V19 measures
V19 ships seven capabilities in P1, with argument mining and cognitive bias schema-reserved for P2. The table below shows the full traceability from scholarly literature to current status.
| # | Capability | Scholarly Source | Status |
|---|---|---|---|
| 1 | Span-level propaganda detection | Da San Martino et al., SemEval-2020 Task 11 | P1 ✓ |
| 2 | Aristotelian appeals | Aristotle's Rhetoric; LEPAn | P1 ✓ |
| 3 | Full fallacy enumeration | SemEval persuasion + classical logic | P1 ✓ |
| 4 | Emotion model + arc | Plutchik-8; VAD model | P1 ✓ |
| 5 | Full-document analysis | Eliminates V18.6 truncation bias | P1 ✓ |
| 6 | External claim grounding | Google Fact Check Tools, Wikidata | P1 ✓ |
| 7 | Ground-truth validation pipeline | SemEval-2020 public set | P1 ✓ |
| 8 | Argument mining | Stab & Gurevych (2017) | P2 reserved |
| 9 | Cognitive bias / nudge detection | Kahneman, Thaler | P2 reserved |
| 10 | Cross-source framing | Media Frames Corpus | P3 — CSCE |
| 11 | Strategic silence | Agenda-setting theory | P3 — corpus |
| 12 | Discourse structure | Rhetorical Structure Theory | P3 — parser |
The four-stage forensic pipeline
V19 replaces the V18.6 monolithic single-pass LLM prompt with a deterministic four-stage architecture. The LLM observes; code scores.
Preprocessing
Paragraph-aware chunking into 3-paragraph windows with 1-paragraph overlap. Eliminates V18.6 truncation bias.
Deterministic · No LLMSpan Annotation (batched, parallel)
Each chunk is sent as one LLM call. Output per span: char offsets, technique, appeal, emotion, confidence, and rationale.
LLM · mimo-v2-flashClaim Grounding (parallel)
Factual claim spans are canonicalized and queried against Google Fact Check Tools and Wikidata. Stage 2 is never blocked.
External APIs · 30-day cacheAggregation (deterministic)
Document scores rolled up from paragraph scores using a prevalence-weighted, paragraph-length-normalized formula.
Deterministic · No LLMValidation
Zod schema check, span offset sanity, and bench-score CI gate (merge blocked on >2pp macro-F1 regression).
Schema · CI gate