V19 vs V20: What changed
| Dimension | V19.1 | V20 ↑ |
|---|---|---|
| LLM calls / article | 2–3 (multi-stage) | 1 unified call |
| Primary model | gpt-4o-mini + gpt-4.1-mini | gpt-4.1-nano |
| Pipeline stages | Stage 0 → 1 → 1.5/1.6 → 2 → 3 | Ingest → LLM → Calibration → Output |
| Ideology output | 5-band bias spectrum (L→R) | 10-framework ideology scores |
| Calibration | None (prompt-only) | 9 deterministic override rules |
| Prod accuracy | 100% band accuracy (14/14) | 93.3% ideology accuracy (14/15 ≥250w) |
| Cost / article | ~$0.003 | ~$0.0056 |
| Fresh scan latency | ~10s | ~23s (OpenRouter variance) |
| Cached latency | < 1s | < 1s |
| Max tokens | — | 5000 (schema-safe ceiling) |
The four-step V20 pipeline
V20 collapses V19's multi-stage architecture into a clean four-step flow. The LLM observes once; deterministic code calibrates and validates.
Article Ingestion & Cache Lookup
Incoming request: URL + extracted article text (minimum 250 words — shorter content rejected as non-article with HTTP 422). Cache key = URL hash. On cache hit (80%+ of prod traffic), full prior analysis returned in < 1s with zero new LLM calls. On miss, proceed to Step 02.
Deterministic · No LLM · 80%+ cache hitUnified LLM Call (gpt-4.1-nano)
A single OpenRouter call with the V20 system prompt (fetched from Langfuse registry, in-process cached 5 min). Parameters: seed=42, top_p=0.1, max_tokens=5000. The model returns one complete JSON object containing: ideology_scores (10 frameworks), winner, classification_confidence, manipulation_risk_score, spans[], paragraphs[], emotional_resonance, credibility_signals, v20_metadata. Fallback: gpt-4.1-mini on primary failure.
LLM · gpt-4.1-nano · seed=42 · top_p=0.1Calibration Layer (9 deterministic rules)
Post-LLM rule engine corrects gpt-4.1-nano's systematic misclassifications that cannot be reliably fixed via prompt instructions alone. Rules fire in order: R1 biometric/surveillance privacy → Libertarianism; R2 Decentralized Governance win without advocacy → TG or runner-up (sub-case: wire/institutional → TG); R3 voting-rights legal protection → Democratic Socialism; R4 academic/scientific journal content → TG; R5 anti-SLAPP/press freedom → Libertarianism; R6 disability/welfare policy → Democratic Socialism; R7 Wikinews + regulatory action → TG; R8 OPEC/multilateral energy → TG; R9 sports/athletic achievement → Decentralized Governance.
Deterministic · 9 rules · No LLMSchema Validation & Output
Zod schema validation against V20Analysis type (span_count optional, appeal permissive — schema fixes from Phase 5). Metadata injection: model_used, timestamp, pipeline_version=V20. Result written to Supabase (scanId generated), Langfuse trace flushed async. Response returned to client: extension panel, API caller, or PDF builder (11-page Pro report).
Zod schema · Supabase · Langfuse trace10-Framework ideology scoring
V20 replaces V19's 5-band left/right spectrum with a granular 10-framework ideology scoring system. Each article receives a confidence-weighted score across all 10 frameworks simultaneously. The highest-scoring framework becomes the winner.
9 deterministic override rules
gpt-4.1-nano has systematic misclassification patterns that cannot be corrected through prompt instructions alone — non-determinism means the same prompt produces different wrong answers across runs. These 9 rules provide deterministic post-processing corrections grounded in domain knowledge.
| Rule | Condition | Override → | Rationale |
|---|---|---|---|
| R1 | Winner ≠ Libertarianism + biometric/surveillance keywords + privacy-threat keywords | Libertarianism | nano associates surveillance with Authoritarian even when article critiques it |
| R2a | Winner = Decentralized Governance + no decentralisation advocacy + wire/institutional signals | Technocratic Governance | Wire news institutional visits fire DG on "local presence" framing |
| R2b | Winner = Decentralized Governance + no decentralisation advocacy (other) | Runner-up ideology | Demotes spurious DG wins to next highest scorer |
| R3 | Winner = Populism + voting/civil rights keywords + legal mechanism keywords | Democratic Socialism | nano confuses partisan legal defence of group rights with populism |
| R4 | Winner = Authoritarian Statism + academic/scientific journal keywords | Technocratic Governance | nano fires Authoritarian on peer-review governance language |
| R5 | Winner = Authoritarian Statism + anti-SLAPP/press freedom keywords | Libertarianism | SLAPP critique is libertarian; nano matches topic word not critique angle |
| R6 | Winner = Populism + disability/welfare keywords | Democratic Socialism | Welfare policy is DemSoc; extends R3 beyond legal framing |
| R7 | Winner = Neoliberal Capitalism + Wikinews + government regulatory keywords | Technocratic Governance | Wire-service regulatory stories are TG not market-framing |
| R8 | Winner = Nationalist Conservatism + OPEC keywords + multilateral keywords | Technocratic Governance | Geopolitical energy institution analysis is TG; nano treats Gulf context as nationalist |
| R9 | Winner = Populism + sports/athletic keywords + no political framing keywords | Decentralized Governance | Sports stories have no political ideology; nano fires Populism for underdog narratives |
V20 accuracy results
V20 was benchmarked on a 21-article corpus spanning 10 ideology categories. Articles <250 words are rejected by the production junk filter (word count gate). Prod-valid accuracy measures only articles that would pass in production.