Epistemic News — Closer to the Truth

Source selection

We pull from 65 curated RSS feeds spanning wire services (AP, Reuters, AFP), US mainstream outlets across the political spectrum, international sources from 15+ countries, and independent/nonprofit journalism. We supplement these with 24 Google News topic feeds that surface articles from hundreds of additional outlets.

Source selection criteria: editorial standards, factual reporting track record, and spectrum coverage. We intentionally include outlets we disagree with — a truth engine that only reads sources it likes isn't seeking truth.

How bias ratings work

Each source is rated using Media Bias/Fact Check (MBFC) classifications. MBFC is the most comprehensive independent media bias database, rating 9,000+ outlets on both bias (far-left to far-right) and factual reporting (very high to very low).

We maintain a local database of 100+ source ratings for fast lookup. Articles from Google News feeds are automatically tagged by matching the source name against our MBFC database. Sources not found default to "center" — this is a known limitation.

Important: Bias ratings describe the source, not the article. A left-center outlet can publish a perfectly neutral article. We label the source so you can judge framing — not to dismiss the content.

How clustering works

We generate vector embeddings for each article title using OpenAI's text-embedding-3-small model, then group articles with cosine similarity above 0.55 into clusters.

If embeddings fail, we fall back to TF-IDF (term frequency) matching on titles with a 0.35 similarity threshold. This is less accurate but ensures the pipeline never stops.

How synthesis works

For the top multi-source stories, we run a 4-agent adversarial debate:

Progressive analyst

Emphasizes humanitarian impact, institutional accountability, and systemic factors.

Conservative analyst

Emphasizes national security, fiscal responsibility, and traditional frameworks.

Libertarian analyst

Emphasizes individual liberty, government overreach, and market dynamics.

Devil's Advocate

Challenges all three — finds groupthink, missing angles, and logical gaps in every perspective.

A synthesis agent then produces the final article: consensus facts first, disputed claims clearly marked, blindspots noted, every factual assertion cited to its source.

How confidence scores work

Verified80+3+ independent sources agree, no contradictions.

Supported50–792 sources agree, minor framing differences.

DisputedBelow 50Sources actively contradict each other on this claim.

UnverifiedN/ASingle source only, no corroboration available.

Limitations and known biases

We believe transparency about limitations is more honest than pretending they don't exist.

△RSS-only ingestion means we miss stories that only appear on social media or behind paywalls.
△Google News topic feeds return disproportionately center/left-center articles, inflating those categories.
△Sources not in our MBFC database default to "center" — unknown outlets get the benefit of the doubt.
△LLM synthesis can introduce subtle framing biases even with adversarial agents. The citations let you check.
△We synthesize 2-3 stories per run, not all 200+ clusters. Most stories pass through unsynthesized.
△English-language sources only. Global events are filtered through anglophone media.
△The pipeline runs on AI models (xAI Grok, OpenAI embeddings) that have their own training biases.