Argus · verification runtime for AI agents · live benchmark

The trust layer between agents and reality.

Every answer comes with the value, a confidence score, the verbatim quotes from each source we consulted, and any disagreements between them. Updated daily on a fixed gauntlet of stable, volatile, and edge-case facts. No other API does all four.

Try the live demo →API docs

Why this is different

Capability	ArgusFlow Facts	Firecrawl / Jina	Clearbit / Apollo
Returns the answer (not just HTML)	yes	no — you extract	yes
Confidence score per answer	yes (0–1)	no	no
Verbatim quote evidence per source	yes	no	no
Source disagreements surfaced	yes	n/a	hidden
Honest “unknown” on edge entities	yes	returns junk	guesses
Cost per verified fact	live data pendinglive · from benchmark above	$0.0015+(then you extract)	$0.05–$0.20+(static enrichment)

Benchmark hasn't run yet. The cron will populate this page within 24 hours.

Methodology

Each (entity, attribute) resolved with refresh=true (cache bypassed) so latency reflects cold resolution.
Verified = ≥2 corroborating sources with composite confidence ≥0.80.
Partial = single high-confidence source (≥0.60).
Unknown = no source returned a value — the system says “I don't know” rather than guessing. Includes a deliberately fake entity (Glimmer Labs) to test honest-failure handling.
Cost includes all LLM extraction calls (Groq Llama 3.1 8B primary, Anthropic Haiku fallback) plus any source scrapes. No proxy / scraping infra costs since most resolutions hit free public sources.
Competitor pricing taken from public 2025-2026 documentation: Apollo from $0.05/credit, Clearbit Reveal $0.20/lookup, ZoomInfo $0.15+/record, Firecrawl $0.0015/scrape (then you extract yourself).
Backed by facts_benchmark_runs + facts_benchmark_summary tables. Public-readable; query directly via the Supabase REST API.