Live AI API prices, status & the only public price history
Per-model $/M-token pricing, context windows and real-time operational status across 11 LLM providers — plus the time-series nobody else keeps, so you can see who's cutting prices in the model price war.
Find the cheapest model →Agent-callable API & MCP
Tracking 44 models across 11 providers · 2 snapshots accrued · prices verified per-provider on the dates shown below.
Cheapest models right now (blended, output-weighted 3:1)
| Model | Provider | Input /1M | Output /1M | Context | Blended |
|---|---|---|---|---|---|
| Command R7B (12-2024) | Cohere | $0.04 | $0.15 | 128K | $0.12 |
| GPT-OSS 20B | Groq | $0.08 | $0.3 | 131K | $0.24 |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1M | $0.25 |
| Llama 3.3 70B (via OpenRouter) | OpenRouter | $0.1 | $0.32 | 131K | $0.27 |
| Llama 4 Scout (17Bx16E) | Groq | $0.11 | $0.34 | 131K | $0.28 |
| GPT-4.1 nano | OpenAI | $0.1 | $0.4 | 1M | $0.33 |
| Gemini 2.5 Flash-Lite | Google Gemini | $0.1 | $0.4 | 1.048576M | $0.33 |
| GPT-4o mini (legacy) | OpenAI | $0.15 | $0.6 | 128K | $0.49 |
→ Full cheapest-model comparison & task-tier picks
11
providers tracked
44
models priced
6h
refresh cadence
x402
agent-payable API
For AI agents: route to the cheapest model programmatically
An agent can call cheapest_model(task, min_context) over MCP (or HTTP) and route to the lowest-cost operational model automatically — pay-per-call via x402 (USDC on Base, no signup) or a self-serve key.