OpenAI-compatible routing across free-tier AI providers. Health-aware selection, streaming, embeddings, and built-in analytics.
Automatically routes to Workers AI, Groq, Gemini, OpenRouter, and Cerebras. Round-robin with health-aware fallback.
Tracks success rates, latency, and cooldowns per model. Unhealthy providers are automatically skipped.
Full SSE streaming for chat completions and responses API. Compatible with OpenAI SDK stream handling.
Generate embeddings via Workers AI, Gemini, or Voyage AI. Standard OpenAI embeddings format.
Request logging, per-provider stats, and a usage dashboard. Monitor everything from /v1/analytics.
IP-based rate limiting on public endpoints. Per-provider daily limits with configurable thresholds.