Skip to content

Getting Started

The AI Gateway is a drop-in OpenAI-compatible proxy that routes your requests across multiple free-tier AI providers. Point your existing OpenAI SDK or curl commands at the gateway and get resilient, load-balanced inference without managing individual provider keys.

Supported providers:

  • Cloudflare Workers AI — fast inference at the edge
  • Groq — ultra-low latency with LPU hardware
  • Google Gemini — large context windows, multimodal
  • OpenRouter — aggregated access to many models
  • Cerebras — high-throughput open-weight models
  • Voyage AI — purpose-built embedding models
  • Health-aware routing — the gateway tracks provider health and automatically skips degraded or rate-limited backends
  • Auto round-robin — when model is set to "auto", requests are distributed across healthy providers
  • Streaming — full server-sent events (SSE) support for all chat endpoints
  • Embeddings — unified /v1/embeddings endpoint backed by Workers AI, Gemini, and Voyage AI
  • Analytics — per-request logging with a dashboard at /usage
  • OpenAI-compatible — works with any client that supports the OpenAI API format

Send your first request in under a minute.

Terminal window
curl https://your-gateway.workers.dev/v1/chat/completions \
-H "Authorization: Bearer <GATEWAY_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{ "role": "user", "content": "Hello! What can you do?" }
]
}'