Health
Overview
Section titled “Overview”The health endpoint exposes the live health state tracked by the gateway for each configured model. It is public — no authentication is required.
Endpoint: GET /health
Authentication: None
Request
Section titled “Request”GET https://your-gateway.workers.dev/healthResponse
Section titled “Response”{ "status": "ok", "models": { "gpt-4o": { "provider": "openai", "healthy": true, "success_rate": 0.994, "avg_latency_ms": 820, "in_cooldown": false, "cooldown_until": null, "daily_requests": 3100, "daily_limit": 10000 }, "claude-3-5-sonnet-20241022": { "provider": "anthropic", "healthy": true, "success_rate": 0.964, "avg_latency_ms": 1140, "in_cooldown": false, "cooldown_until": null, "daily_requests": 2800, "daily_limit": 5000 }, "llama-3.3-70b-versatile": { "provider": "groq", "healthy": false, "success_rate": 0.621, "avg_latency_ms": 490, "in_cooldown": true, "cooldown_until": "2024-11-15T11:00:00.000Z", "daily_requests": 1600, "daily_limit": 2000 } }}Response Fields
Section titled “Response Fields”| Field | Type | Description |
|---|---|---|
status | string | Top-level gateway status ("ok" or "degraded") |
models | object | Health state keyed by model identifier |
Per-Model Fields
Section titled “Per-Model Fields”| Field | Type | Description |
|---|---|---|
provider | string | Backend provider for this model |
healthy | boolean | Whether the model is currently accepting traffic |
success_rate | number | Rolling success rate (0–1) |
avg_latency_ms | number | Rolling average response latency in milliseconds |
in_cooldown | boolean | Whether the model is temporarily paused due to errors |
cooldown_until | string | null | ISO 8601 timestamp when the cooldown expires, or null |
daily_requests | number | Requests made to this model today |
daily_limit | number | null | Configured daily limit for this model, or null if unlimited |
Cooldown Behaviour
Section titled “Cooldown Behaviour”When a model’s error rate exceeds the configured threshold, the gateway places it in cooldown for a short window. During cooldown, the router skips that model and falls back to healthy alternatives. Once cooldown_until passes, the model is re-evaluated on the next request.
A model marked healthy: false with in_cooldown: true will recover automatically — no operator action is needed.
Examples
Section titled “Examples”curl https://your-gateway.workers.dev/healthconst response = await fetch('https://your-gateway.workers.dev/health');const health = await response.json();
console.log(`Gateway status: ${health.status}`);
for (const [model, state] of Object.entries(health.models)) { const status = state.healthy ? 'healthy' : state.in_cooldown ? 'cooldown' : 'unhealthy'; console.log(`${model} (${state.provider}): ${status} — ${(state.success_rate * 100).toFixed(1)}% success`);}