Health

Overview

The health endpoint exposes the live health state tracked by the gateway for each configured model. It is public — no authentication is required.

Endpoint: GET /health

Authentication: None

Request

GET https://your-gateway.workers.dev/health

Response

{
  "status": "ok",
  "models": {
    "gpt-4o": {
      "provider": "openai",
      "healthy": true,
      "success_rate": 0.994,
      "avg_latency_ms": 820,
      "in_cooldown": false,
      "cooldown_until": null,
      "daily_requests": 3100,
      "daily_limit": 10000
    },
    "claude-3-5-sonnet-20241022": {
      "provider": "anthropic",
      "healthy": true,
      "success_rate": 0.964,
      "avg_latency_ms": 1140,
      "in_cooldown": false,
      "cooldown_until": null,
      "daily_requests": 2800,
      "daily_limit": 5000
    },
    "llama-3.3-70b-versatile": {
      "provider": "groq",
      "healthy": false,
      "success_rate": 0.621,
      "avg_latency_ms": 490,
      "in_cooldown": true,
      "cooldown_until": "2024-11-15T11:00:00.000Z",
      "daily_requests": 1600,
      "daily_limit": 2000
    }
  }
}

Response Fields

Field	Type	Description
`status`	string	Top-level gateway status (`"ok"` or `"degraded"`)
`models`	object	Health state keyed by model identifier

Per-Model Fields

Field	Type	Description
`provider`	string	Backend provider for this model
`healthy`	boolean	Whether the model is currently accepting traffic
`success_rate`	number	Rolling success rate (0–1)
`avg_latency_ms`	number	Rolling average response latency in milliseconds
`in_cooldown`	boolean	Whether the model is temporarily paused due to errors
`cooldown_until`	string \| null	ISO 8601 timestamp when the cooldown expires, or null
`daily_requests`	number	Requests made to this model today
`daily_limit`	number \| null	Configured daily limit for this model, or null if unlimited

Cooldown Behaviour

When a model’s error rate exceeds the configured threshold, the gateway places it in cooldown for a short window. During cooldown, the router skips that model and falls back to healthy alternatives. Once cooldown_until passes, the model is re-evaluated on the next request.

A model marked healthy: false with in_cooldown: true will recover automatically — no operator action is needed.

curl https://your-gateway.workers.dev/health

const response = await fetch('https://your-gateway.workers.dev/health');
const health = await response.json();

console.log(`Gateway status: ${health.status}`);

for (const [model, state] of Object.entries(health.models)) {
  const status = state.healthy ? 'healthy' : state.in_cooldown ? 'cooldown' : 'unhealthy';
  console.log(`${model} (${state.provider}): ${status} — ${(state.success_rate * 100).toFixed(1)}% success`);
}