Docs

Gateway API

A drop-in proxy for OpenAI, Anthropic, and Google Gemini. Same request shape, same response shape — plus cost, latency, and team attribution on every call. No SDK to install.

Proxied via gateway

Swap the base URL and send requests through us. We forward, meter, and log every call.

OpenAIAnthropicGoogle Gemini

Billing-ingest only

We pull cost and usage data from the provider's billing API. Traffic stays on your network.

Azure OpenAIVertex / GeminiAWS Bedrock

What you get with each mode

Proxied providers pass through the gateway in real time. Billing-ingest providers stay on your network and we pull cost data after the fact. The table below shows what that means in practice.

CapabilityProxied (OpenAI, Anthropic, Gemini)Billing-ingest only (Azure, Vertex, Bedrock)
Latency impact~5–15 ms added per request (connection reuse + metering)None — traffic never touches the gateway
Per-request tracesFull prompt/response logging, token-level attributionAggregated usage rows only (no prompt bodies)
Real-time rate limitsRPM/TPM per virtual key enforced at the gatewayNot applicable — enforced by the provider directly
Model access / routingAny model the provider exposes; automatic fallback between modelsWhatever models are enabled in your cloud account
Cost visibilityInstant — every response includes calculated USD costDelayed — batched from daily/hourly billing exports
Team attributionPer-request tags (e.g. x-workflow)Per-project or per-subscription labels from billing data

1. Mint a virtual key

Go to /gateway, click New key, give it a name, pick the team, and scope it (allowed providers, allowed models, monthly USD cap, optional IP allowlist). Copy the sk-ts-live-… token — it is shown once.

Make sure a provider key (OpenAI, Anthropic, or Google Gemini) is already configured in onboarding or Settings BYOK.

2. OpenAI-compatible chat completions

Point your existing OpenAI client at the gateway base URL. No SDK changes.

curl https://YOUR-APP.lovable.app/api/public/gateway/v1/chat/completions \
  -H "Authorization: Bearer sk-ts-live-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Say hi."}]
  }'

Pass x-workflow: codegen (or any tag) on the request to attribute spend to a workflow in dashboards.

3. Anthropic-compatible messages

curl https://YOUR-APP.lovable.app/api/public/gateway/v1/messages \
  -H "Authorization: Bearer sk-ts-live-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Say hi."}]
  }'

Streaming ("stream": true) is supported on both endpoints — usage is parsed from the final chunk and recorded.

Error model

  • 401 — invalid, revoked, or IP-blocked key.
  • 402 — monthly USD cap exhausted for this key.
  • 403 — model or provider not allowed for this key.
  • 429 — provider rate limit (passed through).
  • 5xx — provider error or gateway fault; safe to retry with backoff.

What gets logged

For every request we record: provider, model, prompt and completion tokens (including cached / cache-creation tokens), USD cost using current published pricing, latency, status, virtual key id, and team id. Full request and response bodies are only persisted on sampled traces (default 5%) into a separate table you can disable per organization.

See Security for retention and encryption details.

Frequently asked questions

Do I need to route all traffic through the gateway?

No. OpenAI, Anthropic, and Gemini can be proxied for real-time metering and team attribution. For Azure OpenAI, Vertex, and Bedrock, traffic stays on your network and we ingest cost data from the billing API.

Why are some providers billing-ingest only?

Hyperscaler APIs (Azure, AWS, GCP) enforce private networking and custom auth schemes. We pull their detailed usage exports so you still get unified dashboards without re-architecting your VPC.

Can I mix both modes in one report?

Yes. The spend index and team dashboards combine gateway-proxied calls and billing-ingest rows into a single view. Each row is tagged with its source so you can filter or audit.

What if I already use OpenAI-compatible proxies for other providers?

If you route Mistral, Groq, or xAI through an OpenAI-compatible endpoint, send it to our gateway with the model header. We’ll meter it like any other proxied call even though we don’t have a native billing connector for those providers yet.