Docs

Gateway API

A drop-in proxy for OpenAI, Anthropic, and Google Gemini. Same request shape, same response shape — plus cost, latency, and team attribution on every call. No SDK to install.

Proxied via gateway

Swap the base URL and send requests through us. We forward, meter, and log every call.

OpenAIAnthropicGoogle Gemini

Billing-ingest only

We pull cost and usage data from the provider's billing API. Traffic stays on your network.

Azure OpenAIVertex / GeminiAWS Bedrock

Quickstart Capabilities OpenAI Anthropic Errors FAQ

What you get with each mode

Proxied providers pass through the gateway in real time. Billing-ingest providers stay on your network and we pull cost data after the fact. The table below shows what that means in practice.

Capability	Proxied (OpenAI, Anthropic, Gemini)	Billing-ingest only (Azure, Vertex, Bedrock)
Latency impact	~5–15 ms added per request (connection reuse + metering)	None — traffic never touches the gateway
Per-request traces	Full prompt/response logging, token-level attribution	Aggregated usage rows only (no prompt bodies)
Real-time rate limits	RPM/TPM per virtual key enforced at the gateway	Not applicable — enforced by the provider directly
Model access / routing	Any model the provider exposes; automatic fallback between models	Whatever models are enabled in your cloud account
Cost visibility	Instant — every response includes calculated USD cost	Delayed — batched from daily/hourly billing exports
Team attribution	Per-request tags (e.g. `x-workflow`)	Per-project or per-subscription labels from billing data

1. Mint a virtual key

Go to /gateway, click New key, give it a name, pick the team, and scope it (allowed providers, allowed models, monthly USD cap, optional IP allowlist). Copy the sk-ts-live-… token — it is shown once.

Make sure a provider key (OpenAI, Anthropic, or Google Gemini) is already configured in onboarding or Settings BYOK.

2. OpenAI-compatible chat completions

Point your existing OpenAI client at the gateway base URL. No SDK changes.

curl https://YOUR-APP.lovable.app/api/public/gateway/v1/chat/completions \
  -H "Authorization: Bearer sk-ts-live-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Say hi."}]
  }'

Pass x-workflow: codegen (or any tag) on the request to attribute spend to a workflow in dashboards.

3. Anthropic-compatible messages

curl https://YOUR-APP.lovable.app/api/public/gateway/v1/messages \
  -H "Authorization: Bearer sk-ts-live-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Say hi."}]
  }'

Streaming ("stream": true) is supported on both endpoints — usage is parsed from the final chunk and recorded.

Error model

401 — invalid, revoked, or IP-blocked key.
402 — monthly USD cap exhausted for this key.
403 — model or provider not allowed for this key.
429 — provider rate limit (passed through).
5xx — provider error or gateway fault; safe to retry with backoff.

What gets logged

For every request we record: provider, model, prompt and completion tokens (including cached / cache-creation tokens), USD cost using current published pricing, latency, status, virtual key id, and team id. Full request and response bodies are only persisted on sampled traces (default 5%) into a separate table you can disable per organization.

See Security for retention and encryption details.

Frequently asked questions

Do I need to route all traffic through the gateway?

No. OpenAI, Anthropic, and Gemini can be proxied for real-time metering and team attribution. For Azure OpenAI, Vertex, and Bedrock, traffic stays on your network and we ingest cost data from the billing API.

Why are some providers billing-ingest only?

Hyperscaler APIs (Azure, AWS, GCP) enforce private networking and custom auth schemes. We pull their detailed usage exports so you still get unified dashboards without re-architecting your VPC.

Can I mix both modes in one report?

Yes. The spend index and team dashboards combine gateway-proxied calls and billing-ingest rows into a single view. Each row is tagged with its source so you can filter or audit.

What if I already use OpenAI-compatible proxies for other providers?

If you route Mistral, Groq, or xAI through an OpenAI-compatible endpoint, send it to our gateway with the model header. We’ll meter it like any other proxied call even though we don’t have a native billing connector for those providers yet.