Three on-ramps. Your call.

You don't have to change a line of code to start.

Most AI cost tools force you through their gateway day one. We don't. Paste an admin key, get dashboards tonight. Add a drop-in SDK when you want traces. Swap the base_url when you need to physically block overages. Same platform — escalating commitment.

01
Start here
Observe
Read-only billing ingest. 5-minute setup.
Fit

You want dashboards, forecasts, and anomaly alerts tonight — without any code change.

Setup
  • Paste your provider admin/billing key (OpenAI sk-admin-…, Anthropic admin key, GCP service account, Azure SP, AWS CUR)
  • We pull usage + cost daily and roll it up by model, project, and key
  • Connect your Slack / email / PagerDuty — get alerts on anomalies, forecast overruns, and budget thresholds
You get
  • MTD burn + 30-day forecasts
  • Per-project, per-key, per-model attribution
  • Anomaly + spike alerts
  • Board-ready ROI / chargeback reports
Not at this tier
  • Hard caps (alerts only)
  • Per-request bodies / prompts
  • Guardrails + PII redaction
  • Eval / semantic cache
No code change. Connect in Settings → Providers.
Start here
02
Middle
Instrument
Drop-in SDK wrapper. 1-line code change.
Fit

You want per-request traces, eval, and feature-level attribution — and you're OK with a one-line import swap.

Setup
  • Swap `from openai import OpenAI` → `from tokmeter import OpenAI` (drop-in shim)
  • Same for `@anthropic-ai/sdk` (`tokmeter-anthropic`)
  • Spans ship to us over OTel; the request still goes direct to the provider
You get
  • Everything in Observe, plus:
  • Per-request prompt/response samples (sampled, redactable)
  • Latency + token traces per endpoint
  • Eval rules on real production traffic
  • Feature-flag attribution (which workflow drove which $)
Not at this tier
  • Hard caps that block (still alert-only)
  • Semantic cache
  • Egress allow-list
One import line per service. Reversible in one revert.
Read the SDK docs
03
Most power
Govern
Full gateway. Swap base_url, get blocking control.
Fit

You need to physically prevent overages, redact PII before it leaves your network, kill bad models in one click.

Setup
  • Change `OPENAI_BASE_URL` to the Tokmeter gateway
  • Issue virtual keys per team / agent / CI job
  • Set hard $ caps, semantic cache, guardrails, egress allow-list
You get
  • Everything in Instrument, plus:
  • Hard $ caps that BLOCK at the request boundary
  • PII redaction / egress allow-list / kill-switch
  • Semantic cache (typical 20–40% cost cut on its own)
  • Virtual keys with per-key budgets + scopes
One env var per service. One env var to roll back.
See gateway docs
What we can read with an admin key

Mode 1 coverage by provider.

ProviderWhat you give usGranularityStatus
OpenAIsk-admin-… keyHourly · per project · per key · per modelLive
AnthropicAdmin API keyDaily · per workspace · per key · per modelLive
Google Gemini / VertexGCP service account (BigQuery Billing Export reader)Daily · per project · per SKULive
Azure OpenAIService principal (Cost Management Reader)Daily · per deploymentLive
AWS BedrockCUR S3 + Athena, or Cost ExplorerDaily · per model · per accountLive

Admin keys are encrypted at rest with AES-256-GCM, never returned to the browser, and only decrypted inside ingest workers. You can revoke at any time from your provider's dashboard.

For the dev who's been burned before.

Do I have to route my traffic through you?

No. Tier 1 (Observe) is 100% read-only against your provider's billing API. Your prompts and completions never touch our network.

What if I outgrow Observe?

Add the SDK wrapper for traces. Still no traffic detour — we just collect spans. Move to the gateway only when you need hard caps or semantic cache.

Can I downgrade?

Yes. Remove the env var or admin key and you're back where you started. No data lock-in — every dashboard exports to CSV/Parquet.

What about latency?

Tier 1: zero added latency (we never see the request). Tier 2: <2ms span emit overhead. Tier 3: <15ms p50 gateway overhead with streaming passthrough.

Where do prompts live?

Tier 1: nowhere — we only see billing aggregates. Tier 2: sampled spans with optional PII redaction. Tier 3: hashed by default; full bodies opt-in per-team.

Land with Observe. Expand when ready.

Paste an admin key, see your spend in 5 minutes. We earn the right to govern.

See pricing →