Three on-ramps. Your call.

You don't have to change a line of code to start.

Most AI cost tools force you through their gateway day one. We don't. Paste an admin key, get dashboards tonight. Add a drop-in SDK when you want traces. Swap the base_url when you need to physically block overages. Same platform — escalating commitment.

Start here

Observe

Read-only billing ingest. 5-minute setup.

Fit

You want dashboards, forecasts, and anomaly alerts tonight — without any code change.

Setup

→Paste your provider admin/billing key (OpenAI sk-admin-…, Anthropic admin key, GCP service account, Azure SP, AWS CUR)
→We pull usage + cost daily and roll it up by model, project, and key
→Connect your Slack / email / PagerDuty — get alerts on anomalies, forecast overruns, and budget thresholds

You get

✓MTD burn + 30-day forecasts
✓Per-project, per-key, per-model attribution
✓Anomaly + spike alerts
✓Board-ready ROI / chargeback reports

Not at this tier

—Hard caps (alerts only)
—Per-request bodies / prompts
—Guardrails + PII redaction
—Eval / semantic cache

No code change. Connect in Settings → Providers.

Start here →

Middle

Instrument

Drop-in SDK wrapper. 1-line code change.

Fit

You want per-request traces, eval, and feature-level attribution — and you're OK with a one-line import swap.

Setup

→Swap `from openai import OpenAI` → `from tokmeter import OpenAI` (drop-in shim)
→Same for `@anthropic-ai/sdk` (`tokmeter-anthropic`)
→Spans ship to us over OTel; the request still goes direct to the provider

You get

✓Everything in Observe, plus:
✓Per-request prompt/response samples (sampled, redactable)
✓Latency + token traces per endpoint
✓Eval rules on real production traffic
✓Feature-flag attribution (which workflow drove which $)

Not at this tier

—Hard caps that block (still alert-only)
—Semantic cache
—Egress allow-list

One import line per service. Reversible in one revert.

Read the SDK docs →

Most power

Govern

Full gateway. Swap base_url, get blocking control.

Fit

You need to physically prevent overages, redact PII before it leaves your network, kill bad models in one click.

Setup

→Change `OPENAI_BASE_URL` to the Tokmeter gateway
→Issue virtual keys per team / agent / CI job
→Set hard $ caps, semantic cache, guardrails, egress allow-list

You get

✓Everything in Instrument, plus:
✓Hard $ caps that BLOCK at the request boundary
✓PII redaction / egress allow-list / kill-switch
✓Semantic cache (typical 20–40% cost cut on its own)
✓Virtual keys with per-key budgets + scopes

One env var per service. One env var to roll back.

See gateway docs →

What we can read with an admin key

Mode 1 coverage by provider.

Provider	What you give us	Granularity	Status
OpenAI	sk-admin-… key	Hourly · per project · per key · per model	Live
Anthropic	Admin API key	Daily · per workspace · per key · per model	Live
Google Gemini / Vertex	GCP service account (BigQuery Billing Export reader)	Daily · per project · per SKU	Live
Azure OpenAI	Service principal (Cost Management Reader)	Daily · per deployment	Live
AWS Bedrock	CUR S3 + Athena, or Cost Explorer	Daily · per model · per account	Live

Admin keys are encrypted at rest with AES-256-GCM, never returned to the browser, and only decrypted inside ingest workers. You can revoke at any time from your provider's dashboard.

For the dev who's been burned before.

Do I have to route my traffic through you?

No. Tier 1 (Observe) is 100% read-only against your provider's billing API. Your prompts and completions never touch our network.

What if I outgrow Observe?

Add the SDK wrapper for traces. Still no traffic detour — we just collect spans. Move to the gateway only when you need hard caps or semantic cache.

Can I downgrade?

Yes. Remove the env var or admin key and you're back where you started. No data lock-in — every dashboard exports to CSV/Parquet.

What about latency?

Tier 1: zero added latency (we never see the request). Tier 2: <2ms span emit overhead. Tier 3: <15ms p50 gateway overhead with streaming passthrough.

Where do prompts live?

Tier 1: nowhere — we only see billing aggregates. Tier 2: sampled spans with optional PII redaction. Tier 3: hashed by default; full bodies opt-in per-team.

Land with Observe. Expand when ready.

Paste an admin key, see your spend in 5 minutes. We earn the right to govern.

See pricing →