SDK Quick Start

Wrap your LLM client with the SuperPenguin Python SDK to get per-request cost tracking, customer attribution, and spend analytics.

1

Install the SDK & get your key

Install the SDK with pip install superpenguin, then go to SDK Keys and create an SDK API key. You'll get a key starting with sp_. Copy it — it's only shown once.

2

Wrap your LLM client

Initialize the SDK with your sp_ key, then wrap your OpenAI, Anthropic, or LiteLLM client. No base URL changes needed — calls go directly to the provider.

Python (OpenAI)

import superpenguin as sp
from openai import OpenAI

sp.init(api_key="sp_...")

client = sp.wrap(OpenAI())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

3

Add metadata for attribution

Set default metadata on your client, or pass per-call metadata via extra_body to attribute costs to customers, features, teams, or environments.

Python (OpenAI)

# Set default metadata for all calls from this client
client = sp.wrap(OpenAI(), metadata={
    "customer_id": "cust_acme_123",
    "feature": "doc_summary",
    "team": "product",
    "environment": "production",
})

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this"}],
)

Metadata Fields

FieldTypePurpose
prompt_keystringIdentifies the prompt — appears on the Prompts analytics page
prompt_versionstring | numberVersion identifier of the prompt (e.g. "1", "beta", "2.1")
customer_idstringEnd-customer or account consuming the AI call
featurestringProduct feature name (e.g., search, support_agent)
teamstringInternal team owning the feature
environmentstringproduction, staging, dev, etc.
*stringAny other key is stored as a custom tag

All fields are optional. Start with none and add them incrementally.

4

View your data

Go to the Attribution page. You'll see:

KPI Cards

Total SDK spend, request count, avg cost per request

Breakdown Tabs

Slice by model, provider, customer, feature, team, or environment

Drilldowns

Click any row to see nested attribution (e.g., models per customer)

Recent Requests

Individual request log with tokens, cost, latency, metadata

Switch to the Reconciliation tab to compare SDK-estimated costs against your actual provider bills.

Prompt Tracking

Track cost and performance per prompt by passing prompt_key and optionally prompt_version in your request metadata.

How it works

  1. Add prompt_key to your metadata or sp_metadata
  2. The SDK logs each request with that key
  3. Open the Prompts page to see cost, request count, and active version per prompt
  4. Click into a prompt to compare versions side-by-side (avg cost, request volume)

Version comparison

Use prompt_version to A/B test prompt changes. Versions can be numbers (1, 2) or strings ("beta", "v2.1"). The Prompts detail page shows a bar chart comparing cost and volume across versions.

Recommended naming

summarize-article

Kebab-case, descriptive

support-agent-v2

Include context if needed

onboarding.welcome

Dot notation for grouping

extract-invoice-data

Action-oriented names

Cost Estimation

The SDK includes a built-in pricing table for automatic cost estimation. Models with known pricing:

Model PrefixProvider
gpt-*, o3-*, o4-*OpenAI
claude-*Anthropic
gemini-*Google
grok-*xAI

Unknown models still get tracked (tokens, latency, metadata) — cost shows as $0 until pricing is added.

What Gets Logged

Every request logs: provider, model, token counts, estimated cost, latency, status code, streaming/tools/vision flags, and all metadata fields.

Never logged

Prompt content, response content, images, audio, tool arguments, or function results. The SDK does not capture conversation data.

Troubleshooting

ProblemFix
sp.init() has not been calledCall sp.init(api_key="sp_...") or set SP_API_KEY env var
Unsupported client typesp.wrap() supports OpenAI, AsyncOpenAI, Anthropic, and AsyncAnthropic
Attribution page is emptyData appears within seconds — try refreshing
Cost shows as $0Model may not be in the pricing table yet — tokens and latency still track correctly