Configuration
Punk is configured through environment variables and tenant settings.
Local Defaults
bun install
bun run dev
Default local behavior:
- Port:
4100. - Database:
data/punk.db. - Provider: offline mock simulator when no matching live provider key is available.
- Auth: open dev mode when
PUNK_API_KEYis absent. - Worker: embedded in the API process.
- Learning tick: every 10 seconds.
- Web fetch runtime: built-in page-to-context adapter.
- Governance: embedded policy engine with YAML policies.
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
PUNK_PORT | 4100 | HTTP port. |
PUNK_HOST | Bun default | Optional listen host. Use 127.0.0.1 when Punk sits behind Caddy, Nginx, or an SSH tunnel. |
PUNK_API_KEY | unset | Bootstrap bearer token. When set, /api/* and /v1/* require auth. |
PUNK_DATABASE_URL | unset | Postgres/Neon-compatible database URL. Overrides SQLite. Serverless deployments also accept the DATABASE_URL a Neon integration injects (PUNK_DATABASE_URL wins when both are set). |
PUNK_DB_PATH | data/punk.db | SQLite database path. |
PUNK_PROVIDER | inferred | mock; forces offline behavior when set. |
OPENAI_API_KEY | unset | Platform key for live OpenAI models. |
OPENAI_BASE_URL | OpenAI default | Optional OpenAI-compatible base URL. |
ANTHROPIC_API_KEY | unset | Platform key for the native Anthropic backend (claude-* models). |
ANTHROPIC_BASE_URL | Anthropic default | Optional Anthropic-compatible base URL. |
OPENROUTER_API_KEY | unset | Platform key for OpenRouter model slugs, including DeepSeek, Kimi/Moonshot, Google, Anthropic, OpenAI, Qwen, and other routed providers. |
OPENROUTER_BASE_URL | https://openrouter.ai/api | OpenRouter base URL. https://openrouter.ai/api/v1 is also accepted. |
OPENROUTER_SITE_URL | PUNK_APP_BASE_URL | Optional attribution header sent to OpenRouter. |
OPENROUTER_APP_NAME | Punk Chorus | Optional OpenRouter app title header. |
PUNK_CHORUS_SOTA_SYNTHESIS_MODEL | inferred from configured providers | Default final answer model for maximum-quality Chorus runs. |
PUNK_CHORUS_SOTA_PANEL_MODELS | inferred from configured providers | Comma-separated default candidate panel models for maximum-quality Chorus SOTA mix runs. |
PUNK_CHORUS_AGENT_MODEL | claude-sonnet-4-6 | Default delegate model for Chorus Anthropic tool-declaring agent steps, including Claude Code launched with punk/chorus. Request-level chorus_agent_model and the tenant setting override it. |
DEEPSEEK_API_KEY | unset | Direct DeepSeek OpenAI-compatible backend for deepseek-* or deepseek:* model ids. |
DEEPSEEK_BASE_URL | https://api.deepseek.com | Direct DeepSeek base URL. |
MOONSHOT_API_KEY / KIMI_API_KEY | unset | Direct Moonshot/Kimi OpenAI-compatible backend for kimi-*, kimi:*, moonshot-*, or moonshot:* model ids. |
MOONSHOT_BASE_URL / KIMI_BASE_URL | https://api.moonshot.ai | Direct Moonshot/Kimi base URL. |
GOOGLE_GEMINI_API_KEY / GEMINI_API_KEY | unset | Direct Gemini OpenAI-compatible backend for gemini-* or gemini:* model ids. |
GEMINI_BASE_URL | https://generativelanguage.googleapis.com/v1beta/openai | Direct Gemini OpenAI-compatible base URL. |
PUNK_PROVIDER_TIMEOUT_MS | 60000 | Per-call provider timeout. A hung primary aborts at this bound (AbortController) and triggers failover. 0 disables the timeout. See Provider Failover. |
PUNK_FAILOVER_TO_MOCK | auto | Whether the deterministic mock may serve as the final failover backstop. Auto = true in open-dev/no-real-key setups, false once a real provider is configured. Explicit true/false always wins. See Provider Failover. |
PUNK_ENCRYPTION_KEY | dev-only fallback | 32-byte base64 key for the credentials vault (AES-256-GCM). Authenticated deployments must set it before storing credentials; open dev mode may use the dev fallback. |
PUNK_DOCS_DIR | docs | Public docs markdown source directory. |
PUNK_MARKETING_HOST | punktechnologies.com in entrypoints | Host that serves the marketing splash at /; set to an empty value when embedding Punk somewhere that should not host-split marketing. |
PUNK_MEET_HOST | unset | Optional explicit host that serves the product presentation deck at /. Hosts beginning with meet. serve the deck automatically; the deck is also available at /meet for preview. |
PUNK_APP_HOST | app.punktechnologies.com in entrypoints | Dashboard host used by the marketing splash's /app redirect. |
PUNK_POLICIES_DIR | policies | Directory containing policy YAML. |
PUNK_AUTO_PROMOTE | false | Allows hands-free promotion for side-effect-free artifacts that pass gates. |
PUNK_LEARN_INTERVAL_MS | 10000 | Learning tick enqueue cadence. |
PUNK_EMBEDDED_WORKER | true | Set to false when the API should not start its embedded job worker because one or more standalone bun run worker processes are running. |
PUNK_WORKER_POLL_MS | 500 | Embedded/standalone worker poll interval. |
PUNK_WORKER_CONCURRENCY | 2 | Standalone worker claim-loop concurrency. |
PUNK_CRON_SECRET | unset | Secret-gates /api/v1/internal/tick; unset leaves the endpoint 404/off. |
CRON_SECRET | unset | Vercel-provided bearer secret for cron invocations; set equal to PUNK_CRON_SECRET. |
PUNK_RETENTION_DAYS | 90 | Retention sweep window. |
PUNK_RATE_LIMIT_RPM | 300 | /api/v1/* requests per caller per minute; 0 disables. |
PUNK_CHAT_RATE_LIMIT_RPM | 600 | /v1/* gateway requests per caller per minute; 0 disables. |
PUNK_ALLOW_PRIVATE_WEB_FETCH | false | Allows web fetch to private/loopback addresses outside true open dev mode. |
PUNK_ALLOW_PRIVATE_WEBHOOKS | false | Allows webhook URLs to private addresses. Always off unless explicitly true. |
PUNK_ALLOW_PUBLIC_SIGNUP | false | Enables open self-serve signup and email verification. |
PUNK_APP_BASE_URL | https://app.punktechnologies.com | Base URL used in invite and verification links. |
RESEND_API_KEY | unset | Sends email through Resend; unset uses console transport. |
PUNK_EMAIL_FROM | Punk <noreply@punktechnologies.com> | Sender identity for Resend email. |
PUNK_BILLING_DISABLED | false | true → all plan quotas unlimited (self-host). Open dev mode bypasses quotas regardless. See Billing & Usage. |
STRIPE_SECRET_KEY | unset | Enables the Stripe Checkout / portal / webhook billing endpoints. Unset → plan changes apply directly (free-plan complete). |
STRIPE_WEBHOOK_SECRET | unset | Stripe webhook signature signing secret. |
STRIPE_PRICE_PRO | unset | Stripe price id used as the Pro checkout line item. |
STRIPE_PRICE_ENTERPRISE | unset | Stripe price id for Enterprise (if sold self-serve). |
Provider Modes
| Setup | Behavior |
|---|---|
| No matching live key | Mock provider, fully offline for that model family. |
OPENAI_API_KEY set | Live OpenAI backend for gpt-*, o*, and unqualified default model ids. |
ANTHROPIC_API_KEY set | Live Anthropic Messages backend for claude-* models and /v1/messages. |
OPENROUTER_API_KEY set | Live OpenRouter backend for provider/model slugs such as deepseek/deepseek-v4-pro, moonshotai/kimi-k2.7-code, or google/gemini-*. |
DEEPSEEK_API_KEY set | Live direct DeepSeek backend for deepseek-* or deepseek:* model ids. |
MOONSHOT_API_KEY or KIMI_API_KEY set | Live direct Moonshot/Kimi backend for kimi-*, kimi:*, moonshot-*, or moonshot:* model ids. |
GOOGLE_GEMINI_API_KEY or GEMINI_API_KEY set | Live direct Gemini backend for gemini-* or gemini:* model ids. |
PUNK_PROVIDER=mock | Force mock even if API key is present. |
Use mock mode for docs, demos, CI, and deterministic local testing.
Provider Keys (BYOK)
Tenants can bring their own provider API keys instead of (or alongside) the gateway's env keys. A tenant key is a credential in the encrypted vault named exactly after the provider family (openai, anthropic, openrouter, deepseek, moonshot, or gemini) with provider set to the same value and secret { "value": "<api key>" }:
curl -X POST http://localhost:4100/api/v1/credentials \
-H 'content-type: application/json' \
-d '{ "name": "openai", "provider": "openai", "secret": { "value": "sk-…" } }'
Behavior:
- Live calls for that tenant use the tenant's key; the run's trace records
keySource: "tenant"(vs"platform") onmodel.requested, and the route explanation says so. - Tenants without a key use the platform env key, unchanged.
- A tenant key takes a family live even when the platform has no env key for it;
PUNK_PROVIDER=mockstill forces mock for everyone. - Resolution is cached for 60 seconds; storing or deleting a key through the API takes effect immediately.
- Keys are AES-256-GCM encrypted at rest and never returned by the API. Manage them in the dashboard under Governance → Provider keys (one-time entry, masked after).
Provider Failover
When the primary provider for a model hard-fails, Punk transparently fails over to a backup instead of erroring to the customer. The request follows a failover chain built per request:
- Primary: the provider the registry picks for the requested model (honoring the tenant's BYOK key).
- Cross-family backup: for OpenAI/Anthropic families, a configured provider in the other family serves a mapped sibling model. Specialist and OpenRouter model ids can still fall back to the mock backstop when policy allows, but Punk does not silently translate a DeepSeek/Kimi slug to a different commercial provider unless the route records that substitution.
- Mock backstop: the deterministic mock, appended last when
PUNK_FAILOVER_TO_MOCKallows it, so a request never dies for lack of a provider.
The route still reports live; the failover is surfaced in the run's route explanation (failover.attempts + failover.servedBy), a model.failover trace event, and the dashboard's Overview/Runs/run-detail failover badges.
Cross-family model map
A backup serves the equivalent model in the other family, not the requested model name (which the backup wouldn't recognize):
| Requested | Backup serves |
|---|---|
gpt-4o | claude-sonnet-4-6 |
gpt-4o-mini | claude-haiku-4-5 |
gpt-4.1 | claude-sonnet-4-6 |
claude-sonnet-4-6 | gpt-4o |
claude-haiku-4-5 | gpt-4o-mini |
claude-opus-4-8 | gpt-4o |
Unmapped models fall back to the family default sibling (claude-sonnet-4-6 / gpt-4o). The served model determines cost (estimateCost runs on the model that actually answered).
What triggers failover
| Error | Fails over? |
|---|---|
| Network / connection error (no HTTP status) | Yes |
Timeout (PUNK_PROVIDER_TIMEOUT_MS exceeded) | Yes |
| HTTP 5xx | Yes |
| HTTP 429 (rate limit) | Yes; try a backup family that isn't throttled |
| HTTP 400 (invalid request) | No; would fail identically downstream |
| HTTP 401 / 403 (auth) | No; caller's/config's fault, surfaced, never looped (a bad BYOK key included) |
A non-retriable error, or an exhausted chain, finalizes the run as failed (with the run's run.completed event and route explanation intact, so the run is never abandoned). When PUNK_FAILOVER_TO_MOCK=false and all real providers fail, the last real provider's error is returned.
Mock backstop gating
Silently returning simulated content in production can be worse than a clean error, so the mock backstop is a deliberate, gated choice. PUNK_FAILOVER_TO_MOCK defaults to true only when no real provider is configured for the request (open dev); once a platform key or tenant BYOK key exists it defaults to false. Set it explicitly to opt back in.
Streaming boundary
Failover for streaming applies only before the first byte reaches the client: Punk establishes the stream across the chain (pulling the first chunk), and if that errors retriably it advances to the next attempt. Once the first chunk is in hand, Punk commits to that provider; a later mid-stream error finalizes the run failed (you can't un-send bytes).
Auth Modes
Open Dev Mode
When PUNK_API_KEY is unset:
- Headerless requests are accepted.
- Requests resolve to default tenant.
- Requests are admin.
- Private web fetches are allowed for local demo/fixture use.
Do not expose open dev mode publicly.
Protected Mode
When PUNK_API_KEY is set:
- Protected routes require bearer auth.
- The bootstrap token is admin.
- You can create tenant API keys through
/api/v1/keys. - Private web fetches are blocked unless
PUNK_ALLOW_PRIVATE_WEB_FETCH=true.
Login Mode
Login mode turns on automatically once any user exists, or when PUNK_REQUIRE_LOGIN=true:
- Private web fetches are blocked unless
PUNK_ALLOW_PRIVATE_WEB_FETCH=true. - Headerless and cookieless
/api/v1/*requests return 401. - The dashboard at
/serves the login page until apunk_sessioncookie resolves (the page is always available at/login). /docs,/health, the marketing splash, and/v1/*gateway endpoints keep their existing rules; gateway traffic stays API-key territory.- Bearer tokens keep working unchanged; sessions and keys coexist.
Bootstrap the first admin from the environment (idempotent on every boot; it never resets a changed password):
| Variable | Purpose |
|---|---|
PUNK_ADMIN_EMAIL | Email of the bootstrap admin user. |
PUNK_ADMIN_PASSWORD | Password for the bootstrap admin (argon2id-hashed at rest). |
PUNK_REQUIRE_LOGIN | true forces login mode even with zero users. |
Sessions are HttpOnly cookies (SameSite=Lax, 30 days, Secure over HTTPS); the session token is shown once and stored only as a hash. Login attempts are rate-limited to 10 per minute per IP. Users, roles, and password resets are managed in the dashboard under Governance → Users, or via /api/v1/users (admin only).
Tenant API Keys
API keys can be:
- Admin or non-admin.
observeoroptimize.- Pinned to an app id.
- Revoked.
The full token is returned once and stored only as a hash.
Tenant Settings
Settings are managed through /api/v1/settings.
| Setting | Purpose |
|---|---|
retention_days | Tenant retention window. |
redaction | Redact SDK tool/side-effect payloads before trace append. |
webhook_url | Public webhook destination. |
webhook_secret | HMAC/signing secret; not returned by GET. |
approval_exception_ttl_hours | Duration for approved policy exceptions. |
canary_enabled | Promotion enters canary rollout rather than full traffic. |
model_substitutions | Map requested models to cheaper alternatives for shadow comparison and eventual serving. |
model_substitution_enabled | Allow earned substitutions to serve once evidence gates pass. |
semantic_cache | off, shadow, or serve; controls semantic-cache evidence and serving. |
tripwire_action | alert (default: detect, signal, lower trust) or block (a fired tripwire also blocks the run). |
streaming_dlp | true/false (default false). Mask PII/secrets in live response chunks at egress, not just the stored trace. |
memory_quarantine | true/false (default false). Enforce trust-lane gating: low-trust memory influence cannot drive a high-impact action. |
memory_quarantine_min_level | Integer 0–4 (default 3). Side-effect level at/above which memory quarantine bites. |
cross_tenant_learning | true/false (default false). Opt in to anonymized cross-tenant aggregate learning. Shares only anonymized pattern shapes (hashed fingerprints) + success/savings rates (never prompts, outputs, messages, or your identity). The global signal informs prioritization only; your tenant's own evidence is still required before serving. |
Tripwires, streaming DLP, memory quarantine, and cross-tenant learning are all opt-in and default-off, so existing behavior is unchanged until you turn them on. See Governance for the governance items.
Database Choice
Use SQLite for:
- Local development.
- Demos.
- Single-process prototypes.
Use Postgres for:
- Deployed environments.
- Separate worker processes.
- Multi-instance API.
- More reliable job processing.
Policy Directory
Default policies are read from policies/.
Policy files:
- Must end in
.yamlor.yml. - Are loaded in sorted order.
- Invalid files are skipped with a warning.
- Can be hot-reloaded by the governance runtime when invoked.