PUNKthe adaptive runtime

//DOCS Configuration

Environment variables, auth modes, database choices, and settings.

Configuration

Punk is configured through environment variables and tenant settings.

Local Defaults

bun install
bun run dev

Default local behavior:

  • Port: 4100.
  • Database: data/punk.db.
  • Provider: offline mock simulator when no matching live provider key is available.
  • Auth: open dev mode when PUNK_API_KEY is absent.
  • Worker: embedded in the API process.
  • Learning tick: every 10 seconds.
  • Web fetch runtime: built-in page-to-context adapter.
  • Governance: embedded policy engine with YAML policies.

Environment Variables

VariableDefaultPurpose
PUNK_PORT4100HTTP port.
PUNK_HOSTBun defaultOptional listen host. Use 127.0.0.1 when Punk sits behind Caddy, Nginx, or an SSH tunnel.
PUNK_API_KEYunsetBootstrap bearer token. When set, /api/* and /v1/* require auth.
PUNK_DATABASE_URLunsetPostgres/Neon-compatible database URL. Overrides SQLite. Serverless deployments also accept the DATABASE_URL a Neon integration injects (PUNK_DATABASE_URL wins when both are set).
PUNK_DB_PATHdata/punk.dbSQLite database path.
PUNK_PROVIDERinferredmock; forces offline behavior when set.
OPENAI_API_KEYunsetPlatform key for live OpenAI models.
OPENAI_BASE_URLOpenAI defaultOptional OpenAI-compatible base URL.
ANTHROPIC_API_KEYunsetPlatform key for the native Anthropic backend (claude-* models).
ANTHROPIC_BASE_URLAnthropic defaultOptional Anthropic-compatible base URL.
OPENROUTER_API_KEYunsetPlatform key for OpenRouter model slugs, including DeepSeek, Kimi/Moonshot, Google, Anthropic, OpenAI, Qwen, and other routed providers.
OPENROUTER_BASE_URLhttps://openrouter.ai/apiOpenRouter base URL. https://openrouter.ai/api/v1 is also accepted.
OPENROUTER_SITE_URLPUNK_APP_BASE_URLOptional attribution header sent to OpenRouter.
OPENROUTER_APP_NAMEPunk ChorusOptional OpenRouter app title header.
PUNK_CHORUS_SOTA_SYNTHESIS_MODELinferred from configured providersDefault final answer model for maximum-quality Chorus runs.
PUNK_CHORUS_SOTA_PANEL_MODELSinferred from configured providersComma-separated default candidate panel models for maximum-quality Chorus SOTA mix runs.
PUNK_CHORUS_AGENT_MODELclaude-sonnet-4-6Default delegate model for Chorus Anthropic tool-declaring agent steps, including Claude Code launched with punk/chorus. Request-level chorus_agent_model and the tenant setting override it.
DEEPSEEK_API_KEYunsetDirect DeepSeek OpenAI-compatible backend for deepseek-* or deepseek:* model ids.
DEEPSEEK_BASE_URLhttps://api.deepseek.comDirect DeepSeek base URL.
MOONSHOT_API_KEY / KIMI_API_KEYunsetDirect Moonshot/Kimi OpenAI-compatible backend for kimi-*, kimi:*, moonshot-*, or moonshot:* model ids.
MOONSHOT_BASE_URL / KIMI_BASE_URLhttps://api.moonshot.aiDirect Moonshot/Kimi base URL.
GOOGLE_GEMINI_API_KEY / GEMINI_API_KEYunsetDirect Gemini OpenAI-compatible backend for gemini-* or gemini:* model ids.
GEMINI_BASE_URLhttps://generativelanguage.googleapis.com/v1beta/openaiDirect Gemini OpenAI-compatible base URL.
PUNK_PROVIDER_TIMEOUT_MS60000Per-call provider timeout. A hung primary aborts at this bound (AbortController) and triggers failover. 0 disables the timeout. See Provider Failover.
PUNK_FAILOVER_TO_MOCKautoWhether the deterministic mock may serve as the final failover backstop. Auto = true in open-dev/no-real-key setups, false once a real provider is configured. Explicit true/false always wins. See Provider Failover.
PUNK_ENCRYPTION_KEYdev-only fallback32-byte base64 key for the credentials vault (AES-256-GCM). Authenticated deployments must set it before storing credentials; open dev mode may use the dev fallback.
PUNK_DOCS_DIRdocsPublic docs markdown source directory.
PUNK_MARKETING_HOSTpunktechnologies.com in entrypointsHost that serves the marketing splash at /; set to an empty value when embedding Punk somewhere that should not host-split marketing.
PUNK_MEET_HOSTunsetOptional explicit host that serves the product presentation deck at /. Hosts beginning with meet. serve the deck automatically; the deck is also available at /meet for preview.
PUNK_APP_HOSTapp.punktechnologies.com in entrypointsDashboard host used by the marketing splash's /app redirect.
PUNK_POLICIES_DIRpoliciesDirectory containing policy YAML.
PUNK_AUTO_PROMOTEfalseAllows hands-free promotion for side-effect-free artifacts that pass gates.
PUNK_LEARN_INTERVAL_MS10000Learning tick enqueue cadence.
PUNK_EMBEDDED_WORKERtrueSet to false when the API should not start its embedded job worker because one or more standalone bun run worker processes are running.
PUNK_WORKER_POLL_MS500Embedded/standalone worker poll interval.
PUNK_WORKER_CONCURRENCY2Standalone worker claim-loop concurrency.
PUNK_CRON_SECRETunsetSecret-gates /api/v1/internal/tick; unset leaves the endpoint 404/off.
CRON_SECRETunsetVercel-provided bearer secret for cron invocations; set equal to PUNK_CRON_SECRET.
PUNK_RETENTION_DAYS90Retention sweep window.
PUNK_RATE_LIMIT_RPM300/api/v1/* requests per caller per minute; 0 disables.
PUNK_CHAT_RATE_LIMIT_RPM600/v1/* gateway requests per caller per minute; 0 disables.
PUNK_ALLOW_PRIVATE_WEB_FETCHfalseAllows web fetch to private/loopback addresses outside true open dev mode.
PUNK_ALLOW_PRIVATE_WEBHOOKSfalseAllows webhook URLs to private addresses. Always off unless explicitly true.
PUNK_ALLOW_PUBLIC_SIGNUPfalseEnables open self-serve signup and email verification.
PUNK_APP_BASE_URLhttps://app.punktechnologies.comBase URL used in invite and verification links.
RESEND_API_KEYunsetSends email through Resend; unset uses console transport.
PUNK_EMAIL_FROMPunk <noreply@punktechnologies.com>Sender identity for Resend email.
PUNK_BILLING_DISABLEDfalsetrue → all plan quotas unlimited (self-host). Open dev mode bypasses quotas regardless. See Billing & Usage.
STRIPE_SECRET_KEYunsetEnables the Stripe Checkout / portal / webhook billing endpoints. Unset → plan changes apply directly (free-plan complete).
STRIPE_WEBHOOK_SECRETunsetStripe webhook signature signing secret.
STRIPE_PRICE_PROunsetStripe price id used as the Pro checkout line item.
STRIPE_PRICE_ENTERPRISEunsetStripe price id for Enterprise (if sold self-serve).

Provider Modes

SetupBehavior
No matching live keyMock provider, fully offline for that model family.
OPENAI_API_KEY setLive OpenAI backend for gpt-*, o*, and unqualified default model ids.
ANTHROPIC_API_KEY setLive Anthropic Messages backend for claude-* models and /v1/messages.
OPENROUTER_API_KEY setLive OpenRouter backend for provider/model slugs such as deepseek/deepseek-v4-pro, moonshotai/kimi-k2.7-code, or google/gemini-*.
DEEPSEEK_API_KEY setLive direct DeepSeek backend for deepseek-* or deepseek:* model ids.
MOONSHOT_API_KEY or KIMI_API_KEY setLive direct Moonshot/Kimi backend for kimi-*, kimi:*, moonshot-*, or moonshot:* model ids.
GOOGLE_GEMINI_API_KEY or GEMINI_API_KEY setLive direct Gemini backend for gemini-* or gemini:* model ids.
PUNK_PROVIDER=mockForce mock even if API key is present.

Use mock mode for docs, demos, CI, and deterministic local testing.

Provider Keys (BYOK)

Tenants can bring their own provider API keys instead of (or alongside) the gateway's env keys. A tenant key is a credential in the encrypted vault named exactly after the provider family (openai, anthropic, openrouter, deepseek, moonshot, or gemini) with provider set to the same value and secret { "value": "<api key>" }:

curl -X POST http://localhost:4100/api/v1/credentials \
  -H 'content-type: application/json' \
  -d '{ "name": "openai", "provider": "openai", "secret": { "value": "sk-…" } }'

Behavior:

  • Live calls for that tenant use the tenant's key; the run's trace records keySource: "tenant" (vs "platform") on model.requested, and the route explanation says so.
  • Tenants without a key use the platform env key, unchanged.
  • A tenant key takes a family live even when the platform has no env key for it; PUNK_PROVIDER=mock still forces mock for everyone.
  • Resolution is cached for 60 seconds; storing or deleting a key through the API takes effect immediately.
  • Keys are AES-256-GCM encrypted at rest and never returned by the API. Manage them in the dashboard under Governance → Provider keys (one-time entry, masked after).

Provider Failover

When the primary provider for a model hard-fails, Punk transparently fails over to a backup instead of erroring to the customer. The request follows a failover chain built per request:

  1. Primary: the provider the registry picks for the requested model (honoring the tenant's BYOK key).
  2. Cross-family backup: for OpenAI/Anthropic families, a configured provider in the other family serves a mapped sibling model. Specialist and OpenRouter model ids can still fall back to the mock backstop when policy allows, but Punk does not silently translate a DeepSeek/Kimi slug to a different commercial provider unless the route records that substitution.
  3. Mock backstop: the deterministic mock, appended last when PUNK_FAILOVER_TO_MOCK allows it, so a request never dies for lack of a provider.

The route still reports live; the failover is surfaced in the run's route explanation (failover.attempts + failover.servedBy), a model.failover trace event, and the dashboard's Overview/Runs/run-detail failover badges.

Cross-family model map

A backup serves the equivalent model in the other family, not the requested model name (which the backup wouldn't recognize):

RequestedBackup serves
gpt-4oclaude-sonnet-4-6
gpt-4o-miniclaude-haiku-4-5
gpt-4.1claude-sonnet-4-6
claude-sonnet-4-6gpt-4o
claude-haiku-4-5gpt-4o-mini
claude-opus-4-8gpt-4o

Unmapped models fall back to the family default sibling (claude-sonnet-4-6 / gpt-4o). The served model determines cost (estimateCost runs on the model that actually answered).

What triggers failover

ErrorFails over?
Network / connection error (no HTTP status)Yes
Timeout (PUNK_PROVIDER_TIMEOUT_MS exceeded)Yes
HTTP 5xxYes
HTTP 429 (rate limit)Yes; try a backup family that isn't throttled
HTTP 400 (invalid request)No; would fail identically downstream
HTTP 401 / 403 (auth)No; caller's/config's fault, surfaced, never looped (a bad BYOK key included)

A non-retriable error, or an exhausted chain, finalizes the run as failed (with the run's run.completed event and route explanation intact, so the run is never abandoned). When PUNK_FAILOVER_TO_MOCK=false and all real providers fail, the last real provider's error is returned.

Mock backstop gating

Silently returning simulated content in production can be worse than a clean error, so the mock backstop is a deliberate, gated choice. PUNK_FAILOVER_TO_MOCK defaults to true only when no real provider is configured for the request (open dev); once a platform key or tenant BYOK key exists it defaults to false. Set it explicitly to opt back in.

Streaming boundary

Failover for streaming applies only before the first byte reaches the client: Punk establishes the stream across the chain (pulling the first chunk), and if that errors retriably it advances to the next attempt. Once the first chunk is in hand, Punk commits to that provider; a later mid-stream error finalizes the run failed (you can't un-send bytes).

Auth Modes

Open Dev Mode

When PUNK_API_KEY is unset:

  • Headerless requests are accepted.
  • Requests resolve to default tenant.
  • Requests are admin.
  • Private web fetches are allowed for local demo/fixture use.

Do not expose open dev mode publicly.

Protected Mode

When PUNK_API_KEY is set:

  • Protected routes require bearer auth.
  • The bootstrap token is admin.
  • You can create tenant API keys through /api/v1/keys.
  • Private web fetches are blocked unless PUNK_ALLOW_PRIVATE_WEB_FETCH=true.

Login Mode

Login mode turns on automatically once any user exists, or when PUNK_REQUIRE_LOGIN=true:

  • Private web fetches are blocked unless PUNK_ALLOW_PRIVATE_WEB_FETCH=true.
  • Headerless and cookieless /api/v1/* requests return 401.
  • The dashboard at / serves the login page until a punk_session cookie resolves (the page is always available at /login).
  • /docs, /health, the marketing splash, and /v1/* gateway endpoints keep their existing rules; gateway traffic stays API-key territory.
  • Bearer tokens keep working unchanged; sessions and keys coexist.

Bootstrap the first admin from the environment (idempotent on every boot; it never resets a changed password):

VariablePurpose
PUNK_ADMIN_EMAILEmail of the bootstrap admin user.
PUNK_ADMIN_PASSWORDPassword for the bootstrap admin (argon2id-hashed at rest).
PUNK_REQUIRE_LOGINtrue forces login mode even with zero users.

Sessions are HttpOnly cookies (SameSite=Lax, 30 days, Secure over HTTPS); the session token is shown once and stored only as a hash. Login attempts are rate-limited to 10 per minute per IP. Users, roles, and password resets are managed in the dashboard under Governance → Users, or via /api/v1/users (admin only).

Tenant API Keys

API keys can be:

  • Admin or non-admin.
  • observe or optimize.
  • Pinned to an app id.
  • Revoked.

The full token is returned once and stored only as a hash.

Tenant Settings

Settings are managed through /api/v1/settings.

SettingPurpose
retention_daysTenant retention window.
redactionRedact SDK tool/side-effect payloads before trace append.
webhook_urlPublic webhook destination.
webhook_secretHMAC/signing secret; not returned by GET.
approval_exception_ttl_hoursDuration for approved policy exceptions.
canary_enabledPromotion enters canary rollout rather than full traffic.
model_substitutionsMap requested models to cheaper alternatives for shadow comparison and eventual serving.
model_substitution_enabledAllow earned substitutions to serve once evidence gates pass.
semantic_cacheoff, shadow, or serve; controls semantic-cache evidence and serving.
tripwire_actionalert (default: detect, signal, lower trust) or block (a fired tripwire also blocks the run).
streaming_dlptrue/false (default false). Mask PII/secrets in live response chunks at egress, not just the stored trace.
memory_quarantinetrue/false (default false). Enforce trust-lane gating: low-trust memory influence cannot drive a high-impact action.
memory_quarantine_min_levelInteger 0–4 (default 3). Side-effect level at/above which memory quarantine bites.
cross_tenant_learningtrue/false (default false). Opt in to anonymized cross-tenant aggregate learning. Shares only anonymized pattern shapes (hashed fingerprints) + success/savings rates (never prompts, outputs, messages, or your identity). The global signal informs prioritization only; your tenant's own evidence is still required before serving.

Tripwires, streaming DLP, memory quarantine, and cross-tenant learning are all opt-in and default-off, so existing behavior is unchanged until you turn them on. See Governance for the governance items.

Database Choice

Use SQLite for:

  • Local development.
  • Demos.
  • Single-process prototypes.

Use Postgres for:

  • Deployed environments.
  • Separate worker processes.
  • Multi-instance API.
  • More reliable job processing.

Policy Directory

Default policies are read from policies/.

Policy files:

  • Must end in .yaml or .yml.
  • Are loaded in sorted order.
  • Invalid files are skipped with a warning.
  • Can be hot-reloaded by the governance runtime when invoked.