PUNKthe adaptive runtime

//DOCS API

HTTP endpoints, auth, identity headers, and response conventions.

API Reference

Base URL in local dev: http://localhost:4100.

The API is split into:

  • OpenAI-compatible gateway: /v1/chat/completions.
  • Anthropic-compatible gateway: /v1/messages.
  • Anthropic-compatible token counting: /v1/messages/count_tokens.
  • Chorus governed intelligence route: model: "punk/chorus" on supported gateway wires.
  • Punk management/runtime API: /api/v1/*.
  • Health check: /health.
  • Static dashboard and docs: /, /docs.

Authentication

If PUNK_API_KEY is not set and login mode has not been activated by a user account or PUNK_REQUIRE_LOGIN=true, Punk runs in open dev mode:

  • Headerless protected requests are allowed.
  • The default tenant is used.
  • Requests are treated as admin.
  • Private web fetches are allowed for local demos.

If PUNK_API_KEY is set:

  • Protected /v1/* and /api/* routes require Authorization: Bearer <token>.
  • The bootstrap token is admin for the default tenant.
  • Tenant API keys created through /api/v1/keys are hashed at rest.

Dashboard login sessions can also authenticate /api/v1/* through the HttpOnly punk_session cookie. Gateway traffic under /v1/* stays API-key oriented. See Configuration for open dev, protected, and login mode details.

Identity Headers

HeaderPurpose
X-Punk-AppLogical app id. Optional but recommended.
X-Punk-AgentAgent id/name for trust and audit.
X-Punk-SubjectPseudonymous user/subject; also a cache-key safety dimension.

If an API key is pinned to an app, the pinned app overrides X-Punk-App.

Response Headers

HeaderMeaning
x-punk-run-idRun id for trace lookup, feedback, and replay bundle export.
x-punk-routeSelected route such as live, exact_cache, artifact, or blocked.

These headers are exposed through CORS.

Gateway

POST /v1/chat/completions

OpenAI-style chat completions wire format.

Minimum body:

{
  "model": "gpt-4o",
  "messages": [{ "role": "user", "content": "hello" }]
}

Behavior:

  • Uses live OpenAI backend when OPENAI_API_KEY is set for OpenAI/default model ids.
  • Uses OpenRouter, DeepSeek, or Moonshot/Kimi backends for their model id families when configured.
  • Uses deterministic mock provider when no matching live provider is configured.
  • Preserves OpenAI-shaped response body.
  • Supports streaming through gateway clients.
  • Records run, traces, route explanation, cost, latency, policy, and cache/artifact decisions.
  • Set model: "punk/chorus" to activate Chorus while preserving the OpenAI-shaped response body.

POST /v1/messages

Anthropic-compatible Messages endpoint.

Minimum body:

{
  "model": "claude-3-5-sonnet-latest",
  "max_tokens": 256,
  "messages": [{ "role": "user", "content": "hello" }]
}

Behavior:

  • Uses live Anthropic backend when ANTHROPIC_API_KEY is set for claude-* models.
  • Uses the configured provider registry for Chorus final answers, including Anthropic, OpenAI, OpenRouter, DeepSeek, and Moonshot/Kimi when their keys are present.
  • Uses deterministic mock provider when no matching live provider is configured.
  • Requires model, non-empty messages, and positive max_tokens.
  • Returns Anthropic-shaped responses and validation errors.
  • Supports Anthropic streaming events, including structured tool_use blocks and input_json_delta chunks when the upstream provider emits them.
  • Records the same runs, traces, route explanations, policy, cache, artifact, and cost data as the OpenAI-compatible gateway.
  • Set model: "punk/chorus" to activate Chorus while preserving the Anthropic-shaped response body.

POST /v1/messages/count_tokens

Anthropic-compatible token count endpoint. This is useful for Anthropic SDKs and Claude Code clients that preflight prompt size before sending a message request.

Minimum body:

{
  "model": "claude-3-5-sonnet-latest",
  "messages": [{ "role": "user", "content": "hello" }]
}

Behavior:

  • Accepts the same Anthropic-shaped system, messages, tools, tool_choice, and structured content blocks Punk understands for /v1/messages.
  • Returns { "input_tokens": number }.
  • Does not create a run, write trace events, charge usage, or route to a live provider.
  • Applies the same /v1/* auth boundary as the gateway.

Chorus: model: "punk/chorus"

Chorus is Punk's governed intelligence route for harder work that benefits from routing, evidence, verification, cost controls, and receipts.

It is selected by model id, not by a separate endpoint:

WireEndpointResponse shape
OpenAI-style chatPOST /v1/chat/completionsOpenAI chat completion
Anthropic-style messagesPOST /v1/messagesAnthropic message

The route records Chorus-specific trace events and receipts:

Trace eventPurpose
chorus.contractRequest classification, policy, budget, and evidence requirements.
chorus.claim_graphClaim-level work plan.
chorus.route_selectedSparse solver assignments and rejected alternatives.
chorus.verifierVerifier results.
chorus.research_packSource-backed research plan, source cards, and evidence gaps when research mode is enabled.
chorus.live_synthesisFinal-answer model, provider, token, cost, and latency metadata when live answer routing is requested.
chorus.agent_delegateDelegate model, provider, key source, wire, and tool count for Anthropic tool-declaring agent steps.
chorus.tool_planTool calls returned by the delegate before they are serialized as Anthropic tool_use blocks.
chorus.ledgerAccepted/rejected evidence, costs, latency, and confidence.
chorus.receiptExportable receipt linked to the final answer hash.

Retrieve receipts through /api/v1/receipts/:runId, /api/v1/runs/:runId/receipt, or /api/v1/runs/:runId/evidence-packet.

Chorus variants are selected with request fields, not separate model ids:

FocusTypical controls
Fastlatency_mode: "fast", optional quality_mode: "economy"
Balancedlatency_mode: "balanced", quality_mode: "balanced"
Deep reasoninglatency_mode: "deep", quality_mode: "frontier_optional"
Source-backed researchresearch_mode: "som", receipt_mode: "full"
Maximum qualitylatency_mode: "maximum_quality", quality_mode: "maximum_quality", optional live_synthesis_model
Private/locallocal_only: true, optional allowed_model_classes: ["local", "open_weight"]
Shadow evaluationshadow_mode: true, circuit_mode: "learn"

Core request controls:

FieldValues
budget_limit_usdnumber
latency_modefast, balanced, deep, maximum_quality
quality_modeeconomy, balanced, frontier_optional, maximum_quality
receipt_modeoff, summary, full
circuit_modeoff, reuse, learn
shadow_modeboolean
research_modeoff, som, deep
research_max_queries / research_max_sourcesnumbers
live_synthesis_model / live_synthesis_required / live_synthesis_max_tokensfinal-answer controls
chorus_agent_modeldelegate model for Anthropic tool-declaring agent steps
local_only / allowed_model_classes / blocked_providersmodel supply and policy constraints
choruscustomer metadata preserved in receipts and evidence packets

Health

GET /health

Returns gateway health and current provider information.

{
  "ok": true,
  "version": "0.1.0",
  "provider": "mock",
  "plasmate": false
}

GET /api/v1/readiness

Admin endpoint used by the dashboard's Production readiness panel. It summarizes whether the current deployment is ready to expose to real users.

Checks include:

  • Dashboard/API auth.
  • Credential vault encryption.
  • Live model provider configuration.
  • Provider failover posture.
  • Background job draining.
  • Private web fetch/webhook posture.
  • Public marketing/app host split.
  • Billing and quota enforcement.

Example response:

{
  "generatedAt": 1760000000000,
  "summary": { "ready": 6, "attention": 1, "info": 1, "publicReady": false },
  "items": [
    {
      "id": "providers",
      "label": "Live model providers",
      "status": "ready",
      "message": "Configured: OpenAI, Anthropic.",
      "action": null,
      "actionHref": "#/governance",
      "docsHref": "/docs/configuration"
    }
  ]
}

API Keys

Admin required.

MethodPathPurpose
POST/api/v1/keysCreate a tenant API key; token returned once.
GET/api/v1/keysList tenant API keys.
POST/api/v1/keys/:id/revokeRevoke a key.

Create key body:

{
  "name": "staging observe",
  "mode": "observe",
  "appId": "support-agent",
  "admin": false
}

Modes:

  • observe: record what Punk would have done, but return live response and do not block.
  • optimize: allow caches, artifacts, and policy enforcement.

Auth Sessions And Users

Session login is for dashboard humans. User management is admin-only.

MethodPathPurpose
POST/api/v1/auth/loginCreate a punk_session cookie. Body: { email, password }.
POST/api/v1/auth/logoutDelete the active session cookie.
GET/api/v1/auth/meReturn the current user, tenant, admin flag, and password-change state.
POST/api/v1/auth/change-passwordChange the logged-in user's password. Body: { current, new }.
GET/api/v1/usersList users; admin required.
POST/api/v1/usersCreate a user; admin required. Body: { email, name?, role?, tempPassword }.
DELETE/api/v1/users/:idDelete a user; admin required.
POST/api/v1/users/:id/reset-passwordReset password; returns a one-time tempPassword; admin required.

Organizations And Invites

Organizations are the tenant boundary for dashboard users. A session has one active organization; API keys remain tenant-scoped.

MethodPathPurpose
GET/api/v1/orgsList organizations for the current user.
POST/api/v1/orgs/switchSwitch the session's active organization. Body: { orgId }.
POST/api/v1/orgsCreate an organization.
GET/api/v1/orgs/activeRead active org, members, and current role.
PATCH/api/v1/orgs/activeRename active org; owner/admin required.
DELETE/api/v1/orgs/active/members/:userIdRemove a member and invalidate their sessions; owner required.
DELETE/api/v1/orgs/activeDelete the active org and cascade tenant data; owner required.
POST/api/v1/orgs/active/invitesInvite a member by email; owner/admin required.
GET/api/v1/orgs/active/invitesList active org invites; owner/admin required.
POST/api/v1/orgs/active/invites/:id/revokeRevoke an invite; owner/admin required.
GET/api/v1/invites/:tokenInspect an invite before accepting.
POST/api/v1/invites/:token/acceptAccept an invite and create or attach a user.
POST/api/v1/auth/signupPublic signup when enabled with PUNK_ALLOW_PUBLIC_SIGNUP=true.
GET/api/v1/auth/verify/:tokenVerify signup email.
GET/api/v1/orgs/active/onboardingRead zero-state onboarding checklist.
POST/api/v1/orgs/active/seed-demoSeed demo workflow and agent for the active org.

Billing And Usage

MethodPathPurpose
GET/api/v1/usageMonth-to-date usage, quota, spend, savings, and trend.
GET/api/v1/usage/attributionUsage grouped by route, app, agent, model, and workflow.
GET/api/v1/plansAvailable plans and limits.
POST/api/v1/orgs/active/planChange the active org plan directly when Stripe is not enabled; admin required.
POST/api/v1/billing/checkoutCreate a Stripe Checkout session when Stripe is enabled.
POST/api/v1/billing/portalCreate a Stripe customer portal session when Stripe is enabled.
POST/api/v1/billing/webhookStripe webhook endpoint.

Settings

MethodPathPurpose
GET/api/v1/settingsList tenant settings; secrets are redacted as [set].
PUT/api/v1/settingsUpdate one setting; admin required.

Known settings:

KeyValue
retention_daysPositive number.
redactionBoolean.
webhook_urlPublic HTTP(S) URL or null.
webhook_secretString or null; never echoed back.
approval_exception_ttl_hoursPositive number.
canary_enabledBoolean.
model_substitutionsObject map of requested model to cheaper model, such as { "gpt-4o": "gpt-4o-mini" }.
model_substitution_enabledBoolean; required before earned model substitutions can serve traffic.
semantic_cache"off", "shadow", or "serve".
tripwire_action"alert" or "block".
streaming_dlpBoolean; masks sensitive values in live response chunks and non-streaming responses.
memory_quarantineBoolean; gates high-impact actions influenced by low-trust memory.
memory_quarantine_min_levelInteger side-effect level 0-4; default is 3.
cross_tenant_learningBoolean; opt in to anonymized shape-level aggregate learning.

Savings And Opportunities

MethodPathPurpose
GET/api/v1/savingsTenant value rollup: spend, savings, optimized share, cache/artifact hit rates, and web-context token savings.
GET/api/v1/opportunitiesRank not-yet-promoted patterns by estimated value and next unblocker.

Runs

MethodPathPurpose
GET/api/v1/runsList runs. Query: limit, offset, route, status.
GET/api/v1/runs/:idGet run, trace events, and side effects.
GET/api/v1/runs/:id/integrityVerify trace integrity hash chain.
GET/api/v1/runs/:id/receiptGet the Chorus receipt for a run when present.
GET/api/v1/receipts/:idDirect Chorus receipt lookup by run id.
GET/api/v1/runs/:id/replay-bundleExport replay evidence for a run.
GET/api/v1/runs/:id/evidence-packetExport a support/security evidence packet: run, route explanation, integrity result, replay bundle when available, side effects, audit rows, and trace events.
POST/api/v1/runs/:id/feedbackSubmit rating/correction.

Feedback body:

{
  "type": "rating",
  "rating": -1,
  "correction": "The correct classification is billing."
}

Negative feedback on artifact-served runs creates a failed live evaluation for that artifact.

Patterns

MethodPathPurpose
GET/api/v1/patternsList discovered patterns.
GET/api/v1/patterns/:idPattern detail with artifacts and example runs.
GET/api/v1/patterns/:id/evidencePattern evidence: attempts, evaluations, preference, and aggregate signal when opted in.
POST/api/v1/patterns/:id/synthesizeSynthesize candidate artifact; admin required.
POST/api/v1/patterns/:id/ignoreMark pattern negative and cache that decision; admin required.

Artifacts

MethodPathPurpose
GET/api/v1/artifactsList artifacts.
GET/api/v1/artifacts/:idArtifact detail, evaluations, and pattern.
POST/api/v1/artifacts/:id/promotePromote artifact; admin required.
POST/api/v1/artifacts/:id/rollbackRetire artifact; admin required.
POST/api/v1/artifacts/:id/quarantineQuarantine artifact; admin required.
POST/api/v1/artifacts/:id/replayRe-run replay suite; admin required.

Replay body:

{
  "runIds": ["run_..."]
}

If runIds is omitted, Punk uses artifact provenance and holdout runs.

Approvals

MethodPathPurpose
GET/api/v1/approvalsList approvals. Query: status.
POST/api/v1/approvals/:id/approveApprove; admin required.
POST/api/v1/approvals/:id/rejectReject; admin required.

Decision body:

{
  "reason": "Approved for this deployment window."
}

Learning

MethodPathPurpose
GET/api/v1/learning/reportLast learning report and history.
GET/api/v1/learning/attemptsSynthesis attempt log. Query: patternId, limit.
GET/api/v1/learning/global-insightsOpt-in anonymized aggregate-learning insights.
POST/api/v1/learning/tickRun learning loop now; admin required.

Cache

MethodPathPurpose
GET/api/v1/cache/statsCache hit/miss stats.
POST/api/v1/cache/invalidateInvalidate cache entries; admin required.

Invalidate body:

{
  "cacheType": "som"
}

Omit cacheType to clear all cache types for the tenant.

Governance And Audit

MethodPathPurpose
GET/api/v1/auditList audit events. Query: limit, decision.
GET/api/v1/agent-identitiesList agent trust identities and trust state.
GET/api/v1/policiesList loaded policies.
GET/api/v1/tripwiresList planted tripwires.
POST/api/v1/tripwiresPlant a tripwire; admin required.
DELETE/api/v1/tripwires/:idDelete a tripwire; admin required.
POST/api/v1/tripwires/:id/armArm a tripwire; admin required.
POST/api/v1/tripwires/:id/disarmDisable a tripwire; admin required.
GET/api/v1/tripwire-eventsList tripwire firing events.
POST/api/v1/runs/:runId/memoryRecord memory/context influence for memory quarantine.

Workflows And Credentials

Workflows are interpreted graphs. llm nodes loop back through the gateway; web_fetch nodes create compact page context; tool_call nodes use registered MCP servers. See Workflows for graph configuration, node config, scheduling, and template details.

MethodPathPurpose
GET/api/v1/workflowsList workflows. Query: kind=workflow or kind=agent.
POST/api/v1/workflowsCreate workflow; admin required.
GET/api/v1/workflows/:idRead workflow.
PATCH/api/v1/workflows/:idUpdate workflow and bump version; admin required.
DELETE/api/v1/workflows/:idDelete workflow and unschedule it; admin required.
POST/api/v1/workflows/:id/runExecute synchronously; admin required. Body: { input?, trigger? }.
GET/api/v1/workflows/:id/runsWorkflow run history. Query: limit.
GET/api/v1/workflows/:id/savingsCost/savings rollup with optimizedShare.
GET/api/v1/workflow-runs/:idWorkflow run plus node-level trace events.
GET/api/v1/workflow-templatesBuilt-in templates.
POST/api/v1/workflow-templates/:id/instantiateCreate from a template; admin required. Body: { name? }.
POST/api/v1/workflows/exportExport all workflows, or selected ids; read-only.
POST/api/v1/workflows/importAtomic import; admin required.
GET/api/v1/credentialsList stored credentials, masked. Query: provider.
POST/api/v1/credentialsStore credential; admin required. Body: { name, provider, secret }.
DELETE/api/v1/credentials/:idDelete credential; admin required.

Chat And Agents

Conversations are chat threads whose assistant replies are real gateway runs with route, cost, and savings recorded per message. Agents are cron-schedulable single-task runners built on the workflow engine. See Chat & Agents for full semantics.

MethodPathPurpose
GET/api/v1/conversationsList conversations.
POST/api/v1/conversationsCreate conversation. Body: { title?, model?, system? }.
GET/api/v1/conversations/:idRead conversation with messages.
DELETE/api/v1/conversations/:idDelete conversation and its messages.
POST/api/v1/conversations/:id/messagesSend a turn and store the assistant reply. Body: { content }.
GET/api/v1/agentsList agents.
POST/api/v1/agentsCreate agent; admin required. Body: { name, instructions, prompt, model?, scheduleCron?, description?, enabled? }.
GET/api/v1/agents/:idRead agent.
PATCH/api/v1/agents/:idUpdate agent; admin required.
DELETE/api/v1/agents/:idDelete agent and schedule; admin required.
POST/api/v1/agents/:id/runExecute now; admin required. Body: { input? }.

MCP Servers

Registered MCP servers back workflow tool_call nodes. Registry mutations are admin-only. Tool execution is governed before connection.

MethodPathPurpose
GET/api/v1/mcp/serversList registered MCP servers.
POST/api/v1/mcp/serversRegister stdio or HTTP server; admin required.
GET/api/v1/mcp/servers/:idRead server config.
PATCH/api/v1/mcp/servers/:idUpdate server config and evict pooled connection; admin required.
DELETE/api/v1/mcp/servers/:idDelete server; admin required.
POST/api/v1/mcp/servers/:id/testConnect, list tools, persist status/tool count; admin required.
GET/api/v1/mcp/servers/:id/toolsList tools, cached from the last test and refreshed when stale.

Jobs

MethodPathPurpose
GET/api/v1/jobsList jobs and stats. Query: status, limit.

Web Context

MethodPathPurpose
GET/api/v1/web/snapshotsList stored structured page snapshots.
POST/api/v1/web/fetchFetch a URL and return compact page context.
POST/api/v1/web/sessionsOpen a governed stateful web session. Body: { url }.
GET/api/v1/web/sessionsList active tenant sessions.
POST/api/v1/web/sessions/:id/actExecute a governed action. Body: { intent: { action, target, value? } }.
DELETE/api/v1/web/sessions/:idClose a session.

Fetch body:

{
  "url": "https://example.com",
  "bypassCache": false
}

Response includes:

  • som: structured page snapshot
  • source: adapter name, builtin, or cache
  • cached
  • htmlBytes
  • somBytes
  • tokensSavedEstimate
  • diff when a previous snapshot exists
  • context: compact prompt-ready rendering

SDK Trace And Tool Cache

MethodPathPurpose
POST/api/v1/ingest/promptSide-load an external prompt as an observed run. Body: { source, prompt, sessionId?, metadata? }.
POST/api/v1/traceAppend an SDK trace event to a run.
POST/api/v1/tool-cache/checkCheck read-only tool-result cache.
POST/api/v1/tool-cache/storeStore read-only tool result.

Most users should call these through @punk/sdk instead of raw HTTP.

Errors

Errors are JSON objects with an error key. Provider-compatible validation errors use the corresponding OpenAI-style or Anthropic-style error shape.

Common statuses:

  • 400: invalid request body or unsupported setting.
  • 401: missing or invalid bearer token.
  • 403: policy block, private URL blocked, or admin key required.
  • 404: tenant-scoped record not found.
  • 429: rate limit exceeded; retry-after header included.
  • 502: upstream web fetch failure.