Punk Docs

@punk/sdk API reference

The TypeScript client for the Punk gateway. Zero dependencies; works in Bun and Node 18+ with global fetch. For a guided tour, read Onboarding.

import { Punk } from "@punk/sdk";

Common response types (Run, Pattern, Artifact, SavingsSummary, SomSnapshot, and related results) are exported from the package.

Constructor

new Punk(opts?: PunkOptions)

Option	Type	Default	Sent as
`baseUrl`	`string`	`"http://localhost:4100"`	not sent as a header (trailing slashes stripped)
`apiKey`	`string`	none	`Authorization: Bearer <apiKey>` on every request
`app`	`string`	`"default-app"`	`X-Punk-App` on `chat`
`agent`	`string`	none	`X-Punk-Agent` on `chat`
`subject`	`string`	none	`X-Punk-Subject` on `chat`; `subject` field on tool-cache calls

The client is stateless. Construct one per (app, agent, subject) identity. apiKey is only needed when the gateway sets PUNK_API_KEY.

Chat

`chat(params: ChatParams): Promise<ChatResult>`

POST /v1/chat/completions (OpenAI-compatible) with the X-Punk-* identity headers. Forces stream: false. For streaming, use any OpenAI client pointed at the gateway instead.

interface ChatParams {
  model: string;
  messages: Array<{ role: string; content: string }>;
  temperature?: number;
  response_format?: unknown;
  // Chorus requests may also include budget, latency, quality, research,
  // receipt, policy, evaluation, and live-answer controls.
}

interface ChatResult {
  content: string; // choices[0].message.content, "" if absent
  runId: string;   // x-punk-run-id response header, "" if absent
  route: string;   // x-punk-route response header, "live" if absent
  raw: any;        // full OpenAI-shaped response body
}

Errors: throws on any non-2xx (including policy blocks, which return the verdict in the body).

Chorus helper

Use PUNK_CHORUS_MODEL or punkChorusChat() when calling Chorus through the OpenAI-style chat wire:

import { PUNK_CHORUS_MODEL, punkChorusChat } from "@punk/sdk";

const r = await punk.chat(punkChorusChat({
  messages: [{ role: "user", content: "Build a source-backed answer with a receipt." }],
  budget_limit_usd: 0.25,
  latency_mode: "balanced",
  quality_mode: "maximum_quality",
  receipt_mode: "full",
  research_mode: "som",
  chorus: { requestId: "req_123" }
}));

punkChorusChat() is a small convenience wrapper that sets model: "punk/chorus" while preserving the rest of the chat request. See Chorus for the control fields.

Tool tracing

`traceTool<TArgs, TResult>(def: ToolDefinition<TArgs, TResult>): TracedTool<TArgs, TResult>`

Wraps a tool function so invocations are traced into a run and read-only results participate in the tool-result cache.

interface ToolDefinition<TArgs, TResult> {
  name: string;
  sideEffectLevel?: SideEffectLevel; // 0–4; default 3 (conservative)
  ttlSeconds?: number;               // level <= 1 + ttl > 0 => cacheable
  execute: (args: TArgs) => Promise<TResult> | TResult;
}

type TracedTool<TArgs, TResult> =
  (args: TArgs, ctx?: { runId?: string }) => Promise<TResult>;

Behavior of the returned function, in order:

Cache check (only if sideEffectLevel <= 1 and ttlSeconds > 0): POST /api/v1/tool-cache/check with { toolName, subject, args }. On a hit, returns the cached result without executing; if a runId was given, traces tool.completed with cached: true. Network failure degrades to a miss.
**Trace tool.called** with { name, args, sideEffectLevel }, only when ctx.runId is provided.
**Trace side_effect.planned** with { toolName, level, payload }, only for sideEffectLevel >= 2, before execution, so policy and evidence review can account for it.
Execute def.execute(args).
**Trace tool.completed** with { name, result }.
Cache store (cacheable tools only): POST /api/v1/tool-cache/store with the result and TTL.

Guarantees: without ctx.runId the tool executes untraced; trace and cache failures are swallowed (telemetry never breaks the tool call); errors thrown by execute propagate to the caller unchanged.

`trace(runId: string, type: TraceEventType | string, payload: Record<string, unknown>): Promise<void>`

POST /api/v1/trace with { runId, type, payload }. Appends a trace event to a run's ledger. Throws on non-2xx (unlike the internal tracing in traceTool, which is best-effort).

Feedback

`feedback(runId: string, rating: 1 | -1, correction?: string): Promise<void>`

POST /api/v1/runs/:id/feedback with { type: "rating", rating, correction }. Corrections are the strongest learning signal. They count against pattern stability and artifact confidence. Throws on non-2xx.

Memory quarantine

`punk.memory.recordInfluence(runId, { source, trustLane, contentHash? })`

POST /api/v1/runs/:runId/memory. Declare what memory/context influenced a run, tagged with its trust lane (untrusted | observed | verified | human_approved). Recording is always allowed: it's cheap telemetry, useful even when enforcement is off.

When the tenant enables memory_quarantine, a low-trust influence (untrusted/observed) on a run gates that run's high-impact (side-effect level ≥ threshold) tool actions to approval_required, so untrusted web content can't trigger a payment. A verified/human_approved influence on the same run covers it. See Governance § Memory Quarantine.

await punk.memory.recordInfluence(runId, { source: "web:example.com", trustLane: "untrusted" });

Web Fetch

`fetchSom(url: string, opts?: { bypassCache?: boolean }): Promise<WebFetchResult>`

POST /api/v1/web/fetch. Fetches a page and returns compact structured page context instead of raw HTML.

interface WebFetchResult {
  som: SomSnapshot;            // structured page snapshot, with meta byte counts
  source: string;              // adapter name or "cache"
  cached: boolean;             // served from the web snapshot cache
  htmlBytes: number;
  somBytes: number;
  tokensSavedEstimate: number; // raw-HTML tokens you didn't spend
  diff?: SomDiff;              // semantic diff vs. previous snapshot (on refetch)
  context: string;             // compact prompt-ready text rendering
}

bypassCache: true forces a refetch; when a prior snapshot exists, diff reports semantically weighted changes (pricing changed is high-significance; footer noise is low) and an aggregate driftScore in [0,1]. Throws on non-2xx.

Web sessions & actions: `punk.web.*`

The perception-to-action loop: open a stateful session, act on structured page element ids, observe the result. Actions are protocol-level (follow links, fill/submit forms) and governed server-side.

punk.web.openSession(url): Promise<WebSessionOpenResult>   // POST /api/v1/web/sessions
punk.web.act(sessionId, intent): Promise<WebActResult>     // POST /api/v1/web/sessions/:id/act
punk.web.closeSession(sessionId): Promise<{ ok: boolean }> // DELETE /api/v1/web/sessions/:id
punk.web.listSessions()                                    // GET /api/v1/web/sessions

interface WebActionIntent {
  action: "click" | "type" | "select" | "submit";
  target: string;   // element id e_... (or region id r_form... for submit)
  value?: string;   // for type/select
}

interface WebActResult {
  result: WebActionResult; // { ok, action, target, resolved?, navigated?, url, error?, posted? }
  som: SomSnapshot;        // fresh structured snapshot after the action
  diff?: SomDiff;          // semantic diff vs. the pre-action snapshot
  context: string;         // prompt-ready rendering of the fresh snapshot
}

Governance levels: type/select and form-local click actions (checkbox, radio, reset) are level 0 (session-local form state), navigation click is level 1 (read:web), and submit plus submit-button click are **level 3, a write:web,** gated by the same policy engine as chat tools. Successful form submissions include posted, the serialized field set that was sent (name -> value), so operators can inspect the write payload. Policy deny/approval_required on a web write returns 403 with the verdict; observe-mode keys can never perform web writes ("observe-mode keys cannot perform web writes", 403) though their reads run normally. Every action is audited and every navigation destination (session open, link hrefs, form actions) is SSRF-guarded. Idle sessions auto-close after 5 minutes; sessions are tenant-private (another tenant's key sees 404).

Read APIs

`savings(): Promise<SavingsSummary>`

GET /api/v1/savings. Tenant rollup: totalRuns, liveRuns, optimizedRuns, blockedRuns, totalCostUsd, totalSavedUsd, ghostSavedUsd (observe-mode "would have saved" accounting), totalSavedMs, cacheHitRate, artifactHitRate, and web-context token savings.

`patterns(): Promise<Pattern[]>`

GET /api/v1/patterns, unwraps { patterns } ([] if absent). Each Pattern carries its lifecycle state (observed → candidate → … → promoted, or negative/retired), fingerprints, runCount, cost/latency averages, stabilityScore, and optimizableScore.

`artifacts(): Promise<Artifact[]>`

GET /api/v1/artifacts, unwraps { artifacts } ([] if absent). Each Artifact carries state, type, confidence, and evidence counters for an optimized route.

`artifactDetail(id: string): Promise<ArtifactDetail>`

GET /api/v1/artifacts/:id.

interface ArtifactDetail {
  artifact: Artifact;
  evaluations: ArtifactEvaluation[]; // evidence rows
  pattern: Pattern | null;           // the source pattern
}

`runDetail(id: string): Promise<RunDetail>`

GET /api/v1/runs/:id.

interface RunDetail {
  run: Run;                        // includes routeExplanation
  events: TraceEvent[];            // the full append-only trace
  sideEffects: SideEffectRecord[]; // planned/executed/suppressed/blocked
}

run.routeExplanation is the audit story: route, reason, rejected alternatives, policy verdict, cache/artifact details, estimated savings, fallback.

`receipt(id: string): Promise<PunkReceipt>`

GET /api/v1/receipts/:id. Returns the Chorus receipt for a run when one exists.

`evidencePacket(runId: string): Promise<EvidencePacket>`

GET /api/v1/runs/:runId/evidence-packet. Returns a support/security evidence packet with route explanation, integrity result, replay material when available, side effects, audit rows, trace events, and Chorus material when present.

`cacheStats(): Promise<CacheStats>`

GET /api/v1/cache/stats → { stats: Array<{ cacheType, entries, hits }> } per tier (exact_response, tool_result, som, negative, …).

Learning lifecycle

`learningTick(): Promise<LearningReport>`

POST /api/v1/learning/tick. Forces one learning pass (it also runs on a timer inside the gateway). Returns at least:

interface LearningReport {
  artifactsSynthesized: number;
  promotionsEligible: string[]; // artifact ids that passed the gates
  autoPromoted: string[];       // promoted hands-free (PUNK_AUTO_PROMOTE)
  [key: string]: unknown;
}

`promoteArtifact(id: string): Promise<Artifact>`

POST /api/v1/artifacts/:id/promote, unwraps { artifact }. The gateway enforces promotion evidence; side-effect-bearing artifacts additionally require operator action. Throws on non-2xx, including "gate not satisfied" rejections.

MCP registry helpers

punk.mcp covers the small SDK surface for external MCP servers used by workflow tool_call nodes:

await punk.mcp.listServers();
await punk.mcp.createServer({
  name: "internal-tools",
  transport: "http",
  url: "https://mcp.example.com/mcp",
  headers: { Authorization: "cred:cred_123" }
});
await punk.mcp.testServer("mcp_123");

Registry mutations are admin-only. cred:<id> values resolve stored credentials at connect time.

Prompt ingest

`ingestPrompt(source, prompt, opts?): Promise<{ runId: string }>`

POST /api/v1/ingest/prompt. Side-loads an externally handled prompt as a completed observed run, useful for Claude Code hooks or other interfaces where Punk observes and audits work it did not execute directly.

await punk.ingestPrompt("claude-code", prompt, {
  sessionId: "local-session",
  metadata: { project: "support" }
});

Tool-result cache (low level)

traceTool calls these for you; they're public for manual integration.

`toolCacheCheck(toolName: string, args: unknown): Promise<{ hit: boolean; result?: unknown }>`

POST /api/v1/tool-cache/check with { toolName, subject, args }. Never throws. Any failure returns { hit: false }.

`toolCacheStore(toolName: string, args: unknown, result: unknown, ttlSeconds?: number): Promise<void>`

POST /api/v1/tool-cache/store with { toolName, subject, args, result, ttlSeconds }. Never throws. Caching is an optimization, not a failure mode.

Error behavior summary

Surface	On failure
`chat`, `trace`, `feedback`, `fetchSom`, web sessions, MCP helpers, prompt ingest, all read APIs, `learningTick`, `promoteArtifact`	throws `Error("Punk API <METHOD> <path> failed: <status> <statusText>")`, with the first 500 chars of the response body appended
Tracing inside `traceTool`	swallowed; the tool call succeeds untraced
`toolCacheCheck`	degrades to `{ hit: false }`
`toolCacheStore`	swallowed
`def.execute` inside a traced tool	propagates unchanged

There are no retries in the SDK; the gateway is local-first and the router fails open server-side.

Properties

punk.baseUrl, punk.app, punk.agent, punk.subject are readable on the instance. The API key is private.

//DOCS SDK

@punk/sdk API reference

Constructor

Chat

chat(params: ChatParams): Promise<ChatResult>

Chorus helper

Tool tracing

traceTool<TArgs, TResult>(def: ToolDefinition<TArgs, TResult>): TracedTool<TArgs, TResult>

trace(runId: string, type: TraceEventType | string, payload: Record<string, unknown>): Promise<void>

Feedback

feedback(runId: string, rating: 1 | -1, correction?: string): Promise<void>

Memory quarantine

punk.memory.recordInfluence(runId, { source, trustLane, contentHash? })

Web Fetch

fetchSom(url: string, opts?: { bypassCache?: boolean }): Promise<WebFetchResult>

Web sessions & actions: punk.web.*

Read APIs

savings(): Promise<SavingsSummary>

patterns(): Promise<Pattern[]>

artifacts(): Promise<Artifact[]>

artifactDetail(id: string): Promise<ArtifactDetail>

runDetail(id: string): Promise<RunDetail>

receipt(id: string): Promise<PunkReceipt>

evidencePacket(runId: string): Promise<EvidencePacket>

cacheStats(): Promise<CacheStats>