@punk/sdk API reference
The TypeScript client for the Punk gateway. Zero dependencies; works in Bun and Node 18+ with global fetch. For a guided tour, read Onboarding.
import { Punk } from "@punk/sdk";
Common response types (Run, Pattern, Artifact, SavingsSummary, SomSnapshot, and related results) are exported from the package.
Constructor
new Punk(opts?: PunkOptions)
| Option | Type | Default | Sent as |
|---|---|---|---|
baseUrl | string | "http://localhost:4100" | not sent as a header (trailing slashes stripped) |
apiKey | string | none | Authorization: Bearer <apiKey> on every request |
app | string | "default-app" | X-Punk-App on chat |
agent | string | none | X-Punk-Agent on chat |
subject | string | none | X-Punk-Subject on chat; subject field on tool-cache calls |
The client is stateless. Construct one per (app, agent, subject) identity. apiKey is only needed when the gateway sets PUNK_API_KEY.
Chat
chat(params: ChatParams): Promise<ChatResult>
POST /v1/chat/completions (OpenAI-compatible) with the X-Punk-* identity headers. Forces stream: false. For streaming, use any OpenAI client pointed at the gateway instead.
interface ChatParams {
model: string;
messages: Array<{ role: string; content: string }>;
temperature?: number;
response_format?: unknown;
// Chorus requests may also include budget, latency, quality, research,
// receipt, policy, evaluation, and live-answer controls.
}
interface ChatResult {
content: string; // choices[0].message.content, "" if absent
runId: string; // x-punk-run-id response header, "" if absent
route: string; // x-punk-route response header, "live" if absent
raw: any; // full OpenAI-shaped response body
}
Errors: throws on any non-2xx (including policy blocks, which return the verdict in the body).
Chorus helper
Use PUNK_CHORUS_MODEL or punkChorusChat() when calling Chorus through the OpenAI-style chat wire:
import { PUNK_CHORUS_MODEL, punkChorusChat } from "@punk/sdk";
const r = await punk.chat(punkChorusChat({
messages: [{ role: "user", content: "Build a source-backed answer with a receipt." }],
budget_limit_usd: 0.25,
latency_mode: "balanced",
quality_mode: "maximum_quality",
receipt_mode: "full",
research_mode: "som",
chorus: { requestId: "req_123" }
}));
punkChorusChat() is a small convenience wrapper that sets model: "punk/chorus" while preserving the rest of the chat request. See Chorus for the control fields.
Tool tracing
traceTool<TArgs, TResult>(def: ToolDefinition<TArgs, TResult>): TracedTool<TArgs, TResult>
Wraps a tool function so invocations are traced into a run and read-only results participate in the tool-result cache.
interface ToolDefinition<TArgs, TResult> {
name: string;
sideEffectLevel?: SideEffectLevel; // 0–4; default 3 (conservative)
ttlSeconds?: number; // level <= 1 + ttl > 0 => cacheable
execute: (args: TArgs) => Promise<TResult> | TResult;
}
type TracedTool<TArgs, TResult> =
(args: TArgs, ctx?: { runId?: string }) => Promise<TResult>;
Behavior of the returned function, in order:
- Cache check (only if
sideEffectLevel <= 1andttlSeconds > 0):POST /api/v1/tool-cache/checkwith{ toolName, subject, args }. On a hit, returns the cached result without executing; if arunIdwas given, tracestool.completedwithcached: true. Network failure degrades to a miss. - **Trace
tool.called** with{ name, args, sideEffectLevel }, only whenctx.runIdis provided. - **Trace
side_effect.planned** with{ toolName, level, payload }, only forsideEffectLevel >= 2, before execution, so policy and evidence review can account for it. - Execute
def.execute(args). - **Trace
tool.completed** with{ name, result }. - Cache store (cacheable tools only):
POST /api/v1/tool-cache/storewith the result and TTL.
Guarantees: without ctx.runId the tool executes untraced; trace and cache failures are swallowed (telemetry never breaks the tool call); errors thrown by execute propagate to the caller unchanged.
trace(runId: string, type: TraceEventType | string, payload: Record<string, unknown>): Promise<void>
POST /api/v1/trace with { runId, type, payload }. Appends a trace event to a run's ledger. Throws on non-2xx (unlike the internal tracing in traceTool, which is best-effort).
Feedback
feedback(runId: string, rating: 1 | -1, correction?: string): Promise<void>
POST /api/v1/runs/:id/feedback with { type: "rating", rating, correction }. Corrections are the strongest learning signal. They count against pattern stability and artifact confidence. Throws on non-2xx.
Memory quarantine
punk.memory.recordInfluence(runId, { source, trustLane, contentHash? })
POST /api/v1/runs/:runId/memory. Declare what memory/context influenced a run, tagged with its trust lane (untrusted | observed | verified | human_approved). Recording is always allowed: it's cheap telemetry, useful even when enforcement is off.
When the tenant enables memory_quarantine, a low-trust influence (untrusted/observed) on a run gates that run's high-impact (side-effect level ≥ threshold) tool actions to approval_required, so untrusted web content can't trigger a payment. A verified/human_approved influence on the same run covers it. See Governance § Memory Quarantine.
await punk.memory.recordInfluence(runId, { source: "web:example.com", trustLane: "untrusted" });
Web Fetch
fetchSom(url: string, opts?: { bypassCache?: boolean }): Promise<WebFetchResult>
POST /api/v1/web/fetch. Fetches a page and returns compact structured page context instead of raw HTML.
interface WebFetchResult {
som: SomSnapshot; // structured page snapshot, with meta byte counts
source: string; // adapter name or "cache"
cached: boolean; // served from the web snapshot cache
htmlBytes: number;
somBytes: number;
tokensSavedEstimate: number; // raw-HTML tokens you didn't spend
diff?: SomDiff; // semantic diff vs. previous snapshot (on refetch)
context: string; // compact prompt-ready text rendering
}
bypassCache: true forces a refetch; when a prior snapshot exists, diff reports semantically weighted changes (pricing changed is high-significance; footer noise is low) and an aggregate driftScore in [0,1]. Throws on non-2xx.
Web sessions & actions: punk.web.*
The perception-to-action loop: open a stateful session, act on structured page element ids, observe the result. Actions are protocol-level (follow links, fill/submit forms) and governed server-side.
punk.web.openSession(url): Promise<WebSessionOpenResult> // POST /api/v1/web/sessions
punk.web.act(sessionId, intent): Promise<WebActResult> // POST /api/v1/web/sessions/:id/act
punk.web.closeSession(sessionId): Promise<{ ok: boolean }> // DELETE /api/v1/web/sessions/:id
punk.web.listSessions() // GET /api/v1/web/sessions
interface WebActionIntent {
action: "click" | "type" | "select" | "submit";
target: string; // element id e_... (or region id r_form... for submit)
value?: string; // for type/select
}
interface WebActResult {
result: WebActionResult; // { ok, action, target, resolved?, navigated?, url, error?, posted? }
som: SomSnapshot; // fresh structured snapshot after the action
diff?: SomDiff; // semantic diff vs. the pre-action snapshot
context: string; // prompt-ready rendering of the fresh snapshot
}
Governance levels: type/select and form-local click actions (checkbox, radio, reset) are level 0 (session-local form state), navigation click is level 1 (read:web), and submit plus submit-button click are **level 3, a write:web,** gated by the same policy engine as chat tools. Successful form submissions include posted, the serialized field set that was sent (name -> value), so operators can inspect the write payload. Policy deny/approval_required on a web write returns 403 with the verdict; observe-mode keys can never perform web writes ("observe-mode keys cannot perform web writes", 403) though their reads run normally. Every action is audited and every navigation destination (session open, link hrefs, form actions) is SSRF-guarded. Idle sessions auto-close after 5 minutes; sessions are tenant-private (another tenant's key sees 404).
Read APIs
savings(): Promise<SavingsSummary>
GET /api/v1/savings. Tenant rollup: totalRuns, liveRuns, optimizedRuns, blockedRuns, totalCostUsd, totalSavedUsd, ghostSavedUsd (observe-mode "would have saved" accounting), totalSavedMs, cacheHitRate, artifactHitRate, and web-context token savings.
patterns(): Promise<Pattern[]>
GET /api/v1/patterns, unwraps { patterns } ([] if absent). Each Pattern carries its lifecycle state (observed → candidate → … → promoted, or negative/retired), fingerprints, runCount, cost/latency averages, stabilityScore, and optimizableScore.
artifacts(): Promise<Artifact[]>
GET /api/v1/artifacts, unwraps { artifacts } ([] if absent). Each Artifact carries state, type, confidence, and evidence counters for an optimized route.
artifactDetail(id: string): Promise<ArtifactDetail>
GET /api/v1/artifacts/:id.
interface ArtifactDetail {
artifact: Artifact;
evaluations: ArtifactEvaluation[]; // evidence rows
pattern: Pattern | null; // the source pattern
}
runDetail(id: string): Promise<RunDetail>
GET /api/v1/runs/:id.
interface RunDetail {
run: Run; // includes routeExplanation
events: TraceEvent[]; // the full append-only trace
sideEffects: SideEffectRecord[]; // planned/executed/suppressed/blocked
}
run.routeExplanation is the audit story: route, reason, rejected alternatives, policy verdict, cache/artifact details, estimated savings, fallback.
receipt(id: string): Promise<PunkReceipt>
GET /api/v1/receipts/:id. Returns the Chorus receipt for a run when one exists.
evidencePacket(runId: string): Promise<EvidencePacket>
GET /api/v1/runs/:runId/evidence-packet. Returns a support/security evidence packet with route explanation, integrity result, replay material when available, side effects, audit rows, trace events, and Chorus material when present.
cacheStats(): Promise<CacheStats>
GET /api/v1/cache/stats → { stats: Array<{ cacheType, entries, hits }> } per tier (exact_response, tool_result, som, negative, …).
Learning lifecycle
learningTick(): Promise<LearningReport>
POST /api/v1/learning/tick. Forces one learning pass (it also runs on a timer inside the gateway). Returns at least:
interface LearningReport {
artifactsSynthesized: number;
promotionsEligible: string[]; // artifact ids that passed the gates
autoPromoted: string[]; // promoted hands-free (PUNK_AUTO_PROMOTE)
[key: string]: unknown;
}
promoteArtifact(id: string): Promise<Artifact>
POST /api/v1/artifacts/:id/promote, unwraps { artifact }. The gateway enforces promotion evidence; side-effect-bearing artifacts additionally require operator action. Throws on non-2xx, including "gate not satisfied" rejections.
MCP registry helpers
punk.mcp covers the small SDK surface for external MCP servers used by workflow tool_call nodes:
await punk.mcp.listServers();
await punk.mcp.createServer({
name: "internal-tools",
transport: "http",
url: "https://mcp.example.com/mcp",
headers: { Authorization: "cred:cred_123" }
});
await punk.mcp.testServer("mcp_123");
Registry mutations are admin-only. cred:<id> values resolve stored credentials at connect time.
Prompt ingest
ingestPrompt(source, prompt, opts?): Promise<{ runId: string }>
POST /api/v1/ingest/prompt. Side-loads an externally handled prompt as a completed observed run, useful for Claude Code hooks or other interfaces where Punk observes and audits work it did not execute directly.
await punk.ingestPrompt("claude-code", prompt, {
sessionId: "local-session",
metadata: { project: "support" }
});
Tool-result cache (low level)
traceTool calls these for you; they're public for manual integration.
toolCacheCheck(toolName: string, args: unknown): Promise<{ hit: boolean; result?: unknown }>
POST /api/v1/tool-cache/check with { toolName, subject, args }. Never throws. Any failure returns { hit: false }.
toolCacheStore(toolName: string, args: unknown, result: unknown, ttlSeconds?: number): Promise<void>
POST /api/v1/tool-cache/store with { toolName, subject, args, result, ttlSeconds }. Never throws. Caching is an optimization, not a failure mode.
Error behavior summary
| Surface | On failure |
|---|---|
chat, trace, feedback, fetchSom, web sessions, MCP helpers, prompt ingest, all read APIs, learningTick, promoteArtifact | throws Error("Punk API <METHOD> <path> failed: <status> <statusText>"), with the first 500 chars of the response body appended |
Tracing inside traceTool | swallowed; the tool call succeeds untraced |
toolCacheCheck | degrades to { hit: false } |
toolCacheStore | swallowed |
def.execute inside a traced tool | propagates unchanged |
There are no retries in the SDK; the gateway is local-first and the router fails open server-side.
Properties
punk.baseUrl, punk.app, punk.agent, punk.subject are readable on the instance. The API key is private.