PUNKthe adaptive runtime

//DOCS 30 Minutes

Role-specific local walkthroughs for chat, workflows, app developers, and operators.

Punk in 30 Minutes

Punk is the runtime between your agents and the world. It observes real work, governs risky actions, turns web pages into compact agent context, learns repeated patterns, proves cheaper routes with evidence, and explains every routing decision.

This guide gets a local Punk checkout running, then gives each user type a simple first path. Everything works offline with the mock provider; set provider keys only when you want live model calls.

For a full pilot and production rollout plan, use the Onboarding Guide after this quickstart.

Hosted reference: punktechnologies.com. Local default: http://localhost:4100.

What you need: Bun 1.2+ and a Punk checkout.

0-5 Min: Start Punk

Start the gateway:

bun install
bun run dev

Open http://localhost:4100.

If the org is blank, Overview shows a Getting started panel. Click SEED DEMO to create the support-triage workflow template plus a demo agent, or jump straight into Chat.

To drive the full optimization loop with repeatable demo traffic, keep the gateway running and use a second terminal:

bun run demo

The dashboard seed gives you objects to inspect. The CLI demo drives live traffic, cache hits, web fetches, repeated work, evidence review, promotion, and optimized traffic at near-zero cost. Run the demo a second time and the optimized share climbs because Punk remembers what it proved.

The dashboard sections map to the system:

SectionFirst thing to check
OverviewSavings, route mix, recent activity.
RunsEvery model request, route explanation, trace, cost, and latency.
PatternsRepeated request shapes Punk discovered.
ArtifactsProven optimized routes, evidence, promote/rollback.
LearningEvidence, blockers, and confidence trajectory for repeated work.
WebCompact page snapshots and token savings from web fetches.
WorkflowsBuilt-in workflow creator, templates, runs, and node timelines.
AgentsSimple scheduled one-task runners built on workflows.
ChatConversation UI where every assistant reply is a real gateway run.
GovernancePolicies, audit, users, API keys, MCP servers, credentials.
BillingPlan, usage, quotas, spend, savings, and Stripe-backed upgrade flow when enabled.
ApprovalsHuman decisions for policy exceptions and artifact promotion.

After the common setup, choose the path that matches your role.

Path A: Chat User Or Evaluator

Use this path if you want to see the runtime without wiring an app.

  1. Open http://localhost:4100/#/chat.
  2. Click NEW CHAT.
  3. Ask a concrete repeatable question, for example:
Classify this support ticket: Customer cannot reset password after SSO migration.
Return JSON with category and priority.
  1. Ask the same question again in a new chat.
  2. Look at the route and cost badge under each assistant reply.

What you should see:

  • The first reply is a real gateway run.
  • The repeated reply can route through exact_cache.
  • Each assistant message links back to the underlying run.
  • The run detail shows the route explanation, alternatives, policy verdict, cost, and trace events.

Turn a useful chat into a scheduled agent:

  1. Open the conversation.
  2. Click SAVE AS AGENT.
  3. Review the prefilled agent form.
  4. Add a cron schedule if needed, or leave it blank for on-demand runs.
  5. Click CREATE AGENT.
  6. Click RUN NOW.

What happened: the chat system prompt became the agent instructions, the last user message became the prompt template, and the agent was stored as a kind: "agent" workflow with a fixed start -> llm -> output graph.

After you have a few repeated runs, open http://localhost:4100/#/learning to see which patterns were stable, what evidence exists, and why an optimization is or is not eligible.

Read next: Chat & Agents.

Path B: Workflow Builder

Use this path if you want to build multi-step agent jobs in the dashboard.

  1. Open http://localhost:4100/#/workflows.
  2. Start from the template gallery. Pick one:
TemplateUse it for
support-triageClassify a ticket, branch on priority, optionally notify.
web-researchFetch a URL as compact page context, summarize it with an LLM node.
pricing-monitorFetch pricing pages and extract structured plan data.
  1. Click USE TEMPLATE.
  2. Open the created workflow.
  3. Click RUN and provide JSON input.

Example for support-triage:

{
  "ticket": {
    "subject": "SSO reset broken",
    "description": "The customer cannot reset a password after SSO migration."
  }
}

Example for web-research or pricing-monitor:

{
  "url": "https://example.com"
}
  1. Open the run timeline from the result.

What you should see:

  • Every workflow node emits workflow.node.started and workflow.node.completed or workflow.node.failed.
  • Every llm node is a real gateway run, so repeated node work can be cached, learned, and routed through proven optimized paths.
  • Workflow cost and savings roll up from child gateway runs.
  • The editor is a graph creator, not generated code. The graph is validated and interpreted.

To edit from scratch:

  1. Click NEW WORKFLOW.
  2. Add nodes from the palette.
  3. Connect output ports to target nodes.
  4. Configure nodes in the inspector.
  5. Click SAVE.
  6. Click RUN.

Useful node kinds:

NodeUse it when
llmYou need a governed model call whose output can be cached and learned.
web_fetchYou need a URL turned into compact page context.
choiceYou need branching.
tool_callYou need a registered MCP tool.
notifyYou need an outbound webhook.
outputYou are done and want to return a value.

Read next: Workflows and Governance.

Path C: App Developer

Use this path if you already have an agent or app calling a model provider.

OpenAI-Compatible Apps

Change only the base URL:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:4100/v1",
  apiKey: process.env.PUNK_API_KEY ?? "punk-local",
  defaultHeaders: {
    "X-Punk-App": "support-app",
    "X-Punk-Agent": "support-agent",
    "X-Punk-Subject": "user-123"
  }
});

Every existing client.chat.completions.create(...) call now flows through Punk. If the gateway runs with PUNK_API_KEY, pass that key to the provider SDK as its client key; provider keys stay on the Punk gateway, not in your app.

Anthropic-Compatible Apps

Punk also exposes an Anthropic-compatible endpoint:

POST http://localhost:4100/v1/messages

Use a claude-* model. With ANTHROPIC_API_KEY set on the gateway, Punk routes to the live Anthropic backend; otherwise the mock provider keeps local testing offline. With the official Anthropic SDK, set baseURL to http://localhost:4100 and authToken to PUNK_API_KEY when the gateway requires bearer auth.

For Chorus evaluation, keep model: "punk/chorus" and set live_synthesis_model when you want to test a specific configured solver lane. The response still comes back through the same gateway wire.

Read The First Route Explanation

Send the same request twice. Each response includes:

HeaderMeaning
x-punk-run-idThe run id for trace lookup.
x-punk-routeThe selected route, such as live, exact_cache, semantic_cache, artifact, hybrid_artifact, model_substitution, or blocked.

Inspect the run:

curl -s http://localhost:4100/api/v1/runs/<runId> | jq .run.routeExplanation

Add Tool Tracing When It Matters

The base-URL swap sees model traffic. The SDK adds tool tracing, side-effect classification, tool-result caching, feedback, and web fetch helpers.

import { Punk } from "@punk/sdk";

const punk = new Punk({
  app: "support-app",
  agent: "support-agent",
  subject: "user-123"
});

const lookupAccount = punk.traceTool({
  name: "crm.lookupAccount",
  sideEffectLevel: 1,
  ttlSeconds: 300,
  execute: async (args: { accountId: string }) => crm.get(args.accountId)
});

const sendEmail = punk.traceTool({
  name: "email.send",
  sideEffectLevel: 3,
  execute: async (args: { to: string; body: string }) => mailer.send(args)
});

Side-effect rule of thumb:

LevelMeaningPunk behavior
0Pure computationCacheable and replayable.
1Read-only externalCacheable with TTL and replayable.
2Reversible/idempotent writeRequires careful policy.
3User-visible writeNot cached; suppressed in replay/shadow; policy-gated.
4High-impact writeLive plus approval by default.

Undeclared tools default to level 3.

Close The Loop With Feedback

const r = await punk.chat({ model: "gpt-4o", messages: [...] });
await punk.feedback(r.runId, 1);
await punk.feedback(r.runId, -1, "correct answer here");

Feedback affects pattern stability and artifact confidence.

Read next: SDK, API, and examples/.

Path D: Operator Or Admin

Use this path if you are preparing a real deployment or tenant.

  1. Protect gateway traffic. PUNK_API_KEY gates /v1/* and /api/v1/* bearer clients:
PUNK_API_KEY=replace-me bun run dev
  1. Bootstrap dashboard login for human admins. This is idempotent and never resets a changed password:
PUNK_ADMIN_EMAIL=admin@example.com \
PUNK_ADMIN_PASSWORD='replace-me' \
PUNK_REQUIRE_LOGIN=true \
bun run dev
  1. Set a real credentials-vault key before storing provider keys, MCP secrets, or OAuth tokens:
PUNK_ENCRYPTION_KEY="$(openssl rand -base64 32)" bun run dev
  1. Use the dashboard Governance section to configure the tenant:
ObjectWhy
OrganizationRename the active org, invite users, review per-org roles.
API keysTenant/app-scoped bearer tokens.
UsersHuman login sessions.
Provider keysTenant BYOK credentials for configured provider families.
MCP serversExternal tools available to workflow tool_call nodes.
CredentialsStored secrets referenced as cred:<id>.
  1. Open http://localhost:4100/#/billing and decide quota posture:
  • Self-host or internal deployment: set PUNK_BILLING_DISABLED=true if all quotas should be unlimited.
  • Hosted billing: set Stripe env vars and let the Billing view create checkout/portal sessions.
  • Free/dev: leave Stripe unset; the local dashboard can still switch plans directly.
  1. For production storage, set Postgres:
PUNK_DATABASE_URL='postgres://...' bun run dev
  1. Pick the job runner shape:
  • Persistent API process: the embedded worker drains learning, webhook, retention, and scheduled workflow jobs.
  • Scale-out deployment: run additional workers against the same Postgres database:
bun run worker
  • Serverless/Vercel-style deployment: configure PUNK_CRON_SECRET and CRON_SECRET so /api/v1/internal/tick can drain jobs once per minute.
  1. For safer artifact rollout, enable canaries:
curl -X PUT http://localhost:4100/api/v1/settings \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer replace-me' \
  -d '{ "key": "canary_enabled", "value": true }'

Also decide tenant settings for redaction, retention_days, semantic_cache, model_substitutions, model_substitution_enabled, webhook_url, and approval_exception_ttl_hours.

Read next: Configuration, Governance, Accounts & Orgs, Billing & Usage, and Troubleshooting.

How Optimization Is Approved

Whether traffic comes from the gateway, chat, an agent, or a workflow, the promotion model is the same:

  1. Punk observes repeated request shapes.
  2. The learning loop groups them into patterns.
  3. Punk prepares candidate optimized routes when the task is stable enough.
  4. Evidence checks confirm the candidate is safe to serve.
  5. A human or configured policy promotes it when approval is required.
  6. Matching future traffic routes through the cheapest safe path.

Force a learning pass when you want to inspect current evidence:

curl -X POST http://localhost:4100/api/v1/learning/tick

In protected mode, add Authorization: Bearer <token>.

Environment Quick Reference

AreaVariablesEffect
ProviderOPENAI_API_KEY / OPENAI_BASE_URLLive OpenAI provider.
ProviderANTHROPIC_API_KEY / ANTHROPIC_BASE_URLLive Anthropic Messages provider.
ProviderOPENROUTER_API_KEY / OPENROUTER_BASE_URLLive OpenRouter portfolio provider for DeepSeek, Kimi/Moonshot, Google, Anthropic, OpenAI, and other routed model slugs.
ProviderDEEPSEEK_API_KEY / DEEPSEEK_BASE_URLDirect DeepSeek provider.
ProviderMOONSHOT_API_KEY or KIMI_API_KEYDirect Moonshot/Kimi provider.
ProviderPUNK_PROVIDER=mockForce the offline provider.
StoragePUNK_DATABASE_URLPostgres/Neon-compatible storage.
StoragePUNK_DB_PATHSQLite path, default data/punk.db.
AuthPUNK_API_KEYRequire bearer auth for protected routes and gateway calls.
AuthPUNK_ADMIN_EMAIL / PUNK_ADMIN_PASSWORD / PUNK_REQUIRE_LOGIN=trueBootstrap dashboard login and force login mode.
IdentityPUNK_ALLOW_PUBLIC_SIGNUP / PUNK_APP_BASE_URLEnable public signup and set invite/verification links.
EmailRESEND_API_KEY / PUNK_EMAIL_FROMSend real invite/verification email instead of console logs.
SecretsPUNK_ENCRYPTION_KEY32-byte base64 key for stored credentials.
LearningPUNK_AUTO_PROMOTE / PUNK_LEARN_INTERVAL_MSHands-free promotion and learning cadence.
WorkersPUNK_WORKER_POLL_MS / PUNK_WORKER_CONCURRENCYEmbedded or standalone worker polling and concurrency.
ServerlessPUNK_CRON_SECRET / CRON_SECRETEnable the secret-gated tick endpoint for scheduled jobs on Vercel-style deployments.
RetentionPUNK_RETENTION_DAYSTrace/audit retention sweep window.
SafetyPUNK_ALLOW_PRIVATE_WEB_FETCH / PUNK_ALLOW_PRIVATE_WEBHOOKSExplicit private-network escape hatches; leave off in production unless intended.
BillingPUNK_BILLING_DISABLED / STRIPE_SECRET_KEY / STRIPE_WEBHOOK_SECRET / STRIPE_PRICE_*Disable quotas for self-hosting or enable Stripe billing.
GovernancePUNK_POLICIES_DIROptional policy directory.
Public sitePUNK_MARKETING_HOST / PUNK_MEET_HOST / PUNK_APP_HOST / PUNK_DOCS_DIRServe the marketing splash or product deck by host, redirect /app, or override docs source.
  • Docs Home: reader paths and full docs map.
  • Onboarding Guide: extended pilot and rollout plan from workflow selection to production operation.
  • Chat & Agents: chat economics, save-as-agent, scheduled task agents.
  • Workflows: workflow creator, IR, templates, scheduling, MCP tools.
  • SDK: TypeScript client reference.
  • API: HTTP routes, auth, identity headers, response conventions.
  • Configuration: env vars, auth modes, database choices, tenant settings.
  • Governance: policies, trust, approvals, audit, observe mode.
  • Accounts & Orgs: users, orgs, invitations, public signup, email.
  • Billing & Usage: plans, quotas, usage metering, Stripe.