Docs/services-architecture

Unified Service Layer — Architecture & Backend Integration Guide

Audience: Clawvard backend engineers.

Scope: This layer is for composed workflows, long-running jobs, and external-API integrations that need to live in the Clawvard credit loop. It is NOT for LLM / STT / TTS / image-gen passthroughs — those remain on Model Service (new-api at token.clawvard.school, sk-xxx keys, billed by token-usage cron sync). Do not register LLM passthroughs here.

What belongs here: video editing pipelines, multi-step agent workflows, Shotstack/Mux/Replicate/Runway integrations, anything gated by course purchase, anything that needs fixed-cost billing or progress polling.

What belongs in the Model Service: anything that looks like "send a request to an OpenAI-compatible endpoint and get an answer."

After reading this you should be able to add a new service in ~30 minutes (proxy) or 2-4 hours (job).


1. Mental model

┌─────────────────────────────────────────────────────┐
│                 @clawvard/sdk                       │
│   cv.video.removeSilence()   cv.workflow.run(...)   │
└────────────────────────┬────────────────────────────┘
                         │ HTTPS (Bearer sk-xxx)
                         ▼
┌─────────────────────────────────────────────────────┐
│        POST /api/services/invoke/{group}/{method}   │
│   ─ resolveUser (cookie or bearer key)              │
│   ─ dispatcher.dispatch()                           │
└────────────────────────┬────────────────────────────┘
                         ▼
┌─────────────────────────────────────────────────────┐
│          src/lib/services/dispatcher.ts             │
│   1. Look up service in registry by id              │
│   2. Validate input (optional hook)                 │
│   3. Check + charge credits (fixed-cost)            │
│   4. Route to runtime:                              │
│       ├─ proxy: fetch third-party API → JSON        │
│       └─ job:   insert row → kick worker → 202      │
└────────────────────────┬────────────────────────────┘
          ┌──────────────┴──────────────┐
          ▼                             ▼
   ┌───────────────────┐      ┌─────────────────┐
   │ Third-party API   │      │  service_jobs   │
   │ (Shotstack, Mux,  │      │  + worker queue │
   │  Replicate, …)    │      │                 │
   └───────────────────┘      └─────────────────┘

One registry → one invocation URL → one SDK method per service. Adding a service is 3 lines in src/lib/services/registry.ts for proxy services, or 3 lines + a handler function for job services.

Not here: LLM / STT / TTS / image-gen passthroughs. Those go through the Model Service (https://token.clawvard.school, sk-xxx keys, per-token billing). The unified layer is strictly fixed-cost.


2. File layout

File Purpose
src/lib/services/types.ts ServiceDefinition, ServiceMeta, ProxyHandler, JobHandler, runtime contexts
src/lib/services/registry.ts The only file you edit when adding a service. Appends to SERVICES[]
src/lib/services/dispatcher.ts Lookup / validate / charge / route. Shared by every invocation
src/app/api/services/invoke/[group]/[method]/route.ts HTTP entry point. Never edit
src/app/api/services/catalog/route.ts GET /api/services/catalog — public listing
src/app/api/services/jobs/[jobId]/route.ts GET /api/services/jobs/:id — poll endpoint
packages/sdk/src/index.ts @clawvard/sdk — typed client for all services
docs/services-architecture.md You are here

3. Anatomy of a service

interface ServiceDefinition {
  meta: {
    id: string;                // "video.remove-silence"
    group: string;             // "video"
    method: string;            // "removeSilence"  (camelCase in SDK)
    summary: string;           // one-liner for catalog
    description?: string;      // longer markdown
    costCredits: number;       // fixed price per invocation. 0 = free.
    approxCostHintUsd?: number;
    access?: {
      beta?: boolean;
      requiresCoursePurchase?: string;    // "course-id"
      minAccountTier?: "free" | "pro" | "enterprise";
    };
  };
  handler: ProxyHandler | JobHandler;
  validateInput?: (input: unknown) => string | null;
}

id / group / method relationship

id group method SDK call HTTP path
video.render video render cv.video.render(...) POST /api/services/invoke/video/render
video.remove-silence video removeSilence cv.video.removeSilence(...) POST /api/services/invoke/video/remove-silence
workflow.podcast2blog workflow podcast2blog cv.workflow.run("workflow.podcast2blog", ...) POST /api/services/invoke/workflow/podcast2blog

Rules:

  • id is kebab-case, dot-separated, globally unique.
  • group = first segment of id.
  • method is the SDK camelCase form (same as URL path after the /).

4. Adding a PROXY service (30 min task)

Proxy services forward a request to a third-party API (Shotstack, Mux, Replicate, Runway, ElevenLabs direct, etc.). The dispatcher charges costCredits up-front and refunds on upstream failure (4xx/5xx or network error) — our credits are the user-facing price; the third-party COGS is ours to manage.

Each proxy handler carries its own upstream URL + auth, so adding a new provider does not require touching dispatcher code.

Example: Shotstack video render

// in src/lib/services/registry.ts, inside SERVICES[]
{
  meta: {
    id: "video.render",
    group: "video",
    method: "render",
    summary: "Render a timeline to MP4 via Shotstack.",
    costCredits: 50,
    approxCostHintUsd: 0.05,
  },
  handler: {
    kind: "proxy",
    upstreamPath: "/render",
    upstreamBaseUrlEnv: "SHOTSTACK_BASE_URL",        // e.g. https://api.shotstack.io/v1
    upstreamAuth: {
      type: "header",
      headerName: "x-api-key",
      envVar: "SHOTSTACK_API_KEY",
    },
    transformRequest: (input) => ({
      timeline: input,
      output: { format: "mp4" },
    }),
    transformResponse: (res) => ({ renderId: (res as { id: string }).id }),
  },
},

Then add the SDK method:

// in packages/sdk/src/index.ts
class VideoNamespace {
  constructor(private readonly c: HttpClient) {}
  render(input: VideoRenderInput): Promise<VideoRenderOutput> {
    return this.c.invoke("video", "render", input);
  }
}

Add the input/output types at the bottom of index.ts next to the existing ones. No route file edits, no migration.

Upstream URL + auth

Every proxy handler can configure two things that used to be global:

Field Meaning
upstreamPath Absolute URL OR relative path — relative paths are joined to process.env[upstreamBaseUrlEnv]
upstreamBaseUrlEnv Env var name that holds the base URL. Required when upstreamPath is relative.
upstreamAuth.type: "bearer" Adds Authorization: Bearer $env
upstreamAuth.type: "header" Adds $headerName: $env (use for x-api-key etc.)
upstreamAuth.envVar Env var name holding the secret

Each third-party integration gets its own env vars — don't share secrets across providers.

Request/response transforms

Optional transformRequest / transformResponse hooks let you map between the public SDK shape and whatever the upstream wants. Keep these pure — no side effects, no I/O.

When NOT to use a proxy

If the work takes > 10 seconds, use a job handler instead — proxy handlers hold the HTTP connection open until the upstream responds, and Vercel Fluid Compute caps at 300 seconds. Anything video-transcode grade should be a job (start → poll for result).


5. Adding a JOB service (2-4 hour task)

Job services run > 10 seconds (video processing, multi-step workflows, etc.). The dispatcher charges costCredits up-front, stores a row in service_jobs, kicks your execute function, and refunds automatically on failure.

Required: the service_jobs tablesupabase/migrations/20260425000001_service_jobs.sql is already applied. The dispatcher reads/writes it via the persistence helpers at the bottom of dispatcher.ts (no TODO stubs left).

Example: add the video remove-silence service

// in src/lib/services/registry.ts, inside SERVICES[]
{
  meta: {
    id: "video.remove-silence",
    group: "video",
    method: "removeSilence",
    summary: "Auto-detect and cut silent sections out of a video clip.",
    costCredits: 20,
    access: { beta: true },
  },
  handler: {
    kind: "job",
    timeoutSec: 300,
    execute: async (input, ctx) => {
      const { inputUrl, silenceThresholdDb = -40, minSilenceMs = 500 } =
        input as VideoRemoveSilenceInput;

      await ctx.updateProgress?.(0.05, "downloading");
      const localPath = await downloadToScratch(inputUrl);

      await ctx.updateProgress?.(0.25, "transcribing");
      // If you need transcription, hit the Model Service directly with a
      // server-side sk-xxx key (OpenAI-compatible /v1/audio/transcriptions).
      // The Model Service handles token-usage billing itself — do not charge
      // again here.
      const transcript = await transcribeViaTokenRelay(localPath);

      await ctx.updateProgress?.(0.6, "detecting silence");
      const windows = detectSilenceWindows(transcript, silenceThresholdDb, minSilenceMs);

      await ctx.updateProgress?.(0.85, "cutting with ffmpeg");
      const outputUrl = await runFfmpegCut(localPath, windows);

      return {
        outputUrl,
        cutSeconds: sumWindows(windows),
        segmentsRemoved: windows.length,
      };
    },
  },
},

Job execution environment

Vercel Fluid Compute gives you up to 300 seconds (maxDuration = 300 on the invoke route). Beyond that, move the actual compute to one of:

Option When
Vercel Queues (beta) Any async work > 5 min; persistent, native fit
Modal / RunPod GPU-bound work (video transcode, Stable Video Diffusion)
Shotstack / Mux / Descript API Don't own the pipeline — proxy to a video SaaS and pass cost through

The handler's job is to start the work and return — long-running code should live in a worker. The dispatcher calls execute inside a waitUntil() so the HTTP response returns 202 immediately. If execute takes > 300s, the platform will kill the Lambda but the queue worker continues.

Progress reporting

Inside execute, call ctx.updateProgress?.(pct, note) to persist a status update. The SDK surfaces this via .onProgress(cb):

const result = await cv.video.removeSilence({ inputUrl })
  .onProgress((pct, note) => console.log(`${(pct * 100).toFixed(0)}% — ${note}`))
  .wait();

Failure & refund

Throw from execute to fail the job. The dispatcher:

  1. Sets status = 'failed', stores error_message.
  2. Refunds costCredits automatically (via grantCredits with a ${serviceId}.refund type — shows up in the user's transaction history as "Refund for failed ...").

Don't catch errors and return them — just throw. Idempotent refund depends on the dispatcher seeing the exception.


6. Credits integration

Every service in this layer is fixed-price. The dispatcher:

  1. Reads balance via getBalance(userEmail).
  2. If insufficient → 402 insufficient_credits — no work done, no charge.
  3. Calls spendCredits(user, costCredits, serviceId, invocationId, desc).
  4. Runs the proxy or job. On failure (upstream 4xx/5xx, network error, or a thrown job handler), the dispatcher refunds automatically via grantCredits with a ${serviceId}.refund type.

Free services set costCredits: 0 and skip the charge/refund dance.

Why no pay-as-you-go here? Per-token / per-second billing is the Model Service's job (sk-xxx keys, per-token billing). Keeping this layer fixed-price means SDK users can predict cost up-front and we don't duplicate the metering infra.

Adding a new credit-transaction type

If your service should show up with a distinct label in the user's credit history, add it to TYPE_LABELS in src/components/dashboard/CreditsContent.tsx:

"video_remove_silence": { en: "Remove Silence", zh: "去静音" },

7. Authentication

The invoke endpoint accepts two auth forms via resolveUser(request):

  1. Session cookie — for in-dashboard / web calls. No explicit headers; Supabase session cookie is read automatically.
  2. Bearer API keyAuthorization: Bearer sk-xxx. The key is created at /token-relay and scoped to a specific user.

Inside the dispatcher, ctx.userEmail is always populated. Services that need to know the caller identity read it from there.

Rate limiting is not done inside the dispatcher — add it at the route layer (POST /api/services/invoke/[group]/[method]/route.ts) if a specific service needs tighter limits than the global Model Service per-key throttle.


8. Course integration (next step, not in foundation yet)

The eventual hook:

// pseudo, pending course module
{
  meta: {
    id: "video.remove-silence",
    access: {
      requiresCoursePurchase: "new-media-editing-101",
    },
  },
  // ...
}

The dispatcher, before charging, checks course_enrollments for this user × course. Not purchased → 402 course_not_purchased with the course ID in the hint so the UI can upsell.

This is a TODO for the course module and isn't implemented in the foundation. Don't add the check to the dispatcher yet — the course team will wire it when the enrollment flow is ready.


9. Catalog / marketplace UI

GET /api/services/catalog

Returns every service minus the handler impl. The marketplace tile UI consumes this to render cards — service name, one-liner, credit cost, beta badge. Frontend team: see src/components/dashboard/ModelsMarketplace.tsx for how the existing LLM marketplace is built; the new unified marketplace will replace it.


10. Testing

Proxy service

# Assumes $API_KEY exported
curl -X POST https://clawvard.school/api/services/invoke/video/render \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "clips": [{ "asset": { "type": "video", "src": "..." }, "length": 10 }] }'
# => { "renderId": "..." }

Job service

# Start
curl -X POST https://clawvard.school/api/services/invoke/video/remove-silence \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "inputUrl": "https://..." }'
# => { "jobId": "...", "status": "pending", "pollUrl": "/api/services/jobs/..." }

# Poll
curl -H "Authorization: Bearer $API_KEY" \
  https://clawvard.school/api/services/jobs/<jobId>
# => { "status": "running", "progress": { "pct": 0.6, "note": "..." } }

SDK

import { Clawvard } from "@clawvard/sdk";
const cv = new Clawvard({ apiKey: "sk-xxx" });

// Proxy — round-trip JSON, credits charged once
const { renderId } = await cv.video.render({ clips: [/* ... */] });

// Job — auto-polls, surfaces progress, throws on failure
const result = await cv.video.removeSilence({ inputUrl: "..." })
  .onProgress((pct, note) => console.log(pct, note))
  .wait();

For LLM / STT / TTS / image-gen, don't call this layer — point the OpenAI SDK at https://token.clawvard.school with a sk-xxx key minted from /token-relay.


11. Checklist: adding a new service

  • Decide runtime: proxy (stateless, upstream bills) vs job (fixed cost, DB row)
  • Append ServiceDefinition to SERVICES in src/lib/services/registry.ts
  • Add SDK method + input/output types in packages/sdk/src/index.ts
  • (Job only) Implement execute. Call ctx.updateProgress?.() at milestones.
  • (Job only) Ensure the service_jobs table exists and the TODO(backend): stubs in dispatcher.ts + jobs/[jobId]/route.ts point at real SQL
  • (New credit type) Add label to TYPE_LABELS in CreditsContent.tsx
  • pnpm tsc --noEmit clean
  • curl test: both the success path and the insufficient-credits path
  • Update packages/sdk version + publish

12. FAQ

Q: Can one service call another? A: Yes. Import dispatch from @/lib/services/dispatcher and call it from your execute function. The inner call's credits are billed to the same user. Keep recursion shallow — workflow depth > 3 gets hard to reason about.

Q: How do I add per-service rate limiting? A: At the route layer. Wrap the dispatch call with checkRateLimitAsync keyed on ${userEmail}:${serviceId}.

Q: What about streaming job output (not just progress)? A: Out of scope. The SDK's .onProgress() hook gives you percentage + free-form note string which covers the long-running job case. If you need real token-by-token streaming, that's an LLM use-case and belongs on the Model Service — point the OpenAI SDK at token.clawvard.school and use the native streaming there.

Q: Can I version a service? A: Use the id suffix: video.render, video.render-v2. The SDK exposes both as separate methods. Mark the old one deprecated: true in meta and emit a warning in the handler.

Q: Why isn't LLM / STT / TTS / image-gen here? A: It's on the Clawvard Model Service (https://token.clawvard.school, sk-xxx keys). That layer handles per-token billing via cron quota sync — a model this layer is not built for. Keeping the two separate lets users swap OpenAI-SDK-compatible clients freely against Token Relay while this layer focuses on composed / long-running / third-party API work.


13. Pricing & usage APIs

Every service in this layer is fixed-price — the number in meta.costCredits is the authoritative price per successful call. A ¥ / $ equivalent is derived automatically using the conversion rates in src/lib/credit-pricing.ts:

Unit Rate Example (costCredits: 50)
credits 1:1 50 credits
¥ (CNY) 10 credits = ¥1 ≈ ¥5.00
$ (USD) 69 credits = $1 ≈ $0.72

If you need to change a price, edit costCredits on the registry entry — the catalog + SDK pick up the new ¥/$ display automatically.

13.1 Catalog — GET /api/services/catalog

Public, no auth. Returns every service with pricing pre-computed:

curl https://clawvard.school/api/services/catalog
{
  "version": 1,
  "services": [
    {
      "id": "video.render",
      "group": "video",
      "method": "render",
      "summary": "Render a timeline to MP4 via Shotstack.",
      "costCredits": 50,
      "runtime": "proxy",
      "pricing": { "credits": 50, "cny": 5.00, "usd": 0.5 },
      "access": { "beta": true }
    }
  ]
}
// SDK — same data, typed
const services = await cv.catalog();
const render = services.find((s) => s.id === "video.render");
console.log(`${render?.pricing.credits} cr · ¥${render?.pricing.cny}`);

13.2 User usage — GET /api/services/usage

Auth required. Returns a totals block + one row per registered service. Services the user has never called show zero stats so the marketplace can render a complete "you vs. catalog" table.

curl -H "Authorization: Bearer $API_KEY" \
  https://clawvard.school/api/services/usage
{
  "version": 1,
  "totals": { "calls": 42, "refunded": 1, "netCalls": 41, "creditsSpent": 2050 },
  "services": [
    {
      "serviceId": "video.render",
      "calls": 40,
      "refunded": 1,
      "netCalls": 39,
      "creditsSpent": 1950,
      "lastCalledAt": "2026-04-24T08:31:20Z"
    },
    {
      "serviceId": "video.remove-silence",
      "calls": 2,
      "refunded": 0,
      "netCalls": 2,
      "creditsSpent": 100,
      "lastCalledAt": "2026-04-23T19:02:11Z"
    }
  ]
}
const { totals, services } = await cv.usage();
console.log(`You've spent ${totals.creditsSpent} cr this lifetime.`);

13.3 Per-service usage — GET /api/services/usage/{group}/{method}

Auth required. Single-service summary plus, when history=1, the N most recent invocation rows (useful for a "recent calls" drawer).

curl -H "Authorization: Bearer $API_KEY" \
  "https://clawvard.school/api/services/usage/video/render?history=1&limit=10"
{
  "version": 1,
  "serviceId": "video.render",
  "summary": { "serviceId": "video.render", "calls": 40, "refunded": 1, "netCalls": 39, "creditsSpent": 1950, "lastCalledAt": "..." },
  "history": [
    { "invocationId": "b7…", "creditsSpent": 50, "createdAt": "2026-04-24T08:31:20Z", "refunded": false },
    { "invocationId": "2c…", "creditsSpent": 50, "createdAt": "2026-04-24T08:15:02Z", "refunded": true }
  ]
}
const report = await cv.usageFor("video", "render", { includeHistory: true, limit: 10 });
for (const call of report.history ?? []) {
  console.log(call.invocationId, call.refunded ? "(refunded)" : "");
}

13.4 Where the numbers come from

No new table is introduced — usage is derived from credit_transactions:

Query Meaning
WHERE type = 'video.render' every invocation (incl. later refunded)
WHERE type = 'video.render.refund' refunds only
reference_id pair: <invocationId>refund-<invocationId> links the two

This means zero ops cost for usage tracking — the dispatcher's existing spendCredits / grantCredits calls are already the source of truth. If you later need per-call duration or input/output size, add them as a separate service_invocations table; don't cram them into credit_transactions.

13.5 Changing prices safely

  1. Edit costCredits in src/lib/services/registry.ts.
  2. Ship. The catalog + SDK reflect the new price immediately.
  3. Past invocations keep their historical price (they're rows in credit_transactions with the amount that was actually charged).
  4. Users who had an in-flight job at deploy time: the job was charged at the old price and will refund at the old price if it fails.