Case study · anonymizedSocial-video agencyOutcome · one caption pipeline

One caption workflow, every client clip captioned the same way

A social-video agency delivers short-form clips for many clients at once — each with its own brand caption style. Captioning was the inconsistent, manual step that ate junior-editor hours and produced off-brand variance. They standardized it on the ZapCap API. Names are anonymized; identifying figures are ranged.

per-client
A caption template per brand
Consistent every delivery
days → hours
Captioning turnaround
Per client batch
0
Render workers maintained
No ffmpeg / GPU queue in-house
per-min
Usage-based billing
Pass-through per client

01The bottleneck

The agency ships a steady volume of short-form clips across a roster of clients, and every client has its own caption look — font, placement, color, the lot. Captioning was where consistency went to die: a manual editor step repeated per clip, per client.

Every junior editor captioned slightly differently. The same client could get three subtly off-brand looks in a week, revisions bounced back over caption styling, and the captioning step quietly consumed the most billable-but-unbillable hours in the studio.

How captioning worked before
  • ·Each clip captioned by hand, per client, per editor
  • ·Brand caption style re-created from memory each time
  • ·Revisions bounced back over styling inconsistencies
  • ·Captions sometimes overran the 9:16 safe zone
What was needed to standardize
  • ·One locked caption template per client brand
  • ·Consistent output regardless of which editor delivers
  • ·Async batch delivery wired into the delivery tool
  • ·Per-client usage tracking for pass-through billing
  • ·Captioning off the critical path for junior editors

02The ZapCap workflow

Captioning moved out of the editor and into the agency's delivery tool. Each client is mapped to one locked caption template, so the editor just drops clips into the client folder and the tool handles the rest.

For each clip the tool POSTs the source URL to /videos, creates a task with that client's fixed templateId, and stores the taskId against the delivery. The webhook handler verifies x-signature, dedupes on eventId, and files each finished renderUrl back into the client's delivery batch.

Deliverables ship as burned-in .mp4 (h264) by default. When a client needs captions as a separate layer, the same task requests the transparent (prores4444 / vp9 alpha) output instead.

Delivery render path
editor drops clipPOST /videos/urlPOST /videos/:id/task · client templatewebhookinto client delivery batch

03Technical implementation

The integration landed inside the existing delivery tool, not as a new service. Each client record gained a templateId field; dropping clips into a client folder triggers a task per clip against that client's template.

Failure handling. ZapCap is treated as a normal upstream: signed webhooks, eventId-based dedupe so retries never double-file a deliverable, and per-clip status so one failed render re-queues without holding up the rest of the batch.

What was actually built
  • templateId per client (UUID from GET /templates) — one locked caption look per brand.
  • Folder-drop trigger in the delivery tool — one ZapCap task per clip routed to the client template.
  • Webhook handler · x-signature verified, eventId-deduped, renderUrl filed into the client batch.
  • Dual output — burned-in .mp4 by default, transparent overlay when a client wants a separate caption layer.
  • Per-client usage tag so per-minute usage maps to the client it can be passed through to.
  • Per-clip retry — one failed render re-queues without holding the rest of the delivery batch.

04What changed

Captioning stopped depending on which editor did the work. The client template is the single source of truth, so every delivery looks the same regardless of who built it — and styling revisions largely went away.

Billing got simpler too. Per-minute usage is tagged per client, so captioning cost can be passed through cleanly instead of disappearing into unbillable studio hours.

Before
  • ·Each clip captioned by hand, varying by editor
  • ·Off-brand styling and revision churn
  • ·Captioning ate unbillable junior-editor hours
  • ·Captions sometimes clipped the safe zone
  • ·Captioning cost buried in studio overhead
After
  • One locked template per client, consistent every time
  • Same output regardless of which editor delivers
  • Captioning off the editors' critical path
  • Captions respect the 9:16 safe zone in every template
  • Per-minute usage tagged and passed through per client

05In their words

Captioning was the step where every editor did it a little differently and the client noticed. Now each client has one locked template and every delivery matches it. The captioning argument just stopped happening.

Head of production
Social-video agency · anonymized
Anonymization note: name, logo, and product references withheld pending written customer permission. We'll attach the real attribution here once consent is confirmed. — ZapCap content team
API workflow

Client clips, captioned in four calls

Per clip: pass the source URL, create a task with the client's locked caption template, receive the webhook, fetch the rendered MP4. ZapCap handles transcription, styling, and rendering so the delivery tool just routes each clip to the right client template.

  1. 1

    Upload your video

    POST the file to /videos. We stream it to storage and hand you back a videoId.

    POSTPOST /videos
  2. 2

    Create the captioning task

    One POST starts transcription, styling and rendering with your chosen template. Add a notification webhook to skip polling.

    POSTPOST /videos/:id/task
  3. 3

    Receive the webhook

    We POST status updates to your endpoint as the render moves through transcribing → rendering → completed.

    HOOKPOST → your URL
  4. 4

    Download the finished render

    Burned-in subtitles, served from a global CDN. No watermark. MP4 ready for any social platform.

    GETGET renderUrl
Step 1 / 4·~2s
1import { readFileSync } from "node:fs";
2
3const form = new FormData();
4form.append(
5 "file",
6 new Blob([readFileSync("clip.mp4")]),
7 "clip.mp4",
8);
9
10const { id: videoId } = await fetch(
11 "https://api.zapcap.ai/videos",
12 {
13 method: "POST",
14 headers: { "x-api-key": process.env.ZAPCAP_KEY! },
15 body: form,
16 },
17).then(r => r.json());

POST /videos·Upload your video

Illustrative outcomes after standardizing on the API

per-client
One locked caption template per brand
days → hours
Captioning turnaround per batch
0
Render workers maintained in-house
per-min
Usage passed through per client

Caption delivery questions

Each client maps to one locked templateId (a UUID from GET /templates). Every clip for that client renders against the same template, so the caption look is identical no matter which editor delivers.

Standardize caption delivery across every client

Backend-only API, webhook-native, $0.10/min base usage-based billing. One locked caption template per client, consistent every delivery — without an in-house render pipeline.