Case study · anonymizedSocial-video agencyOutcome · one caption pipeline

One caption workflow, every client clip captioned the same way

A social-video agency delivers short-form clips for many clients at once — each with its own brand caption style. Captioning was the inconsistent, manual step that ate junior-editor hours and produced off-brand variance. They standardized it on the ZapCap API. Names are anonymized; identifying figures are ranged.

per-client

A caption template per brand

Consistent every delivery

days → hours

Captioning turnaround

Per client batch

Render workers maintained

No ffmpeg / GPU queue in-house

per-min

Usage-based billing

Pass-through per client

01The bottleneck

The agency ships a steady volume of short-form clips across a roster of clients, and every client has its own caption look — font, placement, color, the lot. Captioning was where consistency went to die: a manual editor step repeated per clip, per client.

Every junior editor captioned slightly differently. The same client could get three subtly off-brand looks in a week, revisions bounced back over caption styling, and the captioning step quietly consumed the most billable-but-unbillable hours in the studio.

How captioning worked before

·Each clip captioned by hand, per client, per editor
·Brand caption style re-created from memory each time
·Revisions bounced back over styling inconsistencies
·Captions sometimes overran the 9:16 safe zone

What was needed to standardize

·One locked caption template per client brand
·Consistent output regardless of which editor delivers
·Async batch delivery wired into the delivery tool
·Per-client usage tracking for pass-through billing
·Captioning off the critical path for junior editors

02The ZapCap workflow

Captioning moved out of the editor and into the agency's delivery tool. Each client is mapped to one locked caption template, so the editor just drops clips into the client folder and the tool handles the rest.

For each clip the tool POSTs the source URL to /videos, creates a task with that client's fixed templateId, and stores the taskId against the delivery. The webhook handler verifies x-signature, dedupes on eventId, and files each finished renderUrl back into the client's delivery batch.

Deliverables ship as burned-in .mp4 (h264) by default. When a client needs captions as a separate layer, the same task requests the transparent (prores4444 / vp9 alpha) output instead.

Delivery render path

editor drops clipPOST /videos/urlPOST /videos/:id/task · client templatewebhookinto client delivery batch

03Technical implementation

The integration landed inside the existing delivery tool, not as a new service. Each client record gained a templateId field; dropping clips into a client folder triggers a task per clip against that client's template.

Failure handling. ZapCap is treated as a normal upstream: signed webhooks, eventId-based dedupe so retries never double-file a deliverable, and per-clip status so one failed render re-queues without holding up the rest of the batch.

What was actually built

templateId per client (UUID from GET /templates) — one locked caption look per brand.
Folder-drop trigger in the delivery tool — one ZapCap task per clip routed to the client template.
Webhook handler · x-signature verified, eventId-deduped, renderUrl filed into the client batch.
Dual output — burned-in .mp4 by default, transparent overlay when a client wants a separate caption layer.
Per-client usage tag so per-minute usage maps to the client it can be passed through to.
Per-clip retry — one failed render re-queues without holding the rest of the delivery batch.

04What changed

Captioning stopped depending on which editor did the work. The client template is the single source of truth, so every delivery looks the same regardless of who built it — and styling revisions largely went away.

Billing got simpler too. Per-minute usage is tagged per client, so captioning cost can be passed through cleanly instead of disappearing into unbillable studio hours.

Before

·Each clip captioned by hand, varying by editor
·Off-brand styling and revision churn
·Captioning ate unbillable junior-editor hours
·Captions sometimes clipped the safe zone
·Captioning cost buried in studio overhead

After

→One locked template per client, consistent every time
→Same output regardless of which editor delivers
→Captioning off the editors' critical path
→Captions respect the 9:16 safe zone in every template
→Per-minute usage tagged and passed through per client

05In their words

“

Captioning was the step where every editor did it a little differently and the client noticed. Now each client has one locked template and every delivery matches it. The captioning argument just stopped happening.

Head of production

Social-video agency · anonymized

Anonymization note: name, logo, and product references withheld pending written customer permission. We'll attach the real attribution here once consent is confirmed. — ZapCap content team

API workflow

Client clips, captioned in four calls

Per clip: pass the source URL, create a task with the client's locked caption template, receive the webhook, fetch the rendered MP4. ZapCap handles transcription, styling, and rendering so the delivery tool just routes each clip to the right client template.

1
Upload your video
POST the file to /videos. We stream it to storage and hand you back a videoId.
POSTPOST /videos
2
Create the captioning task
One POST starts transcription, styling and rendering with your chosen template. Add a notification webhook to skip polling.
POSTPOST /videos/:id/task
3
Receive the webhook
We POST status updates to your endpoint as the render moves through transcribing → rendering → completed.
HOOKPOST → your URL
4
Download the finished render
Burned-in subtitles, served from a global CDN. No watermark. MP4 ready for any social platform.
GETGET renderUrl

Step 1 / 4·~2s

1import { readFileSync } from "node:fs";

3const form = new FormData();

4form.append(

5 "file",

6 new Blob([readFileSync("clip.mp4")]),

7 "clip.mp4",

8);

10const { id: videoId } = await fetch(

11 "https://api.zapcap.ai/videos",

12 {

13 method: "POST",

14 headers: { "x-api-key": process.env.ZAPCAP_KEY! },

15 body: form,

16 },

17).then(r => r.json());

POST /videos·Upload your video

Illustrative outcomes after standardizing on the API

per-client
One locked caption template per brand

days → hours
Captioning turnaround per batch

0
Render workers maintained in-house

per-min
Usage passed through per client

Caption delivery questions

Each client maps to one locked templateId (a UUID from GET /templates). Every clip for that client renders against the same template, so the caption look is identical no matter which editor delivers.

Dropping a clip into the client folder creates one task with that client's template. It runs through pending → transcribing → transcriptionCompleted → rendering → completed, and the webhook files the finished renderUrl into the client batch. That whole path is three calls on ZapCap’s video captioning API.

Yes. Burned-in .mp4 (h264) is the default deliverable; transparent prores4444 / vp9 alpha is available when a client wants captions as a separate layer to composite themselves.

Pricing is usage-based at $0.10 per minute base via usage-based credits. Each task carries a client tag, so per-minute usage maps to the client it can be passed through to.

Where this story connects

The capabilities and integration patterns behind every step in this case study.

Standardize caption delivery across every client

Backend-only API, webhook-native, $0.10/min base usage-based billing. One locked caption template per client, consistent every delivery — without an in-house render pipeline.

Create API key Talk to integrations

One caption workflow, every client clip captioned the same way

01The bottleneck

02The ZapCap workflow

03Technical implementation

04What changed

05In their words

Client clips, captioned in four calls

Upload your video

Create the captioning task

Receive the webhook

Download the finished render

Caption delivery questions

Where this story connects

AI video SaaS captioning case study

Performance creative localization case study

E-commerce video localization case study

Standardize caption delivery across every client

One caption workflow, every client clip captioned the same way

01The bottleneck

02The ZapCap workflow

03Technical implementation

04What changed

05In their words

Client clips, captioned in four calls

Upload your video

Create the captioning task

Receive the webhook

Download the finished render

Caption delivery questions

How do you keep each client on-brand?

How does a clip get from the editor to a captioned deliverable?

Can a client get captions as a separate layer?

How is captioning billed per client?

Where this story connects

AI video SaaS captioning case study

Performance creative localization case study

E-commerce video localization case study

Standardize caption delivery across every client