01The bottleneck
The team — a seed-stage AI video product for marketers — had captions on their public roadmap, voted high by every customer interview. The prototype existed: a Whisper call wired up to ffmpeg with a single hardcoded style, queued through a homegrown Redis worker, behind a polling endpoint.
It was the kind of "almost done" that doesn't ship. The product team wanted style variants. The infra team didn't want another queue. The CTO had no appetite for hiring against a captioning feature.
- ·One caption style, hardcoded font + color
- ·English-only, no language switching
- ·Polling-only — no webhook
- ·Captions sometimes overran the safe zone on 9:16 exports
- ·Style preset gallery in the editor UI
- ·Multilingual rendering with sane line breaks
- ·Async delivery without users staring at a spinner
- ·Per-render usage tracking for billing
- ·A render farm that wouldn't fall over on launch day
02The ZapCap workflow
Once they switched, the captioning path collapsed into three endpoints on their backend plus one ZapCap webhook.
Their existing CDN handled user uploads. Their backend forwarded the source URL to ZapCap, attached the user's selected style as a templateId, and stored the returned taskId against the user record. The webhook handler verified the signature, persisted the renderUrl, and pushed a notification.
Switching styles in the product UI became a one-field change in the request body. Multilingual? Set language. Transparent overlay for power users? Set outputMode.
03Technical implementation
Two engineers, one sprint. The captioning feature shipped behind a feature flag, opened to a small fraction of users on a Wednesday, and rolled to everyone the following week.
Failure handling. The team treated ZapCap as a normal upstream dependency: signed payloads, eventId-based dedupe on their handler, and an alarm on consecutive 5xx. No render-pipeline runbook to maintain.
- One backend route · POST /exports — wraps two ZapCap calls and persists the task ID against the user record.
- One webhook handler · /hooks/zapcap — HMAC-verifies, updates the export row, notifies the user.
- Style preset gallery — UI mirroring the ZapCap template list; one preset per editor card.
- Failure UI — typed error codes from ZapCap mapped to user-readable messages and a retry button.
- eventId dedupe on the webhook — webhook-side dedupe on retry storms.
- Credit-balance check before queueing — surfaces an upgrade prompt instead of a mid-render failure.
04What changed
Operationally the bigger win wasn't the launch — it was that captioning stopped being a project. Caption rendering became a primitive the product team could use without filing a ticket. Adding new styles is a UI change, not an infra change.
Billing slotted in cleanly. API credits at $0.10/min passed through their plan tiers — exports beyond a quota became a one-line upgrade prompt instead of a feature failure.
- ·Captioning marked "in progress" in roadmap for two quarters
- ·One hardcoded style, English only
- ·Polling spinner inside the editor
- ·Captions occasionally clipped the bottom safe zone
- ·Render queue paged the on-call engineer regularly
- →Shipped in one sprint, two engineers
- →Style preset gallery in the editor, multilingual rendering
- →Async delivery — push notification on completion
- →Captions respect the 9:16 safe zone in every template
- →Zero render-related pages since launch
05In their words
We were six weeks into building a captioning stack and not visibly closer to launching one. Switching to the ZapCap API turned the roadmap item into a webhook handler. The product team picked styles; we shipped.