Subtitle API

Video in, styled captioned video out

ZapCap renders finished, styled captioned videos through an API. Send a video URL, choose a caption style, and get a burned-in MP4 — or a transparent overlay — back. Not a transcript API. Not a generic video automation API.

Webhook-native, usage-based credits, bring your own transcript
Not just transcription

Three APIs you might be choosing between

If you're shipping captions to viewers, only one of these returns a video they can actually watch.

OPTION 01

Speech-to-text
Whisper · Deepgram · AssemblyAI
  • Returns text or SRT
  • You handle timing
  • You handle styling
  • You handle rendering

OPTION 02

Subtitle file
SRT / VTT generators
  • Returns timed text file
  • No styling
  • Ignored on social
  • You handle burn-in

OPTION 03 · YOU ARE HERE

ZapCap · styled subtitle rendering
Finished video, not text
  • Returns finished MP4
  • Styled captions baked in
  • Transparent / green-screen
  • Webhook-native

OPTION 04

Video automation
Creatomate · Shotstack · JSON2Video
  • Full video generation
  • Build templates
  • JSON timeline assembly
  • Captions are one element
API workflow

Captioned videos in a few calls

Upload a clip, fire a task, get a webhook. We handle transcription, styling, and rendering — you get the MP4 when it's ready.

  1. 1

    Upload your video

    POST the file to /videos. We stream it to storage and hand you back a videoId.

    POSTPOST /videos
  2. 2

    Create the captioning task

    One POST starts transcription, styling and rendering with your chosen template. Add a notification webhook to skip polling.

    POSTPOST /videos/:id/task
  3. 3

    Receive the webhook

    We POST status updates to your endpoint as the render moves through transcribing → rendering → completed.

    HOOKPOST → your URL
  4. 4

    Download the finished render

    Burned-in subtitles, served from a global CDN. No watermark. MP4 ready for any social platform.

    GETGET renderUrl
Step 1 / 4·~2s
1import { readFileSync } from "node:fs";
2
3const form = new FormData();
4form.append(
5 "file",
6 new Blob([readFileSync("clip.mp4")]),
7 "clip.mp4",
8);
9
10const { id: videoId } = await fetch(
11 "https://api.zapcap.ai/videos",
12 {
13 method: "POST",
14 headers: { "x-api-key": process.env.ZAPCAP_KEY! },
15 body: form,
16 },
17).then(r => r.json());

POST /videos·Upload your video

Live render

Same clip, four caption styles, one field change

Switch the templateId in the request body. ZapCap returns a fully rendered MP4 — no client-side compositing.

Beast preset
Hormozi preset
Tracy preset
Devin preset

Source is a 9:16 vertical clip. Captions render into the final frames — downstream platforms don't need subtitle-track support.

Styling as an API primitive

Style with one field

Override at the leaf.

Send a templateId for one of our presets, or override individual renderOptions for full control. Animation, emoji, keyword highlighting, position, font and color all toggle independently.

  • Templates — Beast, Hormozi, Tracy, Devin, plus 25 more (29 presets). Each preset captures a complete look.
  • Animation — word-by-word pops, karaoke fill, fade in/out, scale bumps.
  • Keyword emphasis — flag punchwords; ZapCap colors / scales / boxes them automatically.
  • Layout — font, color, stroke, shadow, max words per cue, vertical position with safe-zone math.
  • Aspect ratios — render 9:16, 1:1, 16:9 from one source.
Try a style
Render options
{
  "templateId": "46d20d67-255c-4c6a-b971-31fddcfea7f0",
  "renderOptions": {
    "subsOptions": {
      "emphasizeKeywords": true,
      "animation": true,
      "displayWords": 3
    },
    "styleOptions": {
      "fontUppercase": true,
      "fontShadow": "l"
    }
  }
}
Two transcript paths

Auto-transcribe — or bring your own

Auto-transcribe

ZapCap transcribes, splits, and times captions automatically. Edit any cue, approve the transcript, and trigger a render against it.

  • Inspect the transcript on the task status, edit via PUT /videos/:id/task/:taskId/transcript
  • Edit cues before render, no extra credit cost
  • Reuse one transcript across multiple template variants

Bring your own transcript

Send approved cues — from your translation vendor, internal CMS, or Whisper pipeline — and skip transcription entirely.

  • Supported via the SRT-to-burned-in workflow
  • Preserve approved product names, claims, disclaimers
  • Render the same transcript into N styles for A/B variants
Output modes

Pick the format your downstream pipeline wants

Most common

Burned-in MP4

Captions permanently rendered into the source frames. The right choice for TikTok, Reels, Shorts, ad creative, and anywhere subtitle tracks get ignored.

.mp4 · h264
Editor-friendly

Transparent overlay

Caption layer only, alpha-channel preserved. Drop it on a timeline in Premiere, Final Cut, DaVinci, or CapCut — no chroma key needed.

.mov ProRes 4444 · .webm VP9 alpha
Compatibility

Green-screen layer

When the editor or live tool doesn't support alpha. Key out the green in OBS, switcher, or NLE.

.mp4 · #04F404 backdrop

Need an accessibility track too? Burned-in captions are best for social distribution; pair with SRT/VTT export where viewer controls are required.

Multilingual rendering

Translate, restyle, re-render

Set language on the task and ZapCap transcribes, translates, and lays out captions for that target. CJK and Thai use language-aware line-breaking — not whitespace splitting.

  • Source video in one language, output captions in another
  • A dictionary of brand terms biases transcription toward your product names
  • Language-specific guides: Chinese, Japanese, Thai
Ready in two mins
两分钟就好
พร้อมในสองนาที
Simple API credits

Per-minute, usage-based credits

Pay for what you render. Transparent overlays and high-resolution exports use different multipliers — see pricing for the full table.

  • Top up credits to keep tasks flowing in production
  • Volume credits available at scale
  • No per-seat fee — pay for renders, not users
$0.10 / min

Indicative starting rate. Final pricing depends on render mode and output format.

Customer · Anonymized

A European e-commerce group cut localized caption variant production from days to hours by moving captioning to the ZapCap API

Multi-market product videos now flow through one webhook-driven pipeline. Approved transcripts in, brand-styled MP4s out — across markets, with consistent caption styling.

days → hrs
Localized variant lead time
1 source
Re-rendered into multiple markets
webhook
No polling, no manual exports
consistent
Brand style preserved across markets
Developer quickstart

Render styled subtitles from one task endpoint

ZapCap is plain HTTP: upload a video, create a captioning task, then collect the burned-in MP4, transparent overlay, or green-screen layer from status or webhook delivery.

  • POST /videos or /videos/url creates the source video record
  • POST /videos/{videoId}/task accepts templateId, language, transcript, and renderOptions
  • GET /videos/{videoId}/task/{taskId} returns status, transcript, and downloadUrl
VIDEO_ID=$(curl -s -X POST "https://api.zapcap.ai/videos/url" \
  -H "x-api-key: $ZAPCAP_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://cdn.example.com/source.mp4"}' | jq -r .id)

TASK_ID=$(curl -s -X POST "https://api.zapcap.ai/videos/$VIDEO_ID/task" \
  -H "x-api-key: $ZAPCAP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "templateId": "d46bb0da-cce0-4507-909d-fa8904fb8ed7",
    "autoApprove": true,
    "language": "en",
    "renderOptions": {
      "subsOptions": { "animation": true, "displayWords": 3 }
    }
  }' | jq -r .taskId)

About the Subtitle API

A Subtitle API is an HTTP interface for adding subtitles to video programmatically. ZapCap is a styled subtitle rendering API — it accepts a video and returns a captioned MP4 (or a transparent overlay), with the captions baked into the frames. That is different from a transcription API (which returns text) or a subtitle-file API (which returns SRT/VTT).

Start rendering captions through the API

Create a key on a Pro plan and buy credits in the dashboard.