SRT to Burned-In Subtitles API

Your approved transcript in, a styled MP4 out

Already have an SRT — from your translation vendor, your CMS, or a reviewed Whisper pass? ZapCap renders it into a styled, burned-in MP4 without retranscribing a single word. The exact wording you approved is what gets baked into the frames.

Bring your own SRT · no retranscription · exact wording preserved · styled burn-in
You already have the words

The transcript is done, now you need the render

When your captions are written, reviewed, and signed off, the last thing you want is an API that re-listens to the audio and rewrites them. You need styling and burn-in — not another transcription pass.

OPTION 01

Re-transcribe and hope
Auto-caption tools
  • Ignores your approved SRT
  • Re-listens to the audio
  • Can change names and claims
  • Another review cycle

OPTION 02

DIY ffmpeg burn-in
Self-hosted render workers
  • Respects your SRT
  • You build font + layout logic
  • No animated / styled presets
  • You run the render infra

OPTION 03 · YOU ARE HERE

ZapCap · SRT → styled burn-in
Your words, our styling and render
  • Renders your SRT verbatim
  • No retranscription
  • Styled presets or overrides
  • Finished MP4 returned

OPTION 04

Subtitle file generator
SRT / VTT tools
  • Makes the file
  • Doesn't render video
  • No styling
  • You still need burn-in
API workflow

From your SRT to a styled MP4

Upload the video, create a task that carries your approved cues and a templateId, then poll or get a webhook. ZapCap skips transcription and renders your transcript, styled, into the frames.

  1. 1

    Upload your video

    POST the file to /videos. We stream it to storage and hand you back a videoId.

    POSTPOST /videos
  2. 2

    Create the captioning task

    One POST starts transcription, styling and rendering with your chosen template. Add a notification webhook to skip polling.

    POSTPOST /videos/:id/task
  3. 3

    Receive the webhook

    We POST status updates to your endpoint as the render moves through transcribing → rendering → completed.

    HOOKPOST → your URL
  4. 4

    Download the finished render

    Burned-in subtitles, served from a global CDN. No watermark. MP4 ready for any social platform.

    GETGET renderUrl
Step 1 / 4·~2s
1import { readFileSync } from "node:fs";
2
3const form = new FormData();
4form.append(
5 "file",
6 new Blob([readFileSync("clip.mp4")]),
7 "clip.mp4",
8);
9
10const { id: videoId } = await fetch(
11 "https://api.zapcap.ai/videos",
12 {
13 method: "POST",
14 headers: { "x-api-key": process.env.ZAPCAP_KEY! },
15 body: form,
16 },
17).then(r => r.json());

POST /videos·Upload your video

Live render

One approved transcript, four styled outputs

The text below is fixed — it comes from your SRT. Only the templateId changes. ZapCap renders your exact wording into a finished MP4 in each style.

Devin preset
Tracy preset
Beast preset
Hormozi preset

The wording never changes between renders because it is your transcript, not a fresh transcription. Only the styling differs across the four outputs.

Bring your own transcript

You supply the words, ZapCap supplies the render

Send your approved SRT

Provide cues from your translation vendor, internal CMS, legal-reviewed copy, or your own Whisper pipeline. ZapCap renders them as-is — no transcription step, no rewording.

  • Wording, timing, and line breaks honored from your file
  • Approved product names, claims, and disclaimers preserved verbatim
  • Render the same SRT into multiple styles and aspect ratios

Or let ZapCap transcribe

Do not have a transcript yet? ZapCap can transcribe, split, and time captions for you — but on this workflow the point is that you do not have to.

  • Available when you want auto-transcription instead
  • Inspect / edit cues via PUT /videos/:id/task/:taskId/transcript
  • Switch to bring-your-own once your copy is approved
Styling as an API primitive

Style your transcript

Without touching the words.

Your cues are fixed; the look is yours to choose. Send a templateId for a complete style, or override individual renderOptions. Styling never alters the text — it only changes how your approved words are rendered.

  • Templates — Beast, Hormozi, Tracy, Devin, plus 25 more (29 presets) — apply a finished look to your own transcript.
  • Animation — word-by-word pops, karaoke fill, fades — drives off your cue timing.
  • Keyword emphasis — flag punchwords to color / scale / box — without editing the words themselves.
  • Layout — font, color, stroke, shadow, words per cue, vertical position with safe-zone math.
  • Aspect ratios — render 9:16, 1:1, 16:9 from one source and one SRT.
Try a style
Render options
{
  "templateId": "d46bb0da-cce0-4507-909d-fa8904fb8ed7",
  "renderOptions": {
    "subsOptions": {
      "emphasizeKeywords": false,
      "animation": true,
      "displayWords": 4
    },
    "styleOptions": {
      "fontUppercase": false,
      "fontShadow": "m"
    }
  }
}
Output modes

Your transcript, rendered the way your pipeline wants

Most common

Burned-in MP4

Your approved transcript rendered into the frames, styled. The default for social, ads, and any platform that ignores subtitle tracks.

.mp4 · h264
Editor-friendly

Transparent overlay

A caption-only layer of your transcript with alpha preserved, to drop over your own edit in an NLE.

.mov ProRes 4444 · .webm VP9 alpha
Compatibility

Green-screen layer

For tools without alpha. Your transcript on a #04F404 canvas you key out downstream.

.mp4 · #04F404 backdrop

Burned-in is best for distribution. Since you started from an SRT, you already have an accessibility file you can store alongside the MP4 for closed-caption use.

Localized transcripts

Already localized? Render each language

Bring a separate, approved SRT per market and render each into a styled MP4 — the translation work stays with your localization vendor, and ZapCap renders exactly what they signed off. CJK and Thai use language-aware line-breaking.

  • One source video, one approved SRT per target market
  • Vendor-approved translations rendered verbatim, no machine rewrite
  • Language-aware layout for Chinese, Japanese, Thai cue text
As approved
完全照原稿
ตามที่อนุมัติ
Simple API credits

Per-minute, usage-based credits

Pay for the minutes you render. Bringing your own transcript skips transcription, but billing is by rendered minute — see pricing for the full table.

  • Top up credits to keep renders flowing in production
  • Volume credits available at scale
  • No per-seat fee — pay for renders, not users
$0.10 / min

Indicative starting rate. Final pricing depends on render mode and output format. API access requires a Pro plan plus credits.

Customer · Anonymized

A localization team rendered vendor-approved subtitles into styled MP4s without a single retranscription, keeping legal-reviewed wording intact

Translators deliver approved SRTs; ZapCap renders each into a branded, burned-in MP4 per market. Because nothing is re-listened to, names, claims, and disclaimers reach the screen exactly as signed off.

0
Retranscriptions of approved copy
verbatim
Wording preserved to the screen
1 SRT
Rendered into multiple styles
per market
A styled MP4 from each localized file
Developer quickstart

Send approved cues and skip retranscription

Create a task with your approved transcript entries, a templateId, and autoApprove true. ZapCap renders those cues into the video instead of re-listening to the audio.

  • Use your SRT/VTT parser to convert approved cues into transcript entries
  • Pass transcript on POST /videos/{videoId}/task to preserve wording
  • Use webhooks or GET task status to collect the finished MP4
{
  "templateId": "d46bb0da-cce0-4507-909d-fa8904fb8ed7",
  "autoApprove": true,
  "transcript": [
    { "type": "word", "text": "Approved", "start_time": 0.12, "end_time": 0.48 },
    { "type": "word", "text": "wording", "start_time": 0.49, "end_time": 0.92 },
    { "type": "punctuation", "text": ".", "start_time": 0.92, "end_time": 0.92 }
  ],
  "notification": {
    "type": "webhook",
    "notificationsFor": ["render"],
    "recipient": "https://your.app/api/zapcap-webhook"
  }
}

About the SRT to Burned-In Subtitles API

No. On this workflow you supply approved cues and ZapCap renders them verbatim — it does not re-listen to the audio or rewrite your text. The wording, timing, and line breaks come from your file, so the captions on screen are exactly what you approved.

Render your approved transcript through the API

Create a key on a Pro plan and buy credits in the dashboard.