SRT to Burned-In Subtitles API

Your approved transcript in, a styled MP4 out

Already have an SRT — from your translation vendor, your CMS, or a reviewed Whisper pass? ZapCap renders it into a styled, burned-in MP4 without retranscribing a single word. The exact wording you approved is what gets baked into the frames.

Create API key Read API docs

Bring your own SRT · no retranscription · exact wording preserved · styled burn-in

Workflow Demo Transcripts Styling Output modes Languages Pricing Proof Quickstart FAQ

You already have the words

The transcript is done, now you need the render

When your captions are written, reviewed, and signed off, the last thing you want is an API that re-listens to the audio and rewrites them. You need styling and burn-in — not another transcription pass.

OPTION 01

Re-transcribe and hope

Auto-caption tools

Ignores your approved SRT
Re-listens to the audio
Can change names and claims
Another review cycle

OPTION 02

DIY ffmpeg burn-in

Self-hosted render workers

Respects your SRT
You build font + layout logic
No animated / styled presets
You run the render infra

OPTION 03 · YOU ARE HERE

ZapCap · SRT → styled burn-in

Your words, our styling and render

Renders your SRT verbatim
No retranscription
Styled presets or overrides
Finished MP4 returned

OPTION 04

Subtitle file generator

SRT / VTT tools

Makes the file
Doesn't render video
No styling
You still need burn-in

API workflow

How the SRT to Burned-In Subtitles API works

Upload the video, create a task that carries your approved cues and a templateId, then poll or get a webhook. ZapCap skips transcription and renders your transcript, styled, into the frames.

1
Upload your video
POST the file to /videos. We stream it to storage and hand you back a videoId.
POSTPOST /videos
2
Create the captioning task
One POST starts transcription, styling and rendering with your chosen template. Add a notification webhook to skip polling.
POSTPOST /videos/:id/task
3
Receive the webhook
We POST status updates to your endpoint as the render moves through transcribing → rendering → completed.
HOOKPOST → your URL
4
Download the finished render
Burned-in subtitles, served from a global CDN. No watermark. MP4 ready for any social platform.
GETGET renderUrl

Step 1 / 4·~2s

1import { readFileSync } from "node:fs";

3const form = new FormData();

4form.append(

5 "file",

6 new Blob([readFileSync("clip.mp4")]),

7 "clip.mp4",

8);

10const { id: videoId } = await fetch(

11 "https://api.zapcap.ai/videos",

12 {

13 method: "POST",

14 headers: { "x-api-key": process.env.ZAPCAP_KEY! },

15 body: form,

16 },

17).then(r => r.json());

POST /videos·Upload your video

Live render

One approved transcript, four styled outputs

The text below is fixed — it comes from your SRT. Only the templateId changes. ZapCap renders your exact wording into a finished MP4 in each style.

Devin preset

Tracy preset

Beast preset

Hormozi preset

The wording never changes between renders because it is your transcript, not a fresh transcription. Only the styling differs across the four outputs.

Bring your own transcript

SRT to Burned-In Subtitles API transcript options

Send your approved SRT

Provide cues from your translation vendor, internal CMS, legal-reviewed copy, or your own Whisper pipeline. ZapCap renders them as-is — no transcription step, no rewording.

Wording, timing, and line breaks honored from your file
Approved product names, claims, and disclaimers preserved verbatim
Render the same SRT into multiple styles and aspect ratios

Or let ZapCap transcribe

Do not have a transcript yet? ZapCap can transcribe, split, and time captions for you — but on this workflow the point is that you do not have to.

Available when you want auto-transcription instead
Inspect / edit cues via PUT /videos/:id/task/:taskId/transcript
Switch to bring-your-own once your copy is approved

Styling as an API primitive

Style your transcript

Without touching the words.

Your cues are fixed; the look is yours to choose. Send a templateId for a complete style, or override individual renderOptions. Styling never alters the text — it only changes how your approved words are rendered.

Templates — Beast, Hormozi, Tracy, Devin, plus 25 more (29 presets) — apply a finished look to your own transcript.
Animation — word-by-word pops, karaoke fill, fades — drives off your cue timing.
Keyword emphasis — flag punchwords to color / scale / box — without editing the words themselves.
Layout — font, color, stroke, shadow, words per cue, vertical position with safe-zone math.
Aspect ratios — render 9:16, 1:1, 16:9 from one source and one SRT.

Try a style

Render options

{
  "templateId": "d46bb0da-cce0-4507-909d-fa8904fb8ed7",
  "renderOptions": {
    "subsOptions": {
      "emphasizeKeywords": false,
      "animation": true,
      "displayWords": 4
    },
    "styleOptions": {
      "fontUppercase": false,
      "fontShadow": "m"
    }
  }
}

Output modes

SRT to Burned-In Subtitles API output modes

Most common

Burned-in MP4

Your approved transcript rendered into the frames, styled. The default for social, ads, and any platform that ignores subtitle tracks.

.mp4 · h264

Editor-friendly

Transparent overlay

A caption-only layer of your transcript with alpha preserved, to drop over your own edit in an NLE.

.mov ProRes 4444 · .webm VP9 alpha

Compatibility

Green-screen layer

For tools without alpha. Your transcript on a #04F404 canvas you key out downstream.

.mp4 · #04F404 backdrop

Burned-in is best for distribution. Since you started from an SRT, you already have an accessibility file you can store alongside the MP4 for closed-caption use.

Localized transcripts

Already localized? Render each language

Bring a separate, approved SRT per market and render each into a styled MP4 — the translation work stays with your localization vendor, and ZapCap renders exactly what they signed off. CJK and Thai use language-aware line-breaking.

One source video, one approved SRT per target market
Vendor-approved translations rendered verbatim, no machine rewrite
Language-aware layout for Chinese, Japanese, Thai cue text

As approved

完全照原稿

ตามที่อนุมัติ

Simple API credits

Per-minute, usage-based credits

Pay for the minutes you render. Bringing your own transcript skips transcription, but billing is by rendered minute — see pricing for the full table.

Top up credits to keep renders flowing in production
Volume credits available at scale
No per-seat fee — pay for renders, not users

$0.10 / min

Indicative starting rate. Final pricing depends on render mode and output format. API access requires a Pro plan plus credits.

View full pricing Talk to us about volume

Customer · Anonymized

A localization team rendered vendor-approved subtitles into styled MP4s without a single retranscription, keeping legal-reviewed wording intact

Translators deliver approved SRTs; ZapCap renders each into a branded, burned-in MP4 per market. Because nothing is re-listened to, names, claims, and disclaimers reach the screen exactly as signed off.

Read case study

Retranscriptions of approved copy

verbatim

Wording preserved to the screen

1 SRT

Rendered into multiple styles

per market

A styled MP4 from each localized file

Developer quickstart

Send approved cues and skip retranscription

Create a task with your approved transcript entries, a templateId, and autoApprove true. ZapCap renders those cues into the video instead of re-listening to the audio.

Use your SRT/VTT parser to convert approved cues into transcript entries
Pass transcript on POST /videos/{videoId}/task to preserve wording
Use webhooks or GET task status to collect the finished MP4

{
  "templateId": "d46bb0da-cce0-4507-909d-fa8904fb8ed7",
  "autoApprove": true,
  "transcript": [
    { "type": "word", "text": "Approved", "start_time": 0.12, "end_time": 0.48 },
    { "type": "word", "text": "wording", "start_time": 0.49, "end_time": 0.92 },
    { "type": "punctuation", "text": ".", "start_time": 0.92, "end_time": 0.92 }
  ],
  "notification": {
    "type": "webhook",
    "notificationsFor": ["render"],
    "recipient": "https://your.app/api/zapcap-webhook"
  }
}

About the SRT to Burned-In Subtitles API

No. On this workflow you supply approved cues and ZapCap renders them verbatim — it does not re-listen to the audio or rewrite your text. The wording, timing, and line breaks come from your file, so the captions on screen are exactly what you approved.

Approved cues from your translation vendor, your internal CMS, legal-reviewed copy, or your own Whisper / ASR pipeline. The point of the workflow is to preserve text that has already been written and signed off, including product names, claims, and disclaimers.

Yes. Styling and wording are separate. Send a templateId and renderOptions for the look — animation, emphasis, font, shadow, position — and the text stays exactly as you provided it. You can render the same transcript into multiple styles.

Yes. Bring a separate approved SRT per market and render each into a styled MP4. The translation stays with your localization vendor; ZapCap renders verbatim, with language-aware line-breaking for scripts like Chinese, Japanese, and Thai.

A finished video with your transcript baked into the frames — an h264 MP4 by default. The same task can also return a transparent overlay (ProRes 4444 or VP9 with alpha) or a green-screen layer if you composite captions downstream. All three come out of the same video captioning API task.

Yes — you started from one. Keep your SRT alongside the rendered MP4 to serve as a closed-caption / accessibility file where viewers need to toggle captions on and off.

Keep exploring

Render your approved transcript through the API

Create a key on a Pro plan and buy credits in the dashboard.

Create API key Read API docs

SRT to Burned-In Subtitles API

Your approved transcript in, a styled MP4 out

The transcript is done, now you need the render

How the SRT to Burned-In Subtitles API works

Upload your video

Create the captioning task

Receive the webhook

Download the finished render

One approved transcript, four styled outputs