Use case · Localization

Multilingual subtitle rendering API with per-script layout

Multilingual rendering for CJK, Thai, and more

Translation is only half of localization; the subtitles still have to wrap, break, and render correctly in each script. ZapCap is the caption-rendering and translation API that handles the layout rules languages like Chinese, Japanese, and Thai actually need — not just space-splitting that breaks on the first character.

Render multilingual subtitles Integration guide

Per-script layout · translateTo · webhook delivery

The problem Architecture Workflow State machine Launch checklist Build vs buy Proof FAQ

The problem

Translation is easy; rendering it correctly is not

A localization pipeline that only translates produces subtitles that look wrong on screen. Each script has its own line-break and wrapping rules; reading speed differs; and the styling has to stay consistent while the language underneath changes completely.

Script-specific wrapping — CJK breaks by character, not by space — most engines get this wrong.
Thai line breaks — no spaces between words; naive wrapping mangles it.
Reading speed — characters-per-second limits differ by language.
Style consistency — one caption look across every language version.
Re-render churn — restyle once, re-render every language by hand.
Source of truth — one transcript, many translated renders to track.

Reference architecture

Where ZapCap sits in your localization pipeline

Drop the API behind your localization tooling. One source transcript drives a translated render per target language, each laid out with the rules its script needs.

Your source

Source video + transcript

Target languages

Source clip in your CDN

Your backend

Localization job runner

Holds ZAPCAP_API_KEY (server-only)

/webhooks/zapcap handler

ZapCap

POST /videos

Task per language · translateTo

Per-script renderUrl

Bring your own approved transcript via the transcript param when terminology must be exact; the dictionary param feeds names as transcription hints. Use base language codes like zh — not zh-CN / zh-TW.

API workflow

How the multilingual subtitle rendering API works

Upload once, supply or approve the source transcript, then fan out a translated render per language. We handle the per-script layout; you collect the finished files via webhook.

1
Upload your video
POST the source clip to /videos and get back a videoId. The transcript can be auto-generated or supplied via the transcript param when terminology must be exact.
POSTPOST /videos
2
Create the captioning task
Create one task per target language with translateTo. Reuse a single templateId so styling stays identical while the language changes — captions wrap per the rules of each script.
POSTPOST /videos/:id/task
3
Receive the webhook
We POST status updates as each language moves through transcribing → rendering → completed. Map taskId → language so each render files itself correctly.
HOOKPOST → your URL
4
Download the finished render
Each finished render is its own MP4, laid out for its script. Collect them as a localized set, no manual re-styling per language.
GETGET renderUrl

Step 1 / 4·~2s

1import { readFileSync } from "node:fs";

3const form = new FormData();

4form.append(

5 "file",

6 new Blob([readFileSync("clip.mp4")]),

7 "clip.mp4",

8);

10const { id: videoId } = await fetch(

11 "https://api.zapcap.ai/videos",

12 {

13 method: "POST",

14 headers: { "x-api-key": process.env.ZAPCAP_KEY! },

15 body: form,

16 },

17).then(r => r.json());

POST /videos·Upload your video

State machine

Lifecycle of a per-language render

Each language is an independent task. Track them so your localization dashboard shows the full set filling in, language by language.

pending

transcribing

transcriptionCompleted

rendering

completed

In your dashboard

Show a per-language grid so localization owners see which renders are ready, which are still processing, and which need a transcript fix.

On webhook

Pull renderUrl and transcriptUrl, file them under the language, and mark the localized set one step closer to complete.

On failure

Re-run only the failed language; every other render in the set is unaffected.

Launch checklist

Multilingual subtitle rendering API launch checklist

A short list to keep multilingual rendering correct per script and consistent across the set. Transcript, layout, and delivery in one place.

Talk to integrations Read API docs

Approved source transcript Supply your own via the transcript param where terminology must be exact.
Terminology in your dictionary Names and product terms as transcription hints before translation.
One styling preset across languages A single templateId so every language version shares the same look.
Base language codes Use translateTo with codes like zh — not regional variants like zh-CN.
Per-language tracking Map taskId → language so each render files correctly.
Webhook signature verified Check x-signature on every payload; dedupe on eventId.
CJK / Thai spot-check Review wrapping on a sample render before localizing the whole library.

Build vs buy

The multilingual rendering stack, honestly

Build it yourself

In-house localization renderer

1Translation wiring — vendors, terminology control, review loops.
2Per-script layout engine — CJK character breaking, Thai word segmentation.
3Reading-speed logic — characters-per-second limits per language.
4Render workers — ffmpeg / libass with the right font coverage per script.
5Style consistency — identical look across every language render.
6Output storage — one source, many renders, organised per language.
7Billing meter — per-minute counters across the library.

Use ZapCap

Multilingual rendering as a primitive

1translateTo per task — a render per language, laid out for its script.
2One templateId — consistent styling across the set.
3Webhook handler — verify, file by language, assemble the set.

When another tool fits better: ZapCap ships both a web editor and this API and is best-in-class at captioning, but if you need screen and webcam recording plus a broad general video-editing toolset, a full creator app like VEED may suit. See our honest alternatives comparisons.

What changes when rendering becomes an API call.

1 source
Transcript drives every language

per language
One correctly-laid-out render each

per-script
CJK & Thai wrap correctly

~0 lines
Of ffmpeg / layout code

Customer · Anonymized

A localization team replaced a hand-tuned subtitle renderer with the ZapCap API and now produces per-language renders that wrap correctly in CJK and Thai from a single source transcript

Translation was never the blocker — getting subtitles to break and read correctly per script was. With layout handled by the API, the team renders the whole language set from one transcript and one styling preset.

Read case study

1 transcript

Source of truth per video

per language

One render each

CJK / Thai

Wrapped per script

per-minute

Billing passes through cleanly

For localization teams

Yes. Captions are laid out with per-script rules, so Chinese, Japanese, and Korean break by character and Thai breaks at word boundaries rather than on spaces. That is the difference between subtitles that read naturally and ones that break mid-word.

Set translateTo on the task to the target language code, for example zh. Use the base code rather than regional variants like zh-CN or zh-TW. Cantonese is not currently supported.

Yes. When terminology has to be exact, supply your own approved transcript via the transcript param instead of relying on automatic transcription. The dictionary param separately feeds names and terms as hints to the transcription step.

Reuse a single templateId across every task. The font, colour, and placement stay identical while the text underneath changes per language, so the localized set looks like one consistent piece. Styling is a template parameter on ZapCap’s video captioning API, not a per-render decision.

Upload the source once with POST /videos, then create one task per target language — each with its own translateTo. Each render is independent and arrives via webhook, so a failure in one language never blocks the rest of the set.

ZapCap ships both a web editor at zapcap.ai and this rendering API, and it is the more affordable, best-in-class choice for captioning. If your workflow needs screen and webcam recording plus a broad general video-editing toolset, a full creator app like VEED may suit better — our honest comparison pages map out who beats us on which dimension.

Render subtitles that read right in every script

Backend-only API, webhook-native, from $0.10/min base usage pricing. One source transcript in, a correctly-laid-out render per language out.

Render multilingual subtitles Talk to integrations

Multilingual subtitle rendering API with per-script layout

Multilingual rendering for CJK, Thai, and more

Translation is easy; rendering it correctly is not

Where ZapCap sits in your localization pipeline

How the multilingual subtitle rendering API works

Upload your video

Create the captioning task

Receive the webhook

Download the finished render

Lifecycle of a per-language render

Multilingual subtitle rendering API launch checklist

The multilingual rendering stack, honestly

In-house localization renderer

Multilingual rendering as a primitive

A localization team replaced a hand-tuned subtitle renderer with the ZapCap API and now produces per-language renders that wrap correctly in CJK and Thai from a single source transcript

For localization teams

Build the rest of the localization flow

E-commerce video localization API

TikTok Shop video localization API

Performance creative localization API

Agency video captioning API

AI video SaaS captioning API

Render subtitles that read right in every script

Multilingual subtitle rendering API with per-script layout

Multilingual rendering for CJK, Thai, and more

Translation is easy; rendering it correctly is not

Where ZapCap sits in your localization pipeline

How the multilingual subtitle rendering API works

Upload your video

Create the captioning task

Receive the webhook

Download the finished render

Lifecycle of a per-language render

Multilingual subtitle rendering API launch checklist

The multilingual rendering stack, honestly

In-house localization renderer

Multilingual rendering as a primitive

A localization team replaced a hand-tuned subtitle renderer with the ZapCap API and now produces per-language renders that wrap correctly in CJK and Thai from a single source transcript

For localization teams

Do you wrap CJK and Thai subtitles correctly?

How do I target a specific language?

Can I supply my own approved translation or transcript?

Will every language version share the same styling?

How do I render many languages from one source?

What if I need screen recording and general video editing?

Build the rest of the localization flow

E-commerce video localization API

TikTok Shop video localization API

Performance creative localization API

Agency video captioning API

AI video SaaS captioning API

Render subtitles that read right in every script