Free tool

Free YouTube Caption Generator — auto-transcribed and styled in your browser

Drop in a short video (up to 2 minutes), get word-level captions back in under a minute, and style them with the same renderer Komodo's paid Captions Studio uses. Download the SRT for free — no login. Render the styled video with a free Komodo account.

Drop a video here or click to choose
Up to 2 minutes · Up to 200 MB · MP4, MOV, M4A, WebM
Why a styled caption matters

Captions that hold the eye, not just the meaning

Most people watch short-form video on mute. The caption is the hook — but a flat, auto-burned subtitle reads like a transcript. Styled captions (a coloured highlight on the active word, a soft pill behind the line, a font that holds its own at thumbnail size) hold attention through the first three seconds, which is where retention is won.

This free tool exists because most other free caption generators stop at the SRT. You get the transcript and you're on your own for the styling. Here you can adjust the font, the highlight colour, the pill, the outline — see them update on your actual video — and either download the SRT to upload separately, or render the styled video with a free Komodo account.

How to use it

Caption a clip in four steps

  1. Upload a video. Up to 2 minutes, up to 200 MB. MP4, MOV, M4A, WebM all work — phone clips straight from camera roll are fine.
  2. Wait ~10–30 seconds. Whisper transcribes with word-level timing.
  3. Style the captions. Font, colour, highlight, pill — preview live on your video.
  4. Download. Grab the SRT (free) or render the styled video (free Komodo account).
FAQ

Free YouTube caption generator — questions

Is this YouTube caption generator really free?
Yes. Auto-transcription (up to 2 minutes of video), full caption styling, and SRT download are free with no login. Creating a styled, burned-in video with the captions embedded asks you to create a free Komodo account, because rendering video runs on real compute.
How accurate is the transcription?
It runs on OpenAI Whisper — currently the most accurate general-purpose speech-to-text model. Word-level timing is returned for each word, which is what makes per-word highlighted captions possible. Clean audio (clear speech, modest background noise) gives the cleanest result.
Why a 2-minute limit?
It keeps the free tool genuinely free without us limiting access. The full Captions Studio inside Komodo handles videos of any length — and the free Komodo account already includes 5 minutes of captions a month at no cost.
Can I download the captions as an SRT file?
Yes — the Download SRT button is free, no account needed. The SRT contains the transcript with timing exactly as Whisper produced it.
What's the difference between SRT and a styled video?
An SRT is a plain-text subtitle file you can upload separately to YouTube — it gets the words on screen but no visual style. A styled video is the original clip with the captions burned in: your font, your colours, the per-word highlight pill and so on. SRT is free here; styled-video render is one click after creating a free Komodo account.
Will my video be saved anywhere?
No. The video is sent once to OpenAI for transcription, then the transcript is returned and the video is discarded. Nothing is stored on Komodo's side from the free tool.
Also free

While you're here — the rest of Komodo's free toolkit

Free Thumbnail Maker →
Full design canvas, no login
Free SEO Generator →
AI titles, description, tags
Earnings Calculator →
Estimate YouTube ad revenue