YouTube Closed Captions: How They Work and How to Add Your Own

YouTube closed captions are the on-screen text that transcribes a video's spoken audio (and key sounds), toggled with the "CC" button. YouTube auto-generates them for most uploads, but auto-captions are often only ~70-90% accurate. For clean, edited captions you upload an SRT or VTT file. PlainScribe transcribes a video at up to 99% accuracy for $0.067/min ($4/hour) and exports SRT and VTT directly.

TL;DR

  • Closed captions can be turned on or off by the viewer with the CC button, unlike open captions which are burned permanently into the video.
  • YouTube auto-captions are free but imperfect — typically 70-90% accurate, with errors on names, accents, jargon, and overlapping speech.
  • Upload your own SRT/VTT file to control accuracy, punctuation, and timing; YouTube accepts both formats plus SBV.
  • PlainScribe costs $0.067/min (~$4 per audio hour), hits up to 99% accuracy, supports 47 languages, and exports SRT + VTT with no subscription.
  • Captions boost SEO and accessibility — search engines index caption text, and captions make videos usable by 430M+ people with hearing loss and anyone watching on mute.

What Are YouTube Closed Captions?

Closed captions are a synchronized text track displayed over a video that transcribes spoken dialogue and, ideally, relevant non-speech audio (music, applause, "[door slams]"). The "closed" part means they are optional: viewers click the CC icon to switch them on or off. That distinguishes them from open captions, which are permanently burned into the video frame and cannot be turned off.

On YouTube, captions come from three sources:

  1. Automatic captions — generated by YouTube's speech recognition the moment you upload.
  2. Uploaded caption files — an SRT, VTT, or SBV file you create and attach in YouTube Studio.
  3. Typed-in captions — manually entered and timed inside Studio's caption editor.

Auto-captions are the fastest, but accuracy drops sharply with accents, background noise, technical vocabulary, or fast/overlapping speech. Uploaded files give you full editorial control, which is why most professional channels create their own.

Why Closed Captions Matter

  • Accessibility. The WHO estimates over 430 million people live with disabling hearing loss. Captions are how they access your content — and in many regions (US ADA, UK Equality Act) captions are a legal expectation for public-facing media.
  • Mute viewing. A large share of mobile and social viewing happens with the sound off. Captions keep those viewers watching.
  • SEO and discoverability. YouTube and Google index caption text. A complete, accurate transcript gives the algorithm real keywords to rank, instead of guessing from auto-caption errors.
  • Comprehension and retention. Viewers follow accents, jargon, and fast speech more easily when they can read along.

How to Add Closed Captions to a YouTube Video

You have two practical routes. Auto-captions for speed; uploaded files for accuracy.

Option A — Use YouTube auto-captions (fast, free, imperfect)

  1. Upload your video. YouTube generates automatic captions within minutes to a few hours.
  2. In YouTube Studio → Subtitles, open the auto-generated track.
  3. Review and edit the errors — names, numbers, punctuation, and any "[inaudible]" gaps.
  4. Publish the corrected track.

Option B — Upload a clean SRT/VTT file (recommended for quality)

  1. Transcribe your video in PlainScribe: upload the file (up to 200MB, MP4/MOV/MP3/WebM and more), and you get a timestamped transcript at up to 99% accuracy.
  2. Export as SRT or VTT — both are native YouTube formats.
  3. In YouTube Studio → Subtitles → Add language → Upload file, choose "With timing" and select your SRT/VTT.
  4. Save. The captions appear under the CC button, correctly timed.

This second path is also how you ship multiple languages: transcribe once, translate into any of PlainScribe's 47 languages, and upload a caption track per language.

Auto-Captions vs Uploaded Captions

| Factor | YouTube auto-captions | Uploaded SRT/VTT (e.g. from PlainScribe) | |---|---|---| | Cost | Free | $0.067/min (~$4/hr) | | Accuracy | ~70-90%, varies with audio | Up to 99% | | Punctuation/formatting | Limited | Full control | | Languages | Auto-detect + auto-translate (rough) | 47 languages, human-quality translation | | Editorial control | Edit in Studio | Edit before upload, reusable file | | Best for | Quick drafts, casual uploads | Brand channels, courses, anything public-facing |

Verdict: Use auto-captions as a draft, but upload a clean SRT/VTT for anything that represents you professionally. PlainScribe's per-minute pricing means a 60-minute video costs about $4 to caption properly — far cheaper than a $24-$33/mo editor subscription you may barely use.

FAQs

What is the difference between closed captions and subtitles on YouTube? Closed captions transcribe all audio including sound effects and speaker cues, and assume the viewer may not hear the audio. Subtitles translate or transcribe dialogue only, assuming the viewer can hear but needs the words in text. On YouTube the feature is labeled "Subtitles/CC" and both are added the same way.

Are YouTube automatic captions accurate? They are usable but imperfect, typically landing around 70-90% accuracy depending on audio quality, accents, and vocabulary. They struggle with names, numbers, technical terms, and overlapping speech. Review and edit them, or upload a clean SRT/VTT file generated at up to 99% accuracy instead.

What file format does YouTube accept for captions? YouTube accepts several formats, most commonly SRT (SubRip) and VTT (WebVTT), as well as SBV. SRT and VTT are the safest choices. PlainScribe exports both, so you can upload directly.

Do closed captions help YouTube SEO? Yes. YouTube and Google index caption text, so an accurate transcript supplies real keywords for ranking and lets the algorithm understand your content. Accurate uploaded captions beat error-filled auto-captions for SEO because the indexed text actually matches what you said.

How much does it cost to caption a YouTube video? With PlainScribe it is $0.067 per minute, so a 60-minute video costs about $4 to transcribe and export as SRT/VTT. There is no subscription — you pay only for the minutes you process, and a $10 minimum buys roughly 150 minutes of credit.

Add Accurate Captions in Minutes

Start with 30 free minutes — no credit card required. Upload your video, get a transcript at up to 99% accuracy, export SRT or VTT, and drop it straight into YouTube Studio. See per-minute pricing, compare PlainScribe with Rev, Otter, and others, and learn the auto-caption workflow in depth in YouTube automatic closed captioning or how to attach your own files in YouTube video captions.

Transcribe, Translate & Summarize your files

Get started with 30 free minutes. No credit card required.