Closed Caption: Definition, Types, and How SDH Differs

A closed caption is a time-synchronized text track that reproduces a video's spoken dialogue and its non-speech audio — sound effects, music, and speaker labels — that the viewer can switch on or off. The "closed" means hideable. Closed captions are delivered as a separate file (SRT or VTT) and exist primarily for deaf and hard-of-hearing access. With PlainScribe you can produce that file from any video at up to 99% accuracy for $0.067/min.

TL;DR

  • Definition: an optional, viewer-toggleable text track of dialogue plus non-speech sound, delivered as a sidecar SRT/VTT file.
  • CEA-608 vs CEA-708: the two U.S. broadcast caption standards — 608 is the legacy analog format; 708 is the modern digital one with styling and positioning.
  • SDH (Subtitles for the Deaf and Hard of Hearing) is the streaming-era equivalent that combines caption-style sound cues with subtitle-style delivery.
  • Make them yourself by transcribing audio into a timed file — PlainScribe does this at $0.067/min ($4/hour), up to 99% accuracy, exporting SRT and VTT.
  • Free to try: 30 minutes, no credit card; files auto-delete after 7 days.

This is the precise, technical definition. For the plain-language meaning and everyday examples, start with the hub: what does closed caption mean.

The formal definition

A closed caption track consists of caption cues, each with three parts:

  1. Timing — a start and end timestamp pinning the text to the audio.
  2. Text — the dialogue, transcribed verbatim or lightly cleaned.
  3. Non-speech information — bracketed descriptions of sounds ([glass shatters]), music (♪ upbeat jazz ♪), and speaker labels when the speaker isn't obvious.

"Closed" distinguishes it from "open." A closed caption is decoded and displayed on demand by the player; an open caption is rendered into the video pixels and cannot be removed.

Closed captions vs. related terms

| Term | Toggleable | Non-speech audio | Same language as audio? | |------|-----------|------------------|-------------------------| | Closed captions | Yes | Yes | Usually | | Open captions | No (burned in) | Usually | Usually | | Subtitles | Yes | No | Often translated | | SDH | Yes | Yes | Often translated |

Verdict: closed captions are defined by two traits together — toggleable and includes sound description. Drop the sound description and you have subtitles; remove the toggle and you have open captions. The full side-by-side is in closed vs open captions.

The technical standards

  • CEA-608 ("Line 21 captions"): the original U.S. analog-TV standard, monospaced and limited in styling. Still seen as a fallback.
  • CEA-708: the digital-TV standard, supporting fonts, colors, sizes, and on-screen positioning.
  • WebVTT (.vtt): the W3C standard for HTML5 web video, used by most modern web players.
  • SRT (.srt): an informal but near-universal sidecar format supported almost everywhere.

PlainScribe exports the two formats you'll actually attach to web and uploaded video — SRT and VTT (plus TXT and CSV). If you need to choose between them, see SRT vs VTT.

What about SDH?

SDH (Subtitles for the Deaf and Hard of Hearing) emerged because streaming platforms deliver subtitle files, not broadcast caption signals. SDH packages caption-style content — sound effects, speaker IDs, music notes — inside a subtitle delivery format, and is often offered in multiple languages. Practically, when you "turn on captions" on Netflix you're usually selecting an SDH track.

How to create a compliant closed caption file

  1. Transcribe the audio. Upload your video (up to 200MB) to PlainScribe; get timed text at up to 99% accuracy for $0.067/min.
  2. Add non-speech cues. Insert bracketed sound descriptions and speaker labels so the track qualifies as captions, not bare subtitles.
  3. Check timing and reading speed. Keep cues on screen long enough to read (under ~20 characters/second is a good target).
  4. Export SRT or VTT and attach it to your player.

For the end-to-end version with platform steps, see how to add captions to a video.

FAQs

What is the definition of a closed caption? A closed caption is a viewer-toggleable, time-synced text track that reproduces a video's dialogue and non-speech audio (sound effects, music, speaker labels), delivered as a separate file like SRT or VTT.

What is the difference between closed captions and SDH? Closed captions are a broadcast concept (CEA-608/708) carried in the video signal. SDH is the streaming-era equivalent — the same sound-inclusive content delivered in a subtitle file format, often in multiple languages.

What file formats are used for closed captions? On broadcast, CEA-608 and CEA-708. For web and uploaded video, WebVTT (.vtt) and SRT (.srt) are standard. PlainScribe exports SRT and VTT.

Are closed captions required by law? In many jurisdictions, yes — for broadcast and certain online content. Requirements vary by country and content type, so check your local accessibility regulations.

Can I make closed captions automatically? Yes. Automatic speech recognition produces the text and timing; you then add sound cues and review. PlainScribe automates the transcription at up to 99% accuracy.

Build an accurate caption file in minutes

Transcribe any video at up to 99% accuracy, add your sound cues, and export SRT/VTT — pay-as-you-go at $0.067/min, no subscription. Start free with 30 minutes, no credit card. See pricing or the tools page.

Transcribe, Translate & Summarize your files

Get started with 30 free minutes. No credit card required.