Accessibility Rule: <video> elements must have captions

About This Accessibility Rule

Captions are the primary way deaf and hard-of-hearing users access the audio portion of video content. When a video lacks captions, these users can see the visual content but miss everything communicated through sound — including spoken dialogue, narration, music, ambient sounds, and sound effects that provide context or meaning. This creates a critical barrier to understanding.

This rule relates to WCAG 2.0/2.1/2.2 Success Criterion 1.2.2: Captions (Prerecorded) at Level A, which requires that captions be provided for all prerecorded audio content in synchronized media. It is also required under Section 508 and EN 301 549. Because this is a Level A requirement, it represents the minimum baseline for accessibility — failing to provide captions is one of the most impactful accessibility issues a video can have.

Captions vs. Subtitles

It's important to understand the difference between captions and subtitles, as they serve different purposes:

Captions (kind="captions") are designed for deaf and hard-of-hearing users. They include all dialogue plus descriptions of meaningful non-speech audio such as sound effects, music, speaker identification, and other auditory cues (e.g., "[door slams]", "[dramatic orchestral music]", "[audience applause]").
Subtitles (kind="subtitles") are language translations of dialogue and narration, intended for hearing users who don't understand the spoken language. Subtitles generally do not include non-speech audio descriptions.

For accessibility compliance, you must use kind="captions", not kind="subtitles".

What Makes Good Captions

Good captions go beyond transcribing dialogue. They should:

Identify who is speaking when it's not visually obvious
Include meaningful sound effects (e.g., "[phone ringing]", "[glass shattering]")
Describe music when it's relevant (e.g., "[soft piano music]", "[upbeat pop song playing]")
Note significant silence or pauses when they carry meaning
Be accurately synchronized with the audio
Use proper spelling, grammar, and punctuation

How to Fix the Problem

Add a <track> element inside your <video> element with the following attributes:

src — the URL of the caption file (typically in WebVTT .vtt format)
kind — set to "captions"
srclang — the language code of the captions (e.g., "en" for English)
label — a human-readable label for the track (e.g., "English")

Only src is technically required, but kind, srclang, and label are strongly recommended for clarity and to ensure assistive technologies and browsers handle the track correctly.

Examples

Incorrect: Video with no captions

<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
</video>

This video has no <track> element, so deaf and hard-of-hearing users cannot access any of the audio content.

Incorrect: Using subtitles instead of captions

<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
  <track src="subs_en.vtt" kind="subtitles" srclang="en" label="English">
</video>

While this provides a text track, kind="subtitles" does not satisfy the captions requirement. Subtitles typically include only dialogue and won't convey non-speech audio information.

Correct: Video with captions

<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
  <track src="captions_en.vtt" kind="captions" srclang="en" label="English">
</video>

Correct: Video with captions in multiple languages

<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
  <track src="captions_en.vtt" kind="captions" srclang="en" label="English" default>
  <track src="captions_es.vtt" kind="captions" srclang="es" label="Español">
</video>

The default attribute indicates which caption track should be active by default when the user has captions enabled.

Example WebVTT Caption File

A basic captions_en.vtt file looks like this:

WEBVTT

00:00:01.000 --> 00:00:04.000
[upbeat music playing]

00:00:05.000 --> 00:00:08.000
Sarah: Welcome to our annual conference!

00:00:09.000 --> 00:00:12.000
[audience applause]

00:00:13.000 --> 00:00:17.000
Sarah: Today we'll explore three key topics.