# <video> elements must have captions

> Canonical HTML version: https://rocketvalidator.com/accessibility-validation/axe/4.11/video-caption
> Attribution: Rocket Validator (https://rocketvalidator.com)
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

Captions are the primary way deaf and hard-of-hearing users access the audio portion of video content. When a video lacks captions, these users can see the visual content but miss everything communicated through sound — including spoken dialogue, narration, music, ambient sounds, and sound effects that provide context or meaning. This creates a critical barrier to understanding.

This rule relates to **WCAG 2.0/2.1/2.2 Success Criterion 1.2.2: Captions (Prerecorded)** at Level A, which requires that captions be provided for all prerecorded audio content in synchronized media. It is also required under **Section 508** and **EN 301 549**. Because this is a Level A requirement, it represents the minimum baseline for accessibility — failing to provide captions is one of the most impactful accessibility issues a video can have.

## Captions vs. Subtitles

It's important to understand the difference between captions and subtitles, as they serve different purposes:

- **Captions** (`kind="captions"`) are designed for deaf and hard-of-hearing users. They include all dialogue plus descriptions of meaningful non-speech audio such as sound effects, music, speaker identification, and other auditory cues (e.g., "[door slams]", "[dramatic orchestral music]", "[audience applause]").
- **Subtitles** (`kind="subtitles"`) are language translations of dialogue and narration, intended for hearing users who don't understand the spoken language. Subtitles generally do not include non-speech audio descriptions.

For accessibility compliance, you must use `kind="captions"`, not `kind="subtitles"`.

## What Makes Good Captions

Good captions go beyond transcribing dialogue. They should:

- Identify who is speaking when it's not visually obvious
- Include meaningful sound effects (e.g., "[phone ringing]", "[glass shattering]")
- Describe music when it's relevant (e.g., "[soft piano music]", "[upbeat pop song playing]")
- Note significant silence or pauses when they carry meaning
- Be accurately synchronized with the audio
- Use proper spelling, grammar, and punctuation

## How to Fix the Problem

Add a `<track>` element inside your `<video>` element with the following attributes:

- **`src`** — the URL of the caption file (typically in WebVTT `.vtt` format)
- **`kind`** — set to `"captions"`
- **`srclang`** — the language code of the captions (e.g., `"en"` for English)
- **`label`** — a human-readable label for the track (e.g., `"English"`)

Only `src` is technically required, but `kind`, `srclang`, and `label` are strongly recommended for clarity and to ensure assistive technologies and browsers handle the track correctly.

## Examples

### Incorrect: Video with no captions

```html
<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
</video>
```

This video has no `<track>` element, so deaf and hard-of-hearing users cannot access any of the audio content.

### Incorrect: Using subtitles instead of captions

```html
<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
  <track src="subs_en.vtt" kind="subtitles" srclang="en" label="English">
</video>
```

While this provides a text track, `kind="subtitles"` does not satisfy the captions requirement. Subtitles typically include only dialogue and won't convey non-speech audio information.

### Correct: Video with captions

```html
<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
  <track src="captions_en.vtt" kind="captions" srclang="en" label="English">
</video>
```

### Correct: Video with captions in multiple languages

```html
<video width="640" height="360" controls>
  <source src="presentation.mp4" type="video/mp4">
  <track src="captions_en.vtt" kind="captions" srclang="en" label="English" default>
  <track src="captions_es.vtt" kind="captions" srclang="es" label="Español">
</video>
```

The `default` attribute indicates which caption track should be active by default when the user has captions enabled.

### Example WebVTT Caption File

A basic `captions_en.vtt` file looks like this:

```
WEBVTT

00:00:01.000 --> 00:00:04.000
[upbeat music playing]

00:00:05.000 --> 00:00:08.000
Sarah: Welcome to our annual conference!

00:00:09.000 --> 00:00:12.000
[audience applause]

00:00:13.000 --> 00:00:17.000
Sarah: Today we'll explore three key topics.
```

Notice how the captions identify the speaker, describe non-speech sounds, and are synchronized to specific timestamps.
