Accessibility Guides for deaf
Learn how to identify and fix common accessibility issues flagged by Axe Core — so your pages are inclusive and usable for everyone. Also check our HTML Validation Guides.
When an <audio> element lacks a captions track, all of the information it conveys — dialogue, narration, sound effects, musical cues, and speaker identification — becomes completely inaccessible to users who are deaf or deafblind. This is considered a critical accessibility issue because it blocks entire groups of users from accessing content.
This rule relates to WCAG Success Criterion 1.2.1: Audio-only and Video-only (Prerecorded) (Level A), which requires that a text alternative be provided for prerecorded audio-only content. It also falls under Section 508 requirements and EN 301 549. Level A criteria represent the most fundamental accessibility requirements — failing to meet them means significant barriers exist for users with disabilities.
Captions vs. Subtitles
It’s important to understand that captions and subtitles are not the same thing:
-
Captions (
kind="captions") are designed for deaf and hard-of-hearing users. They include dialogue, speaker identification, sound effects (e.g., “[door slams]”), musical cues (e.g., “[soft piano music]”), and other meaningful audio information. -
Subtitles (
kind="subtitles") are language translations intended for hearing users who don’t understand the spoken language. They typically include only dialogue and narration.
Because of this distinction, you must use kind="captions", not kind="subtitles", to satisfy this rule.
How to Fix It
-
Create a captions file (typically in WebVTT
.vttformat) that includes all meaningful audio information: who is speaking, what they say, and relevant non-speech sounds. -
Add a
<track>element inside your<audio>element. -
Set the
kindattribute to"captions". -
Set the
srcattribute to the path of your captions file. -
Use the
srclangattribute to specify the language of the captions. -
Use the
labelattribute to give the track a human-readable name.
While only src is technically required on a <track> element, including kind, srclang, and label is strongly recommended for clarity and proper functionality.
Examples
Incorrect: <audio> with no captions track
<audio controls>
<source src="podcast.mp3" type="audio/mp3">
</audio>
This fails the rule because there is no <track> element providing captions.
Incorrect: <track> with wrong kind value
<audio controls>
<source src="podcast.mp3" type="audio/mp3">
<track src="subs_en.vtt" kind="subtitles" srclang="en" label="English">
</audio>
This fails because kind="subtitles" does not satisfy the captions requirement. Subtitles are not a substitute for captions.
Correct: <audio> with a captions track
<audio controls>
<source src="podcast.mp3" type="audio/mp3">
<track src="captions_en.vtt" kind="captions" srclang="en" label="English Captions">
</audio>
Correct: <audio> with multiple caption tracks for different languages
<audio controls>
<source src="interview.mp3" type="audio/mp3">
<track src="captions_en.vtt" kind="captions" srclang="en" label="English Captions">
<track src="captions_es.vtt" kind="captions" srclang="es" label="Subtítulos en español">
</audio>
Providing captions in multiple languages ensures broader accessibility and is especially helpful when your audience speaks different languages.
Example WebVTT captions file
A basic captions_en.vtt file might look like this:
WEBVTT
00:00:01.000 --> 00:00:04.000
[Upbeat intro music]
00:00:04.500 --> 00:00:07.000
Host: Welcome to the show, everyone.
00:00:07.500 --> 00:00:10.000
Guest: Thanks for having me!
00:00:10.500 --> 00:00:12.000
[Audience applause]
Notice how the captions include speaker identification (Host:, Guest:), non-speech sounds ([Upbeat intro music], [Audience applause]), and the full dialogue. This level of detail is what makes captions effective for deaf and deafblind users.
HTML lists rely on a parent-child relationship between the list container (<ul> or <ol>) and its items (<li>). When an <li> element appears outside of a valid list parent, the browser and assistive technologies lose the semantic meaning of that element. The markup is technically invalid HTML, and the content is no longer recognized as part of a list structure.
This issue primarily affects screen reader users. When a screen reader encounters a properly structured list, it announces the list type and the total number of items — for example, “list, 3 items.” As the user navigates through the list, the screen reader announces each item’s position, such as “1 of 3.” This context is essential for understanding the structure of the content, anticipating its length, and navigating efficiently. Without a <ul> or <ol> parent, none of this information is communicated, leaving the user with no way to know the items are related or how many there are.
This rule relates to WCAG 2.2 Success Criterion 1.3.1: Info and Relationships (Level A), which requires that information, structure, and relationships conveyed through visual presentation are also available programmatically. A visual list of items implies a relationship between those items. Using correct semantic markup ensures that relationship is exposed to assistive technologies, not just conveyed visually through bullet points or numbering.
How to fix it
-
Identify any
<li>elements that are not wrapped in a<ul>or<ol>. - Determine whether the list is unordered (no meaningful sequence) or ordered (sequence matters).
-
Wrap the
<li>elements in the appropriate parent:<ul>for unordered lists or<ol>for ordered lists.
Choose <ul> when the order of items doesn’t matter (e.g., a list of ingredients). Choose <ol> when the order is meaningful (e.g., step-by-step instructions).
Examples
Incorrect: <li> elements without a list parent
These list items have no <ul> or <ol> container, so they are not recognized as a list by assistive technologies.
<li>Coffee</li>
<li>Tea</li>
<li>Milk</li>
Correct: <li> elements inside a <ul>
Wrapping the items in a <ul> creates a valid unordered list that screen readers can announce properly.
<ul>
<li>Coffee</li>
<li>Tea</li>
<li>Milk</li>
</ul>
Correct: <li> elements inside an <ol>
When the order of the items is meaningful, use an <ol> instead.
<ol>
<li>Preheat the oven to 350°F.</li>
<li>Mix the dry ingredients.</li>
<li>Bake for 25 minutes.</li>
</ol>
Incorrect: <li> inside a <div> instead of a list parent
A <div> is not a valid parent for <li> elements, even if it looks correct visually.
<div>
<li>Item one</li>
<li>Item two</li>
</div>
Correct: Replace the <div> with a <ul>
<ul>
<li>Item one</li>
<li>Item two</li>
</ul>
Captions are the primary way deaf and hard-of-hearing users access the audio portion of video content. When a video lacks captions, these users can see the visual content but miss everything communicated through sound — including spoken dialogue, narration, music, ambient sounds, and sound effects that provide context or meaning. This creates a critical barrier to understanding.
This rule relates to WCAG 2.0/2.1/2.2 Success Criterion 1.2.2: Captions (Prerecorded) at Level A, which requires that captions be provided for all prerecorded audio content in synchronized media. It is also required under Section 508 and EN 301 549. Because this is a Level A requirement, it represents the minimum baseline for accessibility — failing to provide captions is one of the most impactful accessibility issues a video can have.
Captions vs. Subtitles
It’s important to understand the difference between captions and subtitles, as they serve different purposes:
-
Captions (
kind="captions") are designed for deaf and hard-of-hearing users. They include all dialogue plus descriptions of meaningful non-speech audio such as sound effects, music, speaker identification, and other auditory cues (e.g., “[door slams]”, “[dramatic orchestral music]”, “[audience applause]”). -
Subtitles (
kind="subtitles") are language translations of dialogue and narration, intended for hearing users who don’t understand the spoken language. Subtitles generally do not include non-speech audio descriptions.
For accessibility compliance, you must use kind="captions", not kind="subtitles".
What Makes Good Captions
Good captions go beyond transcribing dialogue. They should:
- Identify who is speaking when it’s not visually obvious
- Include meaningful sound effects (e.g., “[phone ringing]”, “[glass shattering]”)
- Describe music when it’s relevant (e.g., “[soft piano music]”, “[upbeat pop song playing]”)
- Note significant silence or pauses when they carry meaning
- Be accurately synchronized with the audio
- Use proper spelling, grammar, and punctuation
How to Fix the Problem
Add a <track> element inside your <video> element with the following attributes:
-
src— the URL of the caption file (typically in WebVTT.vttformat) -
kind— set to"captions" -
srclang— the language code of the captions (e.g.,"en"for English) -
label— a human-readable label for the track (e.g.,"English")
Only src is technically required, but kind, srclang, and label are strongly recommended for clarity and to ensure assistive technologies and browsers handle the track correctly.
Examples
Incorrect: Video with no captions
<video width="640" height="360" controls>
<source src="presentation.mp4" type="video/mp4">
</video>
This video has no <track> element, so deaf and hard-of-hearing users cannot access any of the audio content.
Incorrect: Using subtitles instead of captions
<video width="640" height="360" controls>
<source src="presentation.mp4" type="video/mp4">
<track src="subs_en.vtt" kind="subtitles" srclang="en" label="English">
</video>
While this provides a text track, kind="subtitles" does not satisfy the captions requirement. Subtitles typically include only dialogue and won’t convey non-speech audio information.
Correct: Video with captions
<video width="640" height="360" controls>
<source src="presentation.mp4" type="video/mp4">
<track src="captions_en.vtt" kind="captions" srclang="en" label="English">
</video>
Correct: Video with captions in multiple languages
<video width="640" height="360" controls>
<source src="presentation.mp4" type="video/mp4">
<track src="captions_en.vtt" kind="captions" srclang="en" label="English" default>
<track src="captions_es.vtt" kind="captions" srclang="es" label="Español">
</video>
The default attribute indicates which caption track should be active by default when the user has captions enabled.
Example WebVTT Caption File
A basic captions_en.vtt file looks like this:
WEBVTT
00:00:01.000 --> 00:00:04.000
[upbeat music playing]
00:00:05.000 --> 00:00:08.000
Sarah: Welcome to our annual conference!
00:00:09.000 --> 00:00:12.000
[audience applause]
00:00:13.000 --> 00:00:17.000
Sarah: Today we'll explore three key topics.
Notice how the captions identify the speaker, describe non-speech sounds, and are synchronized to specific timestamps.
Ready to validate your sites?
Start your free trial today.