# Document Language Declaration

> Canonical HTML version: https://rocketvalidator.com/glossary/document-language-declaration
> Attribution: Rocket Validator (https://rocketvalidator.com)
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

A document language declaration specifies the primary human language of a web page by setting the `lang` attribute on the `<html>` element, enabling browsers, screen readers, and other tools to process and present the content correctly.

Every HTML document should declare its primary language using the `lang` attribute on the root `<html>` element. This attribute accepts a valid BCP 47 language tag — such as `en` for English, `fr` for French, `es` for Spanish, or `zh-Hans` for Simplified Chinese — and tells user agents what language the page's content is written in. While it may seem like a minor detail, this single attribute has far-reaching effects on how assistive technologies, search engines, browsers, and translation tools interact with the page.

The language declaration is distinct from character encoding (`<meta charset="UTF-8">`) or content-type headers. While encoding determines how bytes map to characters, the language declaration identifies which human language those characters represent. Both are essential, but they serve different purposes.

## Why document language declaration matters

### Accessibility

Screen readers rely on the `lang` attribute to select the correct pronunciation engine and voice profile. When a page written in French lacks a language declaration — or incorrectly declares `en` — a screen reader may attempt to read the French text with English phonetic rules, producing garbled and unintelligible speech. For users who depend on auditory output, this makes the content effectively inaccessible.

The Web Content Accessibility Guidelines (WCAG) address this directly in two success criteria:

- **WCAG 3.1.1 — Language of Page (Level A):** The default human language of each web page must be programmatically determinable. This is satisfied by providing a valid `lang` attribute on the `<html>` element.
- **WCAG 3.1.2 — Language of Parts (Level AA):** When sections of a page are in a different language than the page's default, those sections must also have their language identified using the `lang` attribute on the appropriate element.

Failing criterion 3.1.1 is a Level A violation — the most fundamental tier — meaning any site that omits the document language declaration cannot claim even baseline WCAG conformance.

### HTML validation

The HTML specification strongly recommends including the `lang` attribute on the `<html>` element. HTML validators such as the W3C Nu HTML Checker will flag a missing language declaration as a warning. While it is technically not a parse error, omitting it is considered a significant authoring oversight.

### Search engines and translation

Search engines use the `lang` attribute as a signal when determining which language a page targets, influencing how the page is indexed and served in search results. Browser auto-translation features also consult the language declaration to decide whether to offer translation to the user.

## How document language declaration works

### Setting the page language

The `lang` attribute is placed on the `<html>` element and should contain a valid BCP 47 language tag. The simplest tags are two-letter ISO 639-1 codes, but you can be more specific by adding region subtags.

| Language Tag | Meaning |
|---|---|
| `en` | English |
| `en-US` | English (United States) |
| `en-GB` | English (United Kingdom) |
| `pt-BR` | Portuguese (Brazil) |
| `zh-Hans` | Chinese (Simplified) |
| `ja` | Japanese |

For most use cases, a two-letter language code is sufficient. Add a region subtag only when the distinction matters for pronunciation or content targeting.

### Declaring language for parts of a page

When a portion of the page is in a different language, apply the `lang` attribute to the nearest enclosing element around that content. Screen readers will switch voice profiles at that boundary, ensuring correct pronunciation.

### XHTML considerations

For documents served as `application/xhtml+xml`, use `xml:lang` instead of or in addition to `lang`. For standard HTML5 documents served as `text/html`, the `lang` attribute alone is sufficient.

## Code examples

### Bad example — missing language declaration

```html
<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>Bienvenue sur notre site</title>
  </head>
  <body>
    <h1>Bienvenue sur notre site</h1>
    <p>Nous sommes heureux de vous accueillir.</p>
  </body>
</html>
```

This page is written in French but has no `lang` attribute. Screen readers will default to the user's system language (often English), mispronouncing every word. HTML validators will flag the missing attribute, and the page fails WCAG 3.1.1.

### Bad example — incorrect language tag

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <title>Bienvenue sur notre site</title>
  </head>
  <body>
    <h1>Bienvenue sur notre site</h1>
    <p>Nous sommes heureux de vous accueillir.</p>
  </body>
</html>
```

The `lang` attribute is present but declares English for French content. A screen reader will apply English pronunciation rules, producing an equally poor result.

### Good example — correct language declaration

```html
<!DOCTYPE html>
<html lang="fr">
  <head>
    <meta charset="UTF-8">
    <title>Bienvenue sur notre site</title>
  </head>
  <body>
    <h1>Bienvenue sur notre site</h1>
    <p>Nous sommes heureux de vous accueillir.</p>
  </body>
</html>
```

The `lang="fr"` attribute correctly identifies the content as French. Screen readers will load a French voice, validators will pass, and search engines can index the page under the correct language.

### Good example — mixed-language content

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <title>Welcome to our site</title>
  </head>
  <body>
    <h1>Welcome to our site</h1>
    <p>We are happy to have you here.</p>
    <blockquote lang="fr">
      <p>La vie est belle.</p>
    </blockquote>
    <p>The quote above is a well-known French expression.</p>
  </body>
</html>
```

The page default is English, but the French quote is wrapped in a `<blockquote>` with `lang="fr"`. A screen reader will switch to a French voice for the quoted text and revert to English afterward, satisfying both WCAG 3.1.1 and 3.1.2.

Setting the document language declaration is one of the simplest and highest-impact steps you can take toward building accessible, well-structured HTML. It takes a single attribute, yet it directly affects how millions of assistive technology users experience your content.
