# This document appears to be written in X but the “html” start tag has “lang=Y”.

> Canonical HTML version: https://rocketvalidator.com/html-validation/this-document-appears-to-be-written-in-x-but-the-html-start-tag-has-langeqy
> Attribution: Rocket Validator (https://rocketvalidator.com)
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

The `lang` attribute on the `<html>` element tells browsers, search engines, and assistive technologies what language the page content is written in. The validator uses heuristic analysis of the actual text on the page to detect the likely language, and when there's a mismatch, it flags the discrepancy.

## Why This Matters

An incorrect `lang` attribute causes real problems for users and systems that rely on it:

- **Screen readers** use the `lang` attribute to select the correct pronunciation engine. A French document marked as English will be read aloud with English pronunciation rules, making it incomprehensible.
- **Search engines** use the language declaration for indexing and serving results to users searching in a specific language.
- **Browser features** like automatic translation prompts and spell-checking rely on the declared language.
- **Hyphenation and typographic rules** in CSS also depend on the correct language being declared.

## Common Causes

1. **Copy-pasting a boilerplate** — Starting from an English template but writing content in another language without updating `lang`.
2. **Multilingual sites** — Using the same base template for all language versions without dynamically setting the `lang` value.
3. **Incorrect language subtag** — Using the wrong BCP 47 language tag (e.g., `lang="en"` instead of `lang="de"` for German content).

## When You Can Safely Ignore This Warning

This is a **warning**, not an error. The validator's language detection is heuristic and not always accurate. You may safely ignore it if:

- Your page contains very little text, making detection unreliable.
- The page has significant amounts of content in multiple languages, but the `lang` attribute correctly reflects the *primary* language.
- The detected language is simply wrong (e.g., short text snippets can confuse the detector).

If you're confident the `lang` attribute is correct, you can disregard the warning.

## How to Fix It

Identify the primary language of your document's content and set the `lang` attribute to the appropriate [BCP 47 language tag](https://www.ietf.org/rfc/bcp/bcp47.txt). Common tags include `en` (English), `fr` (French), `de` (German), `es` (Spanish), `pt` (Portuguese), `ja` (Japanese), and `zh` (Chinese).

## Examples

### Incorrect: Content in French, but `lang` set to English

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Mon site</title>
  </head>
  <body>
    <h1>Bienvenue sur notre site</h1>
    <p>Nous sommes ravis de vous accueillir sur notre plateforme.</p>
  </body>
</html>
```

This triggers the warning because the validator detects French content but sees `lang="en"`.

### Fixed: `lang` attribute matches the content language

```html
<!DOCTYPE html>
<html lang="fr">
  <head>
    <meta charset="utf-8">
    <title>Mon site</title>
  </head>
  <body>
    <h1>Bienvenue sur notre site</h1>
    <p>Nous sommes ravis de vous accueillir sur notre plateforme.</p>
  </body>
</html>
```

### Handling mixed-language content

If your page is primarily in one language but contains sections in another, set the `lang` attribute on the `<html>` element to the primary language and use `lang` on specific elements for the other language:

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Our Global Site</title>
  </head>
  <body>
    <h1>Welcome to our site</h1>
    <p>We are glad you are here.</p>
    <blockquote lang="fr">
      <p>La vie est belle.</p>
    </blockquote>
  </body>
</html>
```

This tells assistive technologies that the page is in English, but the blockquote should be read using French pronunciation rules. The validator should not flag this as a mismatch because the majority of the content is in English.
