# Legacy encoding “windows-1252” used. Documents must use UTF-8.

> Canonical HTML version: https://rocketvalidator.com/html-validation/legacy-encoding-windows-1252-used-documents-must-use-utf-8
> Attribution: Rocket Validator (https://rocketvalidator.com)
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

The HTML living standard mandates UTF-8 as the only permitted character encoding for HTML documents. Legacy encodings like `windows-1252`, `iso-8859-1`, `shift_jis`, and others were common in older web pages, but they support only a limited subset of characters. UTF-8, on the other hand, can represent every character in the Unicode standard, making it universally compatible across languages and scripts.

This issue typically arises from one or more of these causes:

1. **Missing or incorrect `<meta charset>` declaration** — Your document either lacks a charset declaration or explicitly declares a legacy encoding like `<meta charset="windows-1252">`.
2. **File not saved as UTF-8** — Even with the correct `<meta>` tag, if your text editor saves the file in a different encoding, characters may become garbled (mojibake).
3. **Server sends a conflicting `Content-Type` header** — The HTTP `Content-Type` header can override the in-document charset declaration. If your server sends `Content-Type: text/html; charset=windows-1252`, the browser will use that encoding regardless of what the `<meta>` tag says.

## Why This Matters

- **Standards compliance**: The WHATWG HTML living standard explicitly states that documents must be encoded in UTF-8. Using a legacy encoding makes your document non-conforming.
- **Internationalization**: Legacy encodings like `windows-1252` only support a limited set of Western European characters. If your content ever includes characters outside that range—emoji, CJK characters, Cyrillic, Arabic, or even certain punctuation—they won't render correctly.
- **Security**: Mixed or ambiguous encodings can lead to security vulnerabilities, including certain types of cross-site scripting (XSS) attacks that exploit encoding mismatches.
- **Consistency**: When the declared encoding doesn't match the actual file encoding, browsers may misinterpret characters, leading to garbled text that's difficult to debug.

## How to Fix It

### Step 1: Declare UTF-8 in your HTML

Add a `<meta charset="utf-8">` tag as the first element inside `<head>`. It must appear within the first 1024 bytes of the document so browsers can detect it early.

### Step 2: Save the file as UTF-8

In most modern text editors and IDEs, you can set the file encoding:

- **VS Code**: Click the encoding label in the bottom status bar and select "Save with Encoding" → "UTF-8".
- **Sublime Text**: Go to File → Save with Encoding → UTF-8.
- **Notepad++**: Go to Encoding → Convert to UTF-8.

If your file already contains characters encoded in `windows-1252`, simply changing the declaration without re-encoding the file will cause those characters to display incorrectly. You need to convert the file's actual encoding.

### Step 3: Check your server configuration

If your server sends a `charset` parameter in the `Content-Type` HTTP header, make sure it specifies UTF-8. For example, in Apache you can add this to your `.htaccess` file:

```
AddDefaultCharset UTF-8
```

In Nginx, you can set it in your server block:

```
charset utf-8;
```

## Examples

### Incorrect: Legacy encoding declared

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="windows-1252">
    <title>My Page</title>
  </head>
  <body>
    <p>Hello world</p>
  </body>
</html>
```

This triggers the error because `windows-1252` is a legacy encoding.

### Incorrect: Using the long-form `http-equiv` with a legacy encoding

```html
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
```

This older syntax also triggers the error when it specifies a non-UTF-8 encoding.

### Correct: UTF-8 declared properly

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My Page</title>
  </head>
  <body>
    <p>Hello world</p>
  </body>
</html>
```

The `<meta charset="utf-8">` tag appears as the first child of `<head>`, and the file itself should be saved with UTF-8 encoding.

### Correct: Using `http-equiv` with UTF-8

```html
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
```

While the shorter `<meta charset="utf-8">` form is preferred, this longer syntax is also valid as long as it specifies UTF-8.
