My Page

# Malformed byte sequence: > Canonical HTML version: https://rocketvalidator.com/html-validation/malformed-byte-sequence > Attribution: Rocket Validator (https://rocketvalidator.com) > License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/) When a browser or validator reads your HTML file, it interprets the raw bytes according to a character encoding — most commonly UTF-8. Each encoding has rules about which byte sequences are valid. For example, in UTF-8, bytes above `0x7F` must follow specific multi-byte patterns. If the validator encounters a byte or sequence of bytes that violates these rules, it reports a "malformed byte sequence" error because it literally cannot decode the bytes into meaningful characters. This problem commonly arises in a few scenarios: - **Encoding mismatch:** Your file is saved as Windows-1252 (or Latin-1, ISO-8859-1) but the document declares UTF-8, or vice versa. Characters like curly quotes (`"` `"`), em dashes (`—`), or accented letters (`é`, `ñ`) are encoded differently across these encodings, producing invalid byte sequences when interpreted under the wrong one. - **Copy-pasting from word processors:** Content copied from Microsoft Word or similar applications often includes "smart quotes" and special characters encoded in Windows-1252, which can produce malformed bytes in a UTF-8 file. - **File corruption:** The file was partially corrupted during transfer (e.g., FTP in the wrong mode) or by a tool that modified it without respecting its encoding. - **Mixed encodings:** Parts of the file were written or appended using different encodings, resulting in some sections containing invalid byte sequences. This is a serious problem because browsers may display garbled text (mojibake), skip characters entirely, or substitute replacement characters (`�`). It also breaks accessibility tools like screen readers, which may mispronounce or skip corrupted text. Search engines may index garbled content, harming your SEO. ## How to Fix It 1. **Declare UTF-8 encoding** in your HTML with `` as the first element inside ``. 2. **Save your file as UTF-8** in your text editor. Most editors have an option like "Save with Encoding" or "File Encoding" in the status bar or save dialog. Choose "UTF-8" or "UTF-8 without BOM." 3. **Re-encode the file** if it was originally saved in a different encoding. Tools like `iconv` on the command line can convert between encodings: ``` iconv -f WINDOWS-1252 -t UTF-8 input.html -o output.html ``` 4. **Replace problematic characters** by re-typing them or using HTML character references if needed. 5. **Check your server configuration.** If your server sends a `Content-Type` header with a charset that conflicts with the file's actual encoding (e.g., `Content-Type: text/html; charset=iso-8859-1` for a UTF-8 file), the validator will use the HTTP header's encoding, causing mismatches. ## Examples ### Incorrect — Encoding mismatch A file saved in Windows-1252 but declaring UTF-8. The byte `0xE9` represents `é` in Windows-1252 but is an invalid lone byte in UTF-8, triggering the malformed byte sequence error. ```html My Page

Resumé

``` ### Correct — File properly saved as UTF-8 The same document, but the file is actually saved in UTF-8 encoding. The character `é` is stored as the two-byte sequence `0xC3 0xA9`, which is valid UTF-8. ```html My Page

Resumé

``` ### Alternative — Using character references If you can't resolve the encoding issue immediately, you can use HTML character references to avoid non-ASCII bytes entirely: ```html

Resumé

``` Or using the named reference: ```html

Resumé

``` Both render as "Resumé" regardless of file encoding, though this is a workaround — properly saving the file as UTF-8 is the preferred long-term solution.