HTML Checking for Large Sites
Rocket Validator integrates the W3C Validator HTML checker into an automated web crawler.
HTML issues checked by Rocket Validator.
The <meta charset> is expected to appear at the beginning of the document, within the first 1024 bytes. Consider moving it to the beginning of the <head> section, as in this example:
<head>
<meta charset="utf-8">
...
</head>
A character encoding declaration is a mechanism by which the character encoding used to store or transmit a document is specified. For HTML documents, the standard way to declare a document character encoding is by including a <meta> tag with a charset attribute, typically <meta charset="utf-8">.
According to the W3C standard:
The element containing the character encoding declaration must be serialized completely within the first 1024 bytes of the document.
In order to define the charset encoding of an HTML document, both of these options are valid, but only one of them must appear in the document:
<!-- This is the preferred way -->
<meta charset="UTF-8">
<!-- This is the older way, also valid -->
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
The <meta charset> tag, used to define the character encoding, must appear only once in a document, within the <head> section.
There can only be one visible <main> element in a document. If more are needed (for example for switching between them with JavaScript), only one can be visible, the others should be hidden toggling the hidden attribute.
Example of 2 main elements, where only one is visible:
<main>
<h1>Active main element</h1>
<!-- content -->
</main>
<main hidden>
<h1>Hidden main element</h1>
<!-- content -->
</main>
6,250 HTML checks per week. Fully automated.
Save time using our automated web checker. Let our crawler check your web pages on the W3C Validator.
A <link> element has been found in an invalid body context. Check the attributes of the <link> element and ensure it’s not within the <body> section.
If the element is within the <head> section, it may have been interpreted as a body context depending on previous elements. For example, while this <link> element is valid per se and is in the <head> section, it is deemed invalid because the previous <img> element made the validator consider it a body context:
<!DOCTYPE html>
<html lang="">
<head>
<title>Test</title>
<img src="photo.jpg" alt="A smiling cat" />
<link rel="canonical" href="https://example.com/" />
</head>
<body>
<p>Some content</p>
</body>
</html>
If we fix that document and move the <img> tag within the body, the issue raised about <meta> disappears because it’s now in a valid context:
<!DOCTYPE html>
<html lang="">
<head>
<title>Test</title>
<link rel="canonical" href="https://example.com/" />
</head>
<body>
<p>Some content</p>
<img src="photo.jpg" alt="A smiling cat" />
</body>
</html>
A <link> element that is using the preload value in the rel attribute is missing the as attribute, used to indicate the type of the resource.
The preload value of the <link> element’s rel attribute lets you declare fetch requests in the HTML’s <head>, specifying resources that your page will need very soon, which you want to start loading early in the page lifecycle, before browsers’ main rendering machinery kicks in. This ensures they are available earlier and are less likely to block the page’s render, improving performance.
The as attribute specifies the type of content being loaded by the <link>, which is necessary for request matching, application of correct content security policy, and setting of correct Accept request header.
25,000 HTML checks per month. Fully automated.
Save time using our automated web checker. Let our crawler check your web pages on the W3C Validator.
The only value admitted for the attribute content in a <meta http-equiv="X-UA-Compatible"> is currently IE=edge. You’re probably seeing this issue because the page being validated includes the following meta tag:
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
As the Google Chrome Frame plugin was discontinued on February 25, 2014, this is longer supported so you should change that meta tag to:
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
According to this article in Wikipedia:
Google Chrome Frame was a plug-in designed for Internet Explorer based on the open-source Chromium project, first announced on September 22, 2009. It went stable in September 2010, on the first birthday of the project. It was discontinued on February 25, 2014 and is no longer supported.
The plug-in worked with Internet Explorer 6, 7, 8 and 9. It allowed suitably coded web pages to be displayed in Internet Explorer by Google Chrome’s versions of the WebKit layout engine and V8 JavaScript engine.
According to the HTML specification, text must not contain control characters other than space characters.
In particular, the control character 
escapes the Unicode control character “CARRIAGE RETURN” are not allowed.
An alternative is using the character reference which escapes the Unicode control character “LINE FEED” that is defined to be a space character, so it’s allowed in HTML text.
The <script> tag allows authors to include dynamic scripts and data blocks in their documents. When the src is present, this tag accepts a type attribute which must be either:
- an empty string
- text/javascript (that’s the default, so it can be omitted)
- module
Examples:
<!-- This is valid, without a type it defaults to JavaScript -->
<script src="app.js"></script>
<!-- This is valid, but will warn that it can be omitted -->
<script type="text/javascript" src="app.js"></script>
<!-- An empty attribute is valid, but will warn that it can be omitted -->
<script type="" src="app.js"></script>
<!-- The module keyword is also valid as a type -->
<script type="module" src="app.js"></script>
<!-- Any other type is invalid -->
<script type="wrong" src="app.js"></script>
<script type="text/html" src="app.js"></script>
<script type="image/jpeg" src="app.js"></script>
An HTML tag could not be parsed, most probably because of a typo.
Still checking your large sites one page at a time?
Save time using our automated web checker. Let our crawler check your web pages on the W3C Validator.