What mistakes does the Hreflang Testing Tool look for?

There are many mistakes you could make when implementing Hreflang tags on your website. Via the online Hreflang testing tool, we try to catch as many of them as we can. Here’s a list, complete with how to fix each type of problem.

Page-level errors

Some errors can be noticed simply by looking at an individual page. These are:

  • Broken pages: It is not uncommon that we crawl the pages in a sitemap and find some that are broken e.g. 404 page not found errors, or 301/302 redirects, or even pages that are completely blank or do not have HTML markup.
  • An Hreflang tag with broken markup. e.g. a <link> tag with missing href attribute
  • Incorrect Hreflang: Acceptable values for the language code used in Hreflang attributes must be in ISO-639-1 or the ISO 3166-1 Alpha 2 format. e.g. while “en-US” is correct, “en-UK” is incorrect. The correct value for the UK is actually “en-GB”.
  • Page not linking to itself: When you implement Hreflang tags on a page (say Page A), you obviously want to include <link>s to the version of page A in other languages. So you link to pages B, C and D. But search engine guidelines specify that page A must also link to itself (specifying the language used on that page, of course).
  • Missing x-default: Another guideline from Google is that an “x-default” must be included as the default page to be shown to users whose language is not among the languages that you have pages for. Usually this is English, and usually it’s the page that is in the XML sitemap.
  • Same page, multiple languages: Sometimes when Hreflang tags are incorrectly implemented, all (or multiple) language versions point to the same page. (see example here). You will see this error if two different languages — say en and fr — point to the same page. However, if the two hreflang attributes both use the same high-level language — say en-US and en-GB — then they can point to the same page and it will not throw an error.
  • Duplicate (or multiple different) Hreflang tags for the same language
  • HTML lang attribute does not match hreflang: The “lang” attribute of the <html> tag on the page is different from the “hreflang” attribute for that page in the <link> tag. This error is usually because of a CMS (content management system) template problem. The <html> tag has an optional lang attribute to specify what language this page is in. This tag is generated by the back-end CMS and most marketers don’t pay any attention to it because it’s not an important SEO meta tag like robots, description or hreflang. Since all pages served by the CMS tend to use the same hard-coding for the lang attribute, we find that pages in German, French etc. — even if they have the correct hreflang attribute — continue to use <html lang="en">
  • Hreflang in HTML and HTTP headers: This is rare but some sites specify Hreflang tags in both their HTML and the HTTP headers returned by the URL. Use only one and keep it simple for yourself and for search engines.

Errors related to a set of pages

Other types of errors require you to take a look at a set of pages that all have the same content. All pages in the same set have the same content, just in different languages. That is why they are grouped into a set and the set is examined collectively. All such pages should point to each other (and to themselves). What’s more, they should point to the canonical version of each other. The errors we look for are:

  • Pages not linking to each other (aka no return tags or missing return tags): All pages in a set must link to each other (and to themselves). Sometimes we see the default page (say Page A) linking to pages B, C and D but each of those only link back to page A. That is a mistake. The correct way to implement it is to have the exact same Hreflang tags on all pages in a set. Remembering this will greatly simplify your implementation. This error is explained in detail in this blog post.
  • Not linking to the canonical version: You have a set of pages all linking to each other. Wonderful! But sometimes when we crawl these pages, we discover that a page specifies that its canonical version is different from the URL that was in the Hreflang tags (or in the sitemap you are testing). This is a mistake because when you are dealing with search engines, you only want to specify the canonical version of a URL, both in your sitemap and in any hreflang tags. All other versions of that page (that point to the canonical version) are discovered by the search engine crawler when it is spidering the web (on your website or from outside). But you do not want to include non-canonical versions of a page in your sitemap, or any structured data that you provide to search engines (like hreflang tags).

Other Errors

Other errors we check for are a byproduct of crawling the pages supplied. These are not related specifically to Hreflang:

  • Invalid (mal-formed) canonical URL
  • More than 1 canonical URL specified for a given page

Further Reading

SEO consultant Aleyda Solis also has a write-up about the most common Hreflang mistakes she encounters.

Published by

Nick Jasuja

Nick Jasuja is the founder of Hreflang.org and Diffen, the world's largest collection of unbiased comparisons. Diffen's expansion to Spanish was Nick's first foray into international SEO and motivated him to launch the Hreflang testing tool. You can find him on Twitter @thisislobo.