In a previous post on how redirects cause Hreflang problems, we looked at how automatic redirection based on IP address geolocation or cookies can lead to bots being unable to crawl and access all your pages. In this post, we will look at other kinds of redirects and canonicals, and the Hreflang return tag problems they cause.The TL;DR version of this post is: Make sure all URLs you link to via Hreflang are the direct destination URLs for the respective pages. None of them should be a redirect (even if it’s a 301). Also make sure that none of these pages have a canonical pointing elsewhere.
The best way to think of Hreflang is as a cluster. Each page in the cluster links to itself and to all other pages in the cluster. All is good in our little world.
When Googlebot crawls any of these pages, it finds references to the other two, and when it follows these links it finds return tags to corroborate that all pages do indeed belong in the same cluster.
But if you have redirects or a canonical pointing elsewhere, things get messy.
Now Googlebot is going to start complaining about No return tags on your English site for your German or French site. Because technically, the page “de” doesn’t exist; only “de2” exists. And your English page is not linking to it. So the return tags from the English page to your German page are missing.
The same problem occurs if any of the pages in the cluster use a rel canonical to indicate that it’s actually a duplicate of another page.
Googlebot does not count “fr” has a page because it’s a duplicate; the page that’s indexed is “fr-real”. There are no return tags to “fr-real” from either en or de pages. Cue in the search console errors for missing return tags.
The Hreflang.org testing tool detects these types of errors. For example, an SEO posted this question on Google product forums asking why they were seeing no return tag errors in Google Search Console. I looked at their robots.txt and found their sitemap. Then I plugged the URL of the sitemap into the Hreflang testing tool and got these results. You can see an example of the errors below; the US version of the page is a redirect. The other error in that cluster is that the en-GB page in the cluster is a duplicate of another page:
So the takeaways:
- No page in the hreflang cluster should be a redirect to another page.
- No page in the hreflang cluster should be a duplicate of another page. i.e., have a canonical tag pointing to a different page.
- All pages in the cluster should link to themselves and to each other. They should all be valid pages that return a 200 OK response.
- Sites change; pages move around; shit happens. Use a tool to automatically test all your pages for Hreflang errors periodically.