Hreflang and Duplicate Content

The Situation
Multinational company has one website for each country it operates it. These websites use country-specific TLDs (e.g. example.au, example.ca, example.co.uk, example.de and so on). Websites contain information about the company’s various products. The product pages for .de are in German but the pages for .ca, .au, .co.uk and .us are all in English. So all of these pages have pretty much the exact same content. The company really wants its .au site to appear in SERPs in Australia and .ca in Canada and so on. So they create all these separate websites, use the same content on product pages on each site, and use Hreflang with “en-US”, “en-CA”, “en-GB” and “en-AU”.

What Google Does
Googlebot finds all these pages that are Hreflang’ed but are essentially duplicate content. So Google ignores the weak duplicates and ends up showing your .uk site in SERPs in Australia. What’s more, when Google does this sort of folding of duplicate pages into a different page, any Hreflang tags from those duplicate pages are ignored. So the Hreflang’ed pages get the missing return tag error in the search console.

Here’s an example of this happening. In fact, in that forum thread, Google’s John Mueller explains that

What’s happening here is that we’re getting a bit confused with all the duplicate URLs that you have on your site. For every country, you seem to have the same content. We’re taking some of these duplicates, and folding them into a single URL to make things easier. Because of that, we don’t see the return links from those duplicates.

And here is John Mueller talking about this exact same issue during one of the Webmaster hangouts:

And here he is again saying pretty much the same thing: https://www.youtube.com/watch?v=isW-Ke-AJJU&t=27m15s

The Takeaway
The takeaway from all of this is: Do not use Hreflang as an excuse for duplicate content. In other words, do not create separate pages when the content is exactly the same, just to geo-target using Hreflang. If your product descriptions are in English — and they are exactly the same — do not create different versions like “en-US”, “en-CA”, “en-AU” etc. because it is an abuse of Hreflang. By using Hreflang you are (falsely) claiming that you have customized your content for that language/region. Google will call you out on it.

To quote John Mueller again,

My recommendation would be to only include unique & relevant content within your site — only include countries that you really have something unique for. This helps our algorithms (we don’t have to filter out duplicates & pick one of the URLs for you), and helps you too (reduces the bloat in your website, it makes the remaining pages a bit “stronger”, and makes it easier to diagnose technical issues like these).

Published by

Nick Jasuja

Nick Jasuja is the founder of Hreflang.org and Diffen, the world's largest collection of unbiased comparisons. Diffen's expansion to Spanish was Nick's first foray into international SEO and motivated him to launch the Hreflang testing tool. You can find him on Twitter @thisislobo.