The Hreflang testing tool now has a new feature that many SEOs needed, especially those who work on large websites with thousands of URLs — exporting test results to an Excel file.
When you analyze a large number of errors, it’s not always easy to present all the information in a digestible format. I’m under no illusion that the web UI is great. It’s functional, and is best suited to see a long list of “All OK” results. But if there are errors — and especially if there are a large number of hreflang links per page, thereby increasing the likelihood of at least some errors — the web UI fails to deliver the results in a way that the user can actually do something about.
So the Export to Excel feature will be very useful. Here’s what you can expect in the Excel file, along with screenshots from a real Excel file generated when testing 110 Expedia pages for Hreflang errors. There are 4 worksheets in the Excel file:
This worksheet contains a list of all URLs submitted for testing, and (almost) all Hreflang links found on those URLs. There is no information about which pages have problems and which ones are implemented correctly.
Say your website is primarily in English and translations are available in German, French, Spanish, Portuguese, Italian and Arabic. You submit your English sitemap for analysis to Hreflang.org. All your English URLs will be listed in Column A. Column B contains the self-reported language(+region/script) code for that page. So if http://www.example.com/page1.html includes an hreflang link pointing to itself (which it should), and that link specifies hreflang=”en” then we understand that this page is in English. So “en” is what you will find in Column B. If the page does not self-reference itself in its Hreflang tags, Column B will remain blank.
Columns C, D and so on are for URLs that contain different language versions of the main (submitted) URL listed in Column A. So in our example, you will have columns for ar, de, es, fr, it, and pt. They are arranged alphabetically by language code.
Here’s an example from the Expedia test:
The most common Hreflang implementation error is the “No Return Tag Found” error, and it can be a bear to troubleshoot. This worksheet will help make that much easier.
The correct way to implement Hreflang tags is to use the same set of tags on all pages that have the same content. I like to use the terms cluster, cluster leader and cluster member to conceptualize this.
Hreflang Cluster: A set of pages that have the same content, but in different languages. These pages are supposed to link to each other (and themselves) using hreflang tags.
Cluster Leader: One member of the cluster that was submitted by the user for testing. The tool finds other cluster members by looking at the Hreflang tags found on the cluster leader.
Cluster members: All pages found in the Hreflangs of the cluster leader. They must also be all crawled and verified.
So we start with the submitted page (the cluster leader), note all the Hreflang tags found on that page, and then crawl all those pages (cluster members) and expect to find the same set of Hreflang tags on them. There are 3 possible errors when comparing the set of hreflang tags found on a cluster member with those on a cluster leader:
- Missing Tags: Some tags found on the cluster leader may be missing on the cluster member page.
- Extra Tags: Some tags found on the cluster member page may be missing on the cluster leader page.
- Mismatched Tags: When the language code matches (say hreflang=”it”) but the cluster leader points to a different page than the cluster member for the same language code.
The Return-Tag-Errors worksheet lists all such errors for all cluster members, along with the URL of the cluster leader that was used to compare.
Here’s an example from the Expedia test (click to view larger image):
Why a Separate Worksheet?
The next worksheet (All-Other-Errors) also lists URLs and the errors/warnings found on them. Then why do we need a separate worksheet for Return Tag errors? And why do we list the cluster leader URL in the last column? It’s because a cluster member may belong to more than one cluster. This only a problem when when there are errors in Hreflang implementation, but we all know that’s pretty common. By listing the cluster leader and the cluster member, we establish a reference point. If one of the cluster leaders is incorrect, you’ll know which errors to ignore. [This is a complicated issue and my explanation here is too short to fully address all the complexities. Let me know if you’d like an elaboration.]
This is probably the easiest sheet to understand. All errors and warnings found on all pages are listed here. Pages without warnings or errors are not included. If there are >1 errors, they are listed as bullet points. But you’ll have to format Column B (select the whole Column by clicking on the column header (B), then click Wrap Text) to see all the bullet points in new lines.
Here’s an example: