Canonical URLs: Why They Are Important and How to Use Them
While it’s understood that SEO is crucial for a business to grow, it’s not always clear how to fortify your SEO arsenal. Earning those all-important Google rankings can be challenging to say the least. Different theories exist for pleasing Google’s algorithm. Search engines don’t provide clear-cut information on how they rank websites; however, they do provide guidance that can help companies keep their rankings high. Google’s SEO guides detail technical actions that can help strengthen SEO.
One of these technical strategies is canonicalization. Canonical tags play a vital role in any SEO strategy. They ensure that a website avoids penalties when different URLs point to the same content. While Google provides guidelines for keeping URL structure simple, sometimes multiple URLs point to a single page. When that happens, search engines can have trouble finding the correct match for the search. This creates undesirable duplicate content.
Canonicalization leads search engines to your selected URL, the one that should be referenced and indexed. Further, it prioritizes the crawling of high-priority pages over low-quality duplicate content pages. So, users searching for your content will land on your preferred page. This makes URL canonicalization a crucial component of your SEO strategy. Keep reading to learn more about canonical tags and URLs and how to maximize their potential.
What Is Canonicalization?
Canonicalization is the process of adding a canonical tag to a URL, thus indicating your preference for it. The principal SEO benefit of canonicalization is to ensure search engines always know which version of a particular page is canonical. When you tag a page as canonical, search engines understand that it is your preferred page version. It treats all other versions as duplicates, which aids in consolidating link signals.
What Are Canonical Tags?
Canonical tags are placed in the header of a web page’s HTML code. They typically look something like this:
<link rel=”canonical” href=”https://www.yourwebsite.com/article-title” />
The URL within this tag can be the page on which the tag appears, or whichever page you want to stand as the main page for the content.
What are Canonical URLs?
A canonical URL is the URL you select as the main page for a piece of content. Once you choose the main page, all the other versions of the same content are recognized as duplicates. With a canonical URL, you can avoid duplicate content issues when you specify the preferred version of a particular page.
How Canonicalization Improves SEO
Because canonicalization means all signals to the duplicate versions are treated as links to the selected canonical version, you are essentially consolidating multiple link signals to a single page. Consolidating pages boosts the SEO of the canonical page. It boosts SEO in other ways as well, including the following:
- Selects which pages should get ranked
Canonicalization essentially informs search engines which page(s) to index and rank over its variants. Only the page you select ranks in search results and loses no traffic to its duplicates.
- Manage content syndication
Content syndication is an effective strategy that boosts brand awareness, website traffic, and SEO. However, it also risks content being marked as a duplicate unless, of course, you also employ canonicalization. As we have discussed, canonicalization consolidates link signals to the canonical URL.
- Prevent crawling of duplicate pages
When search engine bots crawl duplicate pages, the pages inevitably compete with one another for ranking in the Search Engine Page Results (SERPs). Because Google penalizes duplicate content, canonicalization is especially important to SEO in this regard.
A Note About Duplicate Content
When thinking about duplicate content as it relates to canonicalization, remember the concept refers to duplicate URLs more so than duplicate pages. You likely have duplicate content even though you took the time to create unique pages throughout your website.
Duplicate content exists primarily due to Content Management Systems (CMS) like WordPress. CMSs create multiple URLs when you have different versions of your site that can be indexed. This especially applies if you have a separate mobile website. For example, the URLs below may point to the same content, though the URLs themselves are not the same.
They look the same to viewers; however, search engines recognize them as duplicate pages. Take another look at the list above, and note the following:
- The second and third URLs point to a www and non-www variant of the website, respectively.
- The first and second URLs are duplicates because the CMS assigned a category to one of them.
- The second and fourth URLs are duplicates, except for the fact that one is an HTTP and the other an HTTPS.
- The fifth URL is the mobile version of your website and exists on a sub-domain.
- The sixth URL’s only difference is that the word NAME is in all capital letters, thus creating a duplicate.
Because some duplicate content is necessary, canonicalization is required to point search engines to your designated main page and identify the remaining versions as copies.
How to Implement Canonicals
There are five established ways to specify canonical URLs. These are called canonicalization signals and consist of the following:
- HTML tag (rel=canonical)
- HTTP header
- 301 redirect
- Internal links
Note: Google details each signal’s features, benefits, and disadvantages in its official documentation.
1. Using rel= “canonical” HTML tags
Using a rel=canonical tag is the most straightforward way to specify a canonical URL. To do so, just add this code to the <head> section of any duplicate pages:
<link rel=“canonical” href=“https://example.com/canonical-page/” />
As a more specific example, say you have an eCommerce site that sells slippers, and you have several URLs that point toward your catalog of slippers by color. So, you want https://yourstore.com/slippers/pink-slippers/ to be the canonical URL for pink slippers, even though your page’s content is accessible through other URLs like https://yourstore.com/offers/pink-slippers/, for instance.
Add this canonical tag to all your duplicate pages:
<link rel=“canonical” href=“https://yourstore.com/tshirts/black-tshirts/” />
Now, if you’re using a CMS like WordPress, you don’t need to hand-code the HTML on your pages. An easier way is to install Yoast SEO or another SEO plugin that automatically adds self-referencing canonical tags. You can set custom canonicals using the “Advanced” section on each page if you wish.
2. HTTP Header for PDFs and Non-HTML Documents
For documents like PDFs and other non-HTML files, you can’t place canonical tags in the page header because there is no page <head> section. If you have a non-HTML file and wish to specify a canonical URL for it, then you can implement a rel=”canonical” HTTP header. To do this, you must access your website’s .htaccess file. In this file, add the code below:
Link: <http://www.yourwebsite.com/downloads/filename.pdf>; rel=”canonical”
Google reviews all URLs in a sitemap and considers them canonical, at least potentially. If Google identifies duplicate content on pages within the sitemap, it decides which duplicate content version is canonical. So, it is good practice to exclude non-canonical URLs in sitemaps because Google assumes any within the sitemap are potential canonical versions.
4. 301 Redirects
If you’re familiar with 301 redirects, then you know to use them when you wish to divert traffic away from a duplicate URL and to your chosen URL, which is the canonical version.
You can also do the same for secure HTTPS vs. HTTP versions of your site using the 301 redirect to point users toward the HTTPS site. The same applies to www and non-www versions of your site. Choose one canonical version and redirect the others to that version.
5. Internal Links
While you may never have thought of it this way, how you link from one page to another through your website is a canonicalization signal. The more consistency you apply with these signals, the easier it is for search engines to identify your preferred canonical URL. Also, keep in mind that Google prefers HTTPS over HTTP URLs and well-titled URLs. Both principles make those canonicalization signals stronger.
Canonicalization Tips and Best Practices
Now that we have established how vital canonicalization is to SEO, here are a few best practices for adding canonical tags and URLs to your pages.
1. Only One Canonical URL Per Page
You should only have only one canonical URL per page. If you add more than that, you’ll end up doing more harm than good. Google will detect the multiples and ignore them all.
2. Use the Appropriate Domain Protocol
When adding a canonical URL, take care that it’s the correct domain protocol for your website, either HTTP or HTTPS.
If you switch to Secure Socket Layer (SSL), ensure that you don’t select any non-SSL (HTTP) URLs in the canonical tags. This can corrupt your results. Likewise, if you are on a secure domain, use this version of your URL:
<link rel=“canonical” href=“https://yourwebsite.com/sample-page/” />
3. Use the Appropriate WWW or Non-WWW Version
Each version is recognized differently, so be sure you select the most appropriate one for your website.
4. Use Full URLs
When adding a URL to your canonical tag, use the whole URL and not just a portion of it. This ensures that the link is interpreted correctly. So, you should use the following structure:
<link rel=“canonical” href=“https://yoursite.com/sample-page/” />
This one is incorrect:
<link rel=“canonical” href=”/sample-page/” />
5. Use Self-Referential Canonical URLs
Self-referential canonical URLs clarify which page should be indexed and what the URL will be when proper indexing occurs. Even if you’re only concerned with one page, there are likely different URL variations that can call up that page. For instance, upper and lowercase versions of the same URL will call up the same page.
Basically, a self-referential canonical tag on a page points to itself. So, let’s say the URL is https://yourwebsite.com/sample-page, then a self-referencing canonical on that page would be:
<link rel=“canonical” href=“https://yourwebsite.com/sample-page” />
Most CMSs add self-referencing URLs automatically, but you may still need to hardcode them if you’re using a custom CMS.
6. Avoid Referencing a 301 Redirect
You never need to add a canonical URL to a 301 redirect. The goal is to reference the original URL, not any URLs being redirected to it.
7. Use lowercase URLs
Google sometimes recognizes uppercase and lowercase URLs as separate URLs. A good rule of thumb is to use all lowercase URLs on your server. Then, use only lowercase URLs for your canonical tags.
How to Audit and Fix Canonical URLs
Perhaps the simplest way to audit your website for canonical tag errors is to view your page source. You can accomplish this by right-clicking on your webpage. Then, use Control F and search for “canonical.” Check the results to ensure that the URL part of href= is the URL of your preferred indexed page.
For a more thorough and automated audit of your canonical tags, you can use audit tools like ScreamingFrog or the Semrush Site Audit tool. Running a site audit report for your website will activate several checks that involve canonical tags. Next are a few issues that site audits can identify and correct.
Accelerated Mobile Pages (AMPs) With No Canonical Tag
When AMPs don’t have canonical tags, a site audit tool will flag this for you. While you should have canonicalization in place between AMP and non-AMP versions of your site pages, the audit tool will catch any discrepancies or missing tags.
To correct this, add a rel=”canonical” tag in the <head> section of each AMP that points to the non-AMP page.
No Canonical Tags for Duplicate Content
A site audit tool will identify duplicate content and recommend adding a canonical tag or redirecting the page.
Pages with Broken Canonical Tags
If your pages’ canonical links are broken, they won’t be recognized as canonical URLs. Thus, your canonical tags point to non-existent web pages and disrupt the processes of crawling and indexing site content. A site audit will identify these so you can amend them immediately.
Pages With Multiple Canonical URLs
The audit tool identifies this error when you have more than one canonical URL on a page. Fixing it simply means removing the duplicate tags and leaving only one in place.
Time to Review Your Canonical URLs
As you now understand, canonicalization is a must if your goal is to supercharge your website’s SEO. It plays a vital role in maintaining high rankings in the SERPs, consolidating your link signals, and conserving your crawl budget by prioritizing the pages that Google needs to crawl. More importantly, it helps your website avoid the penalties associated with duplicate content.
If you want to learn more about canonicalization or inquire about a professional site audit, take a look at the services we have to offer at Searched & Found.