Home » Search Engine Optimization » Canonicalization -A Way to Handle Duplicate content

Canonicalization -A Way to Handle Duplicate content

Canonicalization is the way to counter the problem of having multiple URLs for the same content.

Use Case Scenerio: You have mobile and desktop versions of pages.

  • www.yourwebsite.com/m/
  • www.yourwebsite.com/

You make a content for Hawai Vacation. The same content will be shown on two different URLs.

  • www.yourwebsite.com/m/hawai-vacation.html
  • www.yourwebsite.com/hawai-vacation.html

Here the the issue is raised. 2 different URLs but the content is same. This situation will make duplicacy of the content. To counter this issue you need to set the preferred URL. This process is called canonicalization.

Here is what Google says about it officially.

Canonical URL: A canonical URL is the URL of the page that Google thinks is most representative from a set of duplicate pages on your site. For example, if you have URLs for the same page (example.com?dress=1234 and example.com/dresses/1234), Google chooses one as canonical. The pages don’t need to be absolutely identical; minor changes in sorting or filtering of list pages don’t make the page unique (for example, sorting by price or filtering by item color). The canonical URL can be in a different domain than a duplicate URL.

Source: Google’s blog on Consolidate Duplicate URLs

Correct method of canonicalization

Now, you can simply add this <link /> tag to specify your preferred version:

<link rel="canonical" href="http://www.yourwebsite.com/hawai-vacation.html"/> inside the <head> section of the duplicate content URL: http://www.yourwebsite.com/m/hawai-vacation.html

and Google will understand that the duplicates all refer to the canonical URL: http://www.yourwebsite.com/hawai-vacation.html. Additional URL properties, like PageRank and related signals, are transferred as well.

Important Note: The link that you are using as a canonicalized version, must be used in sitemap.xml

You can read more about it on Google’s another blog on canonicalization. Here is the link.

Valid reasons for keeping similar or duplicate pages

There are valid reasons why your site might have different URLs that point to the same page, or have duplicate or very similar pages at different URLs. Here are the most common reasons:

  • To support multiple device types:
    • https://example.com/news/koala-rampage
    • https://m.example.com/news/koala-rampage
    • https://amp.example.com/news/koala-rampage
  • To enable dynamic URLs for things like search parameters or session IDs:
    • https://www.example.com/products?category=dresses&color=green
    • https://example.com/dresses/cocktail?gclid=ABCD
    • https://www.example.com/dresses/green/greendress.html
  • If your blog system automatically saves multiple URLs as you position the same post under multiple sections.
    • https://blog.example.com/dresses/green-dresses-are-awesome/
    • https://blog.example.com/green-things/green-dresses-are-awesome/
  • If your server is configured to serve the same content for www/non-www http/https variants:
    • http://example.com/green-dresses
    • https://example.com/green-dresses
    • http://www.example.com/green-dresses
  • If content you provide on a blog for syndication to other sites is replicated in part or in full on those domains:
    • https://news.example.com/green-dresses-for-every-day-155672.html (syndicated post)
    • https://blog.example.com/dresses/green-dresses-are-awesome/3245/ (original post)

https://developers.google.com/search/blog/2009/02/specify-your-canonical

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.