Canonicalization is the way to counter the problem of having multiple URLs for the same content.
Use Case Scenerio: You have mobile and desktop versions of pages.
You make a content for Hawai Vacation. The same content will be shown on two different URLs.
Here the the issue is raised. 2 different URLs but the content is same. This situation will make duplicacy of the content. To counter this issue you need to set the preferred URL. This process is called canonicalization.
Here is what Google says about it officially.
Canonical URL: A canonical URL is the URL of the page that Google thinks is most representative from a set of duplicate pages on your site. For example, if you have URLs for the same page (Source: Google’s blog on Consolidate Duplicate URLs
example.com/dresses/1234), Google chooses one as canonical. The pages don’t need to be absolutely identical; minor changes in sorting or filtering of list pages don’t make the page unique (for example, sorting by price or filtering by item color). The canonical URL can be in a different domain than a duplicate URL.
Correct method of canonicalization
Now, you can simply add this
<link /> tag to specify your preferred version:
<link rel="canonical" href="http://www.yourwebsite.com/hawai-vacation.html"/>inside the
<head>section of the duplicate content URL: http://www.yourwebsite.com/m/hawai-vacation.html
and Google will understand that the duplicates all refer to the canonical URL:
http://www.yourwebsite.com/hawai-vacation.html. Additional URL properties, like PageRank and related signals, are transferred as well.
Important Note: The link that you are using as a canonicalized version, must be used in sitemap.xml
You can read more about it on Google’s another blog on canonicalization. Here is the link.
Valid reasons for keeping similar or duplicate pages
There are valid reasons why your site might have different URLs that point to the same page, or have duplicate or very similar pages at different URLs. Here are the most common reasons:
- To support multiple device types:
- To enable dynamic URLs for things like search parameters or session IDs:
- If your blog system automatically saves multiple URLs as you position the same post under multiple sections.
- If your server is configured to serve the same content for www/non-www http/https variants:
- If content you provide on a blog for syndication to other sites is replicated in part or in full on those domains: