What is Duplicate Content? Best Practices for SEO
In the world of SEO, we’re always told to avoid duplicate content, because the results can be undesirable. But why? What is duplicate content? How does it affect SEO? And How does it affect SEO? See what I did there?
As you can probably deduce from the term, duplicate content is when a piece of content appears twice on the internet. For instance, if you post a blog article on your blog, and also insert it into its own web page, you’ve got duplicate content.
For a more official explanation, here’s how Google defines it:
Duplicate content generally refers to substantive blocks of content within or across domains that either completely matches other content or are appreciably similar.
So why is this so bad? Well, you have to put yourself in Google’s shoes. They want to provide the best possible experience for their users. Which means providing the most relevant, high-quality results to search queries. To fully understand the duplicate content and how it affects SEO, we have to dive a little deeper.
Related: Technical SEO Checklist & Guide
The Issue With Duplicate Content
Duplicate content creates problems for Google and the website owner.
For search engines like Google, it can be difficult to determine:
- Which content to include in their index.
- Which version they should rank.
- Whether to provide link juice (authority, link equity, trust, etc.) to a single page or multiple pages.
Google is wary when content is posted twice because there’s a chance that it may be spam, stolen content, or a website owner trying to bolster traffic; Unfortunately for some website owners, it could just be an innocent web developer mistake. Either way, Google’s algorithms cannot make subjective decisions and detect which version of the content is more relevant. Therefore, Google sometimes uses the catch-all method of not ranking duplicate content.
For website owners, duplicate content does damage to rankings and traffic for the following reasons:
- It’s bad practice for search engines to render multiple copies of the same content. Algorithms cannot choose which version is better, so the owner of the content may have to suffer the fate of being unranked.
- Link equity or as it’s referred to by SEO heads, “link juice” becomes diluted. People who share content that has multiple versions have to make their own choice about which link to share, and often they won’t always choose the same link source. This means that all of the traffic for that same piece of content is being directed to multiple sources. Link juice is a big part of being ranked high in SERP’s. If a single piece of content lives in multiple places, it’s not maximizing its potential visibility.
How Duplicate Content Occurs?
URL Variations
There are many ways that a website’s URL can be displayed differently. For example, http://website.com and https://website.com have the same target, but search engines read these as two different URL’s, and if the same content is posted within them, they can be viewed as duplicate content.
URL Parameters generated for tracking can also cause problems. For example, have you ever seen links like this which lead to the same target?
http://website.com/?utm_source=blog3&utm_page=email&utm_information=information
Geographically Different Domains
Different countries have localized domains such as .ca for Canada and .de for Germany. If your content is the same across these different domains, it can create duplicate content problems.
Copied Content
Google doesn’t take kindly to plagiarism, and if your content is directly taken from another source, Google may choose not to rank you, or push your pages back in the results page.
Many websites these days are content curators. Google recognizes content curation, as long as it offers more than just simply copying and pasting content. Adding a fresh perspective or something of value will get you a pass in Google’s books.
How To Fix Duplicate Content Issue
The goal of repairing duplicate content is to identify which version is the right version. In other words, which web page deserves the credit.
301 redirect
Setting up a 301 redirect from the duplicate pages to the source takes search bots to the original and tells them which page should be ranked. Think of it like Google is lost and you’re giving it directions to the nearest town.
Rel=”canonical”
This funny looking code is a signal built into a website’s HTML head that tells search bots whether a page is a copy of an original or not. It also tells them what the URL is of the original content. Google then knows which page to hand over the link juice to.
Rewrite the content
Google has no problems with someone taking the concept and structure of a piece of content and rewriting it with your own perspective or additions. Think of this as paraphrasing.
Related: Using Rel=Canonical Tag For SEO-Friendly Cross-Domain Duplicate Content
Meta Robots Noindex
This is another way of redirecting search bots from one page to another. The problem with redirecting search bots to different pages is inefficiency — it’s slow and most SEO’s scoff at the idea of anything that adds time to the web experience.
Wrapping Up
It can be challenging for web designers and owners to avoid duplicate content.
You have to put yourself in the shoes of a search engine, rather than a webpage owner. SEO specialists at Power Digital Marketing can help you avoid common mistakes when managing your website, such as URL variations.
If you suspect that you may already have a website with duplicate content problems, we can help. Duplicate content is bad, so do not do it to avoid getting penalized by Google.