What is duplicate content?

Search engines consider duplicate content any text that is repeated, totally or partially, in more than one web address (URL), whether it is internal or external .

Contrary to what many people think, most of the duplicate content detected by Google occurs on the same website.

How does duplicate content affect SEO?

Duplicate content, whether it is internal or external, has a very negative effect on organic positioning, since the moment it is detected by Google’s crawling robots, the search engine penalizes it in various ways:

  • You can filter such content so that it does not appear in search results.
  • Using the Panda algorithm, if the copy is systematic, Google directly penalizes the pages that engage in this practice.
  • In the event of plagiarism complaints, a Google reviewer can perform checks and decide to set penalties manually.

Why does duplicate content occur?

To correctly answer this question, we must distinguish between two different types of duplicate content : internal and external.

1. Internal duplicate content

Most of the duplicate content automatically detected by Google is caused by misuse of URL parameters or by poor organization or management of content .

These are the main problems :

  • Have a non-canonical domain . Our website can operate with or without a domain with a “www” prefix, which causes the creation of pages with identical content but with a different URL, so there is a risk that Google will consider them duplicates.
  • Allow Google trackers access to parts of our website that are under test . These sites sometimes include content that can be considered duplicate.
  • Have URLs with different endings for each country or region . We must bear in mind that for Google a page that, for example, ends with the address .es, is different than the same one but ends in .uk. Therefore, if for any reason we have included identical content, Google will consider it duplicate.
  • Improper organization of our content . Either due to a bad classification of the categories, the inexistence or bad planning of the meta descriptions or the use of identical content in several posts. This is very common in online stores due to their tendency to literally copy product descriptions.

2. External duplicate content

Sometimes duplicate content occurs between  totally different websites and managed by different administrators . This usually happens for two main reasons: 

  1. Because we copy an entire post or one or more fragments of it to write an article, or because other websites are copying us. There are even programs that are dedicated to automatically duplicating content using spam methods.
  2. Use unionization strategies. Sometimes, in order to gain visibility, we decide to send our content to other web pages, which can cause problems when the full content of an article is published without citing the link or doing it incorrectly.

Some keys to solve duplicate content

The problem of duplicate content is not an easy matter to solve, since it involves spending a lot of time to control and monitor it. But due to the negative consequences that it can have for the SEO of your page, it is very  convenient to take the following measures :

  • Use the “rel_canonical” tag to indicate to the search engine which version of our website (the one with the prefix www in the URL or the one that does not) is the one that we want to be indexed. To do this, just insert a line of code in the “head” section or in the HTTP header.
  • Create 301 redirects to direct Google robots to the page we are interested in, which is especially useful when we have moved content from one page to another. These addresses are included in the “.htacces” file in Apache.
  • Use the “no follow” tag to deny Google robots access to certain links because they are under construction, review or testing.
  • Use the “hrflang” tag when using different URLs with a content focus on different countries or languages.
  • Improve our titles, post categories and meta descriptions. For this, we can use the Google Webmaster Tools tool .
  • Optimize our internal links and external links strategy.
  • To avoid duplicate external content we must be very careful not to plagiarize, even just a fragment, articles from other websites.
  • Check that they are not copying us with automatic tools such as Copyscape .
  • In the event that we detect that they are plagiarizing us, we must demand that they remove it. If necessary, we can also report the page to Google.