Just like in the Matrix, duplicate content can be a really bad thing. For your website, this will hurt your search engine rankings if you have it on your website. Your website may have it accidently or your SEO guys have done it for some bizarre reason. Either way, it is something that should be avoided at all costs, but what if you can’t avoid it and if you have a large amount of duplicate content, how can you get rid of it? In this blog post, we look at the best ways of dealing with duplicate content without any help from Neo….
What is Duplicate Content?
Duplicate content refers to blocks of text that appears on more than one page. This is picked up by Google (and other search engines) and can be penalised as this has been used in Black Hat SEO to manipulate website rankings.
What Kind of Things Count as Duplicate Content?
There are numerous different ways to come across duplicate content and it can be throughout your websites. Some of the most common times you will notice duplicate content are;
- Discussion forums (when comments are quoted)
- When your forum creates multiple post threads, you can use a canonical tag (shown below) to redirect to the original post page. This will show search engines where the original content is.
- Shop items (descriptions of products)
- Avoiding duplicate content on an e-commerce website can be tricky. Be sure not to post multiple products and to have them all under specific categories, so there is an original page to go to.
- Print versions of website pages (PDF’s)
- Some websites have duplicate pages on print-friendly pages. Like the discussion forum section above, you can use a canonical redirect to showcase to search engines where the original content is from.
- When someone has scraped content
- This is a black hat technique that should be avoided as much as possible. If you must use recycled content, be sure to make the appropriate citations and a link to the original content.
- Copied and pasted content from another website
- Similar to the scraped content, be sure to use citations and a link to the original content.
- Spun content
- Spun content is where a computer program changes common wording such as then, how, when, to alternative words of the same meaning. This is also a black hat technique and should be avoided at all costs. Another reason to avoid this is because when you change a lot of words for alternative phrases, sometimes the content will not make much sense.
Some of these methods are a lot worse than others, however, the principle still stands. To ensure that we do not get any kind of penalty from search engines we can do the following to our websites.
Introduce Canonical Tags
Canonical tags are a small piece of code that tells search engines which page is the original source of the content. It looks a little something like this:
<link rel=”canonical” href=”https://www.bravr.com/” />
This is helpful for when you would like to update an old blog post for example. You can use this snippet of code on your old blog post, to link it to the newer page. This will indicate to search engines, where the newer, more informative piece of text is and the duplicate content is not intentional.
If you have multiple pages that are exactly the same, then perhaps one of the things you should consider is to delete these pages. Especially as they are holding your website back and not contributing any quality to your website.
Once you have removed these pages, be sure to place a 301 redirect to the remaining page. This is in case the page has been indexed and search engines won’t necessarily know where the remaining page is.