So first off, what is duplicate content?
Essentially, duplicate content is content that appears in more than one place on the Internet. But this may not be as cut and dry as it seems. Content that is too similar, even if it isn’t identical, may be considered duplicates of one another.Â
When thinking about duplicate content, it’s important to remember that it’s not just about what human visitors see when they go to your site and compare two pages. It’s also about what search engines and crawlers see when they access those pages. Since they can’t see the rendered page, they typically go off of the source code of the page, and if that code is too similar, the crawler may think that it’s looking at two versions of the same page.Â
Imagine that you go to a bakery and there are two cupcakes in front of you that look almost identical. They don’t have any signs. How do you know which one you want? That’s what happens when a search engine encounters two pages that are too similar.Â
This confusion between pieces of content can lead to things like ranking issues, because search engines may not be able to figure out which page they should rank or they may rank the incorrect page. Within the Moz tools, we have a 90% threshold for duplicate content, which means that any pages with code that is at least 90% the same will be flagged as duplicates of one another.