We humans often find it frustrating to listen to people repeat themselves. Likewise, search engines are frustrated by websites that do the same. This problem is called duplicate content, defined as web content that is exactly duplicated or substantially similar, yet located at different URLs.
Indexing such content wastes the storage, processing capacity and computation time of a search engine. As a result, if a website contains an excess of duplicate content, at least some of its pages may be penalized. And while the SEO community “jury” is still out on whether or not there is an explicit penalty applied by the various search engines, everyone agrees that duplicate content can be detrimental to website rankings.
It is not necessary to eliminate all duplicate content to make a website search-engine-friendly, but it is desirable to eliminate as much of it as possible. There are many sources of duplicate content, but some are common to almost any website and thus, merit discussion.
Considering the ways in which duplicate content winds up on a website, it’s clear that some are an unavoidable by-product of providing a richer online experience. One of the most common in this category is the “printer friendly” page. Many websites provide two versions for every web page – one for the screen and one for printing. It is not incorrect to do so (unless you are a CSS-zealot), but all printer-friendly webpage URLs should be excluded from the view of search engine spiders, so as to not appear to be duplicate content. So, let’s examine ways to exclude URLs from the view of search engine spiders.

