A Duplicate Content Problem You May Not Know You Have

5 comments

Duplicate content can be found in many guises and wreak havoc on a website. Some of it is easy to identify, such as publishing identical product descriptions across a number of pages (for example when items are simply a different color), using session IDs as part of your shopping cart software, or having two different domains with exactly the same pages (for example a .co.uk site for a UK market and .com site for a US market, neither of which have any discernible difference). However, one duplicate content cause that may have slipped under the radar is the use of ‘printer friendly’ buttons on regular web site pages.

A printer-friendly option is a boon to most sites as it allows users to print off content to read later or store on file without the accompanying images and info that can cause a one-page article to eat up toner and whizz through printer paper unnecessarily. By offering a more concise version of the page without ads and blinking images, you make it easier for your reader to enjoy the material in a way that suits them. Many still feel more comfortable with a hard copy document rather than reading lengthy text passages on a monitor that can cause eye-strain, or that can only be read when a PC is available.

However, while this simpler version of the page is a plus point for the user, it can be a Trojan horse for the website owner by leading to duplicate content issues in the eyes of the search engines. A printer-friendly page will be almost identical to the original screen version but may have an extra /print/ folder in the URL or otherwise reference print in the file name. So www.mywebsite.com/news1.html would be the full version of the page with navigation systems and images and www.mywebsite.com/print/news1.html would be the same page minus images and navigation. To a search engine spider, these two pages are identical which poses a problem when it is attempting to sort and filter web information to provide a useful cross-section of information to the user.

If a search engine finds two pages that are strikingly similar, it may be forced to choose which one to include in its index and which one to ignore. This becomes a problem when printer-friendly pages are thrown into the equation because you may end up with your simple, pared down printer page being indexed and your main page not. All search engine traffic would then be directed to the non-branded page without navigation or adverts, leaving them stuck in a single corner of the site with no means of venturing out to your other more lucrative pages.

If your site is design heavy but you also publish a lot of news and otherwise useful information, it is advisable to include a printer-friendly button on each page, simply to ensure your site is tailored to the needs of all your visitors. However, certain steps must be taken to prevent the search engines seeing these simplistic pages as a duplicate version of your main page. To do this;

1.  Include a standard robot barring instruction. Simply copy and paste the following snippet of code into the source of the page.

2.  Amend your robots.txt file to disallow access to any page within a /print directory. All printer-friendly pages would then need to be created in a /print directory for the instruction to be effective.

3.  When you place your ‘printer friendly version’ button on the main page, make the link a no follow link. You can do this by including “no follow after the start of the link code. For example < a rel=”nofollow” href=www.mywebsite.com/print/news1.html”>

Follow these simple steps and you will be able to please all sides — the search engines whose indexing you crave and the visitors whose attention you need!

About the Author

Rebecca is the managing director of search engine optimization agency Dakota Digital a full-service agency offering SEO, online PR, web copywriting, media relationship management, and social media strategy. Rebecca works directly with each client to increase online visibility, brand profile, and search engine rankings. She has headed a number of international campaigns for large brands.

Add Your Comments

  • (will not be published)

5 Comments

  1. Nice overview Rebecca! Another avenue you could take would be to ensure that your print pages have a rel="canonical pointing back to the non-print version of the page.

  2. Interesting article. It's got me thinking of my own websites and why I'm not ranking for one of them. I use testimonials on the bottom of my sites. And I include all of them on each site. You think something as simple as that could be the reason why Google is not giving me love? Is it better to put different ones on different sites? .-= Lock´s last blog ..Commercial Locksmith Perth =-.

  3. Personally, I like setting up a separate CSS to kick in when the media is print rather than screen. It strips out much of the formatting, navigation and images, leaving you with a very clean, easily printed page, and there's no duplicate content issue, because it's the same page at the same URL. It just looks different.

  4. Lawks! That is the exact opposite of what I would recommend. Yes, the dup-content problem exists, but don't try to block Google's access to the pages as you are then missing a HUGE internal linking boost. At most, put a noindex meta tag in the printer friendly page, but ensure that Google (etc) finds the page and follows links to/from it. Having created the pages, make sure the headline is a link back to the primary page and scatter a few suitable in-text links to other pages on your site. With good CSS, these links wont show up when printed, but will substantially boost your internal link juice. .-= IanVisits´s last blog ..Sat- 7 Aug- 2010 - Bob Shepherd- The Infidel =-.

  5. Hi Rebecca! Are you sure that having 2 different domains with exactly the same pages is a duplicate content issue? (for example a .co.uk site for a UK market and .com site for a US market, neither of which have any discernible difference). I've read at SEOmoz that it wasn't a problem now... .-= Albert L.G.´s last blog ..Del software al vino =-.