6 Tips To Improve Crawling And Boost SEO

1 comment

Search engines frequently crawl website pages in order to determine which ones are indexed in their search listings. Search engine crawlers, also known as robots or spiders, collect, store and download pages they find important, such as a sites’ homepage. The search engines may not download pages they find irrelevant.

After crawling a page, the search engine analyzes it to determine if it is significant enough to be indexed. As time progresses, the search engine continues to request previously downloaded pages to check for content updates. A certain amount of bandwidth is then allocated for the periodic reviews based on the pages perceived relevancy. Every time a page is downloaded, bandwidth is used and once a website’s allocated bandwidth limit is reached, no more pages will be crawled until the next review.

Since there is a limited amount of allotted bandwidth, it is crucial to direct the crawlers to the content you want to be included in the search engine’s index of web content as well as eliminate any unnecessary or duplicate content.

Here are 6 tips to enhance SEO by improving crawling:

1.  Steer Crawlers in the Right Direction

You can help crawlers find and focus on the content you want by using a sitemap in either XML or HTML. The sitemap should always list your most important content. To block content that you do not want search engines crawling, use robots.txt files or no-follow the links to pages that you do not want indexed, including links to access the Login, Terms of Use, and Privacy Policy pages. You can also use robots.txt files to block the indexing of internal website search results.

Avoid using 302 redirects (temporary redirects) by permanently (301 redirect) redirecting pages that no longer exist to a single 404 page. Better yet, 301 redirect old website page addresses to other related content on the site. Limit any browser side creation of links in content to other pages on the site and avoid having links in JavaScript, AJAX, and Flash without an HTML equivalent on the page.

2.  Increase Page Importance

Crawlers begin with the pages they deem important and they return to those pages most often. To increase the importance of pages, decrease the number of clicks from the website’s home page needed to reach important content that may be deep within your website.  Increase the number of internal and external direct links to pages, and avoid using no-follow on internal links to important content.

3.  Increase Pages Crawled per Session

Maximize the number of pages crawled each time a search engine reviews your content by minimizing bloat. You can do this by decreasing the kilobyte size of pages on your website by eliminating blank or white space in HTML, or by using common external CSS and JavaScript files to prevent the search engine from continuously downloading the same content embedded within the HTML in web pages.

4.   Avoid Duplicate Content

Multiple pages on your website with the exact same content will not improve search results. It will however waste your website’s crawl bandwidth. In many cases, there are entire copies of your website under different domain names. You should have only one version of your website domain in the search index and 301 redirect any other domains to the appropriate domain. If there are duplicate pages on your main domain, you can 301 redirect requests to another relevant page or use the canonical META tag on the duplicate content pages to indicate the original source. Finally, replace any session variables in URLs with cookies to track users as session variables often cause search engines to crawl duplicate content.

5.  On-Page Factors

As a general rule of thumb, all web page content displayed to the user should be contained in the HTML from the page requested. Do not use iframes and frames to load content, which can cause extra page fragments to end up in the search index. Avoid Ajax and Javascript for loading links or content because it makes it hard for search engines to find the links. For the pages that should be in the index, make sure you never have META tags that instruct the search engine not to index the page.

6.  Detect and Avoid Crawler Problems

Register your website for Google’s webmaster tools. It’s quick and the reporting tool will identify crawling problem’s the search engine encounters. Use this information to implement changes and then check back to see if Google is still encountering the issues.

Stay away from spider traps as they are essentially internal link black holes on dynamic pages that create numerous pages by adding parameters or sub-directories to URLs, which create infinite loops for search engine crawlers. Also, do not include items, such as calendars, that contain links that go forward and backward indefinitely.

Concluding Remarks

To sum up, finding and resolving crawler issues will dramatically increase search traffic to your site. Depending on your content management system, fixing these problems may take some effort but it will pay off in the long run.

About the Author

John Fairley is the Director of Digital Services at Walker Sands Communications, a full-service marketing and public relations firm focused on delivering growth for business-to-business clients.

Add Your Comments

  • (will not be published)

One Comment

  1. You’ll follow a simple process where measurements for each page are checked. You’ll check the page’s ranking in search results for the target keyword. You’ll look at referrals (the number of searchers who click) from search engines for that keyword to that page.