Improving Your Clickthrough Rates With mod_rewrite

Add Your Comments

This article is only available to paid subscribers.

Few would disagree with the assertion that the primary goal of organic search engine optimization is to attain the highest possible rankings for desirable key phrases in a given search engine. Most search engine marketing professionals also focus their attention on creating titles and descriptions that contain the targeted key phrases and are provocative in a way that effects click-throughs. Unfortunately, fewer devote any of their efforts to doing the same for URLs.

If one views a search engine result as real estate; and he or she views a high placement as something along the lines of owning a billboard at the intersection of Hollywood and Vine, it soon becomes clear that ignoring the URL is tantamount to misappropriating 10-20% of your real estate. Users not only look at the title and description of a website, but also at the URL. Too often, we associate rewriting URLs on dynamic sites – that is, changing a URL from something like “products.php?category_id=1&product_id=2″ into “/Products/Electronics-1/Sony-32-Inch-Wega-2.html”, only with the goal of improving a site’s rankings. This may be true, as keywords in a URL may improve rankings a bit, and the removal of parameters may be especially helpful if a dynamic URL has many variables. But that is not the only benefit. The URL is one line in only a few lines of “real estate” that one receives when a high ranking is attained. A URL like “products.php?category_id=1&product_id=2″ will not encourage clicks, just as a boring, non-informative title or description will not facilitate click-throughs.

It is also worth noting that URLs are much more difficult to revise than titles and descriptions once a site is launched and indexed. Thus, when designing a new site, special care should be devoted to them. Changing URLs later requires one to redirect all of the old URLs to the new ones (typically using “301″ redirects), which can be extremely tedious and has the potential to influence rankings for the worse if done improperly.

This short tutorial provides code and examples in PHP in coordination with mod_rewrite to illustrate how one can create keyword-rich rewritten URLs. The code can be quickly adapted to ASP.NET and IISRewrite (http://www.qwerksoft.com/products/iisrewrite/) for those who use Microsoft-based servers.

The first step is to write some functions to create the URLs. These functions should be placed in a file called “mod_rewrite.inc.php” and included on any page that uses them, via the PHP function “require_once()”. Assuming we are dealing with a site with products placed in categories, our first function looks like this:

function _clean_url_text($string)

{

$string = str_replace(‘ ‘, ‘-’, $string);

// characters to be permitted in the URL

$acceptable_characters = ‘a-zA-Z0-9-’;

// replace all characters that are NOT in the set of a-z, 0-9, or a dash with nothing, i.e. delete them.

$NOT_acceptable_characters_regex = “/[^$acceptable_characters]/”;

return preg_replace($NOT_acceptable_characters_regex, ”, $string);

}

This function either deletes or changes to dashes characters that are invalid or not aesthetically pleasing in a URL string. It retains all alphanumeric characters as they are. The function call to “preg_replace” could be replaced with simpler code in this case, but using it here will familiarize you with regular expressions, which are required for coordination with the tool used later in the process – mod_rewrite. In this case, the regular expression “[^a-z0-9-]” indicates to the regular expression engine (called by preg_replace) that all characters not (^) in the set of letters, numbers, and dashes are matches. We indicate that they should be replaced with ”, the empty string, so they will be removed. All spaces have already been changed to dashes through “str_replace()”, so spaces are retained as dashes. Here are some examples of this transformation:

echo _clean_url_text(“carpenter’s tools”); // == ‘carpenters-tools’

echo _clean_url_text(‘black and white’); // == ‘black-and-white’

This function is nested inside another function that is used to create product URLs:

function make_product_url($category_name, $category_id, $product_name, $product_id)

{

$clean_category_name = _clean_url_text($category_name);

$clean_product_name = _clean_url_text($product_name);

return ‘/Products/’ . $clean_category_name . ‘-’ . $category_id . ‘/’ . $clean_product_name . ‘-’ . $product_id . ‘.html’;

}

Here are some examples of the results of the use of this function:

echo make_product_url(“carpenter’s tools”, 1, ‘belt sander’, 2);

// == ‘/Products/carpenters-tools-1/belt-sander-2.html’

echo make_product_url(‘hammers’, 3, ‘big and mighty hammer’, 4);

// == ‘/Products/hammers-3/big-and-mighty-hammer-4.html’

The website application should be retrofitted with “make_product_url()”. All instances of a website application’s code where URLs are displayed should be replaced. For example (assume the variables below refer to product database information):

echo “/products.php?category_id=$category_id&product_id=$product_id”;

should be changed to:

echo make_product_url($category_name, $category_id, $product_name, $product_id);

The next step is to inform the web server, Apache, to map the new URLs to a script file on the site. The script file mappings will be similar to the URLs in the code that were just replaced with “make_product_url()”. This is where mod_rewrite comes in.

We must create or edit a file named “.htaccess” in the root directory of our site. This file is for Apache, not PHP, and does not need to be included anywhere using “require_once()”. The first character, “.”, indicates it is a hidden file and must be included in the file name. First, verify that it does not exist, and then create the file if necessary, adding the following lines:

RewriteEngine On

RewriteRule ^Products/.*-([0-9]+)/.*-([0-9]+)\.html /products.php?category_id=$1&product_id=$2

The first line simply turns on the regular expression engine. This process incurs some minor overhead so Apache requires that we explicitly turn it on.

The second line, for those less familiar with regular expressions, requires explanation. It checks if any incoming URL matches a regular expression pattern, and if so, it delegates further processing to a script.

The first character, “^”, means that it should only be a match if the next character “P” is at the beginning of the URL. This is a different use of “^” than before, where it meant “not.” Here, it is not found within brackets “[]“, as previously, and is technically a completely different operator with a different meaning. This operator makes the rule more specific, as a URL may contain another URL as a substring, in which case it should not match.

The characters “products/” assert that the next characters in the URL string match those characters. The next characters, “.*”, assert that any string of characters of any length should match. Then, “-” asserts that a dash should be present next.

The characters, “([0-9]+)”, are the crux of this process. The parentheses around the characters indicate that the regular expression engine should store the contained string inside a variable called $1. The contained string, “[0-9]+”, indicates that it should match any string of numeric characters greater than or equal to the length of 1 (indicated by “+” instead of “*”).

The “/” character asserts that the next character should be a slash. The next part of the pattern is handled essentially the same way as the previous, except that the result of “([0-9]+)” will be stored in variable $2 this time.

By now the pattern should become clear. The final characters, “\.html”, mean that the string should end in “.html”. The “\” indicates that “.” is a literal dot; and “.” alone means “any character”, as used previously in the “.*” and “.+” expressions.

The next part of the line, “/products.php?category_id=$1&product_id=$2″, plugs the variables that we stored into our script URL mapping, and indicates to the web server that requests by a browser for URLs matching that previous pattern should be routed to that particular script with those numbers substituted for $1 and $2.

The end result of our labors is a URL that is more enticing and also potentially more search-engine-optimized. The same process can be used to rewrite product-listing pages or any other page on the site that contains dynamic parameters. One thing to note, however, is that rewriting URLs does not solve any duplicate content problems, nor does it address the topic of URL migration. These topics will be discussed in future articles.

About the Author

Jaimie Sirovich is a search marketing consultant. Officially he is a computer programmer, but he claims to enjoy marketing much more. At present, Jaimie is focused on helping clients sell everywhere, and achieve multi-channel integration with major websites such as eBay, Amazon, and even Craigslist. He is the author of Search Engine Optimization With PHP. Jaimie can also be reached directly at jaimie@seoegghead.com.

Add Your Comments

  • (will not be published)