Why you should or should not use clean urls

Posted on March 7th, 2007 by Luke Visinoni

While perusing various web forums, I see the subject of clean urls come up quite often and it seems there is quite a bit of hype and misunderstanding surrounding this topic. In this article, I’d like to nail down precisely what clean urls are and why you should or should not use them for your particilar situation.

What are clean urls?

A clean URL is a term given to urls which are free of cruft. That is to say that they contain only the identifying information needed to pull dynamic content from a website. If you have a simple static web site with only html pages and no dynamic content, you needn’t worry about clean urls. They really won’t benefit you much.

An example of a clean url would be:

http://www.example.com/articles/334

As opposed to:

http://www.example.com/articles.php?session_id=5f9f095ce435a5a2c92fa4b
&action=view&encoding=utf8&id=334

The former url takes only the information absolutely necessary to access article # 334. The latter contains what is called a query string. A query string is a string of name and value pairs delimited by an amperstand and appended to the url with a question mark. Both of these urls lead to the exact same article. It should be quite obvious which one is easier to read, copy, remember, or type into the address bar.

Other common names for clean urls include “neat urls”, “pretty urls”, and “short urls”.

A common misconception

Before we get into the details of clean urls, I’d like to take a moment to debunk (or at the very least clear up) a myth I’ve seen floating around the net about why you should use clean urls.

Search engines don’t understand query strings

This myth comes from the fact that search engine spiders are programmed to be weary of sites with query strings. The reason for this is that search engines realize that in a query string there is a potentially infinite amount of possible values that could send their spiders into a never-ending crawl. This is what is referred to as a “spider trap“.

Take this url for instance:

http://www.example.com/page.php?q=some+stuff

A spider will look at that and see: 

http://www.example.com/page.php?q={potentially_infinate}

If a spider knows that it is crawling a potentially endless supply of content, it will limit the amount of links it will explore, and possibly the frequency in which it comes back, resulting in an incomplete crawl.

So, this myth is more a misunderstanding than a rumor. Your site will still be crawled if it contains query strings, it just may not be crawled as thoroughly or as often. 

Reasons you should use clean urls

There are many reasons why you should use clean urls when you are developing a website. Here, I will try and cover the most common of them.

Keywords in the url increases your search engine ranking

I have read on numerous search engine optimization blogs and websites that adding keywords into your url increases your search engine ranking substantially. I personally don’t have enough experience in the matter to give a definitive answer. I plan on putting this theory to the test, so in part two of this article (which will have to come in a few months), I’ll let you know the results.

Although I can’t give you a definitive answer about how much (if any) keywords in your url increase search engine rankings, I can give you a few other reasons why you might want to include them.

  1. Keywords in the url are bolded in Google’s search results.

    This just adds an extra visual mark of relevance to the user who is scanning through Google’s search results.

  2. Human readability.

    This comes into play if a user should see a link to your web page with no other descriptive text with it. If your web page url contains keywords, you don’t need a description because it’s all right there in the url. For instance, I bet you can predict what might lie behind a url that looks like this:

    http://www.example.com/articles/290/how_to_shop_for_plane_tickets

Clean urls are user-friendly

The importance of good usability, and user-friendliness on the web is enormous. Most people have a hard enough time getting to the internet, so it is your responsibility as a good web developer to make it as easy as possible for them once they get there. If your website is not user-friendly, what is the point of even trying to drive search engine traffic to it? When I come across a site that is impossible to use, I leave very quickly regardless of whether it was the first result in a Google search or not, and I’d bet a pretty penny you do too.

There are a few ways that clean urls are more user-friendly than standard urls.

  1. Hackability

    If done right, clean urls are hackable and can be guessed by the user. Let’s say that your site is selling clothing and the user arrives at a url like the following:

    http://www.example.com/products/womens

    It is fairly reasonable for the user to guess that if they change “womens” to “mens” that they will now be looking at a page with men’s products. It is also fairly reasonable for the user to guess that if they remove “womens” entirely, they will now be looking at a page listing all products, or possibly all categories.

  2. Virality

    Clean urls are just plain easier to say, remember, write down, copy, send to a friend and just about everything else a user might do with a url.

Conclusion

Whether or not clean urls make a difference to search engines, they definitely make a difference to humans, and ultimately that is what really matters. Satisfying the user with quality content and a good user interface (which may or may not include clean urls) is the best way to keep them coming back as well as linking back to your website, the two things which help the most in search engine optimization.

Further reading

http://www.seobook.com/

http://www.mattcutts.com/blog/

10 Responses to “Why you should or should not use clean urls”

  1. Well said, especially the conclusion.

  2. Good Stuff. I give you an “A+”

  3. Very nice article :-)

  4. I can’t believe you didn’t tell me you had a blog!!!

  5. Nicely done. I think I’ll start using them more :-)

  6. [...] Entries (RSS) and Comments (RSS). « Why you should or should not use clean urls [...]

  7. [...] clean urls [...]

  8. [...] clean urls [...]

  9. I prefer the “http://www.domain.com/articles/my-amazing-article-title.html” usage. There really is no need for an id to reside anywhere in the URL. Is it that much more difficult to look up via “title_to_url()” than to just use an id? I am aware of the “it must be unique” over-head…but that is just a single query at time of article creation/update.

    If you’re going to play with URL rewriting, why not do it as clean as possible?

    Good article and topic.

  10. Sean – That is a very valid point. In general I try to avoid the id as well, but there are definitely circumstances where this is not possible. For instance, we use the ecommerce platform, Miva Merchant. Within this platform, in order to see a specific product, you MUST supply the product code. You can either add keywords to the end of the url, or you can make the ids of the products verbose (such as black-leather-pants instead of LP15542BK). We have been known to do both.

Leave a Reply