Protecting your users from themselves

Posted on June 4th, 2007 by Luke Visinoni

I think we all know that security on the internet, as well as anywhere else, is a big deal. As a web developer, it is your job to protect your users’ data from identity thieves and ill-intentioned e-villains. One thing you’d be wise to remember though, is that generally a user’s biggest enemy is himself. It is all too easy to forget that not everybody takes security as seriously as you do. Most people won’t even consider security until it bites them in the arse, and when it does, guess who they’re going to blame.

But my data isn’t worth anything to the bad guys!

This is an argument I hear very often. The fact is, almost any website that stores a user’s email and password can be very desirable to identity thieves or other evil-doers. Why? Because like I said, users aren’t especially worried about security. Many users use the same email address, username, and password for everything, whether it be something as critical as online banking or as trivial as a user account on a blog.

Let’s say you are in charge of running a company blog (sort of like this one), and your database is compromised. If you haven’t taken the proper steps to protect your user’s data, the thief now has all the time in the world to try his newly acquired list of usernames, email addresses, and password combinations on more lucrative and commonly used services such as eBay and Paypal. These companies go through great lengths to protect their users from fraudsters, but in the end, the user has cut their own throat by using the same password for every account they have.

So what can I do?

There really is no quick and dirty answer to this. Security is an ever-evolving topic. The thing you need to remember is to stay on top of things. Make yourself aware of current security issues and best practices in whatever language or application you may be using. To solve the problem regarding the compromised database, you could use a hash function. What is a hash function? According to Wikipedia:

A hash function is a reproducible method of turning some kind of data into a (relatively) small number that may serve as a digital “fingerprint” of the data. The algorithm “chops and mixes” (i.e., substitutes or transposes) the data to create such fingerprints. The fingerprints are called hash sums, hash values, hash codes or simply hashes.

Basically, a hash function takes an arbitrary amount of input and produces a fixed-length value that is mathematically infeasible to reverse. It is basically a “fingerprint” of the data provided. So, instead of storing your user’s passwords in plain-text, you first run it through a hashing function such as md5, sha1 or sha256 and then store the resulting hash.

But if all I store is a hash, how will I verify the user has entered the right password?

Well it’s rather simple really, just hash the incoming credentials the same way you did when you stored the password in the first place. So if you stored a sha1 hash of the user’s password when they registered, create a sha1 hash of the password they enter in and compare the two. If the two hashes match, bingo! you have a correct password. The best part about this is that nobody (even administrators) need to know the users passwords.

If the hash isn’t reversible, how will I send a user their password should they forget it?

Unfortunately, you don’t. You will need to generate a new password for them if they forget it. This can be an automated process of course. It is relatively simple to generate a decent series of characters for use as a password.

I hear that SHA1 and MD5 have been “compromised”?

Technically yes, the security of sha1 and md5 have been compromised by cryptography researchers and rainbow tables, although the use of a salt eliminates many of these compromises. Also, if you are concerned about the risk of using md5 or sha1, there are stronger algorithms such as sha256. These issues are beyond the scope of this article and I am hardly an expert in the field of cryptography, so I encourage you to read the following articles on the subject.

The point is…

You may not think your data is important, and therefor not worth protecting, but you’ve got an obligation to your users to protect their data. You would be surprised just how clever data thieves can be. Stay on your toes!

2 Responses to “Protecting your users from themselves”

  1. Very nice article. This should be recommended reading for all new developers.

  2. [...] A colleague of mine, Luke Visinoni, has written an article entited “Hash functions - Protecting your users from themself” in which he discusses the benefits of, and reasoning behind, hashing a users sensitive information when developing web applications. [...]

Leave a Reply