Protecting Your Users’ Passwords – Part 2

Last week I showed you how NOT to store your users’ passwords in your database: the biggest sin of all is storing them as plaintext and the ‘false sense of security’ solution is to apply a hashing algorithm to them.

We saw that we can use a common hashing algorithm (the algorithm I used is called MD5: http://en.wikipedia.org/wiki/MD5) to turn “donkey” into “9443b0fceb8c03b6a514a706ea69df0b” and I told you that there’s no programmatic way to turn that back into “donkey” – the hashing algorithm is one-way. However, if you did last week’s homework and pasted that ciphertext into a search engine you’ll have found you got many returns. Why?

A little history: when the commonly used hashing algorithms were created, they were designed to be computationally “expensive”. That means they take a lot of processor power (and hence time) to calculate. This was deliberate – a user only has to login occasionally so it didn’t matter if it took 2 or 3 seconds to check their password. The excellent side effect of this delay was that it prevented a hacker from trying to guess your password by brute-force. Even assuming you’d been silly and used a dictionary word as your password, a hacker couldn’t break into your account by trying every word in the dictionary as he’d be there for a very long time. A quick calculation with my machine’s dictionary says, taking 3 seconds per attempt, it would take 3.4 days to attempt every dictionary word. Unfortunately for hashing algorithms, computers have got very much faster in the last 20 years – even my little laptop can generate a hash in 0.04 seconds. Suddenly the time to run through the entire dictionary has shrunk to one hour and our apparent security has vanished.

Things get even worse though. If you have a dictionary word as your password and I have access to a hash of it, I can tell you your password in just 5 seconds. I paste the hash into a search engine – one click on “search” and I have your password. What’s happened is that hackers have done all the hard work up front – they’re already run entire dictionaries through the common hashing algorithms and they’ve posted the lists of words and hashes on the internet where search engines have found them and indexed them. So although it’s technically true that we can’t take a hash value and “unhash” it, hackers do have access to functionality that can perform a similar job – for single words.

“OK”, I hear you say, “but I’d never be stupid enough to just use a plain dictionary word as my password – I’ll put a number on the end of it”. Right then… that might help, but it might not… 8339e38c61175dbd07846ad70dc226b2 and 2484b2d1aec71de2ca87f88af401a6af are hashes of dictionary words with numbers on the end and both are indexed by Google (vote1234 and password99 in case you can’t be bothered checking). Although if your password is “aardvark50” then you’re safe as its hash 0913c211b2eaa2a8b3b11fe53bdf9b4f doesn’t appear on the internet (until now of course because Google will index this blog post and your secret will soon be out!).

So how should we, as programmers, prevent our users’ passwords being cracked like this? The answer is surprisingly simple. We concatenate the password with some other information before we hash it.

The best approach is two-pronged. Firstly we concatenate with a fixed nonsense string eg “78g^&FB%V^&I” – this ensures that, however simple a password the user has entered, we’ve created something that’s pretty much guaranteed to never have existed as a string before in the history of the Internet. Secondly we also concatenate it with a piece of information that’s specific to that user on our site eg their username. This is just icing on the cake to make sure that the hashing is different for each user – so if two users use the same password then their hashes will be different. The procedure is the same as before: we apply this “super-hash” to the password that the user initially sets before we store it in our database and we apply the same “super-hash” to the user’s password attempt before we check it against the database entry.

So now, if user “smith” sets their password as “donkey”, the hash that we’re storing is the hash of “smithdonkey78g^&FB%V^&I”. Good luck finding an online hash dictionary that contains THAT!

Incidentally, my previous post is currently the second return on Google for “9443b0fceb8c03b6a514a706ea69df0b” (the hash of “donkey”) and I’ve actually had incoming traffic from that as a search term, so we KNOW that people are actually using search engines to crack hashed passwords like this. Consider yourself warned and make your code secure.

1 comment to Protecting Your Users’ Passwords – Part 2