In this article, I’ll go over how websites keep your passwords safe from malicious hackers.
First, let’s understand what happens when a hacker breaks into a website’s password database. There are two ways the hack could take place:
1 – Best-case scenario (for the hacker): The hacker gets access to the website’s database and sees all the passwords displayed with usernames next to them to represent who that password belongs to. Now he/she has all the usernames and passwords for this entire site. But it gets worse! Users often re-use the same password across multiple websites. The hacker tries all the user logins on a different website and hits gold when he finds out that bezos39 uses the same password on American Express’ website. The hacker lives happily ever after. (Well, until the FBI finds him…)
2 – Worst-case scenario: The hacker gets access to the website’s database and sees all the passwords displayed, except they’re all very long numbers… The hacker is disappointed when he figures out these long values are hashes.
Clearly, storing hashes in the database protects the user’s data more than storing the passwords as plain, unencrypted text.
So, what are hashes and how do they protect the website’s database from hackers? A hash is just a number, which is calculated by a hash function. But what is a hash function, you ask? A hash function takes an input (in this case a password from a login screen) and runs it through an algorithm that calculates a number specific to that input. For example, if your password was “password1234%”, it would always be something like “323290432” after being hashed.
Hash(password1234%) = 323290432
Every time I use the same hash function and the same input, I will always get the same return value.
Hash(password1234%) = 323290432 – it’s still the same value!
Let’s recall our worst-case scenario: the hacker got access to the website’s database and found long number values. Now, let’s imagine a case where the hacker doesn’t give up after that. The hacker copies your password’s hash “323290432” and pastes it in the website’s password field. Since he also saw your username, he carefully types it into the username field. The hacker is disappointed when he gets a prompt saying the password is incorrect.
So, why didn’t the hacker’s plan work? Well, let’s think about what happened when the hacker pasted “323290432” into the website’s password field. The website plugs “323290432” into its hash function (since the program has no reason to believe that “323290432” is something other than a password) and gets a long number back:
Hash(323290432) = 943028424889
The website then loops over all the “long numbers” in its database to find a matching hash.
While thereAreUsernamesLeftInDatabase if currentUsername isEqualTo user.username move on to while loop below if noMatchingUsername promptUser(“Username not found”) While thereAreHashesLeftInDatabase Check if currentHash isEqualTo “943028424889” If noMatchingHash promptUser(“Incorrect Password”)
The website returns a prompt informing the user of its failure to find the password (since the hash wasn’t found) in its database.
Side note: Not all hash functions use the same algorithm, and the hash “323290432” was just an example.
Summary: The idea is to store the hash of the password, instead of the password itself. Then when a user attempts to login, the site will hash the password that the user entered, and see if that same hash exists in the database.
So far, we’ve seen that a website (or a mobile app) can make its password database more secure by using hashes. This is because if a hacker breaks in, he just gets the hashes of the passwords, not the passwords themselves.
However, even this is not a completely foolproof strategy.
Let’s say you and your imaginary friend, Bertrand, have the same password “password1234%” This is a secure password, but it has a flaw. When we hash that password, it returns the same value for you and Bertrand since it’s the same password.
You: Hash(password1234%) = 323290432
Bertrand: Hash(password1234%) = 323290432
That doesn’t sound like an issue, but if we take a second to think about it, we can see the apparent problem. If you and Bertrand have the same hash, anybody could look at your hashes and figure out that you have the same password. This would be a step towards finding the hash function and stealing everybody’s passwords! This is true in particular because people tend to use common passwords like “password” a lot. If say 10% of the hashes are the same, the hacker could easily figure out that those users are using “password” or “123456789” or one of the super common passwords that people use.
To make it more secure, we can add a salt! But once again, I’m sure you’re wondering, what exactly is a salt? A salt is a random value created to make hashes unique. If my salt is “h32L38230__39”, then I can either prepend (add it before) or append (add it after) it to the password and that will give me a new password.
So, if I prepended that salt to your password for example, I would get “h32L38230__39password1234%”. I could then use a different salt and prepend that new, different salt to Bertrand’s password, which would get me something like “8j43i_4aennm#password1234%”. After passing it through my hash function, the output hashes would be unique thanks to the salt added to the original password.
Salt | Hash | |
You | h32L38230__39 | 345978352 |
Bertrand | 8j43i_4aennm# | 130873302 |
Before you finish reading this, keep in mind that not every salt is “h32L38230__39” or “j43i_4aennm#%”. Salts are chosen to make it harder for a hacker to figure out a user’s password. But how?
One way that hackers try to descramble hashed passwords is by putting together a massive list of commonly used words and passwords. For example, the world’s most used password is “123456”. With that in mind, a hacker would add “123456” to his list. Then they can pre-compute the hashes for each of these passwords using a bunch of common hashing algorithms. Then when they see a hash in the database like “234729888”, they can check their table for that exact hash. If they find it, then they know which password and algorithm may have generated that hash.
Salted hashes help protect against this, because now there are *two* inputs into determining the hash: the user’s password and the salt. This makes it much harder for the hacker to determine the password, because now all the hashes are likely different (i.e., there’s no more “10% of the users have the same hash”).
All that to say, if you’re storing user passwords as plain text, please stop! Use salted hashes – if you ever get hacked, you’ll be glad you did. As for your own personal passwords, longer is always better. Even better, use a mix of lower and uppercase letters, numbers, symbols, etc. The more complex your password is, the harder it will be to crack.
No Comments