A common problem in web development is to implement user authentication and access controls, typically accomplished through sign-up and log-in forms. Though these systems are simple enough in theory, engineering one that lives up to application security standards is a daunting undertaking.
Without a great deal of care and sophistication, authentication systems can be as fragile as a cardboard lemonade stand in a category five hurricane. However, for everything that can go wrong, there is an effective (and often simple) way to achieve a higher level of security and resilience.
**At a Glance**
1. Secure Password Storage in 2015
2. Persistent Authentication ("Remember Me" Checkboxes with Long-Term Cookies) Done Right
3. Account Recovery ("Forgot Your Password?")
If you're confused about any of the terms used on this page, please feel free to consult our [guide to cryptography terms and concepts](https://paragonie.com/blog/2015/08/you-wouldnt-base64-a-password-cryptography-decoded) first.
## Passwords: Hashes, Salts, and Policies
The year was 2004. Already, [collisions in the MD5 hash function](http://cryptography.hyperlink.cz/2004/otherformats.html) were being circulated, spelling near-certain doom for the future of this (and related) cryptographic hash functions. Five years earlier, Niels Provos presented bcrypt at USENIX 99. The RFC for PBKDF2 had already been published for four years.
Would you believe that there are still web programmers that use fast cryptographic hash functions such as MD5 and SHA1 for password storage in 2015? **It has been clear to security experts for a long time that this is a bad idea.**
Acceptable Password Storage Systems
There are only four password hashing algorithms that are currently trusted by professional cryptographers and security researchers to protect users' passwords:
* `Argon2` (winner of the [Password Hashing Competition](https://password-hashing.net))
* `bcrypt`
* `scrypt`
* `PBKDF2` (**P**assword-**B**ased **K**ey **D**erivation **F**unction #2)
For most PHP developers whom cannot install PECL packages in their production environments, [scrypt](https://pecl.php.net/package/scrypt) is not an option. **If you can use scrypt, please do.**
Given the choice between bcrypt and PBKDF2, developers should choose bcrypt. Furthermore, they should **use the existing `password_hash()` and `password_verify()` API** instead of writing their own `crypt()`-based implementation.
Developers should refrain from generating their own salts; let `password_hash()` take care of that instead.
Developers in other languages should refer to our guide on [how to safely store your users' passwords](https://paragonie.com/blog/2016/02/how-safely-store-password-in-2016).
Limitations of bcrypt
There are two caveats to bcrypt that every developer should be aware of: It truncates passwords to 72 characters and also on `NUL` bytes. (This assumes single-byte character encoding; multibyte characters will hit the limit sooner.) Many developers try to solve the 72 character limit issue by pre-hashing the user's password, which can trigger the second. **A dangerous example follows**:
$stored = password_hash(hash('sha256', $_POST['password'], true), PASSWORD_DEFAULT);
// ...
if (password_verify(hash('sha256', $_POST['password'], true), $stored)) {
// Success :D
} else {
// Failure :(
}
There is a nontrivial chance that one of the raw bytes in the hash will be `0x00`. The sooner this byte appears in the string, the cost of finding a collision becomes exponentially cheaper.
For example, both `1]W` and `@1$` produce a SHA-256 hash output that begins with `ab00`.
The solution, therefore, would be to pass the raw SHA-256 hash outputs through `base64_encode()` before passing them to bcrypt:
$stored = password_hash(
base64_encode(
hash('sha256', $_POST['password'], true)
),
PASSWORD_DEFAULT
);
// ...
if (password_verify(
base64_encode(
hash('sha256', $_POST['password'], true)
),
$stored
)) {
// Success :D
} else {
// Failure :(
}
The above example will not truncate at 72 characters and is fully binary-safe, so early null bytes will not lead to security weaknesses. The best of both worlds.
Additionally, you may want to use SHA-384 instead of SHA-256, since [SHA-256 is vulnerable to length-extension attacks](https://blog.skullsecurity.org/2012/everything-you-need-to-know-about-hash-length-extension-attacks) and SHA-384 is not.
To Pepper Or Not To Pepper?
Sometimes, developers come up with the idea of adding another layer of complexity to an otherwise straightforward security feature.
The topic of adding a *pepper* (a secret key known only to PHP and not to the database) to frustrate brute force attacks rears its head in programmer forums quite frequently.
In the above example, adding a pepper could mean replacing `hash('sha256', $_POST['password'], true)` with `hash_hmac('sha256', $_POST['password'], CONSTANT_SECRET_KEY, true)`. **We do not recommend this approach.**
Peppers do not add any meaningful security above and beyond the salt that `password_hash()` generates for you. If your database and web application reside on the same hardware, an attacker who can access the database is probably not far away from accessing your PHP source code and reading the pepper. Finally, relying a static HMAC key means never being able to easily rotate the key in the event of a partial compromise without resetting every user's password or holding onto the old one forever.
A much better solution, which is especially useful if you employ hardware separation, is to encrypt the hashes before you insert them in your database. With this safeguard in place, even if an attacker finds a way to dump all of your database tables, they first have to decrypt the hashes before they can even begin to crack them. With the PHP and the database on separate hardware, this becomes much more secure.
The advantage of encryption over an HMAC key is that an encryption key is agile. You can decrypt the hashes and re-encrypt them with a new key without having to know anyone's password.
However, that being said, **please do not roll your own encryption library.** We highly recommend [Defuse Security's PHP encryption library](https://github.com/defuse/php-encryption).
Finally, our team wrote an open source library called [PasswordLock](https://github.com/paragonie/password_lock) that does everything mentioned so far: Bcrypt-SHA2-Base64, encapsulated with the recommended authenticated encryption library we recommend. Usage example:
use \ParagonIE\PasswordLock\PasswordLock;
define('PASSWORD_KEY', \hex2bin('0102030405060708090a0b0c0d0e0f10'));
// Even better: use a configuration file stored outside your document root
// and not checked into version control
$store_me = PasswordLock::hashAndEncrypt($_POST['password'], PASSWORD_KEY);
if (PasswordLock::decryptAndVerify($_POST['password'], $store_me, PASSWORD_KEY)) {
// Success! :D
} else {
// Failure :(
}
Password Policies
**Who needs 'em?**
Password policies (especially [shameful](https://defuse.ca/password-policy-hall-of-shame.htm) ones) are *usually* a dead give-away that an application doesn't employ proper password hashing. Sometimes the best password policy is to not have one in the first place.
Establishing minimum requirements (e.g. password must be at least 12 characters long) is fine, but dictating which characters are allowed or required or enforcing a maximum password length less than 64 is not. In general, a password policy should not enforce maximums, only enforce minimums (within reason).
A really good way to provide feedback to users about the strength of their passwords is Dropbox's [zxcvbn](https://github.com/dropbox/zxcvbn) library.
Bonus points go to any web apps that go the extra mile to educate users about the benefits of password managers (e.g. [KeePass](http://keepass.info) or KeePassX).
Reasonable Password Policy Example
1. Passwords must be at between 12 and 4,096 characters in length.
2. Passwords can contain any characters (including Unicode).
3. We strongly encourage the use of a password manager like [KeePass](http://keepass.info) or KeePassX to generate and store your passwords.
4. Your [zxcvbn](https://github.com/dropbox/zxcvbn) password strength must be at least level 3 (on the 0-4 scale).
That's it. Don't tell people what their password can or cannot contain. Don't refuse longer passwords. Do stop people from shooting themselves in the foot, but don't interfere beyond what's necessary to prevent foot-bullets.
"Remember Me" - Long-Term Persistent Authentication
Short-term user authentication typically employs [sessions](https://paragonie.com/blog/2015/04/fast-track-safe-and-secure-php-sessions), while long-term authentication relies on a long-lived cookie being stored on the user's browser, separate from their session identifier. Users typically experience this feature as a checkbox labelled, "Remember me on this computer." Implementing a Remember Me feature without building a trivially exploitable backdoor requires a minor engineering feat.
Naive Solution: Just Store User Credentials in a Cookie
Any solution for long-term authentication that looks like `remember_user=1337` is wide open for abuse. Since administrator accounts typically have low User IDs, `remember_user=1` will almost certainly log you into a privileged user account.
Persistent Authentication Tokens
Another common strategy, much less susceptible to attack, is to just generate a unique token when a user checks the "Remember Me" box, store the unique token in a cookie, and have a database table that associates tokens with each user's account. There are a number of things that could still go wrong here, but it is unquestionably an improvement over the previous strategy.
Problem 1: Insufficient Randomness
Although many developers understand the need for unpredictability in security tokens, many do not know how to actually achieve this goal. A not-too-uncommon code snippet for generating unique tokens looks something like this.
function generateInsecureToken($length = 20)
{
$buf = '';
for ($i = 0; $i < $length; ++$i) {
$buf .= chr(mt_rand(0, 255));
}
return bin2hex($buf);
}
The `mt_rand()` function is **not** suitable for security purposes. If you need to generate a random number in PHP, you want one of the following:
* [RandomLib](https://github.com/ircmaxell/RandomLib)
* `random_bytes($length)` (PHP 7, or available in PHP 5 via [random_compat](https://github.com/paragonie/random_compat))
* Raw bytes read from `/dev/urandom`
* `mcrypt_create_iv($length, MCRYPT_DEV_URANDOM);`
* `openssl_random_pseudo_bytes($length);`
Doing it correctly looks like this:
function generateToken($length = 20)
{
return bin2hex(random_bytes($length));
}
Problem 2: Timing Leaks
Even if you're using a cryptographically secure random number generator, but your cookie looks like `rememberme=WBWgm2oMFxsiGRGQNJ6n8gtN3gOuQ2wjN8ZRjZtU0Mn` and you're storing these tokens in a database table that looks like this:
CREATE TABLE `auth_tokens` (
`id` integer(11) not null UNSIGNED AUTO_INCREMENT,
`token` char(33),
`userid` integer(11) not null UNSIGNED,
`expires` integer(11), -- or datetime
PRIMARY KEY (`id`)
);
(And a look-up query might look something like this...)
SELECT * FROM auth_tokens WHERE token = 'WBWgm2oMFxsiGRGQNJ6n8gtN3gOuQ2wjN8ZRjZtU0Mn';
**Watch out, an esoteric and nontrivial attack still exists.**
This may seem fine at first glance, but this actually leaks timing information due to the way strings are compared in database operations.
To clarify: if one changes first byte in the `rememberme` cookie from an `W` to an `X` the comparison will fail *slightly* faster than if the last character was incremented from `n` to `o`. Anthony Ferrara covered this topic in his blog post, [It's All About Time](http://blog.ircmaxell.com/2014/11/its-all-about-time.html).
On modern hardware, this timing difference is only significant at the nanosecond scale. This is not a simple or easy attack to pull off, but writing an authentication library that takes unnecessary risks does not make sense to us.
**Side Note**: This timing leak behavior is not any deficit of database server software. Searching a database is not the sort of operation you want to be done in constant time. Doing so would open the door to denial-of-service attacks. The potential to leak meaningful information out of a timing difference also depends on what type of index the database uses internally.
Even if the query doesn't find a valid entry for the supplied remember me token, attackers get unlimited tries. They can keep re-sending a slightly different cookie until they get their desired result. Especially if your application is not tracking and rate-limiting automatic authentications.
To make sure our "remember me" tokens are iron-clad, let's abstract the look-up from the verification and make sure we do so in constant-time. `hash_equals()` is useful here!
Proactively Secure Long-Term User Authentication
What follows is our proposed strategy for handling "remember me" cookies in a web application without leaking any useful information (even timing information) to an attacker, while still being fast and efficient (to prevent denial of service attacks).
Our proposed strategy deviates from the above simple token-based automatic login system in one crucial way: Instead of only storing a random `token` in a cookie, we store `selector:validator`.
`selector` is a unique ID to facilitate database look-ups, while preventing the unavoidable timing information from impacting security. (This is preferable to simply using the database `id` field, which leaks the number of active users on the application.)
CREATE TABLE `auth_tokens` (
`id` integer(11) not null UNSIGNED AUTO_INCREMENT,
`selector` char(12),
`hashedValidator` char(64),
`userid` integer(11) not null UNSIGNED,
`expires` datetime,
PRIMARY KEY (`id`)
);
On the database side of things, the `validator` is not stored wholesale; instead, the SHA-256 hash of `validator` is stored in the database, while the plaintext is stored (with the `selector`) in the user's cookie. With this fail-safe in place, if somehow the `auth_tokens` table is leaked, immediate widespread user impersonation is prevented.
The automatic login algorithm looks something like:
1. Separate `selector` from `validator`.
2. Grab the row in `auth_tokens` for the given selector. If none is found, abort.
3. Hash the `validator` provided by the user's cookie with SHA-256.
4. Compare the SHA-256 hash we generated with the hash stored in the database, using `hash_equals()`.
5. If step 4 passes, associate the current session with the appropriate user ID.
After this blog was originally posted, our strategy has been implemented in [Gatekeeper](https://github.com/psecio/gatekeeper), if you need a drop-in solution.
**Important**: If the user should ever change their password, you should invalidate all existing long-term authentication tokens for that user.
## Account Recovery
Let's not mince words: **Password reset features are a back-door.** For many apps and services, they are inappropriate and should not be implemented.
Generally, there are two things wrong with account recovery systems:
1. They ask terrible security question; the answers for which are usually not secret to the user. ("What is your mother's maiden name?" Your Facebook friends probably know!)
2. They rely on unreliable second authentication factors (e.g. a random token sent to the user's email address or cell phone).
The security question problem is pretty self-explanatory, but the second implies that having access to a user's email account or cell phone grants an attacker into every application or service they have an account with. **This is very bad.**
We recommend the following:
1. Don't ever implement back-doors if you can help it.
2. Don't ask any security questions if the average user is likely to post the answer on the Internet.
3. (Optional) Allow your users to attach a GnuPG public key to their profile. When an account recovery request is issued for their account, encrypt the account recovery token with their public key so only someone in possession of their private key can access it. **We do this for our projects.**
If you find yourself absolutely needing to implement an account recovery back-door (most apps implement them whether they're needed or not) and your users aren't technical enough to use GnuPG, the best you can do is to generate a random token (generated using a cryptographically secure pseudorandom number generator, as above) and email it to the user. When they fulfill this requirement, give them the ability to set a new password. **Never send them their old password (which is only possible if you aren't hashing them).** If you have access to their old password, you have failed to be a responsible web developer.
Note that sending sensitive information through email forces you to trust STARTTLS, which is a form of [opportunistic encryption](https://adamcaudill.com/2014/02/25/on-opportunistic-encryption) (great against passive observers, but falls apart in the face of an active attacker). The industry does not currently have a more reliable and widely deployed solution (except, as mentioned above, GnuPG), although some initiatives (e.g. [SMIMP](https://github.com/smimp/smimp_spec)) seek to change this deficit. If an email client that supports [STARTTLS Everywhere](https://github.com/EFForg/starttls-everywhere) is available for your application, use it.
## Closing Thoughts
Even if you implement all of the solutions we offer and follow all of our recommendations, in the end you cannot protect users from their own security mistakes. It's a good idea to log *all* authentication attempts (even successful ones).
Paragon Initiative Enterprises provides [technology consulting services](https://paragonie.com/services) to businesses with attention to [security](https://paragonie.com/service/appsec) above and beyond compliance. We lead the industry in web application security (as evidenced by, among other things, our model for a proactively secure "remember me" checkbox and cookie system).
We offer secure web-based business solutions, custom-tailored web, mobile, desktop, and server applications, as well as code auditing and penetration testing services.