Token-based authentication schemes (i.e. how you would typically implement "remember me" cookies or password reset URLs) typically suffer from a design constraint can leave applications vulnerable to timing attacks.
Fortunately, our team has identified a simple and effective mitigation strategy we call split tokens, which you should strongly consider adopting anywhere your software uses token-based authentication.
The Perils of Token-Based Authentication
When you login with a username and a password, typically you do something like this:
- Run a SQL query like
SELECT userid, password_hash FROM user_accounts WHERE username = :username
- Validate the password e.g.
password_verify($password, $storedPasswordHash)
- If the password is valid, you would then associate the current session with the user's ID
The details matter here: Only your username is used in the SQL query. Your password is verified outside of the database query (and, if you're a PHP developer using password_verify()
, the validation is also constant-time).
Contrast this with a traditional token-based authentication scheme.
- Run a SQL query like
SELECT tokenid, userid FROM password_reset_tokens WHERE token = :token AND NOW() < expire_time
- If you get a result, the token was valid
The difference is subtle: You're using your secret (the token) in your SELECT
query. Virtually every database system uses memcmp()
to compare strings, which exits early if it encounters a different byte value. This inevitably introduces a timing side-channel.
To be explicit: This is not a vulnerability of the database systems themselves; you generally want your search algorithms to be as fast as possible. Instead, it's a vulnerability of misusing a non-security feature in a security context.
The Split Tokens Approach
Let's say you already have a token-based authentication scheme where you generate a hex-encoded 32-byte random string, store a copy in the database, and then give a copy to your user. How can this be designed to prevent timing leaks?
First, split the token into two parts: The selector (used in the SQL query) and the verifier (not used in the query). The verifier will not be stored directly into the database; instead, you will store a hash of the verifier alongside the selector.
Now your authentication protocol looks like this:
- Split the user-provided token into two parts
- Run a SQL query like
SELECT tokenid, validator, userid FROM password_reset_tokens WHERE selector = :selector AND NOW() < expire_time
- If you get a result, hash the user-provided verifier and compare it with the hash stored in the database using
hash_equals()
- If both hashes match, the authentication was successful
That's all there is to it.
Questions and Answers
What Does this Accomplish?
The timing leak still exists, but it becomes useless.
A sophisticated attacker may still be able to find a valid selector by observing how long it takes for a guessed token to return a failure. However, with a split-token approach, the verifier is never leaked.
Consequently, you can only leak part of the split token, not all of it.
Where Should the Token Be Split?
Some guidelines:
- You want the verifier to be large enough to be unguessable. This means at least 16 bytes.
- There is no point in making the verifier larger than the output of the hash function you're using. For SHA256, this means at most 32 bytes.
- You want the selector to be large enough to be randomly generated but still unique.
- If your database's primary key is 32 bits (
SERIAL
in PostgreSQL,INT(11)
in MySQL, etc.), you want at least 8 bytes for the selector. - If your database's primary key is 64 bits (
BIGSERIAL
in PostgreSQL), you want at least 16 bytes for the selector.
- If your database's primary key is 32 bits (
- When in doubt, give more to the verifier than the selector.
Given the above, a 32-byte random string can reasonably be split in half (16 for the selector, 16 for the verifier).
Why Should We Hash the Verifier?
The reason we hash the verifier is to prevent an attacker armed with a read-only SQL injection from retrieving a valid authentication token for another user. Naive token-based authentication systems offer no protection against this threat.
Is a Simple Hash Enough?
Yes. Don't bother with something like bcrypt; your input is a high entropy string, not a user-provided password, so that would only waste time and electricity.
If you're also concerned about read-write SQL injection being used to forge tokens for arbitrary user accounts, you may want to worry instead about the attacker using their access to compromise the filesystem and OS. You're probably milliseconds away from a total compromise anyway, so this may not help (unless your application and database are on separate bare-metal servers), but...
Instead of just a hash of the verifier, store an HMAC of the verifier, user's database ID, the token's expiration timestamp with a secret key only known to the application (not the database).
For example:
<?php
declare(strict_types=1);
class ExampleTokenAuth
{
/**
* @var string
*/
protected $tokenSigningKey;
/* Snip! More code goes here. */
public function getHashedVerifier(string $verifier, int $userID, DateTime $expiration): string
{
return \hash_hmac(
'sha256',
\json_encode([
$verifier,
$userID,
$expiration->format('Y-m-d\TH:i:s')
]),
$this->tokenSigningKey
]);
}
}
As long as the signing key is never known to the database, and an attacker cannot escape from the database to the filesystem of the webserver, you can prevent token forgery this way. However, that's a narrow use case and generally you should focus on preventing SQL injection entirely.