Token-based authentication schemes (i.e. how you would typically implement ["remember me" cookies](https://paragonie.com/blog/2015/04/secure-authentication-php-with-long-term-persistence#title.2.1) or [password reset URLs](https://paragonie.com/blog/2016/09/untangling-forget-me-knot-secure-account-recovery-made-simple)) typically suffer from a design constraint can leave applications vulnerable to timing attacks. Fortunately, our team has identified a simple and effective mitigation strategy we call **split tokens**, which you should strongly consider adopting anywhere your software uses token-based authentication. ## The Perils of Token-Based Authentication When you login with a username and a password, typically you do something like this: 1. Run a SQL query like `SELECT userid, password_hash FROM user_accounts WHERE username = :username` 2. Validate the password e.g. `password_verify($password, $storedPasswordHash)` 3. If the password is valid, you would then associate the current session with the user's ID The details matter here: Only your username is used in the SQL query. Your password is verified outside of the database query (and, if you're a PHP developer using `password_verify()`, the validation is also constant-time). Contrast this with a traditional token-based authentication scheme. 1. Run a SQL query like `SELECT tokenid, userid FROM password_reset_tokens WHERE token = :token AND NOW() < expire_time` 2. If you get a result, the token was valid The difference is subtle: You're using your secret (the token) in your `SELECT` query. [Virtually](https://github.com/postgres/postgres/search?q=memcmp&utf8=%E2%9C%93) [every](https://github.com/mysql/mysql-server/search?q=memcmp&utf8=%E2%9C%93) [database](https://github.com/mackyle/sqlite/search?utf8=%E2%9C%93&q=memcmp) system uses `memcmp()` to compare strings, which exits early if it encounters a different byte value. This inevitably introduces a [timing side-channel](http://blog.ircmaxell.com/2014/11/its-all-about-time.html). To be explicit: This is not a vulnerability of the database systems themselves; you generally want your search algorithms to be as fast as possible. Instead, it's a vulnerability of misusing a non-security feature in a security context. ## The Split Tokens Approach Let's say you already have a token-based authentication scheme where you generate a hex-encoded 32-byte random string, store a copy in the database, and then give a copy to your user. How can this be designed to prevent timing leaks? First, split the token into two parts: The **selector** (used in the SQL query) and the **verifier** (not used in the query). The verifier will not be stored directly into the database; instead, you will store a **hash of the verifier** alongside the selector. Now your authentication protocol looks like this: 1. Split the user-provided token into two parts 2. Run a SQL query like `SELECT tokenid, validator, userid FROM password_reset_tokens WHERE selector = :selector AND NOW() < expire_time` 3. If you get a result, hash the user-provided **verifier** and compare it with the hash stored in the database using [`hash_equals()`](https://secure.php.net/hash_equals) 4. If both hashes match, the authentication was successful That's all there is to it. ## Questions and Answers ### What Does this Accomplish? The timing leak still exists, but it becomes useless. A sophisticated attacker may still be able to find a valid **selector** by observing how long it takes for a guessed token to return a failure. However, with a split-token approach, **the verifier is never leaked**. Consequently, you can only leak part of the split token, not all of it. ### Where Should the Token Be Split? Some guidelines: * You want the verifier to be large enough to be unguessable. This means at least 16 bytes. * There is no point in making the verifier larger than the output of the hash function you're using. For SHA256, this means at most 32 bytes. * You want the selector to be large enough to be randomly generated but still unique. * If your database's primary key is 32 bits (`SERIAL` in PostgreSQL, `INT(11)` in MySQL, etc.), you want at least 8 bytes for the selector. * If your database's primary key is 64 bits (`BIGSERIAL` in PostgreSQL), you want at least 16 bytes for the selector. * When in doubt, give more to the verifier than the selector. Given the above, a 32-byte random string can reasonably be split in half (16 for the selector, 16 for the verifier). ### Why Should We Hash the Verifier? The reason we hash the verifier is to prevent an attacker armed with a read-only SQL injection from retrieving a valid authentication token for another user. Naive token-based authentication systems offer no protection against this threat. ### Is a Simple Hash Enough? Yes. Don't bother with something like bcrypt; your input is a high entropy string, not a user-provided password, so that would only waste time and electricity. If you're also concerned about read-write SQL injection being used to forge tokens for arbitrary user accounts, you may want to worry instead about the attacker using their access to compromise the filesystem and OS. You're probably milliseconds away from a total compromise anyway, so this may not help (unless your application and database are on separate bare-metal servers), but... Instead of just a hash of the verifier, store an HMAC of the verifier, user's database ID, the token's expiration timestamp with a secret key only known to the application (not the database). For example:
<?php
declare(strict_types=1);

class ExampleTokenAuth
{
    /**
     * @var string
     */
    protected $tokenSigningKey;

    /* Snip! More code goes here. */

    public function getHashedVerifier(string $verifier, int $userID, DateTime $expiration): string
    {
        return \hash_hmac(
            'sha256',
            \json_encode([
                $verifier,
                $userID,
                $expiration->format('Y-m-d\TH:i:s')
            ]),
            $this->tokenSigningKey
        ]);
    }
}
As long as the signing key is never known to the database, and an attacker cannot escape from the database to the filesystem of the webserver, you can prevent token forgery this way. However, that's a narrow use case and generally you should focus on [preventing SQL injection entirely](https://paragonie.com/blog/2015/05/preventing-sql-injection-in-php-applications-easy-and-definitive-guide).