Back in 2017, we outlined the fundamentals of [searchable encryption with PHP and SQL](https://paragonie.com/blog/2017/05/building-searchable-encrypted-databases-with-php-and-sql). Shortly after, we implemented this design in a library we call [CipherSweet](https://github.com/paragonie/ciphersweet). Our initial design constraints were as follows: 1. Only use the cryptography tools that are already widely available to developers. 2. Only use encryption modes that are [secure against chosen-ciphertext attacks](https://paragonie.com/blog/2015/05/using-encryption-and-authentication-correctly). 3. Treat [usability as a security property](https://xo.tc/avids-rule-of-usability.html). 4. Remain as loosely schema-agnostic as possible, so that it's possible to use our design in NoSQL contexts or wildly different SQL database layouts. 5. Be extensible, so that it may be integrated with many other products and services. Today, we'd like to talk about some of the challenges we've encountered, as well as some of the features that have landed in CipherSweet since its inception, and how we believe they are beneficial for the adoption of usable cryptography at scale. If you're not familiar with cryptography terms, you may find [this page useful](https://paragonie.com/blog/2015/08/you-wouldnt-base64-a-password-cryptography-decoded). ## Challenges in Searchable Encryption As of the time of this writing, it's difficult to declare a "state of the art" design for searchable encryption, for two reasons: 1. Different threat models and operational requirements. 2. Ongoing academic research into different designs and attacks. Cryptographers interested in encrypted search engines are likely invested in the ongoing research into **fully homomorphic encryption** (FHE), which allows the database server to perform calculations on the ciphertext and return an encrypted result to the application to decrypt. Some projects (e.g. the encrypted camera app [Pixek](https://pixek.io) and much of the other work of [Seny Kamara](http://cs.brown.edu/~seny/), et al.) uses a technique called *structured encryption* to accomplish encrypted search with a different threat model and set of operational requirements. Namely, the queries and tags are encrypted client-side and the server just acts as a data mule with no additional power to perform computations. In either case, there are a few challenges that any proposed design must help its users overcome if they are to be used in the real world. ### Active Cryptanalytic Attacks The most significant real-world deterrents from adopting fully homomorphic encryption today are: 1. Performance. 2. Cryptography implementation availability. However, savvy companies will also list a third deterrent: [adaptive chosen-ciphertext attacks](https://paragonie.com/blog/2017/12/assuring-ciphertext-integrity-for-homomorphic-cryptosystems). This can be a controversial point to raise, because its significance depends on your application's threat model. Some application developers *really* trust their database server to not lie to the application. More generally, all forms of active attacks from a privileged but not omnipotent user (e.g. root access to the database server, but not root access on the client application software) should be considered when design any kind of encrypted search feature. ### Small Input Domains Let's say you're designing software for a hospital computer network and need to store protected health information with very few possible inputs (e.g. HIV status). Even if you can encrypt this data securely (i.e. using AEAD and without message length oracles), any system that allows you to quickly search the database for a specific value (e.g. HIV Positive) introduces the risk of [leaking information through side-channels](http://www.cryptofails.com/post/70097430253/crypto-noobs-2-side-channel-attacks). ### Information Leakage Search operations are ripe for [oracles](https://security.stackexchange.com/a/10621/43688). In particular: Order-revealing encryption techniques [leak your plaintext](https://eprint.iacr.org/2016/786.pdf), similar to [block ciphers in ECB mode](https://blog.filippo.io/the-ecb-penguin). Any proposal for searchable encryption must be able to account for its information leakage and provide users a simple way of understanding and managing that risk.
composer require paragonie/ciphersweet
### Using CipherSweet
First, you need a **backend**, which handles all of the cryptographic heavy lifting.
We give you two to choose from, but there's also a [`BackendInterface`](https://github.com/paragonie/ciphersweet/blob/master/src/Contract/BackendInterface.php)
if anyone ever needs to define their own:
* **FIPSCrypto** only uses the algorithms approved for use by FIPS 140-2. Note that
using this backend doesn't automatically make your application FIPS 140-2 certified.
* **ModernCrypto** uses libsodium, and is generally recommended in most situations.
Once you've chosen a backend, you're done thinking about cryptography algorithms. You
don't need to specify a cipher mode, or a hash function, or anything else. Instead,
the next step is to decide how you want to manage your keys.
In addition to a few generic options, CipherSweet provides a [`KeyProviderInterface`](https://github.com/paragonie/ciphersweet/blob/master/src/Contract/KeyProviderInterface.php)
to allow developers to integrate with their own custom key management solutions.
Finally, you just need to pass the **backend** and **key provider** to the **engine**.
From this point on, the engine is the only object you need to work with directly.
All together, it looks like this:
<?php
use ParagonIE\CipherSweet\Backend\ModernCrypto;
use ParagonIE\CipherSweet\KeyProvider\StringProvider;
use ParagonIE\CipherSweet\CipherSweet;
// First, choose your backend:
$backend = new ModernCrypto();
// Next, your key provider:
$provider = new StringProvider(
// The key provider stores the BackendInterface for internal use:
$backend,
// Example key, chosen randomly, hex-encoded:
'4e1c44f87b4cdf21808762970b356891db180a9dd9850e7baf2a79ff3ab8a2fc'
);
// From this point forward, you only need your Engine:
$engine = new CipherSweet($provider);
Once you have an working CipherSweet engine, you have a lot of flexibility in how you use it.
In each of the following classes, you'll mostly use the following methods:
* `prepareForStorage()` on INSERT and UPDATE queries.
* `getAllBlindIndexes()` / `getBlindIndex()` for SELECT queries.
* `decrypt()` / `decryptRow()` / `decryptManyRows()` for decrypting after the
SELECT query.
The encrypt/decrypt APIs were named more verbosely than simply `encrypt()`/`decrypt()` to ensure that the intent is
communicated whenever a developer works with it.
#### EncryptedField: Searchable Encryption for a Single Column
**[`EncryptedField`](https://github.com/paragonie/ciphersweet/tree/master/docs#encryptedfield)**
is a minimalistic interface for encrypting a single column of a database table.
`EncryptedField` is designed for projects that only ever need to encrypt a single field, but still want to be able to search on the values of this field.
<?php
use ParagonIE\CipherSweet\BlindIndex;
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedField;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
/** @var CipherSweet $engine */
$ssn = (new EncryptedField($engine, 'contacts', 'ssn'))
->addBlindIndex(
new BlindIndex('contact_ssn_full', [], 8)
)
->addBlindIndex(
new BlindIndex('contact_ssn_last_four', [new LastFourDigits], 4)
);
#### EncryptedRow: Searchable Encryption for Many Columns in One Table
**[`EncryptedRow`](https://github.com/paragonie/ciphersweet/tree/master/docs#encryptedrow)**
is a more powerful API that operates on rows of data at a time.
`EncryptedRow` is designed for projects that encrypt multiple fields and/or wish to create [compound blind indexes](#compound-blind-indexes).
It also has built-in handling for integers, floating point numbers, and (nullable) boolean values,
(which furthermore doesn't leak the size of the stored values in the ciphertext length):
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedRow;
/** @var CipherSweet $engine */
$row = (new EncryptedRow($engine, 'contacts'))
->addTextField('first_name')
->addTextField('last_name')
->addTextField('ssn')
->addBooleanField('hivstatus')
->addFloatField('latitude')
->addFloatField('longitude')
->addIntegerField('birth_year');
`EncryptedRow` expects an array that maps column names to values, like so:
<?php
$input = [
'contactid' => 12345,
'first_name' => 'Jane',
'last_name' => 'Doe',
'ssn' => '123-45-6789',
'hivstatus' => false,
'latitude' => 52.52,
'longitude' => -33.106,
'birth_year' => 1988,
'extraneous' => true
];
#### EncryptedMultiRows: Searchable Encryption for Many Tables
**[`EncryptedMultiRows`](https://github.com/paragonie/ciphersweet/tree/master/docs#encryptedmultirows)**
is a multi-row abstraction designed to make it easier to work on heavily-normalized databases
and integrate CipherSweet with ORMs (e.g. Eloquent).
Under the hood, it maintains an internal array of `EncryptedRow` objects (one for each table), so
the features that `EncryptedRow` provides are also usable from `EncryptedMultiRows`.
Anyone familiar with `EncryptedRow` should find the API for `EncryptedMultiRows` to be familiar.
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedMultiRows;
/** @var CipherSweet $engine */
$rowSet = (new EncryptedMultiRows($engine))
->addTextField('contacts', 'first_name')
->addTextField('contacts', 'last_name')
->addTextField('contacts', 'ssn')
->addBooleanField('contacts', 'hivstatus')
->addFloatField('contacts', 'latitude')
->addFloatField('contacts', 'longitude')
->addIntegerField('contacts', 'birth_year')
->addTextField('foobar', 'test');
`EncryptedRows` expects an array of table names mapped to an array that in turn maps columns to values,
like so:
<?php
$input = [
'contacts' => [
'contactid' => 12345,
'first_name' => 'Jane',
'last_name' => 'Doe',
'ssn' => '123-45-6789',
'hivstatus' => null, // unknown
'latitude' => 52.52,
'longitude' => -33.106,
'birth_year' => 1988,
'extraneous' => true
],
'foobar' => [
'foobarid' => 23,
'contactid' => 12345,
'test' => 'paragonie'
]
];
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\Transformation\AlphaCharactersOnly;
use ParagonIE\CipherSweet\Transformation\FirstCharacter;
use ParagonIE\CipherSweet\Transformation\Lowercase;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
use ParagonIE\CipherSweet\EncryptedRow;
/** @var EncryptedRow $row */
$row->addCompoundIndex(
$row->createCompoundIndex(
'contact_first_init_last_name',
['first_name', 'last_name'],
64, // 64 bits = 8 bytes
true
)
->addTransform('first_name', new AlphaCharactersOnly())
->addTransform('first_name', new Lowercase())
->addTransform('first_name', new FirstCharacter())
->addTransform('last_name', new AlphaCharactersOnly())
->addTransform('last_name', new Lowercase())
);
This gives you a case-insensitive index of **first initial + last name**.
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\KeyRotation\FieldRotator;
use ParagonIE\CipherSweet\EncryptedField;
// 1. Set up
/**
* @var string $ciphertext
* @var CipherSweet $old
* @var CipherSweet $new
*/
$oldField = new EncryptedField($old, 'contacts', 'ssn');
$newField = new EncryptedField($new, 'contacts', 'ssn');
$rotator = new FieldRotator($oldField, $newField);
// 2. Using the
if ($rotator->needsReEncrypt($ciphertext)) {
list($ciphertext, $indices) = $rotator->prepareForUpdate($ciphertext);
// Then update this row in the database.
}
You can learn more about the various various migration features [here](https://github.com/paragonie/ciphersweet/tree/master/docs#keybackend-rotation).
## Upcoming Developments in CipherSweet
One of the items on [our roadmap for PHP security in 2019](https://paragonie.com/blog/2019/01/our-php-security-roadmap-for-year-2019)
is to bring CipherSweet to your favorite framework, with as little friction as possible.
To this end, we will be releasing ORM integrations throughout Q1 2019, starting with Eloquent and Doctrine.
Additionally, we plan on shipping `KeyProvider` implementations to integrate with cloud KMS solutions and common HSM solutions (e.g. YubiHSM). These will be standalone packages that extend the core functionality of CipherSweet to allow businesses and government offices to meet their stringent security compliance requirements without polluting the main library with code to tolerate oddly-specific requirements.
When both of these developments have been completed, adopting searchable encryption in your PHP software should be as painless as possible.
Finally, we want to develop CipherSweet beyond the PHP language. We want to provide compatible implementations for Java, C#, and Node.js developers in our initial run, although we're happy to assist the open source community in developing and auditing compatible libraries in other languages.
Honorable mention: Ryan Littlefield has already started on an early [Python implementation of CipherSweet](https://github.com/rlittlefield/pyciphersweet).
### Support the Development of CipherSweet
If you'd like to support our development efforts, please consider purchasing an
[enterprise support contract](https://paragonie.com/enterprise) from our company.