On Sun, Jul 14, 2019 at 12:13:45PM -0400, Joe Conway wrote:
On 7/13/19 5:58 PM, Tomas Vondra wrote:
On Sat, Jul 13, 2019 at 02:41:34PM -0400, Joe Conway wrote:
[2] also says provides additional support for AES 256. It also mentions
CBC versus XTS -- I came across this elsewhere and it bears discussion:
"Currently, the following pairs of encryption modes are supported:
AES-256-XTS for contents and AES-256-CTS-CBC for filenames
AES-128-CBC for contents and AES-128-CTS-CBC for filenames
Adiantum for both contents and filenames
If unsure, you should use the (AES-256-XTS, AES-256-CTS-CBC) pair.
AES-128-CBC was added only for low-powered embedded devices with crypto
accelerators such as CAAM or CESA that do not support XTS."
---
[2] also states this, which again makes me think in terms of table being
the moral equivalent to a file:
"Unlike dm-crypt, fscrypt operates at the filesystem level rather than
at the block device level. This allows it to encrypt different files
with different keys and to have unencrypted files on the same
filesystem. This is useful for multi-user systems where each user’s
data-at-rest needs to be cryptographically isolated from the others.
However, except for filenames, fscrypt does not encrypt filesystem
metadata."
<snip>
[5] has this to say which seems independent of mode:
"When encrypting data with a symmetric block cipher, which uses blocks
of n bits, some security concerns begin to appear when the amount of
data encrypted with a single key comes close to 2n/2 blocks, i.e. n*2n/2
bits. With AES, n = 128 (AES-128, AES-192 and AES-256 all use 128-bit
blocks). This means a limit of more than 250 millions of terabytes,
which is sufficiently large not to be a problem. That's precisely why
AES was defined with 128-bit blocks, instead of the more common (at that
time) 64-bit blocks: so that data size is practically unlimited."
FWIW I was a bit confused at first, because the copy paste mangled the
formulas a bit - it should have been 2^(n/2) and n*2^(n/2).
Yeah, sorry about that.
But goes on to say:
"I wouldn't use n*2^(n/2) bits in any sort of recommendation. Once you
reach that number of bits the probability of a collision will grow
quickly and you will be way over 50% probability of a collision by the
time you reach 2*n*2^(n/2) bits. In order to keep the probability of a
collision negligible I recommend encrypting no more than n*2^(n/4) bits
with the same key. In the case of AES that works out to 64GB"
It is hard to say if that recommendation is per key or per key+IV.
Hmm, yeah. The question is what collisions they have in mind? Presumably
it's AES(block1,key) = AES(block2,key) in which case it'd be with fixed
IV, so per key+IV.
Seems likely.
But I did find that files in an encrypted file system are encrypted with
derived keys from a master key, and I view this as analogous to what we
are doing.
My understanding always was that we'd do something like that, i.e. we'd
have a master key (or perhaps multiple of them, for various users), but
the data would be encrypted with secondary (generated) keys, and those
secondary keys would be encrypted by the master key. At least that's
what was proposed at the beginning of this thread by Insung Moon.
In my email I linked the wrong page for [2]. The correct one is here:
[2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html
Following that, I think we could end up with three tiers:
1. A master key encryption key (KEK): this is the ley supplied by the
database admin using something akin to ssl_passphrase_command
2. A master data encryption key (MDEK): this is a generated key using a
cryptographically secure pseudo-random number generator. It is
encrypted using the KEK, probably with Key Wrap (KW):
or maybe better Key Wrap with Padding (KWP):
3a. Per table data encryption keys (TDEK): use MDEK and HKDF to generate
table specific keys.
3b. WAL data encryption keys (WDEK): Similarly use MDEK and a HKDF to
generate new keys when needed for WAL (based on the other info we
need to change WAL keys every 68 GB unless I read that wrong).
I believe that would allows us to have multiple keys but they are
derived securely from the one DEK using available info similar to the
way we intend to use LSN to derive the IVs -- perhaps table.oid for
tables and something else for WAL.
We also need to figure out how/when to generate new WDEK. Maybe every
checkpoint, also meaning we would have to force a checkpoint every 68GB?
I think that very much depends on what exactly the 68GB refers to - key
or key+IV? If key+IV, then I suppose we can use LSN as IV and we would
not need to change checkpoints. But it's not clear to me why we would
need to force checkpoints at all? Surely we can just write a WAL message
about switching to the new key, or something like that?
[HKDF]: https://tools.ietf.org/html/rfc5869
[KW]: https://tools.ietf.org/html/rfc3394
[KWP]: https://tools.ietf.org/html/rfc5649
But AFAICS the 2-tier key scheme is primarily motivated by operational
reasons, i.e. effort to rotate the master key etc. So I would not expect
to find recommendations to use multiple keys in sources primarily
dealing with cryptography.
It does in [2]
One extra thing we should consider is authenticated encryption. We can't
just encrypt the pages (no matter which AES mode is used - XTS/CBC/...),
as that does not provide integrity protection (i.e. can't detect when
the ciphertext was corrupted due to disk failure or intentionally). And
we can't quite rely on checksums, because that checksums the plaintext
and is stored encrypted.
I agree that authenticated encryption would be a good goal. I'm not sure
we need to require it for the first version, although it would mean
another option for the encryption type. That may be another good reason
to allow both AES 128 and AES 256 CTR/CBC in the first version, as it
will hopefully ensure that when we add different modes later it will be
less painful.
We could check the CRC prior to encryption and throw an ERROR if it is
not correct. After decryption we can check it again -- if it no longer
matches we would know there way a corruption or change of the
ciphertext, no?
Hmm, I guess the entire page of ciphertext could be faked including CRC,
so this would only really cover corruption, not an intentional change if
it were done properly.
I don't think any of the schemes discussed here provides protection
against this sort of replay attacks (i.e. replacing a page with an older
copy of the page). That would probably require having some global
checksum or something like that.
Which seems pretty annoying, because then the checksums won't verify
data as sent to the storage system, and verify checksums would require
access to all keys (how do you do that in offline mode?).
Given the scheme above I don't see why that would be an issue. The keys
are all accessible via the MDEK, which is in turn available via the KEK.
I just don't know how the offline tools will access the KMS to get the
keys. But maybe that's not an issue. But even then I think it's kinda
against the idea of checksums that they would not checksum what was sent
to the storage system.
But the main issue with checksum-then-encrypt is it's essentially
"MAC-then-Encrypt" and that does not provide Authenticated Encryption
security - see [1]. We should be looking at "Encrypt-then-MAC" instead,
in which case we'll need to store the MAC somewhere (probably in the
same place as the nonce/IV/key/... for each page).
Yeah, that's why I think maybe this is a v2 feature.
Maybe - as long as we design it with enough flexibility to enable it
later, that might work. That depends on where we store the metadata,
etc.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services