On Fri, Jul 19, 2019 at 12:04:36PM +0200, Antonin Houska wrote:
Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
On Mon, Jul 15, 2019 at 03:42:39PM -0400, Bruce Momjian wrote:
>On Sat, Jul 13, 2019 at 11:58:02PM +0200, Tomas Vondra wrote:
>> One extra thing we should consider is authenticated encryption. We can't
>> just encrypt the pages (no matter which AES mode is used - XTS/CBC/...),
>> as that does not provide integrity protection (i.e. can't detect when
>> the ciphertext was corrupted due to disk failure or intentionally). And
>> we can't quite rely on checksums, because that checksums the plaintext
>> and is stored encrypted.
>
>Uh, if someone modifies a few bytes of the page, we will decrypt it, but
>the checksum (per-page or WAL) will not match our decrypted output. How
>would they make it match the checksum without already knowing the key.
>I read [1] but could not see that explained.
>
Our checksum is only 16 bits, so perhaps one way would be to just
generate 64k of randomly modified pages and hope one of them happens to
hit the right checksum value. Not sure how practical such attack is, but
it does require just filesystem access.
I don't think you can easily generate 64k of different checksums this way. If
the data is random, I suppose that each set of 2^(128 - 16) blocks will
contain the the same checksum after decryption. Thus even you generate 64k of
different ciphertext blocks that contain the checksum, some (many?) checksums
will be duplicate. Unfortunately the math to describe this problem does not
seem to be trivial.
I'm not sure what's your point, or why you care about the 128 bits, but I
don't think the math is very complicated (and it's exactly the same with
or without encryption). The probability of checksum collision for randomly
modified page is 1/64k, so p=~0.00153%. So probability of *not* getting a
collision is (1-p)=99.9985%. So with N pages, the probability of no
collisions is pow((1-p),N) which behaves like this:
N pow((1-p),N)
--------------------
10000 85%
20000 73%
30000 63%
46000 49%
200000 4%
So with 1.6GB relation you have about 96% chance of a checksum collision.
Also note that if you try to generate ciphertext, decryption of which will
result in particular value of checksum, you can hardly control the other 14
bytes of the block, which in turn are used to verify the checksum.
But we don't care about the 14 bytes. In fact, we want the page header
(which includes both the checksums and the other 14B in the block) to
remain unchanged - the attack only needs to modify the remaining parts of
the 8kB page in a way to generate the same checksum on the plaintext.
And that's not that hard to do, IMHO, because the header is stored at the
beginning of the page. So we can just randomly modify the last AES block
(last 16B on the page) to minimize the corruption to the last block.
Now, I'm not saying this attack is particularly practical - it would
generate a fair number of checkpoint failures before getting the first
collision. So it'd trigger quite a few alerts, I guess.
FWIW our CRC algorithm is not quite HMAC, because it's neither keyed nor
a cryptographic hash algorithm. Now, maybe we don't want authenticated
encryption (e.g. XTS is not authenticated, unlike GCM/CCM).
I'm also not sure if we should try to guarantee data authenticity /
integrity. As someone already mentioned elsewhere, page MAC does not help if
the whole page is replaced. (An extreme case is that old filesystem snapshot
containing the whole data directory is restored, although that will probably
make the database crash soon.)
We can guarantee integrity and authenticity of backup, but that's a separate
feature: someone may need this although it's o.k. for him to run the cluster
unencrypted.
Yes, I do agree with that. I think attempts to guarantee data authenticity
and/or integrity at the page level is mostly futile (replay attacks are an
example of why). IMHO we should consider that to be outside the threat
model TDE is expected to address.
IMO a better way to handle authenticity/integrity would be based on WAL,
which is essentially an authoritative log of operations. We should be able
to parse WAL, deduce expected state (min LSN, checksums) for each page,
and validate the cluster state based on that.
I still think having to decrypt the page in order to verify a checksum
(because the header is part of the encrypted page, and is computed from
the plaintext version) is not great.
regards
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services