Hi all,

We're working with Jack on a design for encryption of Iceberg data tables,
and got a question / decision point we'd like to bring to the community's
attention. Might be a bit exotic, but is important, so we have to try this.
Any input on this subject, or pointers to relevant contacts / sources will
be appreciated.

A rather long text below; I tried to make it as short as possible to
explain the question.

We use the standard envelope encryption approach, where the data is
encrypted with a "data encryption key" (DEK). There are lots of DEKs in a
table, because we must generate a key per file/column (this is related to
NIST requirements for cipher usage). Envelope encryption means that the
many DEKs are encrypted with a few MEKs ("master encryption keys"). There
could be just one MEK for the whole table, or for many tables; or a MEK per
sensitive column. The MEKs are managed in a KMS ("key management service")
- which stores them, and handles their access control.

DEKs encrypted with MEKs are stored close to the data. Currently, in the
"key_metadata" field in the Iceberg manifest files.

Envelope encryption practice requires "key rotation", where the MEKs are
replaced from time to time (in ~ weeks or months, as a precaution, or to
limit the number of crypto operations as required by NIST; or after being
compromised). MEK ID stays the same, but the key contents (and version)
change.

This means we have to delete all manifest files after the key rotation.
Because they keep DEKs encrypted with the previous version of a MEK, which
is not safe anymore.

To avoid this - and to minimize Iceberg-KMS interactions (KMSs can be
slow), we add a double envelope encryption mode (already in use in
Parquet), where DEKs are encrypted with an intermediate KEK ("key
encryption key"), which in turn is encrypted with a MEKs. There are less
KEKs than DEKs (e.g. one KEK per writer process lifetime; or per day; or
per N DEKs; or per partition; or per table; etc), but there are more KEKs
than MEKs.

The question is - upon MEK rotation, do we have to replace KEKs?
If not, then we can keep DEKs encrypted with KEKs in the manifest files -
which do not have to be deleted / replaced upon MEK rotation. KEKs
encrypted with MEKs will be kept elsewhere, in a mutable/replaceable medium
(which is easier, because there are much less KEKs than DEKs).
If yes, then we either have to replace all manifest files in a table (once
in a few weeks or months), or to keep "key_metadata" outside manifests,
e.g. in new file types (like with the bloom filters). The size of the
key_metadata entry - per data file - ranges from a few dozen bytes to a few
dozen kilobytes.

Cheers, Gidon

Reply via email to