Hi Gidon and Huaxin,

Thanks for continuing with the effort in Iceberg encryption support. I did
not get enough time to work on this area since the design discussion, so
far I only managed to add key metadata for manifest file, and there are
quite a few changes in our internal branch that I need to port to open
source. I will start to do it in the next few days.

Regarding the design, I wonder if we should first start with defining the
actions API with a Spark implementation for file encryption key rotation,
and then discuss the user experience.

In the original design document, I think we did not reach a consensus with
the community around the actual way to expose key rotation functionalities.
In Spark, we can either do it through DDL extension, or implement it as a
procedure. Given that this is a long-running distributed procedure, my
feeling is that the community will lean towards a procedure call.

We can continue with the discussion around this while first doing the
detailed implementation. Let's set up a discussion around this so that we
can align the efforts.

Best,
Jack Ye


On Wed, Aug 25, 2021 at 4:19 AM Gidon Gershinsky <gg5...@gmail.com> wrote:

> Hi all,
>
> We have briefly discussed this subject in a June sync, with a decision to
> continue via the mailing list.
> There are a number of pull requests from Jack and myself that implement a
> set of disjoint elements from the high-level design
> <https://docs.google.com/document/d/1kkcjr9KrlB9QagRX3ToulG_Rf-65NMSlVANheDNzJq4/edit?usp=sharing>.
> Some low-level details, such as generation and propagation of data keys,
> are not covered in this document.
> I have created a short (and hopefully simple) doc
>
> https://docs.google.com/document/d/19O_qiQumz_66CdWLpw38GFJEsUpnNxXckP9rnYIQnCo/edit?usp=sharing
>  that focuses on these details and describes the bottom-up approach to
> generation of data keys, encryption of data/delete files, and
> options/phases for optimization of key management. The scope of the
> document is intentionally narrow, and currently focuses on the minimal
> simplest option. Reviews are very welcome. Later, this doc will be merged
> in (or referenced from) the master design document.
>
> A PR with a basic encryption DDL has been sent recently by Huaxin, you can
> find it here <https://github.com/apache/iceberg/pull/3013>. Next week,
> I'll send a pull request with an implementation of the minimal encryption
> option. This pull request collects the basics from my PRs 2639, 2638, 2640
> and Jack's PR 2443; adding the key generation and other code that creates
> an end-to-end implementation of the minimal design
> <https://docs.google.com/document/d/19O_qiQumz_66CdWLpw38GFJEsUpnNxXckP9rnYIQnCo/edit?usp=sharing>.
> This PR comes with an example proposed by Ryan - using a table encryption
> key from a keyfile ("pkcs12" format - the closest thing to the "pem" format
> for symmetric keys).
> Besides the minimal version, I have a draft implementation of more
> advanced data encryption options (including per-column keys, double
> wrapping and two-tier management - all described in the master design doc)
> - but let's take this one step at a time, starting with the simplest option.
>
> Cheers, Gidon
>

Reply via email to