Hi Ryan,

 

I believe I’ve addressed most of the below questions in the Google Doc at 
least. I’ll go ahead and start on the implementation as outlined in the current 
document.

 

Let me know if there are any further concerns.

 

-Matt Cheah

 

From: Ryan Blue <rb...@netflix.com>
Reply-To: "rb...@netflix.com" <rb...@netflix.com>
Date: Monday, December 24, 2018 at 11:12 AM
To: Matt Cheah <mch...@palantir.com>
Cc: Iceberg Dev List <dev@iceberg.apache.org>, "Yifei Huang (PD)" 
<yif...@palantir.com>, Vinoo Ganesh <vgan...@palantir.com>
Subject: Re: Iceberg Encryption Proposal

 

Hi Matt,

Thanks for putting this proposal together! It all seems reasonable to me. I 
just have a few questions and comments about scope and use:

·         Encrypted Iceberg metadata is out of scope? 

·         Authentication tags are out of scope? (like those used in Parquet) 

·         I think one requirement should be that Iceberg doesn’t necessarily 
leak the association of data files to keys. In that case, I’d prefer an opaque 
byte array of “key metadata” instead of the existing struct. That allows 
encrypting the key metadata later to avoid the leak. 

·         Using an opaque byte array would also support storing more than one 
encryption key reference for per-column encryption. If that were done, the key 
returned by the get/put API might need to be more flexible. 

·         This should also describe how to pass the key metadata to file 
formats for those that support encryption (or explicitly state that’s out of 
scope) 

·         I’d like a little more detail on how this could look up keys on the 
driver and distribute them to tasks safely to avoid the thundering herd problem 
on the key server 

Thanks!

rb

 

On Wed, Dec 12, 2018 at 11:44 AM Matt Cheah <mch...@palantir.com> wrote:

Hi everyone,

 

Encrypting data written to Iceberg tables is crucial for using this technology 
securely in industry settings. Towards that end, I’ve proposed an API for 
supporting encryption, including how users can implement their own custom 
encryption key providers and the metadata we’ll need to store in manifests.

 

You can find the full spec here: 
https://docs.google.com/document/d/1LptmFB7az2rLnou27QK_KKHgjcA5vKza0dWj4h8fkno/edit
 [docs.google.com]

 

The GitHub ticket tracking this is here: 
https://github.com/apache/incubator-iceberg/issues/20 [github.com]

 

Feel free to provide feedback in comments on the document.

 

Thanks!

 

-Matt Cheah


 

-- 

Ryan Blue 

Software Engineer

Netflix

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to