Hi Stephane,

thanks for pointing out the broken pictures, I fixed those.

Regarding encrypting before or after batching the messages, you are
correct, I had not thought of compression and how this changes things.
Encrypted data does not really encrypt well. My reasoning at the time
of writing was that if we encrypt the entire batch we'd have to wait
for the batch to be full before starting to encrypt. Whereas with per
message encryption we can encrypt them as they come in and more or
less have them ready for sending when the batch is complete.
However I think the difference will probably not be that large (will
do some testing) and offset by just encrypting once instead of many
times, which has a certain overhead every time. Also, from a security
perspective encrypting longer chunks of data is preferable - another
benefit.

This does however take away the ability of the broker to see the
individual records inside the encrypted batch, so this would need to
be stored and retrieved as a single record - just like is done for
compressed batches. I am not 100% sure that this won't create issues,
especially when considering transactions, I will need to look at the
compression code some more. In essence though, since it works for
compression I see no reason why it can't be made to work here.

On a different note, going down this route might make us reconsider
storing the key with the data, as this might significantly reduce
storage overhead - still much higher than just storing them once
though.

Best regards,
Sönke

On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek
<steph...@simplemachines.com.au> wrote:
> Hi Sonke
>
> Very much needed feature and discussion. FYI the image links seem broken.
>
> My 2 cents (if I understood correctly): you say "This process will be
> implemented after Serializer and Interceptors are done with the message
> right before it is added to the batch to be sent, in order to ensure that
> existing serializers and interceptors keep working with encryption just
> like without it."
>
> I think encryption should happen AFTER a batch is created, right before it
> is sent. Reason is that if we want to still keep advantage of compression,
> encryption needs to happen after it (and I believe compression happens on a
> batch level).
> So to me for a producer: serializer / interceptors => batching =>
> compression => encryption => send.
> and the inverse for a consumer.
>
> Regards
> Stephane
>
> On 19 June 2018 at 06:46, Sönke Liebau <soenke.lie...@opencore.com.invalid>
> wrote:
>
>> Hi everybody,
>>
>> I've created a draft version of KIP-317 which describes the addition
>> of transparent data encryption functionality to Kafka.
>>
>> Please consider this as a basis for discussion - I am aware that this
>> is not at a level of detail sufficient for implementation, but I
>> wanted to get some feedback from the community on the general idea
>> before spending more time on this.
>>
>> Link to the KIP is:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 317%3A+Add+transparent+data+encryption+functionality
>>
>> Best regards,
>> Sönke
>>



-- 
Sönke Liebau
Partner
Tel. +49 179 7940878
OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany

Reply via email to