Hi Sönke, Compressing before encrypting has its dangers as well. Suppose you have a known compression format which adds a magic header and you're using a block cipher with a small enough block, then it becomes much easier to figure out the encryption key. For instance you can look at Snappy's stream identifier: https://github.com/google/snappy/blob/master/framing_format.txt . Based on this you should only use block ciphers where block sizes are much larger then 6 bytes. AES for instance should be good with its 128 bits = 16 bytes but even this isn't entirely secure as the first 6 bytes already leaked some information - and it depends on the cypher that how much it is. Also if we suppose that an adversary accesses a broker and takes all the data, they'll have much easier job to decrypt it as they'll have much more examples. So overall we should make sure to define and document the compatible encryptions with the supported compression methods and the level of security they provide to make sure the users are fully aware of the security implications.
Cheers, Viktor On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau <soenke.lie...@opencore.com.invalid> wrote: > Hi Stephane, > > thanks for pointing out the broken pictures, I fixed those. > > Regarding encrypting before or after batching the messages, you are > correct, I had not thought of compression and how this changes things. > Encrypted data does not really encrypt well. My reasoning at the time > of writing was that if we encrypt the entire batch we'd have to wait > for the batch to be full before starting to encrypt. Whereas with per > message encryption we can encrypt them as they come in and more or > less have them ready for sending when the batch is complete. > However I think the difference will probably not be that large (will > do some testing) and offset by just encrypting once instead of many > times, which has a certain overhead every time. Also, from a security > perspective encrypting longer chunks of data is preferable - another > benefit. > > This does however take away the ability of the broker to see the > individual records inside the encrypted batch, so this would need to > be stored and retrieved as a single record - just like is done for > compressed batches. I am not 100% sure that this won't create issues, > especially when considering transactions, I will need to look at the > compression code some more. In essence though, since it works for > compression I see no reason why it can't be made to work here. > > On a different note, going down this route might make us reconsider > storing the key with the data, as this might significantly reduce > storage overhead - still much higher than just storing them once > though. > > Best regards, > Sönke > > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek > <steph...@simplemachines.com.au> wrote: > > Hi Sonke > > > > Very much needed feature and discussion. FYI the image links seem broken. > > > > My 2 cents (if I understood correctly): you say "This process will be > > implemented after Serializer and Interceptors are done with the message > > right before it is added to the batch to be sent, in order to ensure that > > existing serializers and interceptors keep working with encryption just > > like without it." > > > > I think encryption should happen AFTER a batch is created, right before > it > > is sent. Reason is that if we want to still keep advantage of > compression, > > encryption needs to happen after it (and I believe compression happens > on a > > batch level). > > So to me for a producer: serializer / interceptors => batching => > > compression => encryption => send. > > and the inverse for a consumer. > > > > Regards > > Stephane > > > > On 19 June 2018 at 06:46, Sönke Liebau <soenke.lie...@opencore.com > .invalid> > > wrote: > > > >> Hi everybody, > >> > >> I've created a draft version of KIP-317 which describes the addition > >> of transparent data encryption functionality to Kafka. > >> > >> Please consider this as a basis for discussion - I am aware that this > >> is not at a level of detail sufficient for implementation, but I > >> wanted to get some feedback from the community on the general idea > >> before spending more time on this. > >> > >> Link to the KIP is: > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >> 317%3A+Add+transparent+data+encryption+functionality > >> > >> Best regards, > >> Sönke > >> > > > > -- > Sönke Liebau > Partner > Tel. +49 179 7940878 > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany >