Thanks a lot. I think that's the only way that ensures GDPR compliance. In a second iteration, my thoughts are to anonymize instead of removing, maybe identifying PII fields using AVRO custom types.
Thanks again, 2017-11-28 15:54 GMT+01:00 Ben Stopford <b...@confluent.io>: > You should also be able to manage this with a compacted topic. If you give > each message a unique key you'd then be able to delete, or overwrite > specific records. Kafka will delete them from disk when compaction runs. If > you need to partition for ordering purposes you'd need to use a custom > partitioner that extracts a partition key from the unique key before it > does the hash. > > B > > On Sun, Nov 26, 2017 at 10:40 AM Wim Van Leuven < > wim.vanleu...@highestpoint.biz> wrote: > > > Thanks, Lars, for the most interesting read! > > > > > > > > On Sun, 26 Nov 2017 at 00:38 Lars Albertsson <la...@mapflat.com> wrote: > > > > > Hi David, > > > > > > You might find this presentation useful: > > > https://www.slideshare.net/lallea/protecting-privacy-in-practice > > > > > > It explains privacy building blocks primarily in a batch processing > > > context, but most of the principles are applicable for stream > > > processing as well, e.g. splitting non-PII and PII data ("ejected > > > record" slide), encrypting PII data ("lost key" slide). > > > > > > Regards, > > > > > > > > > > > > Lars Albertsson > > > Data engineering consultant > > > www.mapflat.com > > > https://twitter.com/lalleal > > > +46 70 7687109 <+46%2070%20768%2071%2009> <+46%2070%20768%2071%2009> > > > Calendar: http://www.mapflat.com/calendar > > > > > > > > > On Wed, Nov 22, 2017 at 7:46 PM, David Espinosa <espi...@gmail.com> > > wrote: > > > > Hi all, > > > > I would like to double check with you how we want to apply some GDPR > > into > > > > my kafka topics. In concrete the "right to be forgotten", what forces > > us > > > to > > > > delete some data contained in the messages. So not deleting the > > message, > > > > but editing it. > > > > For doing that, my intention is to replicate the topic and apply a > > > > transformation over it. > > > > I think that frameworks like Kafka Streams or Apache Storm. > > > > > > > > Did anybody had to solve this problem? > > > > > > > > Thanks in advance. > > > > > >