Our legal departments interpretation is when an account is deleted any data that is kept longer then K days must be deleted. We setup our un-redacted Kafka topics to never be greater then K days. This simplifies the problem greatly.
Our solution is designed to limit the ability of services to see parts of the data they do not require to operate. It simplifies the technical requirements ( no key management, library implementation in multiple languages etc) requires little coordination with other teams (they change the topic they read from, which is just a string) and fits cleanly within Kafka ecosystem allowing teams to use new streaming technologies, older technologies etc without requiring our data infrastructure team to support them. I am really proud of our solution because it doesn't try to boil the ocean. On Thu, Nov 23, 2017 at 9:31 AM Wim Van Leuven < wim.vanleu...@highestpoint.biz> wrote: > I think the best way to implement this is via envelope encryption: your > system manages a key encryption key (kek) which is used to encrypt data > encryption keys (dek) per user/customer which are used to encrypt the > user's/customer's data. > > If the user/customer walks away, simply drop the dek. His data becomes > undecryptable. > > You do have to implement reencryption in case keks or deks become > compromised. > > If you run in the cloud, AWS and GCloud have basic services Key Management > Services (KMS) to manage the KEKs esp. The access to it and versioning it. > > Their docs explain such a setup very well. > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.google.com_kms_docs_envelope-2Dencryption&d=DwIBaQ&c=x_Y1Lz9GyeGp2OvBCa_eow&r=ChXZJWKniTslJvQGptpIW7qAh4kkrpgYSer_wfh4G5w&m=01Zi8Fp_9BjyOtL5xBYDXRoWwaW_Om105MYasPnG_oc&s=zesl_dG4CDCFF-denLAjKtzsf6Hy0pB07O-jA3y-2zo&e= > > HTH > -wim > > On Thu, Nov 23, 2017, 09:55 David Espinosa <espi...@gmail.com> wrote: > > > Hi Scott and thanks for your reply. > > For what you say, I guess that when you are asked to delete some "data > > user" (that's the "right to be forgotten" in GDPR), what you are really > > doing is blocking the access to it. I had a similar approach, based on > the > > idea of Greg Young's solution of encrypting any private data and > forgetting > > the key when data has to deleted. > > Sadly, our legal department after some checkins has conclude that this > > approach is "to block" data but not deleting it, as a consequence it can > > take us problems. If my guess about your solution is right, you could > have > > the same problems. > > > > Thanks > > > > 2017-11-22 19:59 GMT+01:00 Scott Reynolds <sreyno...@twilio.com.invalid > >: > > > > > We are using Kafka Connect consumers that consume from the raw > unredacted > > > topic and apply transformations and produce to a redacted topic. Using > > > kafka connect allows us to set it all up with an HTTP request and > doesn't > > > require additional infrastructure. > > > > > > Then we wrote a KafkaPrincipal builder to authenticate each consumer to > > > their service names. KafkaPrincipal class is specified in the > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__server.properties&d=DwIBaQ&c=x_Y1Lz9GyeGp2OvBCa_eow&r=ChXZJWKniTslJvQGptpIW7qAh4kkrpgYSer_wfh4G5w&m=01Zi8Fp_9BjyOtL5xBYDXRoWwaW_Om105MYasPnG_oc&s=t3YyvA1dDi-dZuJ-HNuo1SYyEPSNqmQ0DT63RNz4OLQ&e= > file on the brokers. To provide topic level access > > > control we just configured SimpleAclAuthorizer. The net result is, some > > > consumers can only read redacted topic and very few have consumers can > > read > > > unredacted. > > > > > > On Wed, Nov 22, 2017 at 10:47 AM David Espinosa <espi...@gmail.com> > > wrote: > > > > > > > Hi all, > > > > I would like to double check with you how we want to apply some GDPR > > into > > > > my kafka topics. In concrete the "right to be forgotten", what forces > > us > > > to > > > > delete some data contained in the messages. So not deleting the > > message, > > > > but editing it. > > > > For doing that, my intention is to replicate the topic and apply a > > > > transformation over it. > > > > I think that frameworks like Kafka Streams or Apache Storm. > > > > > > > > Did anybody had to solve this problem? > > > > > > > > Thanks in advance. > > > > > > > -- > > > > > > Scott Reynolds > > > Principal Engineer > > > [image: twilio] <http://www.twilio.com/?utm_source=email_signature> > > > MOBILE (630) 254-2474 > > > EMAIL sreyno...@twilio.com > > > > > > -- Scott Reynolds Principal Engineer [image: twilio] <http://www.twilio.com/?utm_source=email_signature> MOBILE (630) 254-2474 EMAIL sreyno...@twilio.com