Thanks a lot. I think that's the only way that ensures GDPR compliance.
In a second iteration, my thoughts are to anonymize instead of removing,
maybe identifying PII fields using AVRO custom types.
Thanks again,
2017-11-28 15:54 GMT+01:00 Ben Stopford :
> You should also be able to manage this
You should also be able to manage this with a compacted topic. If you give
each message a unique key you'd then be able to delete, or overwrite
specific records. Kafka will delete them from disk when compaction runs. If
you need to partition for ordering purposes you'd need to use a custom
partitio
Thanks, Lars, for the most interesting read!
On Sun, 26 Nov 2017 at 00:38 Lars Albertsson wrote:
> Hi David,
>
> You might find this presentation useful:
> https://www.slideshare.net/lallea/protecting-privacy-in-practice
>
> It explains privacy building blocks primarily in a batch processing
>
Hi David,
You might find this presentation useful:
https://www.slideshare.net/lallea/protecting-privacy-in-practice
It explains privacy building blocks primarily in a batch processing
context, but most of the principles are applicable for stream
processing as well, e.g. splitting non-PII and PII
Sounds nice!
I'm discussing with a customer to create a fully anonymized stream for
future analytical purposes.
Remaining question: the anonymization algorithm/strategy that maintains
statistical relevance while being resilient against brute force.
Thoughts?
-wim
On Thu, 23 Nov 2017 at 19:03 Sc
Our legal departments interpretation is when an account is deleted any data
that is kept longer then K days must be deleted. We setup our un-redacted
Kafka topics to never be greater then K days. This simplifies the problem
greatly.
Our solution is designed to limit the ability of services to see
I think the best way to implement this is via envelope encryption: your
system manages a key encryption key (kek) which is used to encrypt data
encryption keys (dek) per user/customer which are used to encrypt the
user's/customer's data.
If the user/customer walks away, simply drop the dek. His da
Hi Scott and thanks for your reply.
For what you say, I guess that when you are asked to delete some "data
user" (that's the "right to be forgotten" in GDPR), what you are really
doing is blocking the access to it. I had a similar approach, based on the
idea of Greg Young's solution of encrypting a
We are using Kafka Connect consumers that consume from the raw unredacted
topic and apply transformations and produce to a redacted topic. Using
kafka connect allows us to set it all up with an HTTP request and doesn't
require additional infrastructure.
Then we wrote a KafkaPrincipal builder to au
Hi all,
I would like to double check with you how we want to apply some GDPR into
my kafka topics. In concrete the "right to be forgotten", what forces us to
delete some data contained in the messages. So not deleting the message,
but editing it.
For doing that, my intention is to replicate the top
10 matches
Mail list logo