echarso opened a new issue #2383: URL: https://github.com/apache/hudi/issues/2383
Hi ! Thank you for the work you are doing ! I want to use CDC for capturing changes in my DB. I will have 2 buckets 1. A1 bucket for historical loading and 2. A2 bucket for changes captured. I will use Hudi for creating my 'gold' dataset with latest status. At some point of time I will receive a GDPR request to remove a person. I will keep those 'GDPR' messages in another S3 gdpr-bucket. On an event received on that bucket i would like to trigger my anonymization / deletion operation. As i understand I can use Hudi for soft and hard deletes of the rows that belong to that customer. After using Hudi for that , information about this customer in A1 and A2 bucket is going to be not available. Is that right ? However can I use Hudi for anonymizing particular fields, only the PI (personal identifier ) fields in my buckets? Thank you. Hope my question makes sense :) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
