+1 (non-binding) from me on the interface. I'd like to see someone familiar with the code comment on the approach, and note there's a couple of different approaches: what's documented in the KIP, and what Xiaohe Dong was working on here: https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-cleaner-compaction-max-lifetime-2.0
If you have code working already Xiongqi Wu could you share a PR? I'd be happy to start testing. On Tue, Aug 28, 2018 at 5:57 AM xiongqi wu <xiongq...@gmail.com> wrote: > Hi All, > > Do you have any additional comments on this KIP? > > > On Thu, Aug 16, 2018 at 9:17 PM, xiongqi wu <xiongq...@gmail.com> wrote: > > > on 2) > > The offsetmap is built starting from dirty segment. > > The compaction starts from the beginning of the log partition. That's how > > it ensure the deletion of tomb keys. > > I will double check tomorrow. > > > > Xiongqi (Wesley) Wu > > > > > > On Thu, Aug 16, 2018 at 6:46 PM Brett Rann <br...@zendesk.com.invalid> > > wrote: > > > >> To just clarify a bit on 1. whether there's an external storage/DB isn't > >> relevant here. > >> Compacted topics allow a tombstone record to be sent (a null value for a > >> key) which > >> currently will result in old values for that key being deleted if some > >> conditions are met. > >> There are existing controls to make sure the old values will stay around > >> for a minimum > >> time at least, but no dedicated control to ensure the tombstone will > >> delete > >> within a > >> maximum time. > >> > >> One popular reason that maximum time for deletion is desirable right now > >> is > >> GDPR with > >> PII. But we're not proposing any GDPR awareness in kafka, just being > able > >> to guarantee > >> a max time where a tombstoned key will be removed from the compacted > >> topic. > >> > >> on 2) > >> huh, i thought it kept track of the first dirty segment and didn't > >> recompact older "clean" ones. > >> But I didn't look at code or test for that. > >> > >> On Fri, Aug 17, 2018 at 10:57 AM xiongqi wu <xiongq...@gmail.com> > wrote: > >> > >> > 1, Owner of data (in this sense, kafka is the not the owner of data) > >> > should keep track of lifecycle of the data in some external > storage/DB. > >> > The owner determines when to delete the data and send the delete > >> request to > >> > kafka. Kafka doesn't know about the content of data but to provide a > >> mean > >> > for deletion. > >> > > >> > 2 , each time compaction runs, it will start from first segments (no > >> > matter if it is compacted or not). The time estimation here is only > used > >> > to determine whether we should run compaction on this log partition. > So > >> we > >> > only need to estimate uncompacted segments. > >> > > >> > On Thu, Aug 16, 2018 at 5:35 PM, Dong Lin <lindon...@gmail.com> > wrote: > >> > > >> > > Hey Xiongqi, > >> > > > >> > > Thanks for the update. I have two questions for the latest KIP. > >> > > > >> > > 1) The motivation section says that one use case is to delete PII > >> > (Personal > >> > > Identifiable information) data within 7 days while keeping non-PII > >> > > indefinitely in compacted format. I suppose the use-case depends on > >> the > >> > > application to determine when to delete those PII data. Could you > >> explain > >> > > how can application reliably determine the set of keys that should > be > >> > > deleted? Is application required to always messages from the topic > >> after > >> > > every restart and determine the keys to be deleted by looking at > >> message > >> > > timestamp, or is application supposed to persist the key-> timstamp > >> > > information in a separate persistent storage system? > >> > > > >> > > 2) It is mentioned in the KIP that "we only need to estimate > earliest > >> > > message timestamp for un-compacted log segments because the deletion > >> > > requests that belong to compacted segments have already been > >> processed". > >> > > Not sure if it is correct. If a segment is compacted before user > sends > >> > > message to delete a key in this segment, it seems that we still need > >> to > >> > > ensure that the segment will be compacted again within the given > time > >> > after > >> > > the deletion is requested, right? > >> > > > >> > > Thanks, > >> > > Dong > >> > > > >> > > On Thu, Aug 16, 2018 at 10:27 AM, xiongqi wu <xiongq...@gmail.com> > >> > wrote: > >> > > > >> > > > Hi Xiaohe, > >> > > > > >> > > > Quick note: > >> > > > 1) Use minimum of segment.ms and max.compaction.lag.ms > >> > > > <http://max.compaction.ms > <http://max.compaction.ms> > >> > <http://max.compaction.ms > <http://max.compaction.ms>>> > >> > > > > >> > > > 2) I am not sure if I get your second question. first, we have > >> jitter > >> > > when > >> > > > we roll the active segment. second, on each compaction, we compact > >> upto > >> > > > the offsetmap could allow. Those will not lead to perfect > compaction > >> > > storm > >> > > > overtime. In addition, I expect we are setting > >> max.compaction.lag.ms > >> > on > >> > > > the order of days. > >> > > > > >> > > > 3) I don't have access to the confluent community slack for now. I > >> am > >> > > > reachable via the google handle out. > >> > > > To avoid the double effort, here is my plan: > >> > > > a) Collect more feedback and feature requriement on the KIP. > >> > > > b) Wait unitl this KIP is approved. > >> > > > c) I will address any additional requirements in the > implementation. > >> > (My > >> > > > current implementation only complies to whatever described in the > >> KIP > >> > > now) > >> > > > d) I can share the code with the you and community see you want to > >> add > >> > > > anything. > >> > > > e) submission through committee > >> > > > > >> > > > > >> > > > On Wed, Aug 15, 2018 at 11:42 PM, XIAOHE DONG < > >> dannyriv...@gmail.com> > >> > > > wrote: > >> > > > > >> > > > > Hi Xiongqi > >> > > > > > >> > > > > Thanks for thinking about implementing this as well. :) > >> > > > > > >> > > > > I was thinking about using `segment.ms` to trigger the segment > >> roll. > >> > > > > Also, its value can be the largest time bias for the record > >> deletion. > >> > > For > >> > > > > example, if the `segment.ms` is 1 day and `max.compaction.ms` > is > >> 30 > >> > > > days, > >> > > > > the compaction may happen around 31 days. > >> > > > > > >> > > > > For my curiosity, is there a way we can do some performance test > >> for > >> > > this > >> > > > > and any tools you can recommend. As you know, previously, it is > >> > cleaned > >> > > > up > >> > > > > by respecting dirty ratio, but now it may happen anytime if max > >> lag > >> > has > >> > > > > passed for each message. I wonder what would happen if clients > >> send > >> > > huge > >> > > > > amount of tombstone records at the same time. > >> > > > > > >> > > > > I am looking forward to have a quick chat with you to avoid > double > >> > > effort > >> > > > > on this. I am in confluent community slack during the work time. > >> My > >> > > name > >> > > > is > >> > > > > Xiaohe Dong. :) > >> > > > > > >> > > > > Rgds > >> > > > > Xiaohe Dong > >> > > > > > >> > > > > > >> > > > > > >> > > > > On 2018/08/16 01:22:22, xiongqi wu <xiongq...@gmail.com> wrote: > >> > > > > > Brett, > >> > > > > > > >> > > > > > Thank you for your comments. > >> > > > > > I was thinking since we already has immediate compaction > >> setting by > >> > > > > setting > >> > > > > > min dirty ratio to 0, so I decide to use "0" as disabled > state. > >> > > > > > I am ok to go with -1(disable), 0 (immediate) options. > >> > > > > > > >> > > > > > For the implementation, there are a few differences between > mine > >> > and > >> > > > > > "Xiaohe Dong"'s : > >> > > > > > 1) I used the estimated creation time of a log segment instead > >> of > >> > > > largest > >> > > > > > timestamp of a log to determine the compaction eligibility, > >> > because a > >> > > > log > >> > > > > > segment might stay as an active segment up to "max compaction > >> lag". > >> > > > (see > >> > > > > > the KIP for detail). > >> > > > > > 2) I measure how much bytes that we must clean to follow the > >> "max > >> > > > > > compaction lag" rule, and use that to determine the order of > >> > > > compaction. > >> > > > > > 3) force active segment to roll to follow the "max compaction > >> lag" > >> > > > > > > >> > > > > > I can share my code so we can coordinate. > >> > > > > > > >> > > > > > I haven't think about a new API to force a compaction. what is > >> the > >> > > use > >> > > > > case > >> > > > > > for this one? > >> > > > > > > >> > > > > > > >> > > > > > On Wed, Aug 15, 2018 at 5:33 PM, Brett Rann > >> > > <br...@zendesk.com.invalid > >> > > > > > >> > > > > > wrote: > >> > > > > > > >> > > > > > > We've been looking into this too. > >> > > > > > > > >> > > > > > > Mailing list: > >> > > > > > > https://lists.apache.org/thread.html/ > <https://lists.apache.org/thread.html/> > >> > <https://lists.apache.org/thread.html/ > <https://lists.apache.org/thread.html/>> > >> > > ed7f6a6589f94e8c2a705553f364ef > >> > > > > > > 599cb6915e4c3ba9b561e610e4@%3Cdev.kafka.apache.org%3E > >> > > > > > > jira wish: https://issues.apache.org/jira/browse/KAFKA-7137 > <https://issues.apache.org/jira/browse/KAFKA-7137> > >> > <https://issues.apache.org/jira/browse/KAFKA-7137 > <https://issues.apache.org/jira/browse/KAFKA-7137>> > >> > > > > > > confluent slack discussion: > >> > > > > > > https://confluentcommunity.slack.com/archives/C49R61XMM/ > <https://confluentcommunity.slack.com/archives/C49R61XMM/> > >> > <https://confluentcommunity.slack.com/archives/C49R61XMM/ > <https://confluentcommunity.slack.com/archives/C49R61XMM/>> > >> > > > > p1530760121000039 > >> > > > > > > > >> > > > > > > A person on my team has started on code so you might want to > >> > > > > coordinate: > >> > > > > > > https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log- > <https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-> > >> > <https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log- > <https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log->> > >> > > > > > > cleaner-compaction-max-lifetime-2.0 > >> > > > > > > > >> > > > > > > He's been working with Jason Gustafson and James Chen around > >> the > >> > > > > changes. > >> > > > > > > You can ping him on confluent slack as Xiaohe Dong. > >> > > > > > > > >> > > > > > > It's great to know others are thinking on it as well. > >> > > > > > > > >> > > > > > > You've added the requirement to force a segment roll which > we > >> > > hadn't > >> > > > > gotten > >> > > > > > > to yet, which is great. I was content with it not including > >> the > >> > > > active > >> > > > > > > segment. > >> > > > > > > > >> > > > > > > > Adding topic level configuration "max.compaction.lag.ms", > >> and > >> > > > > > > corresponding broker configuration " > >> > log.cleaner.max.compaction.la > >> > > > g.ms > >> > > > > ", > >> > > > > > > which is set to 0 (disabled) by default. > >> > > > > > > > >> > > > > > > Glancing at some other settings convention seems to me to be > >> -1 > >> > for > >> > > > > > > disabled (or infinite, which is more meaningful here). 0 to > me > >> > > > implies > >> > > > > > > instant, a little quicker than 1. > >> > > > > > > > >> > > > > > > We've been trying to think about a way to trigger compaction > >> as > >> > > well > >> > > > > > > through an API call, which would need to be flagged > somewhere > >> (ZK > >> > > > > admin/ > >> > > > > > > space?) but we're struggling to think how that would be > >> > coordinated > >> > > > > across > >> > > > > > > brokers and partitions. Have you given any thought to that? > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > On Thu, Aug 16, 2018 at 8:44 AM xiongqi wu < > >> xiongq...@gmail.com> > >> > > > > wrote: > >> > > > > > > > >> > > > > > > > Eno, Dong, > >> > > > > > > > > >> > > > > > > > I have updated the KIP. We decide not to address the issue > >> that > >> > > we > >> > > > > might > >> > > > > > > > have for both compaction and time retention enabled topics > >> (see > >> > > the > >> > > > > > > > rejected alternative item 2). This KIP will only ensure > log > >> can > >> > > be > >> > > > > > > > compacted after a specified time-interval. > >> > > > > > > > > >> > > > > > > > As suggested by Dong, we will also enforce " > >> > > max.compaction.lag.ms" > >> > > > > is > >> > > > > > > not > >> > > > > > > > less than "min.compaction.lag.ms". > >> > > > > > > > > >> > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354 > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354> > >> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354 > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>> > >> > > > > Time-based > >> > > > > > > log > >> > > > > > > > compaction policy > >> > > > > > > > < > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354 > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354> > >> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354 > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>> > >> > > > > Time-based > >> > > > > > > log compaction policy> > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > On Tue, Aug 14, 2018 at 5:01 PM, xiongqi wu < > >> > xiongq...@gmail.com > >> > > > > >> > > > > wrote: > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Per discussion with Dong, he made a very good point that > >> if > >> > > > > compaction > >> > > > > > > > > and time based retention are both enabled on a topic, > the > >> > > > > compaction > >> > > > > > > > might > >> > > > > > > > > prevent records from being deleted on time. The reason > is > >> > when > >> > > > > > > compacting > >> > > > > > > > > multiple segments into one single segment, the newly > >> created > >> > > > > segment > >> > > > > > > will > >> > > > > > > > > have same lastmodified timestamp as latest original > >> segment. > >> > We > >> > > > > lose > >> > > > > > > the > >> > > > > > > > > timestamp of all original segments except the last one. > >> As a > >> > > > > result, > >> > > > > > > > > records might not be deleted as it should be through > time > >> > based > >> > > > > > > > retention. > >> > > > > > > > > > >> > > > > > > > > With the current KIP proposal, if we want to ensure > timely > >> > > > > deletion, we > >> > > > > > > > > have the following configurations: > >> > > > > > > > > 1) enable time based log compaction only : deletion is > >> done > >> > > > though > >> > > > > > > > > overriding the same key > >> > > > > > > > > 2) enable time based log retention only: deletion is > done > >> > > though > >> > > > > > > > > time-based retention > >> > > > > > > > > 3) enable both log compaction and time based retention: > >> > > Deletion > >> > > > > is not > >> > > > > > > > > guaranteed. > >> > > > > > > > > > >> > > > > > > > > Not sure if we have use case 3 and also want deletion to > >> > happen > >> > > > on > >> > > > > > > time. > >> > > > > > > > > There are several options to address deletion issue when > >> > enable > >> > > > > both > >> > > > > > > > > compaction and retention: > >> > > > > > > > > A) During log compaction, looking into record timestamp > to > >> > > delete > >> > > > > > > expired > >> > > > > > > > > records. This can be done in compaction logic itself or > >> use > >> > > > > > > > > AdminClient.deleteRecords() . But this assumes we have > >> record > >> > > > > > > timestamp. > >> > > > > > > > > B) retain the lastModifed time of original segments > during > >> > log > >> > > > > > > > compaction. > >> > > > > > > > > This requires extra meta data to record the information > or > >> > not > >> > > > > grouping > >> > > > > > > > > multiple segments into one during compaction. > >> > > > > > > > > > >> > > > > > > > > If we have use case 3 in general, I would prefer > solution > >> A > >> > and > >> > > > > rely on > >> > > > > > > > > record timestamp. > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Two questions: > >> > > > > > > > > Do we have use case 3? Is it nice to have or must have? > >> > > > > > > > > If we have use case 3 and want to go with solution A, > >> should > >> > we > >> > > > > > > introduce > >> > > > > > > > > a new configuration to enforce deletion by timestamp? > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > On Tue, Aug 14, 2018 at 1:52 PM, xiongqi wu < > >> > > xiongq...@gmail.com > >> > > > > > >> > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > >> Dong, > >> > > > > > > > >> > >> > > > > > > > >> Thanks for the comment. > >> > > > > > > > >> > >> > > > > > > > >> There are two retention policy: log compaction and time > >> > based > >> > > > > > > retention. > >> > > > > > > > >> > >> > > > > > > > >> Log compaction: > >> > > > > > > > >> > >> > > > > > > > >> we have use cases to keep infinite retention of a topic > >> > (only > >> > > > > > > > >> compaction). GDPR cares about deletion of PII (personal > >> > > > > identifiable > >> > > > > > > > >> information) data. > >> > > > > > > > >> Since Kafka doesn't know what records contain PII, it > >> relies > >> > > on > >> > > > > upper > >> > > > > > > > >> layer to delete those records. > >> > > > > > > > >> For those infinite retention uses uses, kafka needs to > >> > > provide a > >> > > > > way > >> > > > > > > to > >> > > > > > > > >> enforce compaction on time. This is what we try to > >> address > >> > in > >> > > > this > >> > > > > > > KIP. > >> > > > > > > > >> > >> > > > > > > > >> Time based retention, > >> > > > > > > > >> > >> > > > > > > > >> There are also use cases that users of Kafka might want > >> to > >> > > > expire > >> > > > > all > >> > > > > > > > >> their data. > >> > > > > > > > >> In those cases, they can use time based retention of > >> their > >> > > > topics. > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> Regarding your first question, if a user wants to > delete > >> a > >> > key > >> > > > in > >> > > > > the > >> > > > > > > > >> log compaction topic, the user has to send a deletion > >> using > >> > > the > >> > > > > same > >> > > > > > > > key. > >> > > > > > > > >> Kafka only makes sure the deletion will happen under a > >> > certain > >> > > > > time > >> > > > > > > > >> periods (like 2 days/7 days). > >> > > > > > > > >> > >> > > > > > > > >> Regarding your second question. In most cases, we might > >> want > >> > > to > >> > > > > delete > >> > > > > > > > >> all duplicated keys at the same time. > >> > > > > > > > >> Compaction might be more efficient since we need to > scan > >> the > >> > > log > >> > > > > and > >> > > > > > > > find > >> > > > > > > > >> all duplicates. However, the expected use case is to > set > >> the > >> > > > time > >> > > > > > > based > >> > > > > > > > >> compaction interval on the order of days, and be larger > >> than > >> > > > 'min > >> > > > > > > > >> compaction lag". We don't want log compaction to happen > >> > > > frequently > >> > > > > > > since > >> > > > > > > > >> it is expensive. The purpose is to help low production > >> rate > >> > > > topic > >> > > > > to > >> > > > > > > get > >> > > > > > > > >> compacted on time. For the topic with "normal" incoming > >> > > message > >> > > > > > > message > >> > > > > > > > >> rate, the "min dirty ratio" might have triggered the > >> > > compaction > >> > > > > before > >> > > > > > > > this > >> > > > > > > > >> time based compaction policy takes effect. > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> Eno, > >> > > > > > > > >> > >> > > > > > > > >> For your question, like I mentioned we have long time > >> > > retention > >> > > > > use > >> > > > > > > case > >> > > > > > > > >> for log compacted topic, but we want to provide ability > >> to > >> > > > delete > >> > > > > > > > certain > >> > > > > > > > >> PII records on time. > >> > > > > > > > >> Kafka itself doesn't know whether a record contains > >> > sensitive > >> > > > > > > > information > >> > > > > > > > >> and relies on the user for deletion. > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> On Mon, Aug 13, 2018 at 6:58 PM, Dong Lin < > >> > > lindon...@gmail.com> > >> > > > > > > wrote: > >> > > > > > > > >> > >> > > > > > > > >>> Hey Xiongqi, > >> > > > > > > > >>> > >> > > > > > > > >>> Thanks for the KIP. I have two questions regarding the > >> > > use-case > >> > > > > for > >> > > > > > > > >>> meeting > >> > > > > > > > >>> GDPR requirement. > >> > > > > > > > >>> > >> > > > > > > > >>> 1) If I recall correctly, one of the GDPR requirement > is > >> > that > >> > > > we > >> > > > > can > >> > > > > > > > not > >> > > > > > > > >>> keep messages longer than e.g. 30 days in storage > (e.g. > >> > > Kafka). > >> > > > > Say > >> > > > > > > > there > >> > > > > > > > >>> exists a partition p0 which contains message1 with > key1 > >> and > >> > > > > message2 > >> > > > > > > > with > >> > > > > > > > >>> key2. And then user keeps producing messages with > >> key=key2 > >> > to > >> > > > > this > >> > > > > > > > >>> partition. Since message1 with key1 is never > overridden, > >> > > sooner > >> > > > > or > >> > > > > > > > later > >> > > > > > > > >>> we > >> > > > > > > > >>> will want to delete message1 and keep the latest > message > >> > with > >> > > > > > > key=key2. > >> > > > > > > > >>> But > >> > > > > > > > >>> currently it looks like log compact logic in Kafka > will > >> > > always > >> > > > > put > >> > > > > > > > these > >> > > > > > > > >>> messages in the same segment. Will this be an issue? > >> > > > > > > > >>> > >> > > > > > > > >>> 2) The current KIP intends to provide the capability > to > >> > > delete > >> > > > a > >> > > > > > > given > >> > > > > > > > >>> message in log compacted topic. Does such use-case > also > >> > > require > >> > > > > Kafka > >> > > > > > > > to > >> > > > > > > > >>> keep the messages produced before the given message? > If > >> > yes, > >> > > > > then we > >> > > > > > > > can > >> > > > > > > > >>> probably just use AdminClient.deleteRecords() or > >> time-based > >> > > log > >> > > > > > > > retention > >> > > > > > > > >>> to meet the use-case requirement. If no, do you know > >> what > >> > is > >> > > > the > >> > > > > > > GDPR's > >> > > > > > > > >>> requirement on time-to-deletion after user explicitly > >> > > requests > >> > > > > the > >> > > > > > > > >>> deletion > >> > > > > > > > >>> (e.g. 1 hour, 1 day, 7 day)? > >> > > > > > > > >>> > >> > > > > > > > >>> Thanks, > >> > > > > > > > >>> Dong > >> > > > > > > > >>> > >> > > > > > > > >>> > >> > > > > > > > >>> On Mon, Aug 13, 2018 at 3:44 PM, xiongqi wu < > >> > > > xiongq...@gmail.com > >> > > > > > > >> > > > > > > > wrote: > >> > > > > > > > >>> > >> > > > > > > > >>> > Hi Eno, > >> > > > > > > > >>> > > >> > > > > > > > >>> > The GDPR request we are getting here at linkedin is > >> if we > >> > > > get a > >> > > > > > > > >>> request to > >> > > > > > > > >>> > delete a record through a null key on a log > compacted > >> > > topic, > >> > > > > > > > >>> > we want to delete the record via compaction in a > given > >> > time > >> > > > > period > >> > > > > > > > >>> like 2 > >> > > > > > > > >>> > days (whatever is required by the policy). > >> > > > > > > > >>> > > >> > > > > > > > >>> > There might be other issues (such as orphan log > >> segments > >> > > > under > >> > > > > > > > certain > >> > > > > > > > >>> > conditions) that lead to GDPR problem but they are > >> more > >> > > like > >> > > > > > > > >>> something we > >> > > > > > > > >>> > need to fix anyway regardless of GDPR. > >> > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > > >>> > -- Xiongqi (Wesley) Wu > >> > > > > > > > >>> > > >> > > > > > > > >>> > On Mon, Aug 13, 2018 at 2:56 PM, Eno Thereska < > >> > > > > > > > eno.there...@gmail.com> > >> > > > > > > > >>> > wrote: > >> > > > > > > > >>> > > >> > > > > > > > >>> > > Hello, > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > Thanks for the KIP. I'd like to see a more precise > >> > > > > definition of > >> > > > > > > > what > >> > > > > > > > >>> > part > >> > > > > > > > >>> > > of GDPR you are targeting as well as some sort of > >> > > > > verification > >> > > > > > > that > >> > > > > > > > >>> this > >> > > > > > > > >>> > > KIP actually addresses the problem. Right now I > find > >> > > this a > >> > > > > bit > >> > > > > > > > >>> vague: > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > "Ability to delete a log message through > compaction > >> in > >> > a > >> > > > > timely > >> > > > > > > > >>> manner > >> > > > > > > > >>> > has > >> > > > > > > > >>> > > become an important requirement in some use cases > >> > (e.g., > >> > > > > GDPR)" > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > Is there any guarantee that after this KIP the > GDPR > >> > > problem > >> > > > > is > >> > > > > > > > >>> solved or > >> > > > > > > > >>> > do > >> > > > > > > > >>> > > we need to do something else as well, e.g., more > >> KIPs? > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > Thanks > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > Eno > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu < > >> > > > > xiongq...@gmail.com> > >> > > > > > > > >>> wrote: > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > Hi Kafka, > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > This KIP tries to address GDPR concern to > fulfill > >> > > > deletion > >> > > > > > > > request > >> > > > > > > > >>> on > >> > > > > > > > >>> > > time > >> > > > > > > > >>> > > > through time-based log compaction on a > compaction > >> > > enabled > >> > > > > > > topic: > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-> > >> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP- > <https://cwiki.apache.org/confluence/display/KAFKA/KIP->> > >> > > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP- > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-> > >> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP- > <https://cwiki.apache.org/confluence/display/KAFKA/KIP->>> > >> > > > > > > > >>> > > > 354%3A+Time-based+log+compaction+policy > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > Any feedback will be appreciated. > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > Xiongqi (Wesley) Wu > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > >> > > > > > > > >>> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> -- > >> > > > > > > > >> Xiongqi (Wesley) Wu > >> > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > -- > >> > > > > > > > > Xiongqi (Wesley) Wu > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > -- > >> > > > > > > > Xiongqi (Wesley) Wu > >> > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > -- > >> > > > > > > > >> > > > > > > Brett Rann > >> > > > > > > > >> > > > > > > Senior DevOps Engineer > >> > > > > > > > >> > > > > > > > >> > > > > > > Zendesk International Ltd > >> > > > > > > > >> > > > > > > 395 Collins Street, Melbourne VIC 3000 Australia > >> > > > > > > > >> > > > > > > Mobile: +61 (0) 418 826 017 > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > -- > >> > > > > > Xiongqi (Wesley) Wu > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Xiongqi (Wesley) Wu > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > Xiongqi (Wesley) Wu > >> > > >> > >> > >> -- > >> > >> Brett Rann > >> > >> Senior DevOps Engineer > >> > >> > >> Zendesk International Ltd > >> > >> 395 Collins Street, Melbourne VIC 3000 Australia > >> > >> Mobile: +61 (0) 418 826 017 > >> > > > > > -- > Xiongqi (Wesley) Wu > -- Brett Rann Senior DevOps Engineer Zendesk International Ltd 395 Collins Street, Melbourne VIC 3000 Australia Mobile: +61 (0) 418 826 017