Hey Luis, Thanks for the explanation. I'd suggest adding the use case to the motivation section.
I think my only hesitation about the header-based compaction is that it is the first time we are putting a schema requirement on header values. I wonder if it's better to leave Kafka agnostic. For example, maybe the compaction strategy could be a plugin which allows custom derivation of the compaction key. Say something like this: class CompactionKey { byte[] key; long version; } interface CompactionStrategy { CompactionKey deriveCompactionKey(Record record); } The idea is to leave schemas in the hands of users. Have you considered something like that? Thanks, Jason On Sat, Aug 11, 2018 at 2:04 AM, Luís Cabral <luis_cab...@yahoo.com.invalid> wrote: > Hi Jason, > > The initial (and still only) requirement I wanted out of this KIP was to > have the header strategy. > This is because I want to be able to version both by date/time or by > (externally provided) sequence, this is specially important if you are > running in multiple environments, which may cause the commonly known issue > of the clocks being slightly de-synchronized. > Having the timestamp strategy was added to the KIP as the result of the > discussions, where it was seen as a potential benefit for other clients who > may prefer that. > > Cheers, > Luís > > From: Jason Gustafson > Sent: 10 August 2018 22:38 > To: dev > Subject: Re: [VOTE] KIP-280: Enhanced log compaction > > Hi Luis, > > It's still not very clear to me why we need the header-based strategy. Can > you elaborate why having the timestamp-based approach alone is not > sufficient? The use case in the motivation just describes a "most recent > snapshot" use case. > > Thanks, > Jason > > On Thu, Aug 9, 2018 at 4:36 AM, Luís Cabral <luis_cab...@yahoo.com.invalid > > > wrote: > > > Hi, > > > > > > So, after a "short" break, I've finally managed to find time to resume > > this KIP. Sorry to all for the delay. > > > > Continuing the conversation of the configurations being global vs topic, > > I've checked this and it seems that they are only available globally. > > > > This configuration is passed to the log cleaner via > "CleanerConfig.scala", > > which only accepts global configurations. This seems intentional, as the > > log cleaner is not mutable and doesn't get instantiated that often. I > think > > that changing this to accept per-topic configuration would be very nice, > > but perhaps as a part of a different KIP. > > > > > > Following the Kafka documentation, these are the settings I'm referring > to: > > > > -- -- > > > > Updating Log Cleaner Configs > > > > Log cleaner configs may be updated dynamically at cluster-default > > level used by all brokers. The changes take effect on the next iteration > of > > log cleaning. One or more of these configs may be updated: > > > > * log.cleaner.threads > > > > * log.cleaner.io.max.bytes.per.second > > > > * log.cleaner.dedupe.buffer.size > > > > * log.cleaner.io.buffer.size > > > > * log.cleaner.io.buffer.load.factor > > > > * log.cleaner.backoff.ms > > > > -- -- > > > > > > > > Please feel free to confirm, otherwise I will update the KIP to reflect > > these configuration nuances in the next few days. > > > > > > Best Regards, > > > > Luis > > > > > > > > On Monday, July 9, 2018, 1:57:38 PM GMT+2, Andras Beni < > > andrasb...@cloudera.com.INVALID> wrote: > > > > > > > > > > > > Hi Luís, > > > > Can you please clarify how the header value has to be encoded in case log > > compaction strategy is 'header'. As I see current PR reads varLong in > > CleanerCache.extractVersion and read String and uses toLong in > > Cleaner.extractVersion while the KIP says no more than 'the header value > > (which must be of type "long")'. > > > > Otherwise +1 for the KIP > > > > As for current implementation: it seems in Cleaner class header key > > "version" is hardwired. > > > > Andras > > > > > > > > On Fri, Jul 6, 2018 at 10:36 PM Jun Rao <j...@confluent.io> wrote: > > > > > Hi, Guozhang, > > > > > > For #4, what you suggested could make sense for timestamp based de-dup, > > but > > > not sure how general it is since the KIP also supports de-dup based on > > > header. > > > > > > Thanks, > > > > > > Jun > > > > > > On Fri, Jul 6, 2018 at 1:12 PM, Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > > > Hello Jun, > > > > Thanks for your feedbacks. I'd agree on #3 that it's worth adding a > > > special > > > > check to not delete the last message, since although unlikely, it is > > > still > > > > possible that a new active segment gets rolled out but contains no > data > > > > yet, and hence the actual last message in this case would be in a > > > > "compact-able" segment. > > > > > > > > For the second part of #4 you raised, maybe we could educate users to > > > set " > > > > message.timestamp.difference.max.ms" to be no larger than " > > > > log.cleaner.delete.retention.ms" (its default value is > > Long.MAX_VALUE)? > > > A > > > > more aggressive approach would be changing the default value of the > > > former > > > > to be the value of the latter if: > > > > > > > > 1. cleanup.policy = compact OR compact,delete > > > > 2. log.cleaner.compaction.strategy != offset > > > > > > > > Because in this case I think it makes sense to really allow users > send > > > any > > > > data longer than "log.cleaner.delete.retention.ms", WDYT? > > > > > > > > > > > > Guozhang > > > > > > > > > > > > On Fri, Jul 6, 2018 at 11:51 AM, Jun Rao <j...@confluent.io> wrote: > > > > > > > > > Hi, Luis, > > > > > > > > > > 1. The cleaning policy is configurable at both global and topic > > level. > > > > The > > > > > global one has the name log.cleanup.policy and the topic level has > > the > > > > name > > > > > cleanup.policy by just stripping the log prefix. We can probably do > > the > > > > > same for the new configs. > > > > > > > > > > 2. Since this KIP may require an admin to configure a larger dedup > > > buffer > > > > > size, it would be useful to document this impact in the wiki and > the > > > > > release notes. > > > > > > > > > > 3. Yes, it's unlikely for the last message to be removed in the > > current > > > > > implementation since we never clean the active segment. However, in > > > > theory, > > > > > this can happen. So it would be useful to guard this explicitly. > > > > > > > > > > 4. Just thought about another issue. We probably want to be a bit > > > careful > > > > > with key deletion. Currently, one can delete a key by sending a > > message > > > > > with a delete tombstone (a null payload). To prevent a reader from > > > > missing > > > > > a deletion if it's removed too quickly, we depend on a > configuration > > > > > log.cleaner.delete.retention.ms (defaults to 1 day). The delete > > > > tombstone > > > > > will only be physically removed from the log after that amount of > > time. > > > > The > > > > > expectation is that a reader should finish reading to the end of > the > > > log > > > > > after consuming a message within that configured time. With the new > > > > > strategy, we have similar, but slightly different problems. The > first > > > > > problem is that the delete tombstone may be delivered earlier than > an > > > > > outdated record in offset order to a consumer. In order for the > > > consumer > > > > > not to take the outdated record, the consumer should cache the > > deletion > > > > > tombstone for some configured amount of time. We ca probably > > piggyback > > > > this > > > > > on log.cleaner.delete.retention.ms, but we need to document this. > > The > > > > > second problem is that once the delete tombstone is physically > > removed > > > > from > > > > > the log, how can we prevent outdated records to be added > (otherwise, > > > they > > > > > will never be garbage collected)? Not sure what's the best way to > do > > > > this. > > > > > One possible way is to push this back to the application and > require > > > the > > > > > user not to publish outdated records after > > > log.cleaner.delete.retention. > > > > ms > > > > > . > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Wed, Jul 4, 2018 at 11:11 AM, Luís Cabral > > > > <luis_cab...@yahoo.com.invalid > > > > > > > > > > > wrote: > > > > > > > > > > > Hi Jun, > > > > > > > > > > > > -: 1. I guess both new configurations will be at the topic > level? > > > > > > > > > > > > They will exist in the global configuration, at the very least. > > > > > > I would like to have them on the topic level as well, but there > is > > an > > > > > > inconsistency between the cleanup/compaction properties that > exist > > > > “only > > > > > > globally” vs “globally + per topic”. > > > > > > I haven’t gotten around to investigating why, and if that reason > > > would > > > > > > then also impact the properties I’m suggesting. At first glance > > they > > > > seem > > > > > > to belong with the properties that are "only globally” > configured, > > > but > > > > > > Guozhang has written earlier with a suggestion of a compaction > > > property > > > > > > that works for both (though I haven’t had the time to look into > it > > > yet, > > > > > > unfortunately). > > > > > > > > > > > > -: 2. Since the log cleaner now needs to keep both the offset > and > > > > > another > > > > > > long (say timestamp) in the de-dup map, it reduces the number of > > keys > > > > > that > > > > > > we can keep in the map and thus may require more rounds of > > cleaning. > > > > This > > > > > > is probably not a big issue, but it will be useful to document > this > > > > > impact > > > > > > in the KIP. > > > > > > > > > > > > As a reader, I tend to prefer brief documentation on new features > > > (they > > > > > > tend to be too many for me to find the willpower to read a > 200-page > > > > essay > > > > > > about each one), so that influences me to avoid writing about > every > > > > > > micro-impact that may exist, and simply leave it inferred (as you > > > have > > > > > just > > > > > > done). > > > > > > But I also don’t feel strongly enough about it to argue either > way. > > > So, > > > > > > after reading my argument, if you still insist, I’ll happily add > > this > > > > > there. > > > > > > > > > > > > -: 3. With the new cleaning strategy, we want to be a bit careful > > > with > > > > > > removing the last message in a partition (which is possible now). > > We > > > > need > > > > > > to preserve the offset of the last message so that we don't reuse > > the > > > > > > offset for a different message. One way to simply never remove > the > > > last > > > > > > message. Another way (suggested by Jason) is to create an empty > > > message > > > > > > batch. > > > > > > > > > > > > That is a good point, but isn’t the last message always kept > > > > regardless? > > > > > > In all of my tests with this approach, never have I seen it being > > > > > removed. > > > > > > This is not because I made it so while changing the code, it was > > > simply > > > > > > like this before, which made me smile! > > > > > > Given these results, I just *assumed* (oops) that these scenarios > > > > > > represented the reality, so the compaction would only affected > the > > > > > “tail”, > > > > > > while the “head” remained untouched. Now that you say its > possible > > > that > > > > > the > > > > > > last message actually gets overwritten somehow, I guess a new > > bullet > > > > > point > > > > > > will have to be added to the KIP for this (after I’ve found the > > time > > > to > > > > > > review the portion of the code that enacts this behaviour). > > > > > > > > > > > > Kind Regards, > > > > > > Luís Cabral > > > > > > > > > > > > From: Jun Rao > > > > > > Sent: 03 July 2018 23:58 > > > > > > To: dev > > > > > > Subject: Re: [VOTE] KIP-280: Enhanced log compaction > > > > > > > > > > > > Hi, Luis, > > > > > > > > > > > > Thanks for the KIP. Overall, this seems a useful KIP. A few > > comments > > > > > below. > > > > > > > > > > > > 1. I guess both new configurations will be at the topic level? > > > > > > 2. Since the log cleaner now needs to keep both the offset and > > > another > > > > > long > > > > > > (say timestamp) in the de-dup map, it reduces the number of keys > > that > > > > we > > > > > > can keep in the map and thus may require more rounds of cleaning. > > > This > > > > is > > > > > > probably not a big issue, but it will be useful to document this > > > impact > > > > > in > > > > > > the KIP. > > > > > > 3. With the new cleaning strategy, we want to be a bit careful > with > > > > > > removing the last message in a partition (which is possible now). > > We > > > > need > > > > > > to preserve the offset of the last message so that we don't reuse > > the > > > > > > offset for a different message. One way to simply never remove > the > > > last > > > > > > message. Another way (suggested by Jason) is to create an empty > > > message > > > > > > batch. > > > > > > > > > > > > Jun > > > > > > > > > > > > On Sat, Jun 9, 2018 at 12:39 AM, Luís Cabral > > > > > <luis_cab...@yahoo.com.invalid > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > Any takers on having a look at this KIP and voting on it? > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > > > > 280%3A+Enhanced+log+compaction > > > > > > > > > > > > > > Cheers, > > > > > > > Luis > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > -- Guozhang > > > > > > > > > > >