Hi all, Seeing that we got all out votes needed with 3 binding votes and 0 nonbinding. I consider this KIP passed.
Cheers, Richard On Fri, Oct 18, 2019 at 9:17 AM Guozhang Wang <wangg...@gmail.com> wrote: > Thanks Richard, I'm +1 on the KIP > > On Thu, Oct 17, 2019 at 3:51 PM Richard Yu <yohan.richard...@gmail.com> > wrote: > > > Hi Guozhang, Jason, > > > > I've updated the KIP to include this warning as well. > > If there is anything else that we need for it, let me know. :) > > Otherwise, we should vote this KIP in. > > > > Cheers, > > Richard > > > > On Thu, Oct 17, 2019 at 10:41 AM Jason Gustafson <ja...@confluent.io> > > wrote: > > > > > Hi Guozhang, > > > > > > It's a fair point. For control records, I think it's a non-issue since > > they > > > are tiny and not batched. So the only case where this might matter is > > large > > > batch deletions. I think the size difference is not a major issue > itself, > > > but I think it's worth mentioning in the KIP the risk of exceeding the > > max > > > message size. I think the code should probably make this more of a soft > > > limit when cleaning. We have run into scenarios in the past as well > where > > > recompression has actually increased message size. We may also want to > be > > > able to upconvert messages to the new format in the future in the > > cleaner. > > > > > > -Jason > > > > > > > > > > > > On Thu, Oct 17, 2019 at 9:08 AM Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > > > Here's my understanding: when log compaction kicks in, the system > time > > at > > > > the moment would be larger than the message timestamp to be > compacted, > > so > > > > the modification on the batch timestamp would practically be > increasing > > > its > > > > value, and hence the deltas for each inner message would be negative > to > > > > maintain their actual timestamp. Depending on the time diff between > the > > > > actual timestamp of the message and the time when log compaction > > happens, > > > > this negative delta can be large or small since it not long depends > on > > > the > > > > cleaner thread wakeup frequency but also dirty ratio etc. > > > > > > > > With varInt encoding, the num.bytes needed for encode an int varies > > from > > > 1 > > > > to 5 bytes; before compaction, the deltas should be relatively small > > > > positive values compared with the base timestamp, and hence most > > likely 1 > > > > or 2 bytes needed to encode, after compaction, the deltas could be > > > > relatively large negative values that may take more bytes to encode. > > > With a > > > > record batch of 512 in practice, and suppose after compaction each > > record > > > > would take 2 more byte for encoding deltas, that would be 1K more per > > > > batch. Usually it would not be too big of an issue with reasonable > > sized > > > > message, but I just wanted to point out this as a potential > regression. > > > > > > > > > > > > Guozhang > > > > > > > > On Wed, Oct 16, 2019 at 9:36 PM Richard Yu < > yohan.richard...@gmail.com > > > > > > > wrote: > > > > > > > > > Hi Guozhang, > > > > > > > > > > Your understanding basically is on point. > > > > > > > > > > I haven't looked into the details for what happens if we change the > > > base > > > > > timestamp and how its calculated, so I'm not clear on how small the > > > delta > > > > > (or big) is. > > > > > To be fair, would the the delta size pose a big problem if it takes > > up > > > > more > > > > > bytes to encode? > > > > > > > > > > Cheers, > > > > > Richard > > > > > > > > > > On Wed, Oct 16, 2019 at 7:36 PM Guozhang Wang <wangg...@gmail.com> > > > > wrote: > > > > > > > > > > > Hello Richard, > > > > > > > > > > > > Thanks for the KIP, I just have one clarification regarding "So > the > > > > idea > > > > > is > > > > > > to set the base timestamp to the delete horizon and adjust the > > deltas > > > > > > accordingly." My understanding is that during compaction, for > each > > > > > > compacted new segment, we would set its base offset of each batch > > as > > > > the > > > > > > delete horizon, which is the "current system time that cleaner > has > > > seen > > > > > so > > > > > > far", and adjust the delta timestamps of each of the inner > records > > of > > > > the > > > > > > batch (and practically the deltas will be all negative)? > > > > > > > > > > > > If that's case, could we do some back of the envelope calculation > > on > > > > > what's > > > > > > the possible smallest case of deltas? Note that since we use > varInt > > > for > > > > > > delta values for each record, the smaller the negative delta, > that > > > > would > > > > > > take more bytes to encode. > > > > > > > > > > > > Guozhang > > > > > > > > > > > > On Wed, Oct 16, 2019 at 6:48 PM Jason Gustafson < > > ja...@confluent.io> > > > > > > wrote: > > > > > > > > > > > > > +1. Thanks Richard. > > > > > > > > > > > > > > On Wed, Oct 16, 2019 at 10:04 AM Richard Yu < > > > > > yohan.richard...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > Want to try to get this KIP wrapped up. So it would be great > if > > > we > > > > > can > > > > > > > get > > > > > > > > some votes. > > > > > > > > > > > > > > > > Cheers, > > > > > > > > Richard > > > > > > > > > > > > > > > > On Tue, Oct 15, 2019 at 12:58 PM Jun Rao <j...@confluent.io> > > > wrote: > > > > > > > > > > > > > > > > > Hi, Richard, > > > > > > > > > > > > > > > > > > Thanks for the updated KIP. +1 from me. > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > On Tue, Oct 15, 2019 at 12:46 PM Richard Yu < > > > > > > > yohan.richard...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Jun, > > > > > > > > > > > > > > > > > > > > I've updated the link accordingly. :) > > > > > > > > > > Here is the updated KIP link: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-534%3A+Retain+tombstones+and+transaction+markers+for+approximately+delete.retention.ms+milliseconds > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > Richard > > > > > > > > > > > > > > > > > > > > On Mon, Oct 14, 2019 at 5:12 PM Jun Rao < > j...@confluent.io> > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi, Richard, > > > > > > > > > > > > > > > > > > > > > > Thanks for the KIP. Looks good to me overall. A few > minor > > > > > > comments > > > > > > > > > below. > > > > > > > > > > > > > > > > > > > > > > 1. Could you change the title from "Retain tombstones" > to > > > > > "Retain > > > > > > > > > > > tombstones and transaction markers" to make it more > > > general? > > > > > > > > > > > > > > > > > > > > > > 2. Could you document which bit in the batch attribute > > will > > > > be > > > > > > used > > > > > > > > for > > > > > > > > > > the > > > > > > > > > > > new flag? The current format of the batch attribute is > > the > > > > > > > following. > > > > > > > > > > > > > > > > > > > > > > * > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------- > > > > > > > > > > > * | Unused (6-15) | Control (5) | Transactional (4) | > > > > > Timestamp > > > > > > > Type > > > > > > > > > > > (3) | Compression Type (0-2) | > > > > > > > > > > > > > > > > > > > > > > * > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > > > > > > > > 3. Could you provide the reasons for the rejected > > > proposals? > > > > > For > > > > > > > > > > > proposal 1, one reason is that it doesn't cover the > > > > transaction > > > > > > > > > > > markers. For proposal 2, one reason is that the > interval > > > > record > > > > > > > > header > > > > > > > > > > > could be exposed to the clients. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Oct 14, 2019 at 4:42 PM Richard Yu < > > > > > > > > yohan.richard...@gmail.com > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > The discussion for KIP-534 seems to have concluded. > > > > > > > > > > > > So I wish to vote this in so that we can get it done. > > > Its a > > > > > > small > > > > > > > > bug > > > > > > > > > > > fix. > > > > > > > > > > > > :) > > > > > > > > > > > > > > > > > > > > > > > > Below is the KIP link: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-534%3A+Retain+tombstones+for+approximately+delete.retention.ms+milliseconds > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Richard > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > -- > > > > -- Guozhang > > > > > > > > > > > > -- > -- Guozhang >