Just wanted to bump this thread for visbility. Thanks to everyone who has participated in the discussion so far.
Thanks, Farooq On 2023/02/14 19:32:53 Guozhang Wang wrote: > Thanks Farooq, that looks good to me. > > Guozhang > > On Sun, Feb 12, 2023 at 9:01 AM Dharin Shah <dh...@gmail.com> wrote: > > > > Hello Farooq, > > > > This is actually a great idea, we have dealt with this by using an array > > instead of a set. > > +1 to this :) > > > > Thank you, > > Dharin > > > > On Sun, Feb 12, 2023 at 8:11 PM Fq Public <fq...@gmail.com> wrote: > > > > > Hi Guozhang, > > > > > > Thanks for reading over my proposal! > > > > > > > Regarding the format, I'm just thinking if we can change the type of > > > "INT newDataLength" to UINT32? > > > > > > Good idea, I've updated the KIP to reflect UINT32 since it makes clear the > > > value can never be less than zero. > > > > > > > `.equals` default implementation on Object is by reference, so if the > > > groupBy did not generate a new object, that may still pass. This means > > > that > > > even if user does not implement the `.equals` function, if the same object > > > is returned then this feature would still be triggered, is that correct? > > > > > > Correct, I've updated the KIP to call out this edge-case clearly as > > > follows: > > > > > > > Since the default `.equals` implementation for an `Object` is by > > > reference, if a user's `groupBy` returns the same reference for the key, > > > then the oldKey and the newKey will naturally `.equals` each other. This > > > will result in a single event being sent to the repartition topic. This > > > change in behaviour should be considered a "bug-fix" rather than a > > > "breaking change" as the semantics of the operation remain unchanged, the > > > only thing that changes for users is they no longer see transient > > > "inconsistent" states. In the worst case, users in this situation will > > > need to update any strict tests that check specifically for the presence > > > of > > > transient "inconsistent" states. > > > > > > What do you think? > > > > > > Thanks, > > > Farooq > > > > > > On 2023/02/07 18:02:24 Guozhang Wang wrote: > > > > Hello Farooq, > > > > > > > > Thanks for the very detailed proposal! I think this is a great idea. > > > > Just a few thoughts: > > > > > > > > 1. I regret that we over-optimized the Changed serde format for > > > > footprint while making it less extensible. It seems to me that a two > > > > rolling bounce migration is unavoidable.. Regarding the format, I'm > > > > just thinking if we can change the type of "INT newDataLength" to > > > > UINT32? > > > > > > > > 2. `.equals` default implementation on Object is by reference, so if > > > > the groupBy did not generate a new object, that may still pass. This > > > > means that even if user does not implement the `.equals` function, if > > > > the same object is returned then this feature would still be > > > > triggered, is that correct? > > > > > > > > > > > > Guozhang > > > > > > > > On Mon, Feb 6, 2023 at 5:05 AM Fq Public <fq...@gmail.com> wrote: > > > > > > > > > > Hi everyone, > > > > > > > > > > I'd like to share a new KIP for discussion: > > > > > https://cwiki.apache.org/confluence/x/P5VbDg > > > > > > > > > > This could be considered mostly as a "bug fix" but we wanted to raise > > > a KIP > > > > > for discussion because it involves changes to the serialization format > > > of > > > > > an internal topic which raises backward compatibility considerations. > > > > > > > > > > Please take a look and let me know what you think. > > > > > > > > > > Thanks, > > > > > Farooq > > > > >