Hi everyone, I am fine with sticking with Materialized and adding the info to the javadoc then - so we keep the footprint smaller. I will then update the KIP to match what we discussed here and send it for a vote next week.
I will raise a new bug-fix ticket and change KAFKA-10383 <https://issues.apache.org/jira/browse/KAFKA-10383> to become a feature - if that's ok. Cheers, Marco On Wed, Apr 7, 2021 at 4:15 AM Matthias J. Sax <mj...@apache.org> wrote: > Just catching up here. > > I agree that we have two issue, and the first (align subscription store > to main store) can be done as a bug-fix. > > For the KIP (that addressed the second), I tend to agree that reusing > `Materialized` might be better as it would keep the API surface area > smaller. > > > -Matthias > > On 4/6/21 8:48 AM, John Roesler wrote: > > Hi Marco, > > > > Just a quick clarification: I just reviewed the Materialized class. It > looks like the only undesirable members are: > > 1. Retention > > 2. Key/Value serdes > > > > The underlying store type would be “KeyValueStore<Bytes,byte[]>” , for > which case the withRetention javadoc already says it’s ignored. > > > > Perhaps we could just stick with Materialized by adding a note to the > Key/Value serdes setters that they are ignored for FKJoin subscription > stores? > > > > Not as elegant as a new config class, but these config classes actually > bring a fair amount of complexity, so it might be nice to avoid a new one. > > > > Thanks, > > John > > > > On Tue, Apr 6, 2021, at 10:28, Marco Aurélio Lotz wrote: > >> Hi John / Guozhang, > >> > >> If I correctly understood John's message, he agrees on having the two > >> scenarios (piggy-back and api extension). In my view, these two > scenarios > >> are separate tasks - the first one is a bug-fix and the second is an > >> improvement on the current API. > >> > >> - bug-fix: On the current API, we change its implementation to piggy > back > >> on the materialization method provided to the materialized parameter. > This > >> way it will not be opinionated anymore and will not force RocksDb > >> persistence for subscription store. Thus an in-memory materialized > >> parameter would imply an in-memory subscription store, for example. > From my > >> understanding, the original implementation tried to be as unopionated > >> towards storage methods as possible - and the current implementation is > not > >> allowing that. Was that the case? We would still need to add this > >> modification to the update notes, since it may affect some deployments. > >> > >> - improvement: We extend the API to allow a user to fine tune different > >> materialization methods for subscription and join store. This is done by > >> adding a new parameter to the associated methods. > >> > >> Does it sound reasonable to you Guozhang? > >> On your question, does it make sense for an user to decide retention > >> policies (withRetention method) or caching for subscription stores? I > can > >> see why to finetune Logging for example, but in a first moment not these > >> other behaviours. That's why I am unsure about using Materialized class. > >> > >> @John, I will update the KIP with your points as soon as we clarify > this. > >> > >> Cheers, > >> Marco > >> > >> On Tue, Apr 6, 2021 at 1:17 AM Guozhang Wang <wangg...@gmail.com> > wrote: > >> > >>> Thanks Marco / John, > >>> > >>> I think the arguments for not piggy-backing on the existing > Materialized > >>> makes sense; on the other hand, if we go this route should we just use > a > >>> separate Materialized than using an extended / > >>> narrowed-scoped MaterializedSubscription since it seems we want to > allow > >>> users to fully customize this store? > >>> > >>> Guozhang > >>> > >>> On Thu, Apr 1, 2021 at 5:28 PM John Roesler <vvcep...@apache.org> > wrote: > >>> > >>>> Thanks Marco, > >>>> > >>>> Sorry if I caused any trouble! > >>>> > >>>> I don’t remember what I was thinking before, but reasoning about it > now, > >>>> you might need the fine-grained choice if: > >>>> > >>>> 1. The number or size of records in each partition of both tables is > >>>> small(ish), but the cardinality of the join is very high. Then you > might > >>>> want an in-memory table store, but an on-disk subscription store. > >>>> > >>>> 2. The number or size of records is very large, but the join > cardinality > >>>> is low. Then you might need an on-disk table store, but an in-memory > >>>> subscription store. > >>>> > >>>> 3. You might want a different kind (or differently configured) store > for > >>>> the subscription store, since it’s access pattern is so different. > >>>> > >>>> If you buy these, it might be good to put the justification into the > KIP. > >>>> > >>>> I’m in favor of the default you’ve proposed. > >>>> > >>>> Thanks, > >>>> John > >>>> > >>>> On Mon, Mar 29, 2021, at 04:24, Marco Aurélio Lotz wrote: > >>>>> Hi Guozhang, > >>>>> > >>>>> Apologies for the late answer. Originally that was my proposal - to > >>>>> piggyback on the provided materialisation method ( > >>>>> https://issues.apache.org/jira/browse/KAFKA-10383). > >>>>> John Roesler suggested to us to provide even further fine tuning on > API > >>>>> level parameters. Maybe we could see this as two sides of the same > >>> coin: > >>>>> > >>>>> - On the current API, we change it to piggy back on the > materialization > >>>>> method provided to the join store. > >>>>> - We extend the API to allow a user to fine tune different > >>>> materialization > >>>>> methods for subscription and join store. > >>>>> > >>>>> What do you think? > >>>>> > >>>>> Cheers, > >>>>> Marco > >>>>> > >>>>> On Thu, Mar 4, 2021 at 8:04 PM Guozhang Wang <wangg...@gmail.com> > >>> wrote: > >>>>> > >>>>>> Thanks Marco, > >>>>>> > >>>>>> Just a quick thought: what if we reuse the existing Materialized > >>>> object for > >>>>>> both subscription and join stores, instead of introducing a new > >>> param / > >>>>>> class? > >>>>>> > >>>>>> Guozhang > >>>>>> > >>>>>> On Tue, Mar 2, 2021 at 1:07 AM Marco Aurélio Lotz < > >>>> cont...@marcolotz.com> > >>>>>> wrote: > >>>>>> > >>>>>>> Hi folks, > >>>>>>> > >>>>>>> I would like to invite everyone to discuss further KIP-718: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-718%3A+Make+KTable+Join+on+Foreign+key+unopinionated > >>>>>>> > >>>>>>> I welcome all feedback on it. > >>>>>>> > >>>>>>> Kind Regards, > >>>>>>> Marco Lotz > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> -- Guozhang > >>>>>> > >>>>> > >>>> > >>> > >>> > >>> -- > >>> -- Guozhang > >>> > >> >