I just re-read the KIP, and I am wondering why we would *only add* new methods.
Is it an expected use case to only change the main stores but not the subscription stores? Wondering if we should deprecate the existing methods that only accept a single `Materialized` parameter? -Matthias On 4/9/21 1:32 PM, Marco Aurélio Lotz wrote: > Hi everyone, > > I am fine with sticking with Materialized and adding the info to the > javadoc then - so we keep the footprint smaller. > I will then update the KIP to match what we discussed here and send it for > a vote next week. > > I will raise a new bug-fix ticket and change KAFKA-10383 > <https://issues.apache.org/jira/browse/KAFKA-10383> to become a feature - > if that's ok. > > Cheers, > Marco > > On Wed, Apr 7, 2021 at 4:15 AM Matthias J. Sax <mj...@apache.org> wrote: > >> Just catching up here. >> >> I agree that we have two issue, and the first (align subscription store >> to main store) can be done as a bug-fix. >> >> For the KIP (that addressed the second), I tend to agree that reusing >> `Materialized` might be better as it would keep the API surface area >> smaller. >> >> >> -Matthias >> >> On 4/6/21 8:48 AM, John Roesler wrote: >>> Hi Marco, >>> >>> Just a quick clarification: I just reviewed the Materialized class. It >> looks like the only undesirable members are: >>> 1. Retention >>> 2. Key/Value serdes >>> >>> The underlying store type would be “KeyValueStore<Bytes,byte[]>” , for >> which case the withRetention javadoc already says it’s ignored. >>> >>> Perhaps we could just stick with Materialized by adding a note to the >> Key/Value serdes setters that they are ignored for FKJoin subscription >> stores? >>> >>> Not as elegant as a new config class, but these config classes actually >> bring a fair amount of complexity, so it might be nice to avoid a new one. >>> >>> Thanks, >>> John >>> >>> On Tue, Apr 6, 2021, at 10:28, Marco Aurélio Lotz wrote: >>>> Hi John / Guozhang, >>>> >>>> If I correctly understood John's message, he agrees on having the two >>>> scenarios (piggy-back and api extension). In my view, these two >> scenarios >>>> are separate tasks - the first one is a bug-fix and the second is an >>>> improvement on the current API. >>>> >>>> - bug-fix: On the current API, we change its implementation to piggy >> back >>>> on the materialization method provided to the materialized parameter. >> This >>>> way it will not be opinionated anymore and will not force RocksDb >>>> persistence for subscription store. Thus an in-memory materialized >>>> parameter would imply an in-memory subscription store, for example. >> From my >>>> understanding, the original implementation tried to be as unopionated >>>> towards storage methods as possible - and the current implementation is >> not >>>> allowing that. Was that the case? We would still need to add this >>>> modification to the update notes, since it may affect some deployments. >>>> >>>> - improvement: We extend the API to allow a user to fine tune different >>>> materialization methods for subscription and join store. This is done by >>>> adding a new parameter to the associated methods. >>>> >>>> Does it sound reasonable to you Guozhang? >>>> On your question, does it make sense for an user to decide retention >>>> policies (withRetention method) or caching for subscription stores? I >> can >>>> see why to finetune Logging for example, but in a first moment not these >>>> other behaviours. That's why I am unsure about using Materialized class. >>>> >>>> @John, I will update the KIP with your points as soon as we clarify >> this. >>>> >>>> Cheers, >>>> Marco >>>> >>>> On Tue, Apr 6, 2021 at 1:17 AM Guozhang Wang <wangg...@gmail.com> >> wrote: >>>> >>>>> Thanks Marco / John, >>>>> >>>>> I think the arguments for not piggy-backing on the existing >> Materialized >>>>> makes sense; on the other hand, if we go this route should we just use >> a >>>>> separate Materialized than using an extended / >>>>> narrowed-scoped MaterializedSubscription since it seems we want to >> allow >>>>> users to fully customize this store? >>>>> >>>>> Guozhang >>>>> >>>>> On Thu, Apr 1, 2021 at 5:28 PM John Roesler <vvcep...@apache.org> >> wrote: >>>>> >>>>>> Thanks Marco, >>>>>> >>>>>> Sorry if I caused any trouble! >>>>>> >>>>>> I don’t remember what I was thinking before, but reasoning about it >> now, >>>>>> you might need the fine-grained choice if: >>>>>> >>>>>> 1. The number or size of records in each partition of both tables is >>>>>> small(ish), but the cardinality of the join is very high. Then you >> might >>>>>> want an in-memory table store, but an on-disk subscription store. >>>>>> >>>>>> 2. The number or size of records is very large, but the join >> cardinality >>>>>> is low. Then you might need an on-disk table store, but an in-memory >>>>>> subscription store. >>>>>> >>>>>> 3. You might want a different kind (or differently configured) store >> for >>>>>> the subscription store, since it’s access pattern is so different. >>>>>> >>>>>> If you buy these, it might be good to put the justification into the >> KIP. >>>>>> >>>>>> I’m in favor of the default you’ve proposed. >>>>>> >>>>>> Thanks, >>>>>> John >>>>>> >>>>>> On Mon, Mar 29, 2021, at 04:24, Marco Aurélio Lotz wrote: >>>>>>> Hi Guozhang, >>>>>>> >>>>>>> Apologies for the late answer. Originally that was my proposal - to >>>>>>> piggyback on the provided materialisation method ( >>>>>>> https://issues.apache.org/jira/browse/KAFKA-10383). >>>>>>> John Roesler suggested to us to provide even further fine tuning on >> API >>>>>>> level parameters. Maybe we could see this as two sides of the same >>>>> coin: >>>>>>> >>>>>>> - On the current API, we change it to piggy back on the >> materialization >>>>>>> method provided to the join store. >>>>>>> - We extend the API to allow a user to fine tune different >>>>>> materialization >>>>>>> methods for subscription and join store. >>>>>>> >>>>>>> What do you think? >>>>>>> >>>>>>> Cheers, >>>>>>> Marco >>>>>>> >>>>>>> On Thu, Mar 4, 2021 at 8:04 PM Guozhang Wang <wangg...@gmail.com> >>>>> wrote: >>>>>>> >>>>>>>> Thanks Marco, >>>>>>>> >>>>>>>> Just a quick thought: what if we reuse the existing Materialized >>>>>> object for >>>>>>>> both subscription and join stores, instead of introducing a new >>>>> param / >>>>>>>> class? >>>>>>>> >>>>>>>> Guozhang >>>>>>>> >>>>>>>> On Tue, Mar 2, 2021 at 1:07 AM Marco Aurélio Lotz < >>>>>> cont...@marcolotz.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi folks, >>>>>>>>> >>>>>>>>> I would like to invite everyone to discuss further KIP-718: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-718%3A+Make+KTable+Join+on+Foreign+key+unopinionated >>>>>>>>> >>>>>>>>> I welcome all feedback on it. >>>>>>>>> >>>>>>>>> Kind Regards, >>>>>>>>> Marco Lotz >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- Guozhang >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> -- Guozhang >>>>> >>>> >> >