I just re-read the KIP, and I am wondering why we would *only add* new
methods.

Is it an expected use case to only change the main stores but not the
subscription stores?

Wondering if we should deprecate the existing methods that only accept a
single `Materialized` parameter?


-Matthias

On 4/9/21 1:32 PM, Marco Aurélio Lotz wrote:
> Hi everyone,
> 
> I am fine with sticking with Materialized and adding the info to the
> javadoc then - so we keep the footprint smaller.
> I will then update the KIP to match what we discussed here and send it for
> a vote next week.
> 
> I will raise a new bug-fix ticket and change KAFKA-10383
> <https://issues.apache.org/jira/browse/KAFKA-10383> to become a feature -
> if that's ok.
> 
> Cheers,
> Marco
> 
> On Wed, Apr 7, 2021 at 4:15 AM Matthias J. Sax <mj...@apache.org> wrote:
> 
>> Just catching up here.
>>
>> I agree that we have two issue, and the first (align subscription store
>> to main store) can be done as a bug-fix.
>>
>> For the KIP (that addressed the second), I tend to agree that reusing
>> `Materialized` might be better as it would keep the API surface area
>> smaller.
>>
>>
>> -Matthias
>>
>> On 4/6/21 8:48 AM, John Roesler wrote:
>>> Hi Marco,
>>>
>>> Just a quick clarification: I just reviewed the Materialized class. It
>> looks like the only undesirable members are:
>>> 1. Retention
>>> 2. Key/Value serdes
>>>
>>> The underlying store type would be “KeyValueStore<Bytes,byte[]>” , for
>> which case the withRetention javadoc already says it’s ignored.
>>>
>>> Perhaps we could just stick with Materialized by adding a note to the
>> Key/Value serdes setters that they are ignored for FKJoin subscription
>> stores?
>>>
>>> Not as elegant as a new config class, but these config classes actually
>> bring a fair amount of complexity, so it might be nice to avoid a new one.
>>>
>>> Thanks,
>>> John
>>>
>>> On Tue, Apr 6, 2021, at 10:28, Marco Aurélio Lotz wrote:
>>>> Hi John / Guozhang,
>>>>
>>>> If I correctly understood John's message, he agrees on having the two
>>>> scenarios (piggy-back and api extension). In my view, these two
>> scenarios
>>>> are separate tasks - the first one is a bug-fix and the second is an
>>>> improvement on the current API.
>>>>
>>>> - bug-fix: On the current API, we change its implementation to piggy
>> back
>>>> on the materialization method provided to the materialized parameter.
>> This
>>>> way it will not be opinionated anymore and will not force RocksDb
>>>> persistence for subscription store. Thus an in-memory materialized
>>>> parameter would imply an in-memory subscription store, for example.
>> From my
>>>> understanding, the original implementation tried to be as unopionated
>>>> towards storage methods as possible - and the current implementation is
>> not
>>>> allowing that. Was that the case? We would still need to add this
>>>> modification to the update notes, since it may affect some deployments.
>>>>
>>>> - improvement: We extend the API to allow a user to fine tune different
>>>> materialization methods for subscription and join store. This is done by
>>>> adding a new parameter to the associated methods.
>>>>
>>>> Does it sound reasonable to you Guozhang?
>>>> On your question, does it make sense for an user to decide retention
>>>> policies (withRetention method) or caching for subscription stores? I
>> can
>>>> see why to finetune Logging for example, but in a first moment not these
>>>> other behaviours. That's why I am unsure about using Materialized class.
>>>>
>>>> @John, I will update the KIP with your points as soon as we clarify
>> this.
>>>>
>>>> Cheers,
>>>> Marco
>>>>
>>>> On Tue, Apr 6, 2021 at 1:17 AM Guozhang Wang <wangg...@gmail.com>
>> wrote:
>>>>
>>>>> Thanks Marco / John,
>>>>>
>>>>> I think the arguments for not piggy-backing on the existing
>> Materialized
>>>>> makes sense; on the other hand, if we go this route should we just use
>> a
>>>>> separate Materialized than using an extended /
>>>>> narrowed-scoped MaterializedSubscription since it seems we want to
>> allow
>>>>> users to fully customize this store?
>>>>>
>>>>> Guozhang
>>>>>
>>>>> On Thu, Apr 1, 2021 at 5:28 PM John Roesler <vvcep...@apache.org>
>> wrote:
>>>>>
>>>>>> Thanks Marco,
>>>>>>
>>>>>> Sorry if I caused any trouble!
>>>>>>
>>>>>> I don’t remember what I was thinking before, but reasoning about it
>> now,
>>>>>> you might need the fine-grained choice if:
>>>>>>
>>>>>> 1. The number or size of records in each partition of both tables is
>>>>>> small(ish), but the cardinality of the join is very high. Then you
>> might
>>>>>> want an in-memory table store, but an on-disk subscription store.
>>>>>>
>>>>>> 2. The number or size of records is very large, but the join
>> cardinality
>>>>>> is low. Then you might need an on-disk table store, but an in-memory
>>>>>> subscription store.
>>>>>>
>>>>>> 3. You might want a different kind (or differently configured) store
>> for
>>>>>> the subscription store, since it’s access pattern is so different.
>>>>>>
>>>>>> If you buy these, it might be good to put the justification into the
>> KIP.
>>>>>>
>>>>>> I’m in favor of the default you’ve proposed.
>>>>>>
>>>>>> Thanks,
>>>>>> John
>>>>>>
>>>>>> On Mon, Mar 29, 2021, at 04:24, Marco Aurélio Lotz wrote:
>>>>>>> Hi Guozhang,
>>>>>>>
>>>>>>> Apologies for the late answer. Originally that was my proposal - to
>>>>>>> piggyback on the provided materialisation method (
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-10383).
>>>>>>> John Roesler suggested to us to provide even further fine tuning on
>> API
>>>>>>> level parameters. Maybe we could see this as two sides of the same
>>>>> coin:
>>>>>>>
>>>>>>> - On the current API, we change it to piggy back on the
>> materialization
>>>>>>> method provided to the join store.
>>>>>>> - We extend the API to allow a user to fine tune different
>>>>>> materialization
>>>>>>> methods for subscription and join store.
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Marco
>>>>>>>
>>>>>>> On Thu, Mar 4, 2021 at 8:04 PM Guozhang Wang <wangg...@gmail.com>
>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Marco,
>>>>>>>>
>>>>>>>> Just a quick thought: what if we reuse the existing Materialized
>>>>>> object for
>>>>>>>> both subscription and join stores, instead of introducing a new
>>>>> param /
>>>>>>>> class?
>>>>>>>>
>>>>>>>> Guozhang
>>>>>>>>
>>>>>>>> On Tue, Mar 2, 2021 at 1:07 AM Marco Aurélio Lotz <
>>>>>> cont...@marcolotz.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi folks,
>>>>>>>>>
>>>>>>>>> I would like to invite everyone to discuss further KIP-718:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-718%3A+Make+KTable+Join+on+Foreign+key+unopinionated
>>>>>>>>>
>>>>>>>>> I welcome all feedback on it.
>>>>>>>>>
>>>>>>>>> Kind Regards,
>>>>>>>>> Marco Lotz
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> -- Guozhang
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -- Guozhang
>>>>>
>>>>
>>
> 

Reply via email to