I agree with Eno about the renaming. @Eno: can you add this to the Wiki Discussion page?
-Matthias On 4/21/17 1:11 AM, Eno Thereska wrote: > Hi Guozhang, > > Thanks for the feedback. Comments inline: > >> 1. Regarding the user-facing semantics, I thought we will claim that >> "KTables generated from functions that do NOT specify a table name will NOT >> be queryable"; but it seems you're proposing to claim it "may not possible >> to be queryable", i.e. if users happen to know the internal name if it is >> materialized, she can still query it. I feel its potential benefits are >> well overwhelmed by the confusion it may introduce. So I'd suggest we just >> be strict and say "no store name, not queryable". > > Sure. "No store name, not queryable" sounds fine. If the user is brave and > digs deep they will be able to query these stores that are always created > (like when we do aggregates), but I agree that there is no reason we need to > make a promise to them if they don't provide a name. I'll change the wording. > >> >> 2. Is there a difference between "calling the overloaded function with >> store name, but specify the value as null" and "calling the overloaded >> function without store name"? I thought they will be implemented the same >> way. But after reading through the wiki I'm not sure. So just clarifying. >> > > There is no difference. I'll clarify. > > >> 3. Personally I'm still a bit preferring renaming "KTable#toStream" to sth. >> like "KTable#getChangelog()" or "#toChangelog", since to me it feels more >> understandable from user's point of view. WDPT? >> > > So I left this out of this KIP, since it's not directly related to the scope. > Perhaps we can do it in a cleanup KIP? > > Thanks > Eno > > >> >> Guozhang >> >> >> On Tue, Apr 11, 2017 at 11:53 AM, Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> +1 >>> >>> On 4/11/17 10:34 AM, Eno Thereska wrote: >>>> Hi Matthias, >>>> >>>> >>>>> On 11 Apr 2017, at 09:41, Matthias J. Sax <matth...@confluent.io> >>> wrote: >>>>> >>>>> Not sure, if we are on the same page already? >>>>> >>>>>> "A __store__ can be queryable whether is't materialized or not" >>>>> >>>>> This does not make sense -- there is nothing like a non-materialized >>>>> store -- only non-materialized KTables. >>>> >>>> Yes, there are stores that are simple views, i.e., non-materialized. >>> Damian has such a prototype for Global Tables (it didn't go into trunk). >>>> It's still a store, e.g., a KeyValueStore, but when you do a get() it >>> recomputes the result on the fly (e.g., it applies a filter). >>>> >>>> Eno >>>> >>>>> >>>>>> "Yes, there is nothing that will prevent users from querying >>>>> internally generated stores, but they cannot assume a store will >>>>> necessarily be queryable." >>>>> >>>>> That is what I disagree on. Stores should be queryable all the time. >>>>> >>>>> Furthermore, we should have all non-materialized KTables to be >>>>> queryable, too. >>>>> >>>>> >>>>> Or maybe there is just some missunderstand going as, and there is some >>>>> mix-up between "store" and "KTable" >>>>> >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> >>>>> On 4/11/17 9:34 AM, Eno Thereska wrote: >>>>>> Hi Matthias, >>>>>> >>>>>> See my note: "A store can be queryable whether it's materialized or >>> not". I think we're on the same page. Stores with an internal name are also >>> queryable. >>>>>> >>>>>> I'm just pointing out that. although that is the case today and with >>> this KIP, I don't think we have an obligation to make stores with internal >>> names queryable in the future. However, that is a discussion for a future >>> point. >>>>>> >>>>>> Eno >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 11 Apr 2017, at 08:56, Matthias J. Sax <matth...@confluent.io> >>> wrote: >>>>>>> >>>>>>> +1 on including GlobalKTable >>>>>>> >>>>>>> But I am not sure about the materialization / queryable question. For >>>>>>> full consistency, all KTables should be queryable nevertheless if they >>>>>>> are materialized or not. -- Maybe this is a second step though (even >>> if >>>>>>> I would like to get this done right away) >>>>>>> >>>>>>> If we don't want all KTables to be queryable, ie, only those KTables >>>>>>> that are materialized, then we should have a clear definition about >>>>>>> this, and only allow to query stores, the user did specify a name for. >>>>>>> This will simply the reasoning for users, what stores are queryable >>> and >>>>>>> what not. Otherwise, we still end up confusing user. >>>>>>> >>>>>>> >>>>>>> -Matthias >>>>>>> >>>>>>> On 4/11/17 8:23 AM, Damian Guy wrote: >>>>>>>> Eno, re: GlobalKTable - yeah that seems fine. >>>>>>>> >>>>>>>> On Tue, 11 Apr 2017 at 14:18 Eno Thereska <eno.there...@gmail.com> >>> wrote: >>>>>>>> >>>>>>>>> About GlobalKTables, I suppose there is no reason why they cannot >>> also use >>>>>>>>> this KIP for consistency, e.g., today you have: >>>>>>>>> >>>>>>>>> public <K, V> GlobalKTable<K, V> globalTable(final Serde<K> >>> keySerde, >>>>>>>>> final Serde<V> valSerde, >>>>>>>>> final String topic, >>>>>>>>> final String storeName) >>>>>>>>> >>>>>>>>> For consistency with the KIP you could also have an overload >>> without the >>>>>>>>> store name, for people who want to construct a global ktable, but >>> don't >>>>>>>>> care about querying it directly: >>>>>>>>> >>>>>>>>> public <K, V> GlobalKTable<K, V> globalTable(final Serde<K> >>> keySerde, >>>>>>>>> final Serde<V> valSerde, >>>>>>>>> final String topic) >>>>>>>>> >>>>>>>>> Damian, what do you think? I'm thinking of adding this to KIP. >>> Thanks to >>>>>>>>> Michael for bringing it up. >>>>>>>>> >>>>>>>>> Eno >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 11 Apr 2017, at 06:13, Eno Thereska <eno.there...@gmail.com> >>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Michael, comments inline: >>>>>>>>>> >>>>>>>>>>> On 11 Apr 2017, at 03:25, Michael Noll <mich...@confluent.io> >>> wrote: >>>>>>>>>>> >>>>>>>>>>> Thanks for the updates, Eno! >>>>>>>>>>> >>>>>>>>>>> In addition to what has already been said: We should also >>> explicitly >>>>>>>>>>> mention that this KIP is not touching GlobalKTable. I'm sure >>> that some >>>>>>>>>>> users will throw KTable and GlobalKTable into one conceptual >>> "it's all >>>>>>>>>>> tables!" bucket and then wonder how the KIP might affect global >>> tables. >>>>>>>>>> >>>>>>>>>> Good point, I'll add. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Damian wrote: >>>>>>>>>>>> I think if no store name is provided users would still be able >>> to query >>>>>>>>>>> the >>>>>>>>>>>> store, just the store name would be some internally generated >>> name. >>>>>>>>> They >>>>>>>>>>>> would be able to discover those names via the IQ API. >>>>>>>>>>> >>>>>>>>>>> I, too, think that users should be able to query a store even if >>> its >>>>>>>>> name >>>>>>>>>>> was internally generated. After all, the data is already there / >>>>>>>>>>> materialized. >>>>>>>>>> >>>>>>>>>> Yes, there is nothing that will prevent users from querying >>> internally >>>>>>>>> generated stores, but they cannot >>>>>>>>>> assume a store will necessarily be queryable. So if it's there, >>> they can >>>>>>>>> query it. If it's not there, and they didn't >>>>>>>>>> provide a queryable name, they cannot complain and say "hey, where >>> is my >>>>>>>>> store". If they must absolutely be certain that >>>>>>>>>> a store is queryable, then they must provide a queryable name. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Damian wrote: >>>>>>>>>>>> I think for some stores it will make sense to not create a >>> physical >>>>>>>>>>> store, i.e., >>>>>>>>>>>> for thinks like `filter`, as this will save the rocksdb >>> overhead. But i >>>>>>>>>>> guess that >>>>>>>>>>>> is more of an implementation detail. >>>>>>>>>>> >>>>>>>>>>> I think it would help if the KIP would clarify what we'd do in >>> such a >>>>>>>>>>> case. For example, if the user did not specify a store name for >>>>>>>>>>> `KTable#filter` -- would it be queryable? If so, would this >>> imply we'd >>>>>>>>>>> always materialize the state store, or...? >>>>>>>>>> >>>>>>>>>> I'll clarify in the KIP with some more examples. Materialization >>> will be >>>>>>>>> an internal concept. A store can be queryable whether it's >>> materialized or >>>>>>>>> not >>>>>>>>>> (e.g., through advanced implementations that compute the value of a >>>>>>>>> filter on a fly, rather than materialize the answer). >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Eno >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Michael >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Apr 11, 2017 at 9:14 AM, Damian Guy <damian....@gmail.com >>>> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Eno, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the update. I agree with what Matthias said. I wonder >>> if >>>>>>>>> the KIP >>>>>>>>>>>> should talk less about materialization and more about querying? >>> After >>>>>>>>> all, >>>>>>>>>>>> that is what is being provided from an end-users perspective. >>>>>>>>>>>> >>>>>>>>>>>> I think if no store name is provided users would still be able to >>>>>>>>> query the >>>>>>>>>>>> store, just the store name would be some internally generated >>> name. >>>>>>>>> They >>>>>>>>>>>> would be able to discover those names via the IQ API >>>>>>>>>>>> >>>>>>>>>>>> I think for some stores it will make sense to not create a >>> physical >>>>>>>>> store, >>>>>>>>>>>> i.e., for thinks like `filter`, as this will save the rocksdb >>>>>>>>> overhead. But >>>>>>>>>>>> i guess that is more of an implementation detail. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Damian >>>>>>>>>>>> >>>>>>>>>>>> On Tue, 11 Apr 2017 at 00:36 Eno Thereska < >>> eno.there...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Matthias, >>>>>>>>>>>>> >>>>>>>>>>>>>> However, this still forces users, to provide a name for store >>> that we >>>>>>>>>>>>>> must materialize, even if users are not interested in querying >>> the >>>>>>>>>>>>>> stores. Thus, I would like to have overloads for all currently >>>>>>>>> existing >>>>>>>>>>>>>> methods having mandatory storeName paremeter, with overloads, >>> that do >>>>>>>>>>>>>> not require the storeName parameter. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Oh yeah, absolutely, this is part of the KIP. I guess I didn't >>> make it >>>>>>>>>>>>> clear, I'll clarify. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> Eno >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On 10 Apr 2017, at 16:00, Matthias J. Sax < >>> matth...@confluent.io> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for pushing this KIP Eno. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The update give a very clear description about the scope, that >>> is >>>>>>>>> super >>>>>>>>>>>>>> helpful for the discussion! >>>>>>>>>>>>>> >>>>>>>>>>>>>> - To put it into my own words, the KIP focus is on enable to >>> query >>>>>>>>> all >>>>>>>>>>>>>> KTables. >>>>>>>>>>>>>> ** The ability to query a store is determined by providing a >>> name for >>>>>>>>>>>>>> the store. >>>>>>>>>>>>>> ** At the same time, providing a name -- and thus making a >>> store >>>>>>>>>>>>>> queryable -- does not say anything about an actual >>> materialization >>>>>>>>> (ie, >>>>>>>>>>>>>> being queryable and being materialized are orthogonal). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I like this overall a lot. However, I would go one step >>> further. >>>>>>>>> Right >>>>>>>>>>>>>> now, you suggest to add new overload methods that allow users >>> to >>>>>>>>>>>> specify >>>>>>>>>>>>>> a storeName -- if `null` is provided and the store is not >>>>>>>>> materialized, >>>>>>>>>>>>>> we ignore it completely -- if `null` is provided but the store >>> must >>>>>>>>> be >>>>>>>>>>>>>> materialized we generate a internal name. So far so good. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However, this still forces users, to provide a name for store >>> that we >>>>>>>>>>>>>> must materialize, even if users are not interested in querying >>> the >>>>>>>>>>>>>> stores. Thus, I would like to have overloads for all currently >>>>>>>>> existing >>>>>>>>>>>>>> methods having mandatory storeName paremeter, with overloads, >>> that do >>>>>>>>>>>>>> not require the storeName parameter. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Otherwise, we would still have some methods which optional >>> storeName >>>>>>>>>>>>>> parameter and other method with mandatory storeName parameter >>> -- >>>>>>>>> thus, >>>>>>>>>>>>>> still some inconsistency. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/9/17 8:35 AM, Eno Thereska wrote: >>>>>>>>>>>>>>> Hi there, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've now done a V2 of the KIP, that hopefully addresses the >>> feedback >>>>>>>>>>>> in >>>>>>>>>>>>> this discussion thread: >>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>>>> 114%3A+KTable+materialization+and+improved+semantics >>>>>>>>>>>>> < >>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>>>> 114:+KTable+materialization+and+improved+semantics>. >>>>>>>>>>>>> Notable changes: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - clearly outline what is in the scope of the KIP and what is >>> not. >>>>>>>>> We >>>>>>>>>>>>> ran into the issue where lots of useful, but somewhat tangential >>>>>>>>>>>>> discussions came up on interactive queries, declarative DSL >>> etc. The >>>>>>>>>>>> exact >>>>>>>>>>>>> scope of this KIP is spelled out. >>>>>>>>>>>>>>> - decided to go with overloaded methods, not .materialize(), >>> to stay >>>>>>>>>>>>> within the spirit of the current declarative DSL. >>>>>>>>>>>>>>> - clarified the depreciation plan >>>>>>>>>>>>>>> - listed part of the discussion we had under rejected >>> alternatives >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If you have any further feedback on this, let's continue on >>> this >>>>>>>>>>>> thread. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you >>>>>>>>>>>>>>> Eno >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 1 Feb 2017, at 09:04, Eno Thereska < >>> eno.there...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks everyone! I think it's time to do a V2 on the KIP so >>> I'll do >>>>>>>>>>>>> that and we can see how it looks and continue the discussion >>> from >>>>>>>>> there. >>>>>>>>>>>>> Stay tuned. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>> Eno >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 30 Jan 2017, at 17:23, Matthias J. Sax < >>> matth...@confluent.io> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I think Eno's separation is very clear and helpful. In >>> order to >>>>>>>>>>>>>>>>> streamline this discussion, I would suggest we focus back >>> on point >>>>>>>>>>>> (1) >>>>>>>>>>>>>>>>> only, as this is the original KIP question. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Even if I started to DSL design discussion somehow, because >>> I >>>>>>>>>>>> thought >>>>>>>>>>>>> it >>>>>>>>>>>>>>>>> might be helpful to resolve both in a single shot, I feel >>> that we >>>>>>>>>>>> have >>>>>>>>>>>>>>>>> too many options about DSL design and we should split it up >>> in two >>>>>>>>>>>>>>>>> steps. This will have the disadvantage that we will change >>> the API >>>>>>>>>>>>>>>>> twice, but still, I think it will be a more focused >>> discussion. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I just had another look at the KIP, an it proposes 3 >>> changes: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. add .materialized() -> IIRC it was suggested to name this >>>>>>>>>>>>>>>>> .materialize() though (can you maybe update the KIP Eno?) >>>>>>>>>>>>>>>>> 2. remove print(), writeAsText(), and foreach() >>>>>>>>>>>>>>>>> 3. rename toStream() to toKStream() >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I completely agree with (2) -- not sure about (3) though >>> because >>>>>>>>>>>>>>>>> KStreamBuilder also hast .stream() and .table() as methods. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> However, we might want to introduce a KStream#toTable() -- >>> this >>>>>>>>> was >>>>>>>>>>>>>>>>> requested multiple times -- might also be part of a >>> different KIP. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thus, we end up with (1). I would suggest to do a step >>> backward >>>>>>>>> here >>>>>>>>>>>>> and >>>>>>>>>>>>>>>>> instead of a discussion how to express the changes in the >>> DSL (new >>>>>>>>>>>>>>>>> overload, new methods...) we should discuss what the actual >>> change >>>>>>>>>>>>>>>>> should be. Like (1) materialize all KTable all the time (2) >>> all >>>>>>>>> the >>>>>>>>>>>>> user >>>>>>>>>>>>>>>>> to force a materialization to enable querying the KTable >>> (3) allow >>>>>>>>>>>> for >>>>>>>>>>>>>>>>> queryable non-materialized KTable. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On more question is, if we want to allow a user-forced >>>>>>>>>>>> materialization >>>>>>>>>>>>>>>>> only as as local store without changelog, or both (together >>> / >>>>>>>>>>>>>>>>> independently)? We got some request like this already. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 1/30/17 3:50 AM, Jan Filipiak wrote: >>>>>>>>>>>>>>>>>> Hi Eno, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks for putting into different points. I want to put a >>> few >>>>>>>>>>>> remarks >>>>>>>>>>>>>>>>>> inline. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Best Jan >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 30.01.2017 12:19, Eno Thereska wrote: >>>>>>>>>>>>>>>>>>> So I think there are several important discussion threads >>> that >>>>>>>>> are >>>>>>>>>>>>>>>>>>> emerging here. Let me try to tease them apart: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 1. inconsistency in what is materialized and what is not, >>> what >>>>>>>>> is >>>>>>>>>>>>>>>>>>> queryable and what is not. I think we all agree there is >>> some >>>>>>>>>>>>>>>>>>> inconsistency there and this will be addressed with any >>> of the >>>>>>>>>>>>>>>>>>> proposed approaches. Addressing the inconsistency is the >>> point >>>>>>>>> of >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> original KIP. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 2. the exact API for materializing a KTable. We can >>> specify 1) a >>>>>>>>>>>>>>>>>>> "store name" (as we do today) or 2) have a >>> ".materialize[d]" >>>>>>>>> call >>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>> 3) get a handle from a KTable ".getQueryHandle" or 4) >>> have a >>>>>>>>>>>> builder >>>>>>>>>>>>>>>>>>> construct. So we have discussed 4 options. It is >>> important to >>>>>>>>>>>>> remember >>>>>>>>>>>>>>>>>>> in this discussion that IQ is not designed for just local >>>>>>>>> queries, >>>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>> also for distributed queries. In all cases an identifying >>>>>>>>> name/id >>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> needed for the store that the user is interested in >>> querying. So >>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>> end up with a discussion on who provides the name, the >>> user (as >>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>> today) or if it is generated automatically (as Jan >>> suggests, as >>>>>>>>> I >>>>>>>>>>>>>>>>>>> understand it). If it is generated automatically we need >>> a way >>>>>>>>> to >>>>>>>>>>>>>>>>>>> expose these auto-generated names to the users and link >>> them to >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> KTables they care to query. >>>>>>>>>>>>>>>>>> Hi, the last sentence is what I currently arguing against. >>> The >>>>>>>>> user >>>>>>>>>>>>>>>>>> would never see a stringtype indentifier name or anything. >>> All he >>>>>>>>>>>>> gets >>>>>>>>>>>>>>>>>> is the queryHandle if he executes a get(K) that will be an >>>>>>>>>>>>> interactive >>>>>>>>>>>>>>>>>> query get. with all the finding the right servers that >>> currently >>>>>>>>>>>>> have a >>>>>>>>>>>>>>>>>> copy of this underlying store stuff going on. The nice >>> part is >>>>>>>>> that >>>>>>>>>>>>> if >>>>>>>>>>>>>>>>>> someone retrieves a queryHandle, you know that you have to >>>>>>>>>>>>> materialized >>>>>>>>>>>>>>>>>> (if you are not already) as queries will be coming. Taking >>> away >>>>>>>>> the >>>>>>>>>>>>>>>>>> confusion mentioned in point 1 IMO. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 3. The exact boundary between the DSL, that is the >>> processing >>>>>>>>>>>>>>>>>>> language, and the storage/IQ queries, and how we jump >>> from one >>>>>>>>> to >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> other. This is mostly for how we get a handle on a store >>> (so >>>>>>>>> it's >>>>>>>>>>>>>>>>>>> related to point 2), rather than for how we query the >>> store. I >>>>>>>>>>>> think >>>>>>>>>>>>>>>>>>> we all agree that we don't want to limit ways one can >>> query a >>>>>>>>>>>> store >>>>>>>>>>>>>>>>>>> (e.g., using gets or range queries etc) and the query >>> APIs are >>>>>>>>> not >>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>> the scope of the DSL. >>>>>>>>>>>>>>>>>> Does the IQ work with range currently? The range would >>> have to be >>>>>>>>>>>>>>>>>> started on all stores and then merged by maybe the client. >>> Range >>>>>>>>>>>>> force a >>>>>>>>>>>>>>>>>> flush to RocksDB currently so I am sure you would get a >>>>>>>>> performance >>>>>>>>>>>>> hit >>>>>>>>>>>>>>>>>> right there. Time-windows might be okay, but I am not sure >>> if the >>>>>>>>>>>>> first >>>>>>>>>>>>>>>>>> version should offer the user range access. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 4. The nature of the DSL and whether its declarative >>> enough, or >>>>>>>>>>>>>>>>>>> flexible enough. Damian made the point that he likes the >>> builder >>>>>>>>>>>>>>>>>>> pattern since users can specify, per KTable, things like >>> caching >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>> logging needs. His observation (as I understand it) is >>> that the >>>>>>>>>>>>>>>>>>> processor API (PAPI) is flexible but doesn't provide any >>> help at >>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>> to users. The current DSL provides declarative >>> abstractions, but >>>>>>>>>>>>> it's >>>>>>>>>>>>>>>>>>> not fine-grained enough. This point is much broader than >>> the >>>>>>>>> KIP, >>>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>> discussing it in this KIPs context is ok, since we don't >>> want to >>>>>>>>>>>>> make >>>>>>>>>>>>>>>>>>> small piecemeal changes and then realise we're not in the >>> spot >>>>>>>>> we >>>>>>>>>>>>> want >>>>>>>>>>>>>>>>>>> to be. >>>>>>>>>>>>>>>>>> This is indeed much broader. My guess here is that's why >>> both >>>>>>>>> API's >>>>>>>>>>>>>>>>>> exists and helping the users to switch back and forth >>> might be a >>>>>>>>>>>>> thing. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Feel free to pitch in if I have misinterpreted something. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>> Eno >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 30 Jan 2017, at 10:22, Jan Filipiak < >>>>>>>>> jan.filip...@trivago.com >>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Eno, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have a really hard time understanding why we can't. >>> From my >>>>>>>>>>>> point >>>>>>>>>>>>>>>>>>>> of view everything could be super elegant DSL only + >>> public api >>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> the PAPI-people as already exist. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The above aproach implementing a .get(K) on KTable is >>> foolisch >>>>>>>>> in >>>>>>>>>>>>> my >>>>>>>>>>>>>>>>>>>> opinion as it would be to late to know that >>> materialisation >>>>>>>>> would >>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>> required. >>>>>>>>>>>>>>>>>>>> But having an API that allows to indicate I want to >>> query this >>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>>> and then wrapping the say table's processorname can work >>> out >>>>>>>>>>>> really >>>>>>>>>>>>>>>>>>>> really nice. The only obstacle I see is people not >>> willing to >>>>>>>>>>>> spend >>>>>>>>>>>>>>>>>>>> the additional time in implementation and just want a >>> quick >>>>>>>>> shot >>>>>>>>>>>>>>>>>>>> option to make it work. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> For me it would look like this: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> table = builder.table() >>>>>>>>>>>>>>>>>>>> filteredTable = table.filter() >>>>>>>>>>>>>>>>>>>> rawHandle = table.getQueryHandle() // Does the >>> materialisation, >>>>>>>>>>>>>>>>>>>> really all names possible but id rather hide the >>> implication of >>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>> materializes >>>>>>>>>>>>>>>>>>>> filteredTableHandle = filteredTable.getQueryHandle() // >>> this >>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> _not_ materialize again of course, the source or the >>> aggregator >>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> stay the only materialized processors >>>>>>>>>>>>>>>>>>>> streams = new streams(builder) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This middle part is highly flexible I could imagin to >>> force the >>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>> todo something like this. This implies to the user that >>> his >>>>>>>>>>>> streams >>>>>>>>>>>>>>>>>>>> need to be running >>>>>>>>>>>>>>>>>>>> instead of propagating the missing initialisation back by >>>>>>>>>>>>> exceptions. >>>>>>>>>>>>>>>>>>>> Also if the users is forced to pass the appropriate >>> streams >>>>>>>>>>>>> instance >>>>>>>>>>>>>>>>>>>> back can change. >>>>>>>>>>>>>>>>>>>> I think its possible to build multiple streams out of >>> one >>>>>>>>>>>> topology >>>>>>>>>>>>>>>>>>>> so it would be easiest to implement aswell. This is just >>> what I >>>>>>>>>>>>> maybe >>>>>>>>>>>>>>>>>>>> had liked the most >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> streams.start(); >>>>>>>>>>>>>>>>>>>> rawHandle.prepare(streams) >>>>>>>>>>>>>>>>>>>> filteredHandle.prepare(streams) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> later the users can do >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> V value = rawHandle.get(K) >>>>>>>>>>>>>>>>>>>> V value = filteredHandle.get(K) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This could free DSL users from anything like storenames >>> and how >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>> what to materialize. Can someone indicate what the >>> problem >>>>>>>>> would >>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>> implementing it like this. >>>>>>>>>>>>>>>>>>>> Yes I am aware that the current IQ API will not support >>>>>>>>> querying >>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>> KTableProcessorName instread of statestoreName. But I >>> think >>>>>>>>> that >>>>>>>>>>>>> had >>>>>>>>>>>>>>>>>>>> to change if you want it to be intuitive >>>>>>>>>>>>>>>>>>>> IMO you gotta apply the filter read time >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Looking forward to your opinions >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best Jan >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> #DeathToIQMoreAndBetterConnectors >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 30.01.2017 10:42, Eno Thereska wrote: >>>>>>>>>>>>>>>>>>>>> Hi there, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The inconsistency will be resolved, whether with >>> materialize >>>>>>>>> or >>>>>>>>>>>>>>>>>>>>> overloaded methods. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> With the discussion on the DSL & stores I feel we've >>> gone in a >>>>>>>>>>>>>>>>>>>>> slightly different tangent, which is worth discussing >>>>>>>>>>>> nonetheless. >>>>>>>>>>>>>>>>>>>>> We have entered into an argument around the scope of >>> the DSL. >>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>> DSL has been designed primarily for processing. The DSL >>> does >>>>>>>>> not >>>>>>>>>>>>>>>>>>>>> dictate ways to access state stores or what hind of >>> queries to >>>>>>>>>>>>>>>>>>>>> perform on them. Hence, I see the mechanism for >>> accessing >>>>>>>>>>>> storage >>>>>>>>>>>>> as >>>>>>>>>>>>>>>>>>>>> decoupled from the DSL. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> We could think of ways to get store handles from part >>> of the >>>>>>>>>>>> DSL, >>>>>>>>>>>>>>>>>>>>> like the KTable abstraction. However, subsequent >>> queries will >>>>>>>>> be >>>>>>>>>>>>>>>>>>>>> store-dependent and not rely on the DSL, hence I'm not >>> sure we >>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>> any grand-convergence DSL-Store here. So I am arguing >>> that the >>>>>>>>>>>>>>>>>>>>> current way of getting a handle on state stores is fine. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>> Eno >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 30 Jan 2017, at 03:56, Guozhang Wang < >>> wangg...@gmail.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thinking loud here about the API options (materialize >>> v.s. >>>>>>>>>>>>> overloaded >>>>>>>>>>>>>>>>>>>>>> functions) and its impact on IQ: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. The first issue of the current DSL is that, there is >>>>>>>>>>>>>>>>>>>>>> inconsistency upon >>>>>>>>>>>>>>>>>>>>>> whether / how KTables should be materialized: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> a) in many cases the library HAS TO materialize >>> KTables no >>>>>>>>>>>>>>>>>>>>>> matter what, >>>>>>>>>>>>>>>>>>>>>> e.g. KStream / KTable aggregation resulted KTables, >>> and hence >>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>> enforce >>>>>>>>>>>>>>>>>>>>>> users to provide store names and throw RTE if it is >>> null; >>>>>>>>>>>>>>>>>>>>>> b) in some other cases, the KTable can be materialized >>> or >>>>>>>>> not; >>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>> example in KStreamBuilder.table(), store names can be >>>>>>>>> nullable >>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> in which >>>>>>>>>>>>>>>>>>>>>> case the KTable would not be materialized; >>>>>>>>>>>>>>>>>>>>>> c) in some other cases, the KTable will never be >>>>>>>>> materialized, >>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>> example KTable.filter() resulted KTables, and users >>> have no >>>>>>>>>>>>> options to >>>>>>>>>>>>>>>>>>>>>> enforce them to be materialized; >>>>>>>>>>>>>>>>>>>>>> d) this is related to a), where some KTables are >>> required to >>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>> materialized, but we do not enforce users to provide a >>> state >>>>>>>>>>>>> store >>>>>>>>>>>>>>>>>>>>>> name, >>>>>>>>>>>>>>>>>>>>>> e.g. KTables involved in joins; a RTE will be thrown >>> not >>>>>>>>>>>>>>>>>>>>>> immediately but >>>>>>>>>>>>>>>>>>>>>> later in this case. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2. The second issue is related to IQ, where state >>> stores are >>>>>>>>>>>>>>>>>>>>>> accessed by >>>>>>>>>>>>>>>>>>>>>> their state stores; so only those KTable's that have >>>>>>>>>>>>> user-specified >>>>>>>>>>>>>>>>>>>>>> state >>>>>>>>>>>>>>>>>>>>>> stores will be queryable. But because of 1) above, many >>>>>>>>> stores >>>>>>>>>>>>> may >>>>>>>>>>>>>>>>>>>>>> not be >>>>>>>>>>>>>>>>>>>>>> interested to users for IQ but they still need to >>> provide a >>>>>>>>>>>>>>>>>>>>>> (dummy?) state >>>>>>>>>>>>>>>>>>>>>> store name for them; while on the other hand users >>> cannot >>>>>>>>> query >>>>>>>>>>>>>>>>>>>>>> some state >>>>>>>>>>>>>>>>>>>>>> stores, e.g. the ones generated by KTable.filter() as >>> there >>>>>>>>> is >>>>>>>>>>>> no >>>>>>>>>>>>>>>>>>>>>> APIs for >>>>>>>>>>>>>>>>>>>>>> them to specify a state store name. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 3. We are aware from user feedbacks that such backend >>> details >>>>>>>>>>>>> would be >>>>>>>>>>>>>>>>>>>>>> better be abstracted away from the DSL layer, where app >>>>>>>>>>>>> developers >>>>>>>>>>>>>>>>>>>>>> should >>>>>>>>>>>>>>>>>>>>>> just focus on processing logic, while state stores >>> along with >>>>>>>>>>>>> their >>>>>>>>>>>>>>>>>>>>>> changelogs etc would better be in a different >>> mechanism; same >>>>>>>>>>>>>>>>>>>>>> arguments >>>>>>>>>>>>>>>>>>>>>> have been discussed for serdes / windowing triggers as >>> well. >>>>>>>>>>>> For >>>>>>>>>>>>>>>>>>>>>> serdes >>>>>>>>>>>>>>>>>>>>>> specifically, we had a very long discussion about it >>> and >>>>>>>>>>>>> concluded >>>>>>>>>>>>>>>>>>>>>> that, at >>>>>>>>>>>>>>>>>>>>>> least in Java7, we cannot completely abstract serde >>> away in >>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> DSL, so we >>>>>>>>>>>>>>>>>>>>>> choose the other extreme to enforce users to be >>> completely >>>>>>>>>>>> aware >>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> serde requirements when some KTables may need to be >>>>>>>>>>>> materialized >>>>>>>>>>>>> vis >>>>>>>>>>>>>>>>>>>>>> overloaded API functions. While for the state store >>> names, I >>>>>>>>>>>> feel >>>>>>>>>>>>>>>>>>>>>> it is a >>>>>>>>>>>>>>>>>>>>>> different argument than serdes (details below). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> So to me, for either materialize() v.s. overloaded >>> functions >>>>>>>>>>>>>>>>>>>>>> directions, >>>>>>>>>>>>>>>>>>>>>> the first thing I'd like to resolve is the >>> inconsistency >>>>>>>>> issue >>>>>>>>>>>>>>>>>>>>>> mentioned >>>>>>>>>>>>>>>>>>>>>> above. So in either case: KTable materialization will >>> not be >>>>>>>>>>>>> affect >>>>>>>>>>>>>>>>>>>>>> by user >>>>>>>>>>>>>>>>>>>>>> providing state store name or not, but will only be >>> decided >>>>>>>>> by >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>>>>> when it is necessary. More specifically, only join >>> operator >>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> builder.table() resulted KTables are not always >>> materialized, >>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>>>>> are still >>>>>>>>>>>>>>>>>>>>>> likely to be materialized lazily (e.g. when >>> participated in a >>>>>>>>>>>>> join >>>>>>>>>>>>>>>>>>>>>> operator). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> For overloaded functions that would mean: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> a) we have an overloaded function for ALL operators >>> that >>>>>>>>> could >>>>>>>>>>>>>>>>>>>>>> result >>>>>>>>>>>>>>>>>>>>>> in a KTable, and allow it to be null (i.e. for the >>> function >>>>>>>>>>>>> without >>>>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>>> param it is null by default); >>>>>>>>>>>>>>>>>>>>>> b) null-state-store-name do not indicate that a KTable >>> would >>>>>>>>>>>>>>>>>>>>>> not be >>>>>>>>>>>>>>>>>>>>>> materialized, but that it will not be used for IQ at >>> all >>>>>>>>>>>>> (internal >>>>>>>>>>>>>>>>>>>>>> state >>>>>>>>>>>>>>>>>>>>>> store names will be generated when necessary). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> For materialize() that would mean: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> a) we will remove state store names from ALL operators >>> that >>>>>>>>>>>>> could >>>>>>>>>>>>>>>>>>>>>> result in a KTable. >>>>>>>>>>>>>>>>>>>>>> b) KTables that not calling materialized do not >>> indicate that >>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>> KTable >>>>>>>>>>>>>>>>>>>>>> would not be materialized, but that it will not be >>> used for >>>>>>>>> IQ >>>>>>>>>>>>> at all >>>>>>>>>>>>>>>>>>>>>> (internal state store names will be generated when >>>>>>>>> necessary). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Again, in either ways the API itself does not "hint" >>> about >>>>>>>>>>>>> anything >>>>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>> materializing a KTable or not at all; it is still >>> purely >>>>>>>>>>>>> determined >>>>>>>>>>>>>>>>>>>>>> by the >>>>>>>>>>>>>>>>>>>>>> library when parsing the DSL for now. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Following these thoughts, I feel that 1) we should >>> probably >>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>>>>>> the name >>>>>>>>>>>>>>>>>>>>>> "materialize" since it may be misleading to users as >>> what >>>>>>>>>>>>> actually >>>>>>>>>>>>>>>>>>>>>> happened >>>>>>>>>>>>>>>>>>>>>> behind the scene, to e.g. Damian suggested >>>>>>>>>>>> "queryableStore(String >>>>>>>>>>>>>>>>>>>>>> storeName)", >>>>>>>>>>>>>>>>>>>>>> which returns a QueryableStateStore, and can replace >>> the >>>>>>>>>>>>>>>>>>>>>> `KafkaStreams.store` function; 2) comparing those two >>> options >>>>>>>>>>>>>>>>>>>>>> assuming we >>>>>>>>>>>>>>>>>>>>>> get rid of the misleading function name, I personally >>> favor >>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>> adding more >>>>>>>>>>>>>>>>>>>>>> overloading functions as it keeps the API simpler. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Guozhang >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Jan Filipiak >>>>>>>>>>>>>>>>>>>>>> <jan.filip...@trivago.com> >>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> thanks for your mail, felt like this can clarify some >>>>>>>>> things! >>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>>>> thread >>>>>>>>>>>>>>>>>>>>>>> unfortunately split but as all branches close in on >>> what my >>>>>>>>>>>>>>>>>>>>>>> suggestion was >>>>>>>>>>>>>>>>>>>>>>> about Ill pick this to continue >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Of course only the table the user wants to query >>> would be >>>>>>>>>>>>>>>>>>>>>>> materialized. >>>>>>>>>>>>>>>>>>>>>>> (retrieving the queryhandle implies materialisation). >>> So In >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> example of >>>>>>>>>>>>>>>>>>>>>>> KTable::filter if you call >>>>>>>>>>>>>>>>>>>>>>> getIQHandle on both tables only the one source that >>> is there >>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> materialize and the QueryHandleabstraction would make >>> sure >>>>>>>>> it >>>>>>>>>>>>> gets >>>>>>>>>>>>>>>>>>>>>>> mapped >>>>>>>>>>>>>>>>>>>>>>> and filtered and what not uppon read as usual. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Of Course the Object you would retrieve would maybe >>> only >>>>>>>>> wrap >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> storeName / table unique identifier and a way to >>> access the >>>>>>>>>>>>> streams >>>>>>>>>>>>>>>>>>>>>>> instance and then basically uses the same mechanism >>> that is >>>>>>>>>>>>>>>>>>>>>>> currently used. >>>>>>>>>>>>>>>>>>>>>>> From my point of view this is the least confusing way >>> for >>>>>>>>> DSL >>>>>>>>>>>>>>>>>>>>>>> users. If >>>>>>>>>>>>>>>>>>>>>>> its to tricky to get a hand on the streams instance >>> one >>>>>>>>> could >>>>>>>>>>>>> ask >>>>>>>>>>>>>>>>>>>>>>> the user >>>>>>>>>>>>>>>>>>>>>>> to pass it in before executing queries, therefore >>> making >>>>>>>>> sure >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> streams >>>>>>>>>>>>>>>>>>>>>>> instance has been build. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> The effort to implement this is indeed some orders of >>>>>>>>>>>> magnitude >>>>>>>>>>>>>>>>>>>>>>> higher >>>>>>>>>>>>>>>>>>>>>>> than the overloaded materialized call. As long as I >>> could >>>>>>>>> help >>>>>>>>>>>>>>>>>>>>>>> getting a >>>>>>>>>>>>>>>>>>>>>>> different view I am happy. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Best Jan >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 28.01.2017 09:36, Eno Thereska wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi Jan, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I understand your concern. One implication of not >>> passing >>>>>>>>> any >>>>>>>>>>>>>>>>>>>>>>>> store name >>>>>>>>>>>>>>>>>>>>>>>> and just getting an IQ handle is that all KTables >>> would >>>>>>>>> need >>>>>>>>>>>>> to be >>>>>>>>>>>>>>>>>>>>>>>> materialised. Currently the store name (or proposed >>>>>>>>>>>>>>>>>>>>>>>> .materialize() call) >>>>>>>>>>>>>>>>>>>>>>>> act as hints on whether to materialise the KTable or >>> not. >>>>>>>>>>>>>>>>>>>>>>>> Materialising >>>>>>>>>>>>>>>>>>>>>>>> every KTable can be expensive, although there are >>> some >>>>>>>>> tricks >>>>>>>>>>>>> one >>>>>>>>>>>>>>>>>>>>>>>> can play, >>>>>>>>>>>>>>>>>>>>>>>> e.g., have a virtual store rather than one backed by >>> a >>>>>>>>> Kafka >>>>>>>>>>>>> topic. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> However, even with the above, after getting an IQ >>> handle, >>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> user would >>>>>>>>>>>>>>>>>>>>>>>> still need to use IQ APIs to query the state. As >>> such, we >>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>> still >>>>>>>>>>>>>>>>>>>>>>>> continue to be outside the original DSL so this >>> wouldn't >>>>>>>>>>>>> address >>>>>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>>>>> original concern. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> So I read this suggestion as simplifying the APIs by >>>>>>>>> removing >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> store >>>>>>>>>>>>>>>>>>>>>>>> name, at the cost of having to materialise every >>> KTable. >>>>>>>>> It's >>>>>>>>>>>>>>>>>>>>>>>> definitely an >>>>>>>>>>>>>>>>>>>>>>>> option we'll consider as part of this KIP. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>>>> Eno >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 28 Jan 2017, at 06:49, Jan Filipiak < >>>>>>>>>>>>> jan.filip...@trivago.com> >>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Hi Exactly >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I know it works from the Processor API, but my >>> suggestion >>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>> prevent >>>>>>>>>>>>>>>>>>>>>>>>> DSL users dealing with storenames what so ever. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> In general I am pro switching between DSL and >>> Processor >>>>>>>>> API >>>>>>>>>>>>>>>>>>>>>>>>> easily. (In >>>>>>>>>>>>>>>>>>>>>>>>> my Stream applications I do this a lot with >>> reflection and >>>>>>>>>>>>>>>>>>>>>>>>> instanciating >>>>>>>>>>>>>>>>>>>>>>>>> KTableImpl) Concerning this KIP all I say is that >>> there >>>>>>>>>>>> should >>>>>>>>>>>>>>>>>>>>>>>>> be a DSL >>>>>>>>>>>>>>>>>>>>>>>>> concept of "I want to expose this __KTable__. This >>> can be >>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>> Method like >>>>>>>>>>>>>>>>>>>>>>>>> KTable::retrieveIQHandle():InteractiveQueryHandle, >>> the >>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>>>>>>>> would know >>>>>>>>>>>>>>>>>>>>>>>>> to materialize, and the user had a reference to the >>> "store >>>>>>>>>>>>> and the >>>>>>>>>>>>>>>>>>>>>>>>> distributed query mechanism by the Interactive Query >>>>>>>>> Handle" >>>>>>>>>>>>>>>>>>>>>>>>> under the hood >>>>>>>>>>>>>>>>>>>>>>>>> it can use the same mechanism as the PIP people >>> again. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I hope you see my point J >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Best Jan >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> #DeathToIQMoreAndBetterConnectors :) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 27.01.2017 21:59, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Jan, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> the IQ feature is not limited to Streams DSL but >>> can also >>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>> used for >>>>>>>>>>>>>>>>>>>>>>>>>> Stores used in PAPI. Thus, we need a mechanism >>> that does >>>>>>>>>>>> work >>>>>>>>>>>>>>>>>>>>>>>>>> for PAPI >>>>>>>>>>>>>>>>>>>>>>>>>> and DSL. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Nevertheless I see your point and I think we could >>>>>>>>> provide >>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>> better API >>>>>>>>>>>>>>>>>>>>>>>>>> for KTable stores including the discovery of remote >>>>>>>>> shards >>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>>>>>> the same >>>>>>>>>>>>>>>>>>>>>>>>>> KTable. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> @Michael: Yes, right now we do have a lot of >>> overloads >>>>>>>>> and >>>>>>>>>>>> I >>>>>>>>>>>>> am >>>>>>>>>>>>>>>>>>>>>>>>>> not a >>>>>>>>>>>>>>>>>>>>>>>>>> big fan of those -- I would rather prefer a builder >>>>>>>>>>>> pattern. >>>>>>>>>>>>>>>>>>>>>>>>>> But that >>>>>>>>>>>>>>>>>>>>>>>>>> might be a different discussion (nevertheless, if >>> we >>>>>>>>> would >>>>>>>>>>>>> aim >>>>>>>>>>>>>>>>>>>>>>>>>> for a API >>>>>>>>>>>>>>>>>>>>>>>>>> rework, we should get the changes with regard to >>> stores >>>>>>>>>>>> right >>>>>>>>>>>>>>>>>>>>>>>>>> from the >>>>>>>>>>>>>>>>>>>>>>>>>> beginning on, in order to avoid a redesign later >>> on.) >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> something like: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> stream.groupyByKey() >>>>>>>>>>>>>>>>>>>>>>>>>> .window(TimeWindow.of(5000)) >>>>>>>>>>>>>>>>>>>>>>>>>> .aggregate(...) >>>>>>>>>>>>>>>>>>>>>>>>>> .withAggValueSerde(new CustomTypeSerde()) >>>>>>>>>>>>>>>>>>>>>>>>>> .withStoreName("storeName); >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> (This would also reduce JavaDoc redundancy -- >>> maybe a >>>>>>>>>>>>> personal >>>>>>>>>>>>>>>>>>>>>>>>>> pain >>>>>>>>>>>>>>>>>>>>>>>>>> point right now :)) >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 1/27/17 11:10 AM, Jan Filipiak wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Yeah, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Maybe my bad that I refuse to look into IQ as i >>> don't >>>>>>>>> find >>>>>>>>>>>>> them >>>>>>>>>>>>>>>>>>>>>>>>>>> anywhere >>>>>>>>>>>>>>>>>>>>>>>>>>> close to being interesting. The Problem IMO is >>> that >>>>>>>>> people >>>>>>>>>>>>>>>>>>>>>>>>>>> need to know >>>>>>>>>>>>>>>>>>>>>>>>>>> the Store name), so we are working on different >>> levels >>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>> achieve a >>>>>>>>>>>>>>>>>>>>>>>>>>> single goal. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> What is your peoples opinion on having a method on >>>>>>>>> KTABLE >>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>> returns >>>>>>>>>>>>>>>>>>>>>>>>>>> them something like a Keyvalue store. There is of >>> course >>>>>>>>>>>>>>>>>>>>>>>>>>> problems like >>>>>>>>>>>>>>>>>>>>>>>>>>> "it cant be used before the streamthreads are >>> going and >>>>>>>>>>>>>>>>>>>>>>>>>>> groupmembership >>>>>>>>>>>>>>>>>>>>>>>>>>> is established..." but the benefit would be that >>> for the >>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>>>>>>>> there is >>>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>> consistent way of saying "Hey I need it >>> materialized as >>>>>>>>>>>>>>>>>>>>>>>>>>> querries gonna >>>>>>>>>>>>>>>>>>>>>>>>>>> be comming" + already get a Thing that he can >>> execute >>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>> querries on >>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>> 1 step. >>>>>>>>>>>>>>>>>>>>>>>>>>> What I think is unintuitive here is you need to >>> say >>>>>>>>>>>>>>>>>>>>>>>>>>> materialize on this >>>>>>>>>>>>>>>>>>>>>>>>>>> Ktable and then you go somewhere else and find >>> its store >>>>>>>>>>>>> name >>>>>>>>>>>>>>>>>>>>>>>>>>> and then >>>>>>>>>>>>>>>>>>>>>>>>>>> you go to the kafkastreams instance and ask for >>> the >>>>>>>>> store >>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>>>>>>>> name. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> So one could the user help to stay in DSL land and >>>>>>>>>>>> therefore >>>>>>>>>>>>>>>>>>>>>>>>>>> maybe >>>>>>>>>>>>>>>>>>>>>>>>>>> confuse him less. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best Jan >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> #DeathToIQMoreAndBetterConnectors :) >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 27.01.2017 16:51, Damian Guy wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I think Jan is saying that they don't always >>> need to be >>>>>>>>>>>>>>>>>>>>>>>>>>>> materialized, >>>>>>>>>>>>>>>>>>>>>>>>>>>> i.e., >>>>>>>>>>>>>>>>>>>>>>>>>>>> filter just needs to apply the ValueGetter, it >>> doesn't >>>>>>>>>>>>> need yet >>>>>>>>>>>>>>>>>>>>>>>>>>>> another >>>>>>>>>>>>>>>>>>>>>>>>>>>> physical state store. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, 27 Jan 2017 at 15:49 Michael Noll < >>>>>>>>>>>>> mich...@confluent.io> >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Like Damian, and for the same reasons, I am more >>> in >>>>>>>>> favor >>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>>>>>>>>> overloading >>>>>>>>>>>>>>>>>>>>>>>>>>>>> methods rather than introducing `materialize()`. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FWIW, we already have a similar API setup for >>> e.g. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> `KTable#through(topicName, stateStoreName)`. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> A related but slightly different question is >>> what e.g. >>>>>>>>>>>> Jan >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Filipiak >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mentioned earlier in this thread: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think we need to explain more clearly why >>> KIP-114 >>>>>>>>>>>>> doesn't >>>>>>>>>>>>>>>>>>>>>>>>>>>>> propose >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> seemingly simpler solution of always >>> materializing >>>>>>>>>>>>> tables/state >>>>>>>>>>>>>>>>>>>>>>>>>>>>> stores. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 27, 2017 at 4:38 PM, Jan Filipiak < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> jan.filip...@trivago.com> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yeah its confusing, Why shoudn't it be >>> querable by >>>>>>>>> IQ? >>>>>>>>>>>> If >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you uses >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueGetter of Filter it will apply the filter >>> and >>>>>>>>>>>>> should be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> completely >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transparent as to if another processor or IQ is >>>>>>>>>>>> accessing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it? How >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new method help? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I cannot see the reason for the additional >>>>>>>>> materialize >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method being >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> required! Hence I suggest leave it alone. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> regarding removing the others I dont have >>> strong >>>>>>>>>>>> opinions >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> seems to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be unrelated. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best Jan >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 26.01.2017 20:48, Eno Thereska wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Forwarding this thread to the users list too >>> in case >>>>>>>>>>>>> people >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comment. It is also on the dev list. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eno >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Begin forwarded message: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From: "Matthias J. Sax" < >>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: [DISCUSS] KIP-114: KTable >>>>>>>>>>>> materialization >>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> improved >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> semantics >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Date: 24 January 2017 at 19:30:10 GMT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To: dev@kafka.apache.org >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Reply-To: dev@kafka.apache.org >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That not what I meant by "huge impact". >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I refer to the actions related to >>> materialize a >>>>>>>>>>>> KTable: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> creating a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> RocksDB store and a changelog topic -- users >>> should >>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> aware about >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> runtime implication and this is better >>> expressed by >>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> explicit >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> call, rather than implicitly triggered by >>> using a >>>>>>>>>>>>> different >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> overload of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a method. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 1/24/17 1:35 AM, Damian Guy wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think your definition of a huge impact and >>> mine >>>>>>>>> are >>>>>>>>>>>>> rather >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ;-P >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Overloading a few methods is not really a >>> huge >>>>>>>>>>>> impact >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> IMO. It is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sacrifice worth making for readability, >>> usability of >>>>>>>>>>>> the >>>>>>>>>>>>> API. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, 23 Jan 2017 at 17:55 Matthias J. >>> Sax < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I understand your argument, but do not >>> agree with >>>>>>>>>>>> it. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your first version (even if the "flow" is >>> not as >>>>>>>>>>>>> nice) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is more >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> explicit >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> than the second version. Adding a >>> stateStoreName >>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is quite >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implicit but has a huge impact -- thus, I >>> prefer >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rather more >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verbose >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but explicit version. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 1/23/17 1:39 AM, Damian Guy wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm not a fan of materialize. I think it >>>>>>>>> interrupts >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flow, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> i.e, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> table.mapValue(..).materialize().join(..).materialize() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compared to: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table.mapValues(..).join(..) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I know which one i prefer. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> My preference is stil to provide >>> overloaded >>>>>>>>>>>> methods >>>>>>>>>>>>> where >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> people can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specify the store names if they want, >>> otherwise >>>>>>>>> we >>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> generate >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, 23 Jan 2017 at 05:30 Matthias J. Sax >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for the KIP Eno! Here are my 2 >>> cents: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1) I like Guozhang's proposal about >>> removing >>>>>>>>>>>> store >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> name from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> methods and generate internal names >>> (however, I >>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do this >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> overloads). Furthermore, I would not >>> force >>>>>>>>> users >>>>>>>>>>>>> to call >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .materialize() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if they want to query a store, but add >>> one more >>>>>>>>>>>>> method >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .stateStoreName() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that returns the store name if the >>> KTable is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialized. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thus, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .materialize() must not necessarily have a >>> parameter >>>>>>>>>>>>> storeName >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (ie, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should have some overloads here). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would also not allow to provide a >>> null store >>>>>>>>>>>>> name (to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> indicate no >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialization if not necessary) but >>> throw an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exception. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This yields some simplification (see >>> below). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2) I also like Guozhang's proposal about >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KStream#toTable() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. What will happen when you call >>> materialize >>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> already >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialized? Will it create another >>>>>>>>> StateStore >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (providing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> name >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different), throw an Exception? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently an exception is thrown, but >>> see >>>>>>>>> below. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we follow approach (1) from >>> Guozhang, there >>>>>>>>>>>> is >>>>>>>>>>>>> no >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> need to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> worry >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a second materialization and also no >>> exception >>>>>>>>>>>>> must be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> throws. A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> call to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .materialize() basically sets a >>> "materialized >>>>>>>>>>>>> flag" (ie, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> idempotent >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operation) and sets a new name. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Rename toStream() to toKStream() for >>>>>>>>> consistency. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Not sure whether that is really >>> required. We >>>>>>>>>>>> also >>>>>>>>>>>>> use >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `KStreamBuilder#stream()` and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `KStreamBuilder#table()`, for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't care about the "K" prefix. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eno's reply: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think changing it to `toKStream` >>> would make >>>>>>>>> it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> absolutely >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> clear >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we are converting it to. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'd say we should probably change the >>>>>>>>>>>>> KStreamBuilder >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> methods >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (but >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this KIP). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would keep #toStream(). (see below) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 5) We should not remove any methods but >>> only >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> deprecate them. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A general note: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I do not understand your comments >>> "Rejected >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alternatives". You >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> say >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "Have >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the KTable be the materialized view" was >>>>>>>>>>>> rejected. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> But your >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actually >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does exactly this -- the changelog >>> abstraction >>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KTable is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> secondary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after those changes and the "view" abstraction >>> is >>>>>>>>> what >>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KTable is. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just to be clear, I like this a lot: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - it aligns with the name KTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - is aligns with stream-table-duality >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - it aligns with IQ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would say that a KTable is a "view >>>>>>>>> abstraction" >>>>>>>>>>>>> (as >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialization is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> optional). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 1/22/17 5:05 PM, Guozhang Wang wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP Eno, I have a few >>> meta >>>>>>>>>>>> comments >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and a few >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> detailed >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comments: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. I like the materialize() function in >>>>>>>>> general, >>>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how other KTable functions should be >>> updated >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. For >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >> >> >> -- >> -- Guozhang >
signature.asc
Description: OpenPGP digital signature