Eno, re: GlobalKTable - yeah that seems fine.

On Tue, 11 Apr 2017 at 14:18 Eno Thereska <eno.there...@gmail.com> wrote:

> About GlobalKTables, I suppose there is no reason why they cannot also use
> this KIP for consistency, e.g., today you have:
>
> public <K, V> GlobalKTable<K, V> globalTable(final Serde<K> keySerde,
>                                              final Serde<V> valSerde,
>                                              final String topic,
>                                              final String storeName)
>
> For consistency with the KIP you could also have an overload without the
> store name, for people who want to construct a global ktable, but don't
> care about querying it directly:
>
> public <K, V> GlobalKTable<K, V> globalTable(final Serde<K> keySerde,
>                                              final Serde<V> valSerde,
>                                              final String topic)
>
> Damian, what do you think? I'm thinking of adding this to KIP. Thanks to
> Michael for bringing it up.
>
> Eno
>
>
>
> > On 11 Apr 2017, at 06:13, Eno Thereska <eno.there...@gmail.com> wrote:
> >
> > Hi Michael, comments inline:
> >
> >> On 11 Apr 2017, at 03:25, Michael Noll <mich...@confluent.io> wrote:
> >>
> >> Thanks for the updates, Eno!
> >>
> >> In addition to what has already been said:  We should also explicitly
> >> mention that this KIP is not touching GlobalKTable.  I'm sure that some
> >> users will throw KTable and GlobalKTable into one conceptual "it's all
> >> tables!" bucket and then wonder how the KIP might affect global tables.
> >
> > Good point, I'll add.
> >
> >
> >>
> >> Damian wrote:
> >>> I think if no store name is provided users would still be able to query
> >> the
> >>> store, just the store name would be some internally generated name.
> They
> >>> would be able to discover those names via the IQ API.
> >>
> >> I, too, think that users should be able to query a store even if its
> name
> >> was internally generated.  After all, the data is already there /
> >> materialized.
> >
> > Yes, there is nothing that will prevent users from querying internally
> generated stores, but they cannot
> > assume a store will necessarily be queryable. So if it's there, they can
> query it. If it's not there, and they didn't
> > provide a queryable name, they cannot complain and say "hey, where is my
> store". If they must absolutely be certain that
> > a store is queryable, then they must provide a queryable name.
> >
> >
> >>
> >>
> >> Damian wrote:
> >>> I think for some stores it will make sense to not create a physical
> >> store, i.e.,
> >>> for thinks like `filter`, as this will save the rocksdb overhead. But i
> >> guess that
> >>> is more of an implementation detail.
> >>
> >> I think it would help if the KIP would clarify what we'd do in such a
> >> case.  For example, if the user did not specify a store name for
> >> `KTable#filter` -- would it be queryable?  If so, would this imply we'd
> >> always materialize the state store, or...?
> >
> > I'll clarify in the KIP with some more examples. Materialization will be
> an internal concept. A store can be queryable whether it's materialized or
> not
> > (e.g., through advanced implementations that compute the value of a
> filter on a fly, rather than materialize the answer).
> >
> > Thanks,
> > Eno
> >
> >
> >>
> >> -Michael
> >>
> >>
> >>
> >>
> >> On Tue, Apr 11, 2017 at 9:14 AM, Damian Guy <damian....@gmail.com>
> wrote:
> >>
> >>> Hi Eno,
> >>>
> >>> Thanks for the update. I agree with what Matthias said. I wonder if
> the KIP
> >>> should talk less about materialization and more about querying? After
> all,
> >>> that is what is being provided from an end-users perspective.
> >>>
> >>> I think if no store name is provided users would still be able to
> query the
> >>> store, just the store name would be some internally generated name.
> They
> >>> would be able to discover those names via the IQ API
> >>>
> >>> I think for some stores it will make sense to not create a physical
> store,
> >>> i.e., for thinks like `filter`, as this will save the rocksdb
> overhead. But
> >>> i guess that is more of an implementation detail.
> >>>
> >>> Cheers,
> >>> Damian
> >>>
> >>> On Tue, 11 Apr 2017 at 00:36 Eno Thereska <eno.there...@gmail.com>
> wrote:
> >>>
> >>>> Hi Matthias,
> >>>>
> >>>>> However, this still forces users, to provide a name for store that we
> >>>>> must materialize, even if users are not interested in querying the
> >>>>> stores. Thus, I would like to have overloads for all currently
> existing
> >>>>> methods having mandatory storeName paremeter, with overloads, that do
> >>>>> not require the storeName parameter.
> >>>>
> >>>>
> >>>> Oh yeah, absolutely, this is part of the KIP. I guess I didn't make it
> >>>> clear, I'll clarify.
> >>>>
> >>>> Thanks
> >>>> Eno
> >>>>
> >>>>
> >>>>> On 10 Apr 2017, at 16:00, Matthias J. Sax <matth...@confluent.io>
> >>> wrote:
> >>>>>
> >>>>> Thanks for pushing this KIP Eno.
> >>>>>
> >>>>> The update give a very clear description about the scope, that is
> super
> >>>>> helpful for the discussion!
> >>>>>
> >>>>> - To put it into my own words, the KIP focus is on enable to query
> all
> >>>>> KTables.
> >>>>> ** The ability to query a store is determined by providing a name for
> >>>>> the store.
> >>>>> ** At the same time, providing a name -- and thus making a store
> >>>>> queryable -- does not say anything about an actual materialization
> (ie,
> >>>>> being queryable and being materialized are orthogonal).
> >>>>>
> >>>>>
> >>>>> I like this overall a lot. However, I would go one step further.
> Right
> >>>>> now, you suggest to add new overload methods that allow users to
> >>> specify
> >>>>> a storeName -- if `null` is provided and the store is not
> materialized,
> >>>>> we ignore it completely -- if `null` is provided but the store must
> be
> >>>>> materialized we generate a internal name. So far so good.
> >>>>>
> >>>>> However, this still forces users, to provide a name for store that we
> >>>>> must materialize, even if users are not interested in querying the
> >>>>> stores. Thus, I would like to have overloads for all currently
> existing
> >>>>> methods having mandatory storeName paremeter, with overloads, that do
> >>>>> not require the storeName parameter.
> >>>>>
> >>>>> Otherwise, we would still have some methods which optional storeName
> >>>>> parameter and other method with mandatory storeName parameter --
> thus,
> >>>>> still some inconsistency.
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>>
> >>>>> On 4/9/17 8:35 AM, Eno Thereska wrote:
> >>>>>> Hi there,
> >>>>>>
> >>>>>> I've now done a V2 of the KIP, that hopefully addresses the feedback
> >>> in
> >>>> this discussion thread:
> >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 114%3A+KTable+materialization+and+improved+semantics
> >>>> <
> >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 114:+KTable+materialization+and+improved+semantics>.
> >>>> Notable changes:
> >>>>>>
> >>>>>> - clearly outline what is in the scope of the KIP and what is not.
> We
> >>>> ran into the issue where lots of useful, but somewhat tangential
> >>>> discussions came up on interactive queries, declarative DSL etc. The
> >>> exact
> >>>> scope of this KIP is spelled out.
> >>>>>> - decided to go with overloaded methods, not .materialize(), to stay
> >>>> within the spirit of the current declarative DSL.
> >>>>>> - clarified the depreciation plan
> >>>>>> - listed part of the discussion we had under rejected alternatives
> >>>>>>
> >>>>>> If you have any further feedback on this, let's continue on this
> >>> thread.
> >>>>>>
> >>>>>> Thank you
> >>>>>> Eno
> >>>>>>
> >>>>>>
> >>>>>>> On 1 Feb 2017, at 09:04, Eno Thereska <eno.there...@gmail.com>
> >>> wrote:
> >>>>>>>
> >>>>>>> Thanks everyone! I think it's time to do a V2 on the KIP so I'll do
> >>>> that and we can see how it looks and continue the discussion from
> there.
> >>>> Stay tuned.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Eno
> >>>>>>>
> >>>>>>>> On 30 Jan 2017, at 17:23, Matthias J. Sax <matth...@confluent.io>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I think Eno's separation is very clear and helpful. In order to
> >>>>>>>> streamline this discussion, I would suggest we focus back on point
> >>> (1)
> >>>>>>>> only, as this is the original KIP question.
> >>>>>>>>
> >>>>>>>> Even if I started to DSL design discussion somehow, because I
> >>> thought
> >>>> it
> >>>>>>>> might be helpful to resolve both in a single shot, I feel that we
> >>> have
> >>>>>>>> too many options about DSL design and we should split it up in two
> >>>>>>>> steps. This will have the disadvantage that we will change the API
> >>>>>>>> twice, but still, I think it will be a more focused discussion.
> >>>>>>>>
> >>>>>>>> I just had another look at the KIP, an it proposes 3 changes:
> >>>>>>>>
> >>>>>>>> 1. add .materialized() -> IIRC it was suggested to name this
> >>>>>>>> .materialize() though (can you maybe update the KIP Eno?)
> >>>>>>>> 2. remove print(), writeAsText(), and foreach()
> >>>>>>>> 3. rename toStream() to toKStream()
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I completely agree with (2) -- not sure about (3) though because
> >>>>>>>> KStreamBuilder also hast .stream() and .table() as methods.
> >>>>>>>>
> >>>>>>>> However, we might want to introduce a KStream#toTable() -- this
> was
> >>>>>>>> requested multiple times -- might also be part of a different KIP.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thus, we end up with (1). I would suggest to do a step backward
> here
> >>>> and
> >>>>>>>> instead of a discussion how to express the changes in the DSL (new
> >>>>>>>> overload, new methods...) we should discuss what the actual change
> >>>>>>>> should be. Like (1) materialize all KTable all the time (2) all
> the
> >>>> user
> >>>>>>>> to force a materialization to enable querying the KTable (3) allow
> >>> for
> >>>>>>>> queryable non-materialized KTable.
> >>>>>>>>
> >>>>>>>> On more question is, if we want to allow a user-forced
> >>> materialization
> >>>>>>>> only as as local store without changelog, or both (together /
> >>>>>>>> independently)? We got some request like this already.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 1/30/17 3:50 AM, Jan Filipiak wrote:
> >>>>>>>>> Hi Eno,
> >>>>>>>>>
> >>>>>>>>> thanks for putting into different points. I want to put a few
> >>> remarks
> >>>>>>>>> inline.
> >>>>>>>>>
> >>>>>>>>> Best Jan
> >>>>>>>>>
> >>>>>>>>> On 30.01.2017 12:19, Eno Thereska wrote:
> >>>>>>>>>> So I think there are several important discussion threads that
> are
> >>>>>>>>>> emerging here. Let me try to tease them apart:
> >>>>>>>>>>
> >>>>>>>>>> 1. inconsistency in what is materialized and what is not, what
> is
> >>>>>>>>>> queryable and what is not. I think we all agree there is some
> >>>>>>>>>> inconsistency there and this will be addressed with any of the
> >>>>>>>>>> proposed approaches. Addressing the inconsistency is the point
> of
> >>>> the
> >>>>>>>>>> original KIP.
> >>>>>>>>>>
> >>>>>>>>>> 2. the exact API for materializing a KTable. We can specify 1) a
> >>>>>>>>>> "store name" (as we do today) or 2) have a ".materialize[d]"
> call
> >>> or
> >>>>>>>>>> 3) get a handle from a KTable ".getQueryHandle" or 4) have a
> >>> builder
> >>>>>>>>>> construct. So we have discussed 4 options. It is important to
> >>>> remember
> >>>>>>>>>> in this discussion that IQ is not designed for just local
> queries,
> >>>> but
> >>>>>>>>>> also for distributed queries. In all cases an identifying
> name/id
> >>> is
> >>>>>>>>>> needed for the store that the user is interested in querying. So
> >>> we
> >>>>>>>>>> end up with a discussion on who provides the name, the user (as
> >>> done
> >>>>>>>>>> today) or if it is generated automatically (as Jan suggests, as
> I
> >>>>>>>>>> understand it). If it is generated automatically we need a way
> to
> >>>>>>>>>> expose these auto-generated names to the users and link them to
> >>> the
> >>>>>>>>>> KTables they care to query.
> >>>>>>>>> Hi, the last sentence is what I currently arguing against. The
> user
> >>>>>>>>> would never see a stringtype indentifier name or anything. All he
> >>>> gets
> >>>>>>>>> is the queryHandle if he executes a get(K) that will be an
> >>>> interactive
> >>>>>>>>> query get. with all the finding the right servers that currently
> >>>> have a
> >>>>>>>>> copy of this underlying store stuff going on. The nice part is
> that
> >>>> if
> >>>>>>>>> someone retrieves a queryHandle, you know that you have to
> >>>> materialized
> >>>>>>>>> (if you are not already) as queries will be coming. Taking away
> the
> >>>>>>>>> confusion mentioned in point 1 IMO.
> >>>>>>>>>>
> >>>>>>>>>> 3. The exact boundary between the DSL, that is the processing
> >>>>>>>>>> language, and the storage/IQ queries, and how we jump from one
> to
> >>>> the
> >>>>>>>>>> other. This is mostly for how we get a handle on a store (so
> it's
> >>>>>>>>>> related to point 2), rather than for how we query the store. I
> >>> think
> >>>>>>>>>> we all agree that we don't want to limit ways one can query a
> >>> store
> >>>>>>>>>> (e.g., using gets or range queries etc) and the query APIs are
> not
> >>>> in
> >>>>>>>>>> the scope of the DSL.
> >>>>>>>>> Does the IQ work with range currently? The range would have to be
> >>>>>>>>> started on all stores and then merged by maybe the client. Range
> >>>> force a
> >>>>>>>>> flush to RocksDB currently so I am sure you would get a
> performance
> >>>> hit
> >>>>>>>>> right there. Time-windows might be okay, but I am not sure if the
> >>>> first
> >>>>>>>>> version should offer the user range access.
> >>>>>>>>>>
> >>>>>>>>>> 4. The nature of the DSL and whether its declarative enough, or
> >>>>>>>>>> flexible enough. Damian made the point that he likes the builder
> >>>>>>>>>> pattern since users can specify, per KTable, things like caching
> >>> and
> >>>>>>>>>> logging needs. His observation (as I understand it) is that the
> >>>>>>>>>> processor API (PAPI) is flexible but doesn't provide any help at
> >>> all
> >>>>>>>>>> to users. The current DSL provides declarative abstractions, but
> >>>> it's
> >>>>>>>>>> not fine-grained enough. This point is much broader than the
> KIP,
> >>>> but
> >>>>>>>>>> discussing it in this KIPs context is ok, since we don't want to
> >>>> make
> >>>>>>>>>> small piecemeal changes and then realise we're not in the spot
> we
> >>>> want
> >>>>>>>>>> to be.
> >>>>>>>>> This is indeed much broader. My guess here is that's why both
> API's
> >>>>>>>>> exists and helping the users to switch back and forth might be a
> >>>> thing.
> >>>>>>>>>>
> >>>>>>>>>> Feel free to pitch in if I have misinterpreted something.
> >>>>>>>>>>
> >>>>>>>>>> Thanks
> >>>>>>>>>> Eno
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On 30 Jan 2017, at 10:22, Jan Filipiak <
> jan.filip...@trivago.com
> >>>>
> >>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Eno,
> >>>>>>>>>>>
> >>>>>>>>>>> I have a really hard time understanding why we can't. From my
> >>> point
> >>>>>>>>>>> of view everything could be super elegant DSL only + public api
> >>> for
> >>>>>>>>>>> the PAPI-people as already exist.
> >>>>>>>>>>>
> >>>>>>>>>>> The above aproach implementing a .get(K) on KTable is foolisch
> in
> >>>> my
> >>>>>>>>>>> opinion as it would be to late to know that materialisation
> would
> >>>> be
> >>>>>>>>>>> required.
> >>>>>>>>>>> But having an API that allows to indicate I want to query this
> >>>> table
> >>>>>>>>>>> and then wrapping the say table's processorname can work out
> >>> really
> >>>>>>>>>>> really nice. The only obstacle I see is people not willing to
> >>> spend
> >>>>>>>>>>> the additional time in implementation and just want a quick
> shot
> >>>>>>>>>>> option to make it work.
> >>>>>>>>>>>
> >>>>>>>>>>> For me it would look like this:
> >>>>>>>>>>>
> >>>>>>>>>>> table =  builder.table()
> >>>>>>>>>>> filteredTable = table.filter()
> >>>>>>>>>>> rawHandle = table.getQueryHandle() // Does the materialisation,
> >>>>>>>>>>> really all names possible but id rather hide the implication of
> >>> it
> >>>>>>>>>>> materializes
> >>>>>>>>>>> filteredTableHandle = filteredTable.getQueryHandle() // this
> >>> would
> >>>>>>>>>>> _not_ materialize again of course, the source or the aggregator
> >>>> would
> >>>>>>>>>>> stay the only materialized processors
> >>>>>>>>>>> streams = new streams(builder)
> >>>>>>>>>>>
> >>>>>>>>>>> This middle part is highly flexible I could imagin to force the
> >>>> user
> >>>>>>>>>>> todo something like this. This implies to the user that his
> >>> streams
> >>>>>>>>>>> need to be running
> >>>>>>>>>>> instead of propagating the missing initialisation back by
> >>>> exceptions.
> >>>>>>>>>>> Also if the users is forced to pass the appropriate streams
> >>>> instance
> >>>>>>>>>>> back can change.
> >>>>>>>>>>> I think its possible to build multiple streams out of  one
> >>> topology
> >>>>>>>>>>> so it would be easiest to implement aswell. This is just what I
> >>>> maybe
> >>>>>>>>>>> had liked the most
> >>>>>>>>>>>
> >>>>>>>>>>> streams.start();
> >>>>>>>>>>> rawHandle.prepare(streams)
> >>>>>>>>>>> filteredHandle.prepare(streams)
> >>>>>>>>>>>
> >>>>>>>>>>> later the users can do
> >>>>>>>>>>>
> >>>>>>>>>>> V value = rawHandle.get(K)
> >>>>>>>>>>> V value = filteredHandle.get(K)
> >>>>>>>>>>>
> >>>>>>>>>>> This could free DSL users from anything like storenames and how
> >>> and
> >>>>>>>>>>> what to materialize. Can someone indicate what the problem
> would
> >>> be
> >>>>>>>>>>> implementing it like this.
> >>>>>>>>>>> Yes I am aware that the current IQ API will not support
> querying
> >>> by
> >>>>>>>>>>> KTableProcessorName instread of statestoreName. But I think
> that
> >>>> had
> >>>>>>>>>>> to change if you want it to be intuitive
> >>>>>>>>>>> IMO you gotta apply the filter read time
> >>>>>>>>>>>
> >>>>>>>>>>> Looking forward to your opinions
> >>>>>>>>>>>
> >>>>>>>>>>> Best Jan
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> #DeathToIQMoreAndBetterConnectors
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 30.01.2017 10:42, Eno Thereska wrote:
> >>>>>>>>>>>> Hi there,
> >>>>>>>>>>>>
> >>>>>>>>>>>> The inconsistency will be resolved, whether with materialize
> or
> >>>>>>>>>>>> overloaded methods.
> >>>>>>>>>>>>
> >>>>>>>>>>>> With the discussion on the DSL & stores I feel we've gone in a
> >>>>>>>>>>>> slightly different tangent, which is worth discussing
> >>> nonetheless.
> >>>>>>>>>>>> We have entered into an argument around the scope of the DSL.
> >>> The
> >>>>>>>>>>>> DSL has been designed primarily for processing. The DSL does
> not
> >>>>>>>>>>>> dictate ways to access state stores or what hind of queries to
> >>>>>>>>>>>> perform on them. Hence, I see the mechanism for accessing
> >>> storage
> >>>> as
> >>>>>>>>>>>> decoupled from the DSL.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We could think of ways to get store handles from part of the
> >>> DSL,
> >>>>>>>>>>>> like the KTable abstraction. However, subsequent queries will
> be
> >>>>>>>>>>>> store-dependent and not rely on the DSL, hence I'm not sure we
> >>> get
> >>>>>>>>>>>> any grand-convergence DSL-Store here. So I am arguing that the
> >>>>>>>>>>>> current way of getting a handle on state stores is fine.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks
> >>>>>>>>>>>> Eno
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On 30 Jan 2017, at 03:56, Guozhang Wang <wangg...@gmail.com>
> >>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thinking loud here about the API options (materialize v.s.
> >>>> overloaded
> >>>>>>>>>>>>> functions) and its impact on IQ:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. The first issue of the current DSL is that, there is
> >>>>>>>>>>>>> inconsistency upon
> >>>>>>>>>>>>> whether / how KTables should be materialized:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> a) in many cases the library HAS TO materialize KTables no
> >>>>>>>>>>>>> matter what,
> >>>>>>>>>>>>> e.g. KStream / KTable aggregation resulted KTables, and hence
> >>> we
> >>>>>>>>>>>>> enforce
> >>>>>>>>>>>>> users to provide store names and throw RTE if it is null;
> >>>>>>>>>>>>> b) in some other cases, the KTable can be materialized or
> not;
> >>>> for
> >>>>>>>>>>>>> example in KStreamBuilder.table(), store names can be
> nullable
> >>>> and
> >>>>>>>>>>>>> in which
> >>>>>>>>>>>>> case the KTable would not be materialized;
> >>>>>>>>>>>>> c) in some other cases, the KTable will never be
> materialized,
> >>>> for
> >>>>>>>>>>>>> example KTable.filter() resulted KTables, and users have no
> >>>> options to
> >>>>>>>>>>>>> enforce them to be materialized;
> >>>>>>>>>>>>> d) this is related to a), where some KTables are required to
> >>> be
> >>>>>>>>>>>>> materialized, but we do not enforce users to provide a state
> >>>> store
> >>>>>>>>>>>>> name,
> >>>>>>>>>>>>> e.g. KTables involved in joins; a RTE will be thrown not
> >>>>>>>>>>>>> immediately but
> >>>>>>>>>>>>> later in this case.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2. The second issue is related to IQ, where state stores are
> >>>>>>>>>>>>> accessed by
> >>>>>>>>>>>>> their state stores; so only those KTable's that have
> >>>> user-specified
> >>>>>>>>>>>>> state
> >>>>>>>>>>>>> stores will be queryable. But because of 1) above, many
> stores
> >>>> may
> >>>>>>>>>>>>> not be
> >>>>>>>>>>>>> interested to users for IQ but they still need to provide a
> >>>>>>>>>>>>> (dummy?) state
> >>>>>>>>>>>>> store name for them; while on the other hand users cannot
> query
> >>>>>>>>>>>>> some state
> >>>>>>>>>>>>> stores, e.g. the ones generated by KTable.filter() as there
> is
> >>> no
> >>>>>>>>>>>>> APIs for
> >>>>>>>>>>>>> them to specify a state store name.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 3. We are aware from user feedbacks that such backend details
> >>>> would be
> >>>>>>>>>>>>> better be abstracted away from the DSL layer, where app
> >>>> developers
> >>>>>>>>>>>>> should
> >>>>>>>>>>>>> just focus on processing logic, while state stores along with
> >>>> their
> >>>>>>>>>>>>> changelogs etc would better be in a different mechanism; same
> >>>>>>>>>>>>> arguments
> >>>>>>>>>>>>> have been discussed for serdes / windowing triggers as well.
> >>> For
> >>>>>>>>>>>>> serdes
> >>>>>>>>>>>>> specifically, we had a very long discussion about it and
> >>>> concluded
> >>>>>>>>>>>>> that, at
> >>>>>>>>>>>>> least in Java7, we cannot completely abstract serde away in
> the
> >>>>>>>>>>>>> DSL, so we
> >>>>>>>>>>>>> choose the other extreme to enforce users to be completely
> >>> aware
> >>>> of
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> serde requirements when some KTables may need to be
> >>> materialized
> >>>> vis
> >>>>>>>>>>>>> overloaded API functions. While for the state store names, I
> >>> feel
> >>>>>>>>>>>>> it is a
> >>>>>>>>>>>>> different argument than serdes (details below).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So to me, for either materialize() v.s. overloaded functions
> >>>>>>>>>>>>> directions,
> >>>>>>>>>>>>> the first thing I'd like to resolve is the inconsistency
> issue
> >>>>>>>>>>>>> mentioned
> >>>>>>>>>>>>> above. So in either case: KTable materialization will not be
> >>>> affect
> >>>>>>>>>>>>> by user
> >>>>>>>>>>>>> providing state store name or not, but will only be decided
> by
> >>>> the
> >>>>>>>>>>>>> library
> >>>>>>>>>>>>> when it is necessary. More specifically, only join operator
> and
> >>>>>>>>>>>>> builder.table() resulted KTables are not always materialized,
> >>> but
> >>>>>>>>>>>>> are still
> >>>>>>>>>>>>> likely to be materialized lazily (e.g. when participated in a
> >>>> join
> >>>>>>>>>>>>> operator).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> For overloaded functions that would mean:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> a) we have an overloaded function for ALL operators that
> could
> >>>>>>>>>>>>> result
> >>>>>>>>>>>>> in a KTable, and allow it to be null (i.e. for the function
> >>>> without
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>> param it is null by default);
> >>>>>>>>>>>>> b) null-state-store-name do not indicate that a KTable would
> >>>>>>>>>>>>> not be
> >>>>>>>>>>>>> materialized, but that it will not be used for IQ at all
> >>>> (internal
> >>>>>>>>>>>>> state
> >>>>>>>>>>>>> store names will be generated when necessary).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> For materialize() that would mean:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> a) we will remove state store names from ALL operators that
> >>>> could
> >>>>>>>>>>>>> result in a KTable.
> >>>>>>>>>>>>> b) KTables that not calling materialized do not indicate that
> >>> a
> >>>>>>>>>>>>> KTable
> >>>>>>>>>>>>> would not be materialized, but that it will not be used for
> IQ
> >>>> at all
> >>>>>>>>>>>>> (internal state store names will be generated when
> necessary).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Again, in either ways the API itself does not "hint" about
> >>>> anything
> >>>>>>>>>>>>> for
> >>>>>>>>>>>>> materializing a KTable or not at all; it is still purely
> >>>> determined
> >>>>>>>>>>>>> by the
> >>>>>>>>>>>>> library when parsing the DSL for now.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Following these thoughts, I feel that 1) we should probably
> >>>> change
> >>>>>>>>>>>>> the name
> >>>>>>>>>>>>> "materialize" since it may be misleading to users as what
> >>>> actually
> >>>>>>>>>>>>> happened
> >>>>>>>>>>>>> behind the scene, to e.g. Damian suggested
> >>> "queryableStore(String
> >>>>>>>>>>>>> storeName)",
> >>>>>>>>>>>>> which returns a QueryableStateStore, and can replace the
> >>>>>>>>>>>>> `KafkaStreams.store` function; 2) comparing those two options
> >>>>>>>>>>>>> assuming we
> >>>>>>>>>>>>> get rid of the misleading function name, I personally favor
> not
> >>>>>>>>>>>>> adding more
> >>>>>>>>>>>>> overloading functions as it keeps the API simpler.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Jan Filipiak
> >>>>>>>>>>>>> <jan.filip...@trivago.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> thanks for your mail, felt like this can clarify some
> things!
> >>>> The
> >>>>>>>>>>>>>> thread
> >>>>>>>>>>>>>> unfortunately split but as all branches close in on what my
> >>>>>>>>>>>>>> suggestion was
> >>>>>>>>>>>>>> about Ill pick this to continue
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Of course only the table the user wants to query would be
> >>>>>>>>>>>>>> materialized.
> >>>>>>>>>>>>>> (retrieving the queryhandle implies materialisation). So In
> >>> the
> >>>>>>>>>>>>>> example of
> >>>>>>>>>>>>>> KTable::filter if you call
> >>>>>>>>>>>>>> getIQHandle on both tables only the one source that is there
> >>>> would
> >>>>>>>>>>>>>> materialize and the QueryHandleabstraction would make sure
> it
> >>>> gets
> >>>>>>>>>>>>>> mapped
> >>>>>>>>>>>>>> and filtered and what not uppon read as usual.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Of Course the Object you would retrieve would maybe only
> wrap
> >>>> the
> >>>>>>>>>>>>>> storeName / table unique identifier and a way to access the
> >>>> streams
> >>>>>>>>>>>>>> instance and then basically uses the same mechanism that is
> >>>>>>>>>>>>>> currently used.
> >>>>>>>>>>>>>> From my point of view this is the least confusing way for
> DSL
> >>>>>>>>>>>>>> users. If
> >>>>>>>>>>>>>> its to tricky to get a hand on the streams instance one
> could
> >>>> ask
> >>>>>>>>>>>>>> the user
> >>>>>>>>>>>>>> to pass it in before executing queries, therefore making
> sure
> >>>> the
> >>>>>>>>>>>>>> streams
> >>>>>>>>>>>>>> instance has been build.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The effort to implement this is indeed some orders of
> >>> magnitude
> >>>>>>>>>>>>>> higher
> >>>>>>>>>>>>>> than the overloaded materialized call. As long as I could
> help
> >>>>>>>>>>>>>> getting a
> >>>>>>>>>>>>>> different view I am happy.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best Jan
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 28.01.2017 09:36, Eno Thereska wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Jan,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I understand your concern. One implication of not passing
> any
> >>>>>>>>>>>>>>> store name
> >>>>>>>>>>>>>>> and just getting an IQ handle is that all KTables would
> need
> >>>> to be
> >>>>>>>>>>>>>>> materialised. Currently the store name (or proposed
> >>>>>>>>>>>>>>> .materialize() call)
> >>>>>>>>>>>>>>> act as hints on whether to materialise the KTable or not.
> >>>>>>>>>>>>>>> Materialising
> >>>>>>>>>>>>>>> every KTable can be expensive, although there are some
> tricks
> >>>> one
> >>>>>>>>>>>>>>> can play,
> >>>>>>>>>>>>>>> e.g., have a virtual store rather than one backed by a
> Kafka
> >>>> topic.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> However, even with the above, after getting an IQ handle,
> the
> >>>>>>>>>>>>>>> user would
> >>>>>>>>>>>>>>> still need to use IQ APIs to query the state. As such, we
> >>> would
> >>>>>>>>>>>>>>> still
> >>>>>>>>>>>>>>> continue to be outside the original DSL so this wouldn't
> >>>> address
> >>>>>>>>>>>>>>> your
> >>>>>>>>>>>>>>> original concern.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> So I read this suggestion as simplifying the APIs by
> removing
> >>>> the
> >>>>>>>>>>>>>>> store
> >>>>>>>>>>>>>>> name, at the cost of having to materialise every KTable.
> It's
> >>>>>>>>>>>>>>> definitely an
> >>>>>>>>>>>>>>> option we'll consider as part of this KIP.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>> Eno
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 28 Jan 2017, at 06:49, Jan Filipiak <
> >>>> jan.filip...@trivago.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Hi Exactly
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I know it works from the Processor API, but my suggestion
> >>>> would
> >>>>>>>>>>>>>>>> prevent
> >>>>>>>>>>>>>>>> DSL users dealing with storenames what so ever.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In general I am pro switching between DSL and Processor
> API
> >>>>>>>>>>>>>>>> easily. (In
> >>>>>>>>>>>>>>>> my Stream applications I do this a lot with reflection and
> >>>>>>>>>>>>>>>> instanciating
> >>>>>>>>>>>>>>>> KTableImpl) Concerning this KIP all I say is that there
> >>> should
> >>>>>>>>>>>>>>>> be a DSL
> >>>>>>>>>>>>>>>> concept of "I want to expose this __KTable__. This can be
> a
> >>>>>>>>>>>>>>>> Method like
> >>>>>>>>>>>>>>>> KTable::retrieveIQHandle():InteractiveQueryHandle, the
> >>> table
> >>>>>>>>>>>>>>>> would know
> >>>>>>>>>>>>>>>> to materialize, and the user had a reference to the "store
> >>>> and the
> >>>>>>>>>>>>>>>> distributed query mechanism by the Interactive Query
> Handle"
> >>>>>>>>>>>>>>>> under the hood
> >>>>>>>>>>>>>>>> it can use the same mechanism as the PIP people again.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I hope you see my point J
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best Jan
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> #DeathToIQMoreAndBetterConnectors :)
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 27.01.2017 21:59, Matthias J. Sax wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Jan,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> the IQ feature is not limited to Streams DSL but can also
> >>> be
> >>>>>>>>>>>>>>>>> used for
> >>>>>>>>>>>>>>>>> Stores used in PAPI. Thus, we need a mechanism that does
> >>> work
> >>>>>>>>>>>>>>>>> for PAPI
> >>>>>>>>>>>>>>>>> and DSL.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Nevertheless I see your point and I think we could
> provide
> >>> a
> >>>>>>>>>>>>>>>>> better API
> >>>>>>>>>>>>>>>>> for KTable stores including the discovery of remote
> shards
> >>> of
> >>>>>>>>>>>>>>>>> the same
> >>>>>>>>>>>>>>>>> KTable.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> @Michael: Yes, right now we do have a lot of overloads
> and
> >>> I
> >>>> am
> >>>>>>>>>>>>>>>>> not a
> >>>>>>>>>>>>>>>>> big fan of those -- I would rather prefer a builder
> >>> pattern.
> >>>>>>>>>>>>>>>>> But that
> >>>>>>>>>>>>>>>>> might be a different discussion (nevertheless, if we
> would
> >>>> aim
> >>>>>>>>>>>>>>>>> for a API
> >>>>>>>>>>>>>>>>> rework, we should get the changes with regard to stores
> >>> right
> >>>>>>>>>>>>>>>>> from the
> >>>>>>>>>>>>>>>>> beginning on, in order to avoid a redesign later on.)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> something like:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> stream.groupyByKey()
> >>>>>>>>>>>>>>>>>   .window(TimeWindow.of(5000))
> >>>>>>>>>>>>>>>>>   .aggregate(...)
> >>>>>>>>>>>>>>>>>   .withAggValueSerde(new CustomTypeSerde())
> >>>>>>>>>>>>>>>>>   .withStoreName("storeName);
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> (This would also reduce JavaDoc redundancy -- maybe a
> >>>> personal
> >>>>>>>>>>>>>>>>> pain
> >>>>>>>>>>>>>>>>> point right now :))
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On 1/27/17 11:10 AM, Jan Filipiak wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Yeah,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Maybe my bad that I refuse to look into IQ as i don't
> find
> >>>> them
> >>>>>>>>>>>>>>>>>> anywhere
> >>>>>>>>>>>>>>>>>> close to being interesting. The Problem IMO is that
> people
> >>>>>>>>>>>>>>>>>> need to know
> >>>>>>>>>>>>>>>>>> the Store name), so we are working on different levels
> to
> >>>>>>>>>>>>>>>>>> achieve a
> >>>>>>>>>>>>>>>>>> single goal.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> What is your peoples opinion on having a method on
> KTABLE
> >>>> that
> >>>>>>>>>>>>>>>>>> returns
> >>>>>>>>>>>>>>>>>> them something like a Keyvalue store. There is of course
> >>>>>>>>>>>>>>>>>> problems like
> >>>>>>>>>>>>>>>>>> "it cant be used before the streamthreads are going and
> >>>>>>>>>>>>>>>>>> groupmembership
> >>>>>>>>>>>>>>>>>> is established..." but the benefit would be that for the
> >>>> user
> >>>>>>>>>>>>>>>>>> there is
> >>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>> consistent way of saying "Hey I need it materialized as
> >>>>>>>>>>>>>>>>>> querries gonna
> >>>>>>>>>>>>>>>>>> be comming" + already get a Thing that he can execute
> the
> >>>>>>>>>>>>>>>>>> querries on
> >>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> 1 step.
> >>>>>>>>>>>>>>>>>> What I think is unintuitive here is you need to say
> >>>>>>>>>>>>>>>>>> materialize on this
> >>>>>>>>>>>>>>>>>> Ktable and then you go somewhere else and find its store
> >>>> name
> >>>>>>>>>>>>>>>>>> and then
> >>>>>>>>>>>>>>>>>> you go to the kafkastreams instance and ask for the
> store
> >>>> with
> >>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>> name.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> So one could the user help to stay in DSL land and
> >>> therefore
> >>>>>>>>>>>>>>>>>> maybe
> >>>>>>>>>>>>>>>>>> confuse him less.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Best Jan
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> #DeathToIQMoreAndBetterConnectors :)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 27.01.2017 16:51, Damian Guy wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think Jan is saying that they don't always need to be
> >>>>>>>>>>>>>>>>>>> materialized,
> >>>>>>>>>>>>>>>>>>> i.e.,
> >>>>>>>>>>>>>>>>>>> filter just needs to apply the ValueGetter, it doesn't
> >>>> need yet
> >>>>>>>>>>>>>>>>>>> another
> >>>>>>>>>>>>>>>>>>> physical state store.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Fri, 27 Jan 2017 at 15:49 Michael Noll <
> >>>> mich...@confluent.io>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Like Damian, and for the same reasons, I am more in
> favor
> >>>> of
> >>>>>>>>>>>>>>>>>>>> overloading
> >>>>>>>>>>>>>>>>>>>> methods rather than introducing `materialize()`.
> >>>>>>>>>>>>>>>>>>>> FWIW, we already have a similar API setup for e.g.
> >>>>>>>>>>>>>>>>>>>> `KTable#through(topicName, stateStoreName)`.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> A related but slightly different question is what e.g.
> >>> Jan
> >>>>>>>>>>>>>>>>>>>> Filipiak
> >>>>>>>>>>>>>>>>>>>> mentioned earlier in this thread:
> >>>>>>>>>>>>>>>>>>>> I think we need to explain more clearly why KIP-114
> >>>> doesn't
> >>>>>>>>>>>>>>>>>>>> propose
> >>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> seemingly simpler solution of always materializing
> >>>> tables/state
> >>>>>>>>>>>>>>>>>>>> stores.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Fri, Jan 27, 2017 at 4:38 PM, Jan Filipiak <
> >>>>>>>>>>>>>>>>>>>> jan.filip...@trivago.com>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>>>> Yeah its confusing, Why shoudn't it be querable by
> IQ?
> >>> If
> >>>>>>>>>>>>>>>>>>>>> you uses
> >>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> ValueGetter of Filter it will apply the filter and
> >>>> should be
> >>>>>>>>>>>>>>>>>>>>> completely
> >>>>>>>>>>>>>>>>>>>>> transparent as to if another processor or IQ is
> >>> accessing
> >>>>>>>>>>>>>>>>>>>>> it? How
> >>>>>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> new method help?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I cannot see the reason for the additional
> materialize
> >>>>>>>>>>>>>>>>>>>>> method being
> >>>>>>>>>>>>>>>>>>>>> required! Hence I suggest leave it alone.
> >>>>>>>>>>>>>>>>>>>>> regarding removing the others I dont have strong
> >>> opinions
> >>>>>>>>>>>>>>>>>>>>> and it
> >>>>>>>>>>>>>>>>>>>>> seems to
> >>>>>>>>>>>>>>>>>>>>> be unrelated.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Best Jan
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On 26.01.2017 20:48, Eno Thereska wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Forwarding this thread to the users list too in case
> >>>> people
> >>>>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> comment. It is also on the dev list.
> >>>>>>>>>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>>>>>>>>> Eno
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Begin forwarded message:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> From: "Matthias J. Sax" <matth...@confluent.io>
> >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: [DISCUSS] KIP-114: KTable
> >>> materialization
> >>>> and
> >>>>>>>>>>>>>>>>>>>>>>> improved
> >>>>>>>>>>>>>>>>>>>>>>> semantics
> >>>>>>>>>>>>>>>>>>>>>>> Date: 24 January 2017 at 19:30:10 GMT
> >>>>>>>>>>>>>>>>>>>>>>> To: dev@kafka.apache.org
> >>>>>>>>>>>>>>>>>>>>>>> Reply-To: dev@kafka.apache.org
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> That not what I meant by "huge impact".
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I refer to the actions related to materialize a
> >>> KTable:
> >>>>>>>>>>>>>>>>>>>>>>> creating a
> >>>>>>>>>>>>>>>>>>>>>>> RocksDB store and a changelog topic -- users should
> >>> be
> >>>>>>>>>>>>>>>>>>>>>>> aware about
> >>>>>>>>>>>>>>>>>>>>>>> runtime implication and this is better expressed by
> >>> an
> >>>>>>>>>>>>>>>>>>>>>>> explicit
> >>>>>>>>>>>>>>>>>>>>>>> method
> >>>>>>>>>>>>>>>>>>>>>>> call, rather than implicitly triggered by using a
> >>>> different
> >>>>>>>>>>>>>>>>>>>>>>> overload of
> >>>>>>>>>>>>>>>>>>>>>>> a method.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On 1/24/17 1:35 AM, Damian Guy wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I think your definition of a huge impact and mine
> are
> >>>> rather
> >>>>>>>>>>>>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>>>>>>>>> ;-P
> >>>>>>>>>>>>>>>>>>>>>>>> Overloading a few methods  is not really a huge
> >>> impact
> >>>>>>>>>>>>>>>>>>>>>>>> IMO. It is
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> also a
> >>>>>>>>>>>>>>>>>>>>> sacrifice worth making for readability, usability of
> >>> the
> >>>> API.
> >>>>>>>>>>>>>>>>>>>>>>>> On Mon, 23 Jan 2017 at 17:55 Matthias J. Sax <
> >>>>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io>
> >>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> I understand your argument, but do not agree with
> >>> it.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Your first version (even if the "flow" is not as
> >>>> nice)
> >>>>>>>>>>>>>>>>>>>>>>>>> is more
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> explicit
> >>>>>>>>>>>>>>>>>>>>> than the second version. Adding a stateStoreName
> >>>> parameter
> >>>>>>>>>>>>>>>>>>>>> is quite
> >>>>>>>>>>>>>>>>>>>>>>>>> implicit but has a huge impact -- thus, I prefer
> >>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> rather more
> >>>>>>>>>>>>>>>>>>>>>>>>> verbose
> >>>>>>>>>>>>>>>>>>>>>>>>> but explicit version.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On 1/23/17 1:39 AM, Damian Guy wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I'm not a fan of materialize. I think it
> interrupts
> >>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> flow,
> >>>>>>>>>>>>>>>>>>>>>>>>>> i.e,
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>> table.mapValue(..).materialize().join(..).materialize()
> >>>>>>>>>>>>>>>>>>>>>>>>>> compared to:
> >>>>>>>>>>>>>>>>>>>>>>>>>> table.mapValues(..).join(..)
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I know which one i prefer.
> >>>>>>>>>>>>>>>>>>>>>>>>>> My preference is stil to provide overloaded
> >>> methods
> >>>> where
> >>>>>>>>>>>>>>>>>>>>>>>>>> people can
> >>>>>>>>>>>>>>>>>>>>>>>>>> specify the store names if they want, otherwise
> we
> >>>> just
> >>>>>>>>>>>>>>>>>>>>>>>>>> generate
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> them.
> >>>>>>>>>>>>>>>>>>>>> On Mon, 23 Jan 2017 at 05:30 Matthias J. Sax
> >>>>>>>>>>>>>>>>>>>>>>>>>> <matth...@confluent.io
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for the KIP Eno! Here are my 2 cents:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1) I like Guozhang's proposal about removing
> >>> store
> >>>>>>>>>>>>>>>>>>>>>>>>>>> name from
> >>>>>>>>>>>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>>>>>>>>> KTable
> >>>>>>>>>>>>>>>>>>>>>>>>>>> methods and generate internal names (however, I
> >>>> would
> >>>>>>>>>>>>>>>>>>>>>>>>>>> do this
> >>>>>>>>>>>>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>>>>>>>> overloads). Furthermore, I would not force
> users
> >>>> to call
> >>>>>>>>>>>>>>>>>>>>>>>>>>> .materialize()
> >>>>>>>>>>>>>>>>>>>>>>>>>>> if they want to query a store, but add one more
> >>>> method
> >>>>>>>>>>>>>>>>>>>>>>>>>>> .stateStoreName()
> >>>>>>>>>>>>>>>>>>>>>>>>>>> that returns the store name if the KTable is
> >>>>>>>>>>>>>>>>>>>>>>>>>>> materialized.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thus,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> also
> >>>>>>>>>>>>>>>>>>>>> .materialize() must not necessarily have a parameter
> >>>> storeName
> >>>>>>>>>>>>>>>>>>>>>>>>>>> (ie,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>>>>> should have some overloads here).
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I would also not allow to provide a null store
> >>>> name (to
> >>>>>>>>>>>>>>>>>>>>>>>>>>> indicate no
> >>>>>>>>>>>>>>>>>>>>>>>>>>> materialization if not necessary) but throw an
> >>>>>>>>>>>>>>>>>>>>>>>>>>> exception.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> This yields some simplification (see below).
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2) I also like Guozhang's proposal about
> >>>>>>>>>>>>>>>>>>>>>>>>>>> KStream#toTable()
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 3)
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 3. What will happen when you call materialize
> on
> >>>>>>>>>>>>>>>>>>>>>>>>>>> KTable
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> that is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> already
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> materialized? Will it create another
> StateStore
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> (providing
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> name
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>> different), throw an Exception?
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Currently an exception is thrown, but see
> below.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> If we follow approach (1) from Guozhang, there
> >>> is
> >>>> no
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> need to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> worry
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>>>>>>>>>> a second materialization and also no exception
> >>>> must be
> >>>>>>>>>>>>>>>>>>>>>>>>>>> throws. A
> >>>>>>>>>>>>>>>>>>>>>>>>>>> call to
> >>>>>>>>>>>>>>>>>>>>>>>>>>> .materialize() basically sets a "materialized
> >>>> flag" (ie,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> idempotent
> >>>>>>>>>>>>>>>>>>>>>>>>>>> operation) and sets a new name.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 4)
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Rename toStream() to toKStream() for
> consistency.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Not sure whether that is really required. We
> >>> also
> >>>> use
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> `KStreamBuilder#stream()` and
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> `KStreamBuilder#table()`, for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> example,
> >>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>> don't care about the "K" prefix.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Eno's reply:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I think changing it to `toKStream` would make
> it
> >>>>>>>>>>>>>>>>>>>>>>>>>>> absolutely
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> clear
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> what
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> we are converting it to.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I'd say we should probably change the
> >>>> KStreamBuilder
> >>>>>>>>>>>>>>>>>>>>>>>>>>> methods
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> (but
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>>>>> this KIP).
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I would keep #toStream(). (see below)
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 5) We should not remove any methods but only
> >>>>>>>>>>>>>>>>>>>>>>>>>>> deprecate them.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> A general note:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I do not understand your comments "Rejected
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Alternatives". You
> >>>>>>>>>>>>>>>>>>>>>>>>>>> say
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "Have
> >>>>>>>>>>>>>>>>>>>>>>>>>>> the KTable be the materialized view" was
> >>> rejected.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> But your
> >>>>>>>>>>>>>>>>>>>>>>>>>>> KIP
> >>>>>>>>>>>>>>>>>>>>>>>>>>> actually
> >>>>>>>>>>>>>>>>>>>>>>>>>>> does exactly this -- the changelog abstraction
> of
> >>>>>>>>>>>>>>>>>>>>>>>>>>> KTable is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> secondary
> >>>>>>>>>>>>>>>>>>>>> after those changes and the "view" abstraction is
> what
> >>> a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> KTable is.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> And
> >>>>>>>>>>>>>>>>>>>>>>>>>>> just to be clear, I like this a lot:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> - it aligns with the name KTable
> >>>>>>>>>>>>>>>>>>>>>>>>>>> - is aligns with stream-table-duality
> >>>>>>>>>>>>>>>>>>>>>>>>>>> - it aligns with IQ
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I would say that a KTable is a "view
> abstraction"
> >>>> (as
> >>>>>>>>>>>>>>>>>>>>>>>>>>> materialization is
> >>>>>>>>>>>>>>>>>>>>>>>>>>> optional).
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 1/22/17 5:05 PM, Guozhang Wang wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP Eno, I have a few meta
> >>> comments
> >>>>>>>>>>>>>>>>>>>>>>>>>>> and a few
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> detailed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> comments:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. I like the materialize() function in
> general,
> >>>> but
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I would
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> see
> >>>>>>>>>>>>>>>>>>>>>>>>>> how other KTable functions should be updated
> >>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. For
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> example,
> >>>>>>>>>>>>>>>>

Reply via email to