Hi Matthias, The reason for overloading the methods with Materialized parameter is regarding the semantic of this class. The Materialized class allow to name a queryable store. if a name is set then it will be used both to name the state-store and the changelog-topic. If no name is given, then the provided Named will be used. This allow to name the operation without having a queriable store.
So if my analysis is correct, we will end up with : Generated | Named | Joined / Grouped | Materialized ------------------------------------------------------------------------------------------------- Node | X | X | X | ------------------------------------------------------------------------------------------------- Repartition Topic | X | | X | ------------------------------------------------------------------------------------------------- Queryable Store | | | | X ------------------------------------------------------------------------------------------------- State store | X | X | X | X ------------------------------------------------------------------------------------------------- Changelog Topic | X | X | X | X ------------------------------------------------------------------------------------------------- Le dim. 13 janv. 2019 à 03:23, Matthias J. Sax <matth...@confluent.io> a écrit : > Just catching up on this KIP again. > > One nit. The KIP says: > > > In addition, the generated names have a few disadvantages to guarantee > topology compatibilities. In fact, adding a new operator, using a > third-library doing some optimization to remove some operators or upgrading > to a new KafkaStreams version with internal API changes may changed suffix > indexing for a large amount of the processor names. This will in turn > change the internal state store names, as well as internal topic names as > well. > > > > This is not true any longer (I guess it was true, when the KIP was > initially proposed), because all stores/internal-topics can be named > since 2.1 release. I would suggest to remove the paragraph. > > Overall, I like the Named/NamedOperation design. > > What is unclear to me thought is, why we need new overloads for methods > that accept `Materialized`. To be more precise, I think it make sense to > add an overload that only takes `Named`, but not one that takes both > `Named` and `Materialized`. For example: > > KGroupedStream#count() // exists > KGroupedStream#count(Materialized) // exits > KGroupedStream#count(Named) // added (makes sense to me) > KGroupedStream#count(Named, Materialized) // added -- why? > > I would prefer to use `Materialized` to name the processor for this > case, too. Can you elaborate on the motivation? > > > -Matthias > > On 1/11/19 3:39 PM, Florian Hussonnois wrote: > > Hi Guozhang, > > > > I have updated the PR as well as the KIP. I should add more unit tests to > > covers all new methods. > > > > However, I still have one test in failure. The reason is that using > > Joined.name() in both potential repartition topic and processor nodes > leads > > to topology-incompatible. > > How should we deal with that ? > > > > Thanks, > > > > Le jeu. 10 janv. 2019 à 01:21, Guozhang Wang <wangg...@gmail.com> a > écrit : > > > >> Hello Florian, > >> > >> Just checking if have read about my previous email and if you feel happy > >> about it. We have the 2.2 KIP freeze deadline at 24th this month, while > the > >> PR itself is getting quite close. So it'll be great if we can get the > >> agreement on it and get it into 2.2.0 release. > >> > >> > >> Guozhang > >> > >> > >> On Mon, Dec 17, 2018 at 2:39 PM Guozhang Wang <wangg...@gmail.com> > wrote: > >> > >>> Hi Florian / John, > >>> > >>> Just wanted to throw a couple minor thoughts on the current proposal: > >>> > >>> 1) Regarding the interface / function name, I'd propose we call the > >>> interface `NamedOperation` which would be implemented by Produced / > >>> Consumed / Printed / Joined / Grouped / Suppressed (note I > intentionally > >>> exclude Materialized here since its semantics is quite), and have the > >>> default class that implements `NamedOperation` as `Named`, which would > be > >>> used in our adding overload functions. The main reason is to have > >>> consistency in naming. > >>> > >>> 2) As a minor tweak, I think it's better to use Joined.name() in both > its > >>> possibly generate repartition topic, as well as the map processor used > for > >>> group-by (currently this name is only used for the repartition topic). > >>> > >>> > >>> Florian: if you think this proposal makes sense, please feel free to go > >>> ahead and update the PR; after we made a first pass on it and feels > >>> confident about it, we can go ahead with the VOTING process. About the > >>> implementation of 2) above, this may be out of your implementation > scope, > >>> so feel free to leave it out side your PR while Bill who originally > worked > >>> on the Grouped KIP can make a follow-up PR for it. > >>> > >>> Guozhang > >>> > >>> On Fri, Dec 14, 2018 at 9:43 PM Guozhang Wang <wangg...@gmail.com> > wrote: > >>> > >>>> Hello Florian, > >>>> > >>>> Really appreciate you for your patience. > >>>> > >>>> I know that we've discussed about the approach to adding overloaded > >>>> functions and rejected it early on. But looking deeper into the > current PR > >>>> I realized that this approach has a danger of great API confusions to > users > >>>> (I tried to explain my thoughts in the PR, but it was not very clear) > --- > >>>> the basic idea is that, today we already have a few existing control > >>>> classes including Grouped, Joined, Suppressed that allow users to > specify > >>>> serdes etc, while also a "name" which can then be used to define the > >>>> processor name / internal topic names in the topology (the static > function > >>>> names are not consistent, which I think we should fix as well). And > Named > >>>> interface, by extending the lambda function interfaces like > ValueJoiner / > >>>> Predicate etc opens the door for another way to specify the names > again. > >>>> > >>>> So in order to achieve consistency, we are left with generally two > >>>> options: > >>>> > >>>> 1) only allow users to specify names via the lambda interfaces that > >>>> extends Named interface. This means we'd better remove the naming > mechanism > >>>> from the existing control objects to keep consistency. > >>>> > >>>> 2) only allow users to specify names via control classes, and we > >>>> introduce a new class (Named) for those which do not have one yet --- > this > >>>> leads to the overloaded functions. > >>>> > >>>> I did a quick count on the num.of overloaded functions, and summing > from > >>>> KTable (8) / KStream (15) / KGroupedStream (6) / KGroupedTable (6) / > >>>> TimeWindowedKStream (6) / SessionWindowedKStream (6) we got about 47 > >>>> overloaded functions (our guess was pretty close!) -- note this is > based on > >>>> John's proposal that we can let existing Grouped / Joined to extend > Named > >>>> and hence we only need overloaded functions with a default > NamedOperation > >>>> for those operators that do not have a control classes already. > >>>> > >>>> Thinking about this approach I feel it is not too bad compared with > >>>> either 1) above, which would require us to deprecate lot of public > >>>> functions around name(), or having a mixed mechanism for naming, which > >>>> could lead to very confusing behavior to users. Additionally, for most > >>>> users who would only want to specify the names for those stateful > >>>> operations which have internal topics / state stores and hence are > more > >>>> keen to upgrade compatibility, those added overloads would be > not-often > >>>> used functions for them anyways. And by letting existing control > classes to > >>>> extend Named, we can have a unified method name for static > constructor as > >>>> well. > >>>> > >>>> > >>>> > >>>> Guozhang > >>>> > >>>> > >>>> On Fri, Dec 14, 2018 at 10:24 AM John Roesler <j...@confluent.io> > wrote: > >>>> > >>>>> Hi Florian, > >>>>> > >>>>> Sorry about the run-around of rejecting the original proposal, > >>>>> only to return to it later on. Hopefully, it's more encouraging > >>>>> than frustrating that we're coming around to your initial way of > >>>>> thinking. > >>>>> > >>>>> Thanks! > >>>>> -John > >>>>> > >>>>> On Thu, Dec 13, 2018 at 4:28 PM Florian Hussonnois < > >>>>> fhussonn...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> Thanks again. I agree with your propositions. > >>>>>> Also IMHO, overloading all methods (filter, map) to accept a new > >>>>> control > >>>>>> object seems to provide a more natural development experience for > >>>>> users. > >>>>>> > >>>>>> Actually, this was the first proposition for this KIP, but we have > >>>>> rejected > >>>>>> it because this solution led to adding a lot of new methods. > >>>>>> As you mentioned it, the API has evolve since the creation of this > >>>>> KIP - > >>>>>> some existing control objects already allow to customize internal > >>>>> names. We > >>>>>> should so keep on that strategy. > >>>>>> > >>>>>> If everyone is OK with that, I will update the KIP and the PR > >>>>> accordingly; > >>>>>> > >>>>>> Thanks. > >>>>>> > >>>>>> Le jeu. 13 déc. 2018 à 18:08, John Roesler <j...@confluent.io> a > >>>>> écrit : > >>>>>> > >>>>>>> Hi again, all, > >>>>>>> > >>>>>>> Matthias, I agree with you. > >>>>>>> > >>>>>>> Florian, thanks for your response. > >>>>>>> > >>>>>>> I think your proposal is the best way to address the ask for hiding > >>>>> the > >>>>>>> name() getter. But I'd like to question that ask and instead > >>>>> propose that > >>>>>>> we just make the name() getter part of the public API. > >>>>>>> > >>>>>>> The desire to "hide" the getters causes a lot of complexity in our > >>>>> code > >>>>>>> base, and it will become completely impractical with the mixin > >>>>> strategy > >>>>>> of > >>>>>>> Named. > >>>>>>> > >>>>>>> If we were to switch strategies back to mixing Named in to the > >>>>> control > >>>>>>> objects rather than the functions, then the path forward becomes > >>>>> quite > >>>>>>> clear. > >>>>>>> > >>>>>>> On the other hand, it seems harmless for anyone who wants to be > >>>>> able to > >>>>>>> query the name from a control object after setting it, so my vote > >>>>> would > >>>>>> be > >>>>>>> simply to keep the Named interface as: > >>>>>>> > >>>>>>> public interface Named<T extends Named<T>> { > >>>>>>> String name(); > >>>>>>> T withName(String name); > >>>>>>> } > >>>>>>> > >>>>>>> Under this proposal, we only mix Named in to the control objects, > >>>>> which > >>>>>>> means we have no need of default implementations anymore (because > >>>>> we can > >>>>>>> update all the control objects concurrently with adding this > >>>>> interface to > >>>>>>> them). > >>>>>>> > >>>>>>> This does hinge on switching over to a control-object-only > strategy, > >>>>>> which > >>>>>>> introduces the need to add about 50 new control object classes, > >>>>> which > >>>>>> would > >>>>>>> only serve to implement Named. As a middle ground, maybe we could > >>>>> just > >>>>>> add > >>>>>>> one generic control object class, like: > >>>>>>> > >>>>>>> public class NamedOperation implements Named<NamedOperation> { > >>>>>>> private final String name; > >>>>>>> private NamedOperation(final String name) { this.name = name; } > >>>>>>> public static NamedOperation name(final String name) { > >>>>>>> return new NamedOperation(name); > >>>>>>> } > >>>>>>> public String name() { return name; } > >>>>>>> public NamedOperation withName(final String name) { > >>>>>>> return new NamedOperation(name); > >>>>>>> } > >>>>>>> } > >>>>>>> > >>>>>>> And then, we'd add overloads for all the methods that don't have > >>>>> control > >>>>>>> objects already (for example, filter() ): > >>>>>>> > >>>>>>> // existing > >>>>>>> KStream<K, V> filter(Predicate<? super K, ? super V> predicate); > >>>>>>> > >>>>>>> // new > >>>>>>> KStream<K, V> filter(Predicate<? super K, ? super V> predicate, > >>>>>>> NamedOperation named); > >>>>>>> > >>>>>>> Additionally, in regard to Matthias's point about existing control > >>>>>> objects > >>>>>>> with naming semantics, they would extend Named (but not > >>>>> NamedOperation) > >>>>>> for > >>>>>>> uniformity. > >>>>>>> > >>>>>>> You provided a good approach to hide the getter with your > >>>>> SettableName > >>>>>>> class; I think what you proposed is the only way we could hide the > >>>>> name. > >>>>>>> In the end, though, it's a lot of complexity added (control object > >>>>> class > >>>>>>> hierarchy, inheritance, mutable state, internal casting) for > >>>>> something of > >>>>>>> dubious value: to be able to hide the name from someone *after they > >>>>>>> themselves have set it*. > >>>>>>> > >>>>>>> Although it'll be a pain, perhaps Matthias's suggestion to > >>>>> enumerate all > >>>>>>> the API methods is the best way to be sure we all agree on what's > >>>>> going > >>>>>> to > >>>>>>> happen. > >>>>>>> > >>>>>>> Thanks again for wrangling with this issue, > >>>>>>> -John > >>>>>>> > >>>>>>> On Thu, Dec 13, 2018 at 9:03 AM Matthias J. Sax < > >>>>> matth...@confluent.io> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Just catching up on this discussion. > >>>>>>>> > >>>>>>>> My overall personal take is, that I am not a big fan of the > >>>>> interface > >>>>>>>> `Named` that is used as a factory. I would rather prefer to add a > >>>>>>>> control object parameter to all methods that don't have one yet. > >>>>> This > >>>>>>>> KIP was started a while ago, and we added new naming capabilities > >>>>> in > >>>>>> the > >>>>>>>> meantime. Guozhang's example in the PR comment about naming in > >>>>>>>> stream-stream join shows, that we might end up in a confusion > >>>>> situation > >>>>>>>> for users if we use `Named`. Also, in 2.1, user can already name > >>>>> as > >>>>>>>> repartition-/changelog-topics and stores. Thus, KIP-307 boils > >>>>> down to > >>>>>>>> provide non-functional naming? > >>>>>>>> > >>>>>>>> Hence, for all methods that allow to specify names already, I > >>>>> don't see > >>>>>>>> any reason to change them, but use the existing API to also name > >>>>> the > >>>>>>>> processor(s) instead of allowing uses to specify a new name. > >>>>>>>> > >>>>>>>> About the inconsistency in method naming. I agree, that `as` is > >>>>> very > >>>>>>>> generic and maybe not the best choice. > >>>>>>>> > >>>>>>>> I think it might be helpful, to have a table overview in the KIP, > >>>>> that > >>>>>>>> list all existing static/non-static methods that allow to specify > >>>>> a > >>>>>>>> name, plus a columns with the new suggested naming for those > >>>>> methods? > >>>>>>>> > >>>>>>>> Thoughts? > >>>>>>>> > >>>>>>>> > >>>>>>>> -Matthias > >>>>>>>> > >>>>>>>> > >>>>>>>> On 12/12/18 12:45 AM, Florian Hussonnois wrote: > >>>>>>>>> Thank you very much for your feedbacks. > >>>>>>>>> > >>>>>>>>> Currently, there is still lot of discussions regarding the Named > >>>>>>>> interface. > >>>>>>>>> On the one hand we should provided consistency over the stream > >>>>> API > >>>>>> and > >>>>>>> on > >>>>>>>>> the other hand we should not break the semantic as John point > >>>>> it up. > >>>>>>>>> > >>>>>>>>> Guozhang, I'm sorry, but I'm little bit confused, maybe I missed > >>>>>>>> something. > >>>>>>>>> In your comment you have suggested that : > >>>>>>>>> * Produced/Consumed/Suppressed should extends Named > >>>>>>>>> * Named should have a private-package method to get the > >>>>> specified > >>>>>>>> processor > >>>>>>>>> name internally (processorName()) > >>>>>>>>> * Finally we should end up with something like : Named -> XXX > >>>>> -> > >>>>>>>>> XXXInternal or Named -> Produced -> ProducedInternal > >>>>>>>>> > >>>>>>>>> The objective behind that is to : > >>>>>>>>> * consolidate the internal method processorName() > >>>>>>>>> * consolidate the method withName that exists now existing into > >>>>>>> Produced, > >>>>>>>>> Consumed and Suppressed. > >>>>>>>>> > >>>>>>>>> But, Named is an interface so we can't define a private-package > >>>>>> method > >>>>>>> on > >>>>>>>>> it. Also, for example Produced and ProducedInternal are not in > >>>>> the > >>>>>> same > >>>>>>>>> package so having a private-package method doesn't really help. > >>>>>>>>> In addition, if we add the withName method into Named interface > >>>>> this > >>>>>>> can > >>>>>>>>> become confusing for developers because action interfaces > >>>>>> (ValueMapper, > >>>>>>>>> Reducer, etc) extend it. > >>>>>>>>> The interface would look like : > >>>>>>>>> > >>>>>>>>> public interface Named<T extends Named<T>> { > >>>>>>>>> default String name() { > >>>>>>>>> return null; > >>>>>>>>> } > >>>>>>>>> default Named<T> withName(final String name) { > >>>>>>>>> return null; > >>>>>>>>> } > >>>>>>>>> ... > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> So maybe instead of adding another method to Named we could > >>>>> create a > >>>>>>> new > >>>>>>>>> package-private class that could be extended by > >>>>>>>>> Produced/Consumed/Joined/Suppressed. For exemple, > >>>>>>>>> class SettableName<T extends SettableName<T>> implements Named { > >>>>>>>>> > >>>>>>>>> protected String processorName; > >>>>>>>>> > >>>>>>>>> SettableName(final SettableName settable) { > >>>>>>>>> this(Objects.requireNonNull(settable, "settable can't be > >>>>>>>>> null").name()); > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> SettableName(final String processorName) { > >>>>>>>>> this.processorName = processorName; > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> @Override > >>>>>>>>> public String name() { > >>>>>>>>> return processorName; > >>>>>>>>> } > >>>>>>>>> public T withName(final String processorName) { > >>>>>>>>> this.processorName = processorName; > >>>>>>>>> return (T)this; > >>>>>>>>> } > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> In that way, we will get : public class Produced implements > >>>>>>>>> SettableName<Produced> { ... > >>>>>>>>> > >>>>>>>>> WDYT? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Le mar. 11 déc. 2018 à 02:46, Guozhang Wang <wangg...@gmail.com> > >>>>> a > >>>>>>>> écrit : > >>>>>>>>> > >>>>>>>>>> I had one meta comment on the PR: > >>>>>>>>>> > >>>>> https://github.com/apache/kafka/pull/5909#discussion_r240447153 > >>>>>>>>>> > >>>>>>>>>> On Mon, Dec 10, 2018 at 5:22 PM John Roesler < > >>>>> j...@confluent.io> > >>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi Florian, > >>>>>>>>>>> > >>>>>>>>>>> I hope it's ok if I ask a few questions at this late stage... > >>>>>>>>>>> > >>>>>>>>>>> Comment 1 ====== > >>>>>>>>>>> > >>>>>>>>>>> It seems like the proposal is to add a new "Named" interface > >>>>> that > >>>>>> is > >>>>>>>>>>> intended to be mixed in with the existing API objects at > >>>>> various > >>>>>>>> points. > >>>>>>>>>>> > >>>>>>>>>>> Just to preface some of my comments, it looks like your KIP > >>>>> was > >>>>>>> created > >>>>>>>>>>> quite a while ago, so the API may have changed somewhat since > >>>>> you > >>>>>>>>>> started. > >>>>>>>>>>> > >>>>>>>>>>> As I see the API, there are a few different kinds of DSL > >>>>> method > >>>>>>>>>> arguments: > >>>>>>>>>>> * functions: things like Initializer, Aggregator, ValueJoiner, > >>>>>>>>>>> ForEachAction... All of these are essentially Streams-flavored > >>>>>>> Function > >>>>>>>>>>> interfaces with different arities, type bounds, and semantics. > >>>>>>>>>>> * config objects: things like Produced, Consumed, Joined, > >>>>>> Grouped... > >>>>>>>>>> These > >>>>>>>>>>> are containers for configurations, where the target of the > >>>>>>>> configuration > >>>>>>>>>> is > >>>>>>>>>>> the operation itself > >>>>>>>>>>> * raw configurations: things like a raw topic-name string and > >>>>>>>>>> Materialized: > >>>>>>>>>>> These are configurations for operations that have no config > >>>>> object, > >>>>>>> and > >>>>>>>>>> for > >>>>>>>>>>> various reasons, we didn't make one. The distinguishing > >>>>> feature is > >>>>>>> that > >>>>>>>>>> the > >>>>>>>>>>> target of the configuration is not the operation itself, but > >>>>> some > >>>>>>>> aspect > >>>>>>>>>> of > >>>>>>>>>>> it. For example, in Materialized, we are not setting the > >>>>> caching > >>>>>>>> behavior > >>>>>>>>>>> of, for example, an aggregation; we're setting the caching > >>>>> behavior > >>>>>>> of > >>>>>>>> a > >>>>>>>>>>> materialized state store attached to the aggregation. > >>>>>>>>>>> > >>>>>>>>>>> It seems like choosing to mix the Named interface in with the > >>>>>>> functions > >>>>>>>>>> has > >>>>>>>>>>> a couple of unfortunate side-effects: > >>>>>>>>>>> * Aggregator is not the only function passed to any of the > >>>>> relevant > >>>>>>>>>>> aggregate methods, so it seems a little arbitrary to pick that > >>>>>>> function > >>>>>>>>>>> over Initializer or Merger. > >>>>>>>>>>> * As you noted, branch() takes an array of Predicate, so we > >>>>> just > >>>>>>> ignore > >>>>>>>>>> the > >>>>>>>>>>> provided name(s), even though Predicate names are used > >>>>> elsewhere. > >>>>>>>>>>> * Not all things that we want to name have function arguments, > >>>>>>> notably > >>>>>>>>>>> source and sink, so we'd switch paradigms and use the config > >>>>> object > >>>>>>>>>>> instead. > >>>>>>>>>>> * Adding an extra method to the function interfaces means that > >>>>>> those > >>>>>>>> are > >>>>>>>>>> no > >>>>>>>>>>> longer SAM interfaces. You proposed to add a default > >>>>>> implementation, > >>>>>>> so > >>>>>>>>>> we > >>>>>>>>>>> could still pass a lambda if we don't want to set the name, > >>>>> but if > >>>>>> we > >>>>>>>>>> *do* > >>>>>>>>>>> want to set the name, we can no longer use lambdas. > >>>>>>>>>>> > >>>>>>>>>>> I think the obvious other choice would be to mix Named in > >>>>> with the > >>>>>>>> config > >>>>>>>>>>> objects instead, but this has one main downside of its own... > >>>>>>>>>>> * not every operator we wish to name has a config object. I > >>>>> don't > >>>>>>> know > >>>>>>>> if > >>>>>>>>>>> everyone involved is comfortable with adding a config object > >>>>> to > >>>>>> every > >>>>>>>>>>> operator that's missing one. > >>>>>>>>>>> > >>>>>>>>>>> Personally, I favor moving toward a more consistent state > >>>>> that's > >>>>>>>> forward > >>>>>>>>>>> compatible with any further changes we wish to make. I > >>>>> *think* that > >>>>>>>>>> giving > >>>>>>>>>>> every operator two forms (one with no config and one with a > >>>>> config > >>>>>>>>>> object) > >>>>>>>>>>> would be such an API. > >>>>>>>>>>> > >>>>>>>>>>> Comment 2 ========= > >>>>>>>>>>> > >>>>>>>>>>> Finally, just a minor comment: the static method in Named > >>>>> wouldn't > >>>>>>> work > >>>>>>>>>>> properly as defined. Assuming that we mix Named in with > >>>>> Produced, > >>>>>> for > >>>>>>>>>>> example, we'd need to be able to use it like: > >>>>>>>>>>>> kStream.to("out", Produced.with("myOut")) > >>>>>>>>>>> This doesn't work because with() returns a Named, but we need > >>>>> a > >>>>>>>> Produced. > >>>>>>>>>>> > >>>>>>>>>>> We can pull off a builder method in the interface, but not a > >>>>> static > >>>>>>>>>> method. > >>>>>>>>>>> To define a builder method in the interface that returns an > >>>>>> instance > >>>>>>> of > >>>>>>>>>> the > >>>>>>>>>>> concrete subtype, you have to use the "curiously recurring > >>>>> generic" > >>>>>>>>>>> pattern. > >>>>>>>>>>> > >>>>>>>>>>> It would look like: > >>>>>>>>>>> > >>>>>>>>>>> public interface Named<N extends Named<N>> { > >>>>>>>>>>> String name(); > >>>>>>>>>>> N withName(String name); > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> You can see where the name of the pattern comes from ;) > >>>>>>>>>>> An implementation would then look like: > >>>>>>>>>>> > >>>>>>>>>>> public class Produced implements Named<Produced> { > >>>>>>>>>>> String name() { return name; } > >>>>>>>>>>> Produced withName(final String name) { this.name = name; > >>>>> return > >>>>>>>> this; > >>>>>>>>>> } > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> Note that the generic parameter gets filled in properly in the > >>>>>>>>>> implementing > >>>>>>>>>>> class, so that you get the right return type out. > >>>>>>>>>>> > >>>>>>>>>>> It doesn't work at all with a static factory method at the > >>>>>> interface > >>>>>>>>>> level, > >>>>>>>>>>> so it would be up to Produced to define a static factory if it > >>>>>> wants > >>>>>>> to > >>>>>>>>>>> present one. > >>>>>>>>>>> > >>>>>>>>>>> ====== > >>>>>>>>>>> > >>>>>>>>>>> Those are my two feedbacks! > >>>>>>>>>>> > >>>>>>>>>>> I hope you find this helpful, rather than frustrating. I'm > >>>>> sorry I > >>>>>>>> didn't > >>>>>>>>>>> get a chance to comment sooner. > >>>>>>>>>>> > >>>>>>>>>>> Thanks for the KIP, I think it will be much nicer to be able > >>>>> to > >>>>>> name > >>>>>>>> the > >>>>>>>>>>> processor nodes. > >>>>>>>>>>> > >>>>>>>>>>> -John > >>>>>>>>>>> > >>>>>>>>>>> On Tue, Nov 27, 2018 at 6:34 PM Guozhang Wang < > >>>>> wangg...@gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi Florian, > >>>>>>>>>>>> > >>>>>>>>>>>> I've made a pass over the PR. There are some comments that > >>>>> are > >>>>>>> related > >>>>>>>>>> to > >>>>>>>>>>>> the function names which may be affecting the KIP wiki page, > >>>>> but > >>>>>>>>>> overall > >>>>>>>>>>> I > >>>>>>>>>>>> think it looks good already. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Guozhang > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Fri, Nov 16, 2018 at 4:21 PM Guozhang Wang < > >>>>> wangg...@gmail.com > >>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Thanks Florian! I will take a look at the PR. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Nov 12, 2018 at 2:44 PM Florian Hussonnois < > >>>>>>>>>>>> fhussonn...@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi Matthias, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Sorry I was absent for a while. I have started a new PR > >>>>> for this > >>>>>>>>>> KIP. > >>>>>>>>>>> It > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>> still in progress for now. I'm working on it. > >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5909 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Le ven. 19 oct. 2018 à 20:13, Matthias J. Sax < > >>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>> a > >>>>>>>>>>>>>> écrit : > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> What is the status of this KIP? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 7/19/18 5:17 PM, Guozhang Wang wrote: > >>>>>>>>>>>>>>>> Hello Florian, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Sorry for being late... Found myself keep apologizing > >>>>> for late > >>>>>>>>>>>> replies > >>>>>>>>>>>>>>>> these days. But I do want to push this KIP's progress > >>>>> forward > >>>>>>>>>> as I > >>>>>>>>>>>>>> see it > >>>>>>>>>>>>>>>> very important and helpful feature for extensibility. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> About the exceptions, I've gone through them and > >>>>> hopefully it > >>>>>> is > >>>>>>>>>>> an > >>>>>>>>>>>>>>>> exhaustive list: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> 1. KTable#toStream() > >>>>>>>>>>>>>>>> 2. KStream#merge(KStream) > >>>>>>>>>>>>>>>> 3. KStream#process() / transform() / transformValues() > >>>>>>>>>>>>>>>> 4. KGroupedTable / KGroupedStream#count() > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Here's my reasoning: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> * It is okay not letting users to override the name for > >>>>> 1/2, > >>>>>>>>>> since > >>>>>>>>>>>>>> they > >>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>> too trivial to be useful for debugging, plus their > >>>>> processor > >>>>>>>>>> names > >>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>> not determine any related topic / store names. > >>>>>>>>>>>>>>>> * For 3, I'd vote for adding overloaded functions with > >>>>> Named. > >>>>>>>>>>>>>>>> * For 4, if users really want to name the processor she > >>>>> can > >>>>>> call > >>>>>>>>>>>>>>>> aggregate() instead, so I think it is okay to skip this > >>>>> case. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Fri, Jul 6, 2018 at 3:06 PM, Florian Hussonnois < > >>>>>>>>>>>>>>> fhussonn...@gmail.com> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> The option #3 seems to be a good alternative and I find > >>>>> the > >>>>>> API > >>>>>>>>>>>> more > >>>>>>>>>>>>>>>>> elegant (thanks John). > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> But, we still have the need to overload some methods > >>>>> either > >>>>>>>>>>> because > >>>>>>>>>>>>>>> they do > >>>>>>>>>>>>>>>>> not accept an action instance or because they are > >>>>> translated > >>>>>> to > >>>>>>>>>>>>>> multiple > >>>>>>>>>>>>>>>>> processors. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> For example, this is the case for methods branch() and > >>>>>> merge(). > >>>>>>>>>>> We > >>>>>>>>>>>>>> could > >>>>>>>>>>>>>>>>> introduce a new interface Named (or maybe a different > >>>>> name ?) > >>>>>>>>>>> with > >>>>>>>>>>>> a > >>>>>>>>>>>>>>> method > >>>>>>>>>>>>>>>>> name(). All action interfaces could extend this one to > >>>>>>>>>> implement > >>>>>>>>>>>> the > >>>>>>>>>>>>>>> option > >>>>>>>>>>>>>>>>> 3). > >>>>>>>>>>>>>>>>> This would result by having the following overloads : > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Stream<K, V> merge(final Named name, final KStream<K, V> > >>>>>>>>>> stream); > >>>>>>>>>>>>>>>>> KStream<K, V>[] branch(final Named name, final > >>>>> Predicate<? > >>>>>>>>>> super > >>>>>>>>>>>> K, ? > >>>>>>>>>>>>>>> super > >>>>>>>>>>>>>>>>> V>... predicates) > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> N.B : The list above is not exhaustive > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> --------- > >>>>>>>>>>>>>>>>> user's code will become : > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> KStream<String, Integer> stream = > >>>>>>>>>> builder.stream("test"); > >>>>>>>>>>>>>>>>> KStream<String, Integer>[] branches = > >>>>>>>>>>>>>>>>> stream.branch(Named.with("BRANCH-STREAM-ON-VALUE"), > >>>>>>>>>>>>>>>>> Predicate.named("STREAM-PAIR-VALUE", > >>>>> (k, v) > >>>>>> -> > >>>>>>>>>> v > >>>>>>>>>>> % > >>>>>>>>>>>> 2 > >>>>>>>>>>>>>> == > >>>>>>>>>>>>>>>>> 0), > >>>>>>>>>>>>>>>>> Predicate.named("STREAM-IMPAIR-VALUE", > >>>>> (k, v) > >>>>>>>>>> -> > >>>>>>>>>>> v > >>>>>>>>>>>> % > >>>>>>>>>>>>>> 2 > >>>>>>>>>>>>>>> != > >>>>>>>>>>>>>>>>> 0)); > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> branches[0].to("pair"); > >>>>>>>>>>>>>>>>> branches[1].to("impair"); > >>>>>>>>>>>>>>>>> --------- > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> This is a mix of the options 3) and 1) > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Le ven. 6 juil. 2018 à 22:58, Guozhang Wang < > >>>>>>>>>> wangg...@gmail.com> > >>>>>>>>>>> a > >>>>>>>>>>>>>>> écrit : > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hi folks, just to summarize the options we have so far: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> 1) Add a new "as" for KTable / KStream, plus adding new > >>>>>> fields > >>>>>>>>>>> for > >>>>>>>>>>>>>>>>>> operators-returns-void control objects (the current > >>>>> wiki's > >>>>>>>>>>>>>> proposal). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Pros: no more overloads. > >>>>>>>>>>>>>>>>>> Cons: a bit departing with the current high-level API > >>>>> design > >>>>>>>>>> of > >>>>>>>>>>>> the > >>>>>>>>>>>>>>> DSL, > >>>>>>>>>>>>>>>>>> plus, the inconsistency between operators-returns-void > >>>>> and > >>>>>>>>>>>>>>>>>> operators-not-return-voids. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> 2) Add overloaded functions for all operators, that > >>>>> accepts > >>>>>> a > >>>>>>>>>>> new > >>>>>>>>>>>>>>> control > >>>>>>>>>>>>>>>>>> object "Described". > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Pros: consistent with current APIs. > >>>>>>>>>>>>>>>>>> Cons: lots of overloaded functions to add. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> 3) Add another default function in the interface > >>>>> (thank you > >>>>>>>>>> J8!) > >>>>>>>>>>>> as > >>>>>>>>>>>>>>> John > >>>>>>>>>>>>>>>>>> proposed. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Pros: no overloaded functions, no "Described". > >>>>>>>>>>>>>>>>>> Cons: do we lose lambda functions really (seems not if > >>>>> we > >>>>>>>>>>> provide > >>>>>>>>>>>> a > >>>>>>>>>>>>>>>>> "named" > >>>>>>>>>>>>>>>>>> for each func)? Plus "Described" may be more extensible > >>>>>> than a > >>>>>>>>>>>>>> single > >>>>>>>>>>>>>>>>>> `String`. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> My principle of considering which one is better depends > >>>>>>>>>>> primarily > >>>>>>>>>>>> on > >>>>>>>>>>>>>>> "how > >>>>>>>>>>>>>>>>>> to make advanced users easily use the additional API, > >>>>> while > >>>>>>>>>>>> keeping > >>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>> hidden from normal users who do not care at all". For > >>>>> that > >>>>>>>>>>>> purpose I > >>>>>>>>>>>>>>>>> think > >>>>>>>>>>>>>>>>>> 3) > 1) > 2). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> One caveat though, is that changing the interface > >>>>> would not > >>>>>> be > >>>>>>>>>>>>>>>>>> binary-compatible though source-compatible, right? I.e. > >>>>>> users > >>>>>>>>>>> need > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> recompile their code though no changes needed. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Another note: for 3), if we really want to keep > >>>>>> extensibility > >>>>>>>>>> of > >>>>>>>>>>>>>>>>> Described > >>>>>>>>>>>>>>>>>> we could do sth. like: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> --------- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> public interface Predicate<K, V> { > >>>>>>>>>>>>>>>>>> // existing method > >>>>>>>>>>>>>>>>>> boolean test(final K key, final V value); > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> // new default method adds the ability to name the > >>>>>>>>>> predicate > >>>>>>>>>>>>>>>>>> default Described described() { > >>>>>>>>>>>>>>>>>> return new Described(null); > >>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> ---------- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> where user's code becomes: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> stream.filter(named("key", (k, v) -> true)); // note > >>>>>> `named` > >>>>>>>>>>> now > >>>>>>>>>>>>>> just > >>>>>>>>>>>>>>>>>> sets a Described("key") in "described()". > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> stream.filter(described(Described.as("key", /* any > >>>>> other > >>>>>> fancy > >>>>>>>>>>>>>>>>> parameters > >>>>>>>>>>>>>>>>>> in the future*/), (k, v) -> true)); > >>>>>>>>>>>>>>>>>> ---------- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I feel it is not much likely that we'd need to extend > >>>>> it > >>>>>>>>>> further > >>>>>>>>>>>> in > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> future, so just a `String` would be good enough. But > >>>>> just > >>>>>>>>>>> listing > >>>>>>>>>>>>>> all > >>>>>>>>>>>>>>>>>> possibilities here. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Fri, Jul 6, 2018 at 8:19 AM, John Roesler < > >>>>>>>>>> j...@confluent.io > >>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Hi Florian, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Sorry I'm late to the party, but I missed the message > >>>>>>>>>>> originally. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Regarding the names, it's probably a good idea to > >>>>> stick to > >>>>>>>>>> the > >>>>>>>>>>>> same > >>>>>>>>>>>>>>>>>>> character set we're currently using: letters, > >>>>> numbers, and > >>>>>>>>>>>> hyphens. > >>>>>>>>>>>>>>> The > >>>>>>>>>>>>>>>>>>> names are used in Kafka topics, files and folders, and > >>>>>>>>>> RocksDB > >>>>>>>>>>>>>>>>> databases, > >>>>>>>>>>>>>>>>>>> and we also need them to work with the file systems of > >>>>>>>>>> Windows, > >>>>>>>>>>>>>> Linux, > >>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>> MacOS. My opinion is that with a situation like that, > >>>>> it's > >>>>>>>>>>> better > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>> conservative. It might also be a good idea to impose > >>>>> an > >>>>>> upper > >>>>>>>>>>>>>> limit on > >>>>>>>>>>>>>>>>>> name > >>>>>>>>>>>>>>>>>>> length to avoid running afoul of any of those systems. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> --- > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> It seems like there's a small debate between 1) > >>>>> adding a > >>>>>> new > >>>>>>>>>>>>>> method to > >>>>>>>>>>>>>>>>>>> KStream (and maybe KTable) to modify its name after > >>>>> the > >>>>>> fact, > >>>>>>>>>>> or > >>>>>>>>>>>> 2) > >>>>>>>>>>>>>>>>>>> piggy-backing on the config objects where they exist > >>>>> and > >>>>>>>>>> adding > >>>>>>>>>>>> one > >>>>>>>>>>>>>>>>> where > >>>>>>>>>>>>>>>>>>> they don't. To me, #2 is the better alternative even > >>>>> though > >>>>>>>>>> it > >>>>>>>>>>>>>>> produces > >>>>>>>>>>>>>>>>>>> more overloads and may be a bit awkward in places. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> The reason is simply that #1 is a high-level > >>>>> departure from > >>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> graph-building paradigm we're using in the DSL. > >>>>> Consider: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Graph.node1(config).node2(config) > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> vs > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Graph.node1().config().node2().config() > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> We could have done either, but we picked the former. I > >>>>>> think > >>>>>>>>>>> it's > >>>>>>>>>>>>>>>>>> probably > >>>>>>>>>>>>>>>>>>> a good goal to try and stick to it so that developers > >>>>> can > >>>>>>>>>>> develop > >>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>> rely > >>>>>>>>>>>>>>>>>>> on their instincts for how the DSL will behave. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I do want to present one alternative to adding new > >>>>> config > >>>>>>>>>>>> objects: > >>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>>>>> just add a "name()" method to all our "action" > >>>>> interfaces. > >>>>>>>>>> For > >>>>>>>>>>>>>>> example, > >>>>>>>>>>>>>>>>>>> I'll demonstrate how we can add a "name" to Predicate > >>>>> and > >>>>>>>>>> then > >>>>>>>>>>>> use > >>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> name a "KStream#filter" DSL operator: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> public interface Predicate<K, V> { > >>>>>>>>>>>>>>>>>>> // existing method > >>>>>>>>>>>>>>>>>>> boolean test(final K key, final V value); > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> // new default method adds the ability to name the > >>>>>>>>>>> predicate > >>>>>>>>>>>>>>>>>>> default String name() { > >>>>>>>>>>>>>>>>>>> return null; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> // new static factory method adds the ability to > >>>>> wrap > >>>>>>>>>>> lambda > >>>>>>>>>>>>>>>>>> predicates > >>>>>>>>>>>>>>>>>>> with a named predicate > >>>>>>>>>>>>>>>>>>> static <K, V> Predicate<K, V> named(final String > >>>>> name, > >>>>>>>>>>> final > >>>>>>>>>>>>>>>>>>> Predicate<K, V> predicate) { > >>>>>>>>>>>>>>>>>>> return new Predicate<K, V>() { > >>>>>>>>>>>>>>>>>>> @Override > >>>>>>>>>>>>>>>>>>> public boolean test(final K key, final V > >>>>>> value) { > >>>>>>>>>>>>>>>>>>> return predicate.test(key, value); > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> @Override > >>>>>>>>>>>>>>>>>>> public String name() { > >>>>>>>>>>>>>>>>>>> return name; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> }; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Then, here's how it would look to use it: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> // Anonymous predicates continue to work just fine > >>>>>>>>>>>>>>>>>>> stream.filter((k, v) -> true); > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> // Devs can swap in a Predicate that implements the > >>>>> name() > >>>>>>>>>>>> method. > >>>>>>>>>>>>>>>>>>> stream.filter(new Predicate<Object, Object>() { > >>>>>>>>>>>>>>>>>>> @Override > >>>>>>>>>>>>>>>>>>> public boolean test(final Object key, final Object > >>>>>>>>>> value) { > >>>>>>>>>>>>>>>>>>> return true; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> @Override > >>>>>>>>>>>>>>>>>>> public String name() { > >>>>>>>>>>>>>>>>>>> return "hey"; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> }); > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> // Or they can wrap their existing lambda using the > >>>>> static > >>>>>>>>>>>> factory > >>>>>>>>>>>>>>>>> method > >>>>>>>>>>>>>>>>>>> stream.filter(named("key", (k, v) -> true)); > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Just a thought. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Overall, I think it's really valuable to be able to > >>>>> name > >>>>>> the > >>>>>>>>>>>>>>>>> processors, > >>>>>>>>>>>>>>>>>>> for all the reasons you mentioned in the KIP. So > >>>>> thank you > >>>>>>>>>> for > >>>>>>>>>>>>>>>>>> introducing > >>>>>>>>>>>>>>>>>>> this! > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>> -John > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Thu, Jul 5, 2018 at 4:53 PM Florian Hussonnois < > >>>>>>>>>>>>>>>>> fhussonn...@gmail.com > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Hi, thank you very much for all you suggestions. I've > >>>>>>>>>> started > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>> update > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> KIP ( > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL > >>>>>>>>>>>>>>>>>>>> ). > >>>>>>>>>>>>>>>>>>>> Also, I propose to rename the Processed class into > >>>>>>>>>> Described - > >>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>> more meaningful (but this is just a detail). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I'm OK to not enforcing uppercase for specific names > >>>>> but > >>>>>>>>>>> should > >>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>> allow > >>>>>>>>>>>>>>>>>>>> arbitrary names with whitespaces for example ? > >>>>> Currently, > >>>>>> I > >>>>>>>>>>>> can't > >>>>>>>>>>>>>>>>> tell > >>>>>>>>>>>>>>>>>> if > >>>>>>>>>>>>>>>>>>>> this can lead to some side effects ? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Le lun. 11 juin 2018 à 01:31, Matthias J. Sax < > >>>>>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>> écrit : > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Just catching up on this thread. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I like the general idea. Couple of comments: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - I think that adding `Processed` (or maybe a > >>>>> different > >>>>>>>>>>> name?) > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>> valid proposal for stateless operators that only > >>>>> have a > >>>>>>>>>>> single > >>>>>>>>>>>>>>>>>> overload > >>>>>>>>>>>>>>>>>>>>> atm. It would align with the overall API design. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - for all methods with multiple existing > >>>>> overloads, we > >>>>>> can > >>>>>>>>>>>>>>>>> consider > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>> extend `Consumed`, `Produced`, `Materialized` etc > >>>>> to take > >>>>>>>>>> an > >>>>>>>>>>>>>>>>>> additional > >>>>>>>>>>>>>>>>>>>>> processor name (not sure atm how elegant this is; we > >>>>>> would > >>>>>>>>>>> need > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>> "play" with the API a little bit; the advantage > >>>>> would be, > >>>>>>>>>>> that > >>>>>>>>>>>> we > >>>>>>>>>>>>>>>>> do > >>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> add more overloads what seems to be key for this > >>>>> KIP) > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - operators return void: while I agree that the > >>>>> "name > >>>>>>>>>> first" > >>>>>>>>>>>>>>>>>> chaining > >>>>>>>>>>>>>>>>>>>>> idea is not very intuitive, it might still work, if > >>>>> we > >>>>>> name > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> method > >>>>>>>>>>>>>>>>>>>>> correctly (again, we would need to "play" with the > >>>>> API a > >>>>>>>>>>> little > >>>>>>>>>>>>>> bit > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> see) > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - for DSL operators that are translated to multiple > >>>>>> nodes: > >>>>>>>>>>> it > >>>>>>>>>>>>>>>>> might > >>>>>>>>>>>>>>>>>>>>> make sense to use the specified operator name as > >>>>> prefix > >>>>>> and > >>>>>>>>>>> add > >>>>>>>>>>>>>>>>>>>>> reasonable suffixes. For example, a join translates > >>>>> into > >>>>>> 5 > >>>>>>>>>>>>>>>>> operators > >>>>>>>>>>>>>>>>>>>>> that could be name "name-left-store-processor", > >>>>>>>>>>>>>>>>>>>>> "name-left-join-processor", > >>>>> "name-right-store-processor", > >>>>>>>>>>>>>>>>>>>>> "name-right-join-processor", and > >>>>>>>>>> "name-join-merge-processor" > >>>>>>>>>>>> (or > >>>>>>>>>>>>>>>>>>>>> similar). Maybe just using numbers might also work. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - I think, we should strip the number suffixes if > >>>>> a user > >>>>>>>>>>>>>> provides > >>>>>>>>>>>>>>>>>>> names > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - enforcing upper case seems to be tricky: for > >>>>> example, > >>>>>> we > >>>>>>>>>>> do > >>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> enforce upper case for store names and we cannot > >>>>> easily > >>>>>>>>>>> change > >>>>>>>>>>>> it > >>>>>>>>>>>>>>>>> as > >>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>> would break compatibility -- thus, for consistency > >>>>>> reasons > >>>>>>>>>> we > >>>>>>>>>>>>>> might > >>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> want to do this > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> - for better understand of the impact of the KIP, > >>>>> it > >>>>>> would > >>>>>>>>>>> be > >>>>>>>>>>>>>>>>> quite > >>>>>>>>>>>>>>>>>>>>> helpful if you would list all method names that are > >>>>>>>>>> affected > >>>>>>>>>>> in > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>>>>> (ie, list all newly added overloads) > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On 5/31/18 6:40 PM, Guozhang Wang wrote: > >>>>>>>>>>>>>>>>>>>>>> Hi Florian, > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Re 1: I think changing the KStreamImpl / > >>>>> KTableImpl to > >>>>>>>>>> allow > >>>>>>>>>>>>>>>>>>> modifying > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> processor name after the operator is fine as long > >>>>> as we > >>>>>> do > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> check > >>>>>>>>>>>>>>>>>>>>> again > >>>>>>>>>>>>>>>>>>>>>> when modifying that. In fact, we are having some > >>>>>> topology > >>>>>>>>>>>>>>>>>>> optimization > >>>>>>>>>>>>>>>>>>>>>> going on which may modify processor names in the > >>>>> final > >>>>>>>>>>>> topology > >>>>>>>>>>>>>>>>>>>> anyways ( > >>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4983). > >>>>>> Semantically > >>>>>>>>>> I > >>>>>>>>>>>>>> think > >>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>> easier to understand to developers than "deciding > >>>>> the > >>>>>>>>>>>> processor > >>>>>>>>>>>>>>>>>> name > >>>>>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>> the next operator". > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Re 2: Yeah I'm thinking that for operators that > >>>>>> translates > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> multiple > >>>>>>>>>>>>>>>>>>>>>> processor names, we can still use the provided > >>>>> "hint" to > >>>>>>>>>>> name > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> processor > >>>>>>>>>>>>>>>>>>>>>> names, e.g. for Joins we can name them as > >>>>>> `join-foo-this` > >>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>> `join-foo-that` etc if user calls `as("foo")`. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Re 3: The motivation I had about removing the > >>>>> suffix is > >>>>>>>>>> that > >>>>>>>>>>>> it > >>>>>>>>>>>>>>>>> has > >>>>>>>>>>>>>>>>>>>> huge > >>>>>>>>>>>>>>>>>>>>>> restrictions on topology compatibilities: consider > >>>>> if > >>>>>> user > >>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>>> added a > >>>>>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>> operator, or library does some optimization to > >>>>> remove > >>>>>> some > >>>>>>>>>>>>>>>>>> operators, > >>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> suffix indexing may be changed for a large amount > >>>>> of the > >>>>>>>>>>>>>>>>> processor > >>>>>>>>>>>>>>>>>>>> names: > >>>>>>>>>>>>>>>>>>>>>> this will in turn change the internal state store > >>>>> names, > >>>>>>>>>> as > >>>>>>>>>>>> well > >>>>>>>>>>>>>>>>> as > >>>>>>>>>>>>>>>>>>>>>> internal topic names as well, making the new > >>>>> application > >>>>>>>>>>>>>> topology > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> incompatible with the ones. One rationale I had > >>>>> about > >>>>>> this > >>>>>>>>>>> KIP > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>> aligned this effort, moving forward we can allow > >>>>> users > >>>>>> to > >>>>>>>>>>>>>>>>> customize > >>>>>>>>>>>>>>>>>>>>>> internal names so that they can still be reused > >>>>> even > >>>>>> with > >>>>>>>>>>>>>>>>> topology > >>>>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>>> (e.g. KIP-230), so I think removing the suffix > >>>>> index > >>>>>> would > >>>>>>>>>>> be > >>>>>>>>>>>>>>>>> more > >>>>>>>>>>>>>>>>>>>>>> applicable in the long run. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Thu, May 31, 2018 at 3:08 PM, Florian > >>>>> Hussonnois < > >>>>>>>>>>>>>>>>>>>>> fhussonn...@gmail.com> > >>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Hi , > >>>>>>>>>>>>>>>>>>>>>>> Thank you very much for your feedback. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 1/ > >>>>>>>>>>>>>>>>>>>>>>> I agree that overloading most of the methods with > >>>>> a > >>>>>>>>>>> Processed > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> ideal. > >>>>>>>>>>>>>>>>>>>>>>> I've started modifying the KStream API and I got > >>>>> to the > >>>>>>>>>>> same > >>>>>>>>>>>>>>>>>>>> conclusion. > >>>>>>>>>>>>>>>>>>>>>>> Also ading a new method directly to KStreamImpl > >>>>> and > >>>>>>>>>>>> KTableImpl > >>>>>>>>>>>>>>>>>>> classes > >>>>>>>>>>>>>>>>>>>>>>> seems to be a better option. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> However a processor name cannot be redefined after > >>>>>>>>>> calling > >>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>> operator > >>>>>>>>>>>>>>>>>>>>> (or > >>>>>>>>>>>>>>>>>>>>>>> maybe I miss something in the code). > >>>>>>>>>>>>>>>>>>>>>>> From my understanding, this will only set the > >>>>> KStream > >>>>>>>>>> name > >>>>>>>>>>>>>>>>>> property > >>>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> processor name previsouly added to the topology > >>>>>> builder - > >>>>>>>>>>>>>>>>> leading > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>> InvalidTopology exception. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> So the new method should actually defines the > >>>>> name of > >>>>>> the > >>>>>>>>>>>> next > >>>>>>>>>>>>>>>>>>>>> processor : > >>>>>>>>>>>>>>>>>>>>>>> Below is an example : > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> *stream.as <http://stream.as > >>>>>>>>>>>>>>>>>>> (Processed.name("MAPPE_TO_UPPERCASE")* > >>>>>>>>>>>>>>>>>>>>>>> * .map( (k, v) -> KeyValue.pair(k, > >>>>>>>>>>>> v.toUpperCase()))* > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> I think this approach could solve the cases for > >>>>> methods > >>>>>>>>>>>>>>>>> returning > >>>>>>>>>>>>>>>>>>>> void ? > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Regarding this new method we have two possible > >>>>>>>>>>>> implementations > >>>>>>>>>>>>>> : > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 1. Adding a method like : withName(String > >>>>>>>>>> processorName) > >>>>>>>>>>>>>>>>>>>>>>> 2. or adding a method accepting an Processed > >>>>> object > >>>>>> : > >>>>>>>>>>>>>>>>>>>> as(Processed). > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> I think solution 2. is preferable as the Processed > >>>>>> class > >>>>>>>>>>>> could > >>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>> enriched > >>>>>>>>>>>>>>>>>>>>>>> further (in futur). > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 2/ > >>>>>>>>>>>>>>>>>>>>>>> As Guozhang said some operators add internal > >>>>>> processors. > >>>>>>>>>>>>>>>>>>>>>>> For example the branch() method create one > >>>>>> KStreamBranch > >>>>>>>>>>>>>>>>> processor > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>> route > >>>>>>>>>>>>>>>>>>>>>>> records and one KStreamPassThrough processor for > >>>>> each > >>>>>>>>>>> branch. > >>>>>>>>>>>>>>>>>>>>>>> In that situation only the parent processor can be > >>>>>> named. > >>>>>>>>>>> For > >>>>>>>>>>>>>>>>>>> children > >>>>>>>>>>>>>>>>>>>>>>> processors we could keep the current behaviour > >>>>> that > >>>>>> add a > >>>>>>>>>>>>>> suffix > >>>>>>>>>>>>>>>>>>> (i.e > >>>>>>>>>>>>>>>>>>>>>>> KSTREAM-BRANCHCHILD-) > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> This also the case for the join() method that > >>>>> result to > >>>>>>>>>>>> adding > >>>>>>>>>>>>>>>>>>>> multiple > >>>>>>>>>>>>>>>>>>>>>>> processors to the topology (windowing, left/right > >>>>> joins > >>>>>>>>>>> and a > >>>>>>>>>>>>>>>>>> merge > >>>>>>>>>>>>>>>>>>>>>>> processor). > >>>>>>>>>>>>>>>>>>>>>>> I think, like for the branch method users could > >>>>> only > >>>>>>>>>>> define a > >>>>>>>>>>>>>>>>>>>> processor > >>>>>>>>>>>>>>>>>>>>>>> name prefix. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 3/ > >>>>>>>>>>>>>>>>>>>>>>> I think we should still added a suffix like > >>>>>>>>>> "-0000000000" > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> processor > >>>>>>>>>>>>>>>>>>>>>>> name and enforce uppercases as this will keep some > >>>>>>>>>>>> consistency > >>>>>>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> ones generated by the API. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 4/ > >>>>>>>>>>>>>>>>>>>>>>> Yes, the KTable interface should be modified like > >>>>>> KStream > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>> allow > >>>>>>>>>>>>>>>>>>>>> custom > >>>>>>>>>>>>>>>>>>>>>>> processor names definition. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Le jeu. 31 mai 2018 à 19:18, Damian Guy < > >>>>>>>>>>>> damian....@gmail.com> > >>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>> écrit > >>>>>>>>>>>>>>>>>>>>> : > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Hi Florian, > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. What about KTable and other > >>>>> DSL > >>>>>>>>>>>>>> interfaces? > >>>>>>>>>>>>>>>>>>> Will > >>>>>>>>>>>>>>>>>>>>> they > >>>>>>>>>>>>>>>>>>>>>>>> not want to be able to do the same thing? > >>>>>>>>>>>>>>>>>>>>>>>> It would be good to see a complete set of the > >>>>> public > >>>>>> API > >>>>>>>>>>>>>>>>> changes. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>> Damian > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 May 2018 at 19:45 Guozhang Wang < > >>>>>>>>>>>>>> wangg...@gmail.com > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Hello Florian, > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. I have some meta feedbacks > >>>>> on the > >>>>>>>>>>>>>>>>> proposal: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> 1. You mentioned that this `Processed` object > >>>>> will be > >>>>>>>>>>> added > >>>>>>>>>>>>>>>>> to a > >>>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>>>> overloaded variant of all the stateless > >>>>> operators, > >>>>>> what > >>>>>>>>>>>> about > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> stateful > >>>>>>>>>>>>>>>>>>>>>>>>> operators? Would like to hear your opinions if > >>>>> you > >>>>>> have > >>>>>>>>>>>>>>>>> thought > >>>>>>>>>>>>>>>>>>>> about > >>>>>>>>>>>>>>>>>>>>>>>> that: > >>>>>>>>>>>>>>>>>>>>>>>>> note for stateful operators they will usually be > >>>>>> mapped > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> multiple > >>>>>>>>>>>>>>>>>>>>>>>>> processor node names, so we probably need to > >>>>> come up > >>>>>>>>>> with > >>>>>>>>>>>>>> some > >>>>>>>>>>>>>>>>>>> ways > >>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>> define all their names. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> 2. I share the same concern with Bill as for > >>>>> adding > >>>>>>>>>> lots > >>>>>>>>>>> of > >>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>> overload > >>>>>>>>>>>>>>>>>>>>>>>>> functions into the stateless operators, as we > >>>>> have > >>>>>> just > >>>>>>>>>>>> spent > >>>>>>>>>>>>>>>>>>> quite > >>>>>>>>>>>>>>>>>>>>>>> some > >>>>>>>>>>>>>>>>>>>>>>>>> effort in trimming them since 1.0.0 release. If > >>>>> the > >>>>>>>>>> goal > >>>>>>>>>>> is > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> just > >>>>>>>>>>>>>>>>>>>>>>>> provide > >>>>>>>>>>>>>>>>>>>>>>>>> some "hints" on the generated processor node > >>>>> names, > >>>>>> not > >>>>>>>>>>>>>>>>> strictly > >>>>>>>>>>>>>>>>>>>>>>>> enforcing > >>>>>>>>>>>>>>>>>>>>>>>>> the exact names that to be generated, then how > >>>>> about > >>>>>> we > >>>>>>>>>>>> just > >>>>>>>>>>>>>>>>>> add a > >>>>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>>>> function to `KStream` and `KTable` classes like: > >>>>>>>>>>>>>>>>>> "as(Processed)", > >>>>>>>>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> semantics as "the latest operators that > >>>>> generate this > >>>>>>>>>>>> KStream > >>>>>>>>>>>>>>>>> / > >>>>>>>>>>>>>>>>>>>> KTable > >>>>>>>>>>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>>>>>>>> be named accordingly to this hint". > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> The only caveat, is that for all operators like > >>>>>>>>>>>> `KStream#to` > >>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>> `KStream#print` that returns void, this > >>>>> alternative > >>>>>>>>>> would > >>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>> work. > >>>>>>>>>>>>>>>>>>>>> But > >>>>>>>>>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>>> the current operators: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> a. KStream#print, > >>>>>>>>>>>>>>>>>>>>>>>>> b. KStream#foreach, > >>>>>>>>>>>>>>>>>>>>>>>>> c. KStream#to, > >>>>>>>>>>>>>>>>>>>>>>>>> d. KStream#process > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> I personally felt that except `KStream#process` > >>>>> users > >>>>>>>>>>> would > >>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> usually > >>>>>>>>>>>>>>>>>>>>>>>>> bother to override their names, and for > >>>>>>>>>> `KStream#process` > >>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>> could > >>>>>>>>>>>>>>>>>>>> add > >>>>>>>>>>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>>>>>>>> overload variant with the additional Processed > >>>>>> object. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> 3. In your example, the processor names are > >>>>> still > >>>>>> added > >>>>>>>>>>>> with > >>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>> suffix > >>>>>>>>>>>>>>>>>>>>>>>> like > >>>>>>>>>>>>>>>>>>>>>>>>> " > >>>>>>>>>>>>>>>>>>>>>>>>> -0000000000", is this intentional? If yes, why > >>>>> (I > >>>>>>>>>> thought > >>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>> user > >>>>>>>>>>>>>>>>>>>>>>>>> specified processor name hints we will not add > >>>>> suffix > >>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> distinguish > >>>>>>>>>>>>>>>>>>>>>>>>> different nodes of the same type any more)? > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 29, 2018 at 6:47 AM, Bill Bejeck < > >>>>>>>>>>>>>>>>> bbej...@gmail.com > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Florian, > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. I think being able to add > >>>>> more > >>>>>>>>>>>> context > >>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>> processor names would be useful. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> I like the idea of adding a > >>>>> "withProcessorName" to > >>>>>>>>>>>> Produced, > >>>>>>>>>>>>>>>>>>>> Consumed > >>>>>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>> Joined. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> But instead of adding the "Processed" > >>>>> parameter to a > >>>>>>>>>>> large > >>>>>>>>>>>>>>>>>>>> percentage > >>>>>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>> the methods, which would result in overloaded > >>>>>> methods > >>>>>>>>>>>> (which > >>>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>>>>> removed > >>>>>>>>>>>>>>>>>>>>>>>>>> quite a bit with KIP-182) what do you think of > >>>>>> adding > >>>>>>>>>> a > >>>>>>>>>>>>>>>>> method > >>>>>>>>>>>>>>>>>>>>>>>>>> to the AbstractStream class "withName(String > >>>>>>>>>>>>>> processorName)"? > >>>>>>>>>>>>>>>>>> BTW > >>>>>>>>>>>>>>>>>>>> I"m > >>>>>>>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>>>>> married to the method name, it's the best I > >>>>> can do > >>>>>> off > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>> top > >>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>> my > >>>>>>>>>>>>>>>>>>>>>>>>> head. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> For the methods that return void, we'd have to > >>>>> add a > >>>>>>>>>>>>>>>>> parameter, > >>>>>>>>>>>>>>>>>>> but > >>>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>> would at least cut down on the number of > >>>>> overloaded > >>>>>>>>>>>> methods > >>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> API. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Just my 2 cents. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>>>>>>>>> Bill > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, May 27, 2018 at 4:13 PM, Florian > >>>>> Hussonnois > >>>>>> < > >>>>>>>>>>>>>>>>>>>>>>>>> fhussonn...@gmail.com > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a new discussion on > >>>>> following > >>>>>>>>>>> KIP : > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> This is still a draft. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward for your feedback. > >>>>>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>> Florian HUSSONNOIS > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> Florian HUSSONNOIS > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> -- Guozhang > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Florian HUSSONNOIS > >>>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> -- Guozhang > >>>> > >>> > >>> > >>> -- > >>> -- Guozhang > >>> > >> > >> > >> -- > >> -- Guozhang > >> > > > > > > -- Florian HUSSONNOIS