Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

Florian Hussonnois Fri, 11 Jan 2019 15:40:19 -0800

Hi Guozhang,

I have updated the PR as well as the KIP. I should add more unit tests to
covers all new methods.


However, I still have one test in failure. The reason is that using
Joined.name() in both potential repartition topic and processor nodes leads
to topology-incompatible.
How should we deal with that ?

Thanks,

Le jeu. 10 janv. 2019 à 01:21, Guozhang Wang <[email protected]> a écrit :

> Hello Florian,
>
> Just checking if have read about my previous email and if you feel happy
> about it. We have the 2.2 KIP freeze deadline at 24th this month, while the
> PR itself is getting quite close. So it'll be great if we can get the
> agreement on it and get it into 2.2.0 release.
>
>
> Guozhang
>
>
> On Mon, Dec 17, 2018 at 2:39 PM Guozhang Wang <[email protected]> wrote:
>
>> Hi Florian / John,
>>
>> Just wanted to throw a couple minor thoughts on the current proposal:
>>
>> 1) Regarding the interface / function name, I'd propose we call the
>> interface `NamedOperation` which would be implemented by Produced /
>> Consumed / Printed / Joined / Grouped / Suppressed (note I intentionally
>> exclude Materialized here since its semantics is quite), and have the
>> default class that implements `NamedOperation` as `Named`, which would be
>> used in our adding overload functions. The main reason is to have
>> consistency in naming.
>>
>> 2) As a minor tweak, I think it's better to use Joined.name() in both its
>> possibly generate repartition topic, as well as the map processor used for
>> group-by (currently this name is only used for the repartition topic).
>>
>>
>> Florian: if you think this proposal makes sense, please feel free to go
>> ahead and update the PR; after we made a first pass on it and feels
>> confident about it, we can go ahead with the VOTING process. About the
>> implementation of 2) above, this may be out of your implementation scope,
>> so feel free to leave it out side your PR while Bill who originally worked
>> on the Grouped KIP can make a follow-up PR for it.
>>
>> Guozhang
>>
>> On Fri, Dec 14, 2018 at 9:43 PM Guozhang Wang <[email protected]> wrote:
>>
>>> Hello Florian,
>>>
>>> Really appreciate you for your patience.
>>>
>>> I know that we've discussed about the approach to adding overloaded
>>> functions and rejected it early on. But looking deeper into the current PR
>>> I realized that this approach has a danger of great API confusions to users
>>> (I tried to explain my thoughts in the PR, but it was not very clear) ---
>>> the basic idea is that, today we already have a few existing control
>>> classes including Grouped, Joined, Suppressed that allow users to specify
>>> serdes etc, while also a "name" which can then be used to define the
>>> processor name / internal topic names in the topology (the static function
>>> names are not consistent, which I think we should fix as well). And Named
>>> interface, by extending the lambda function interfaces like ValueJoiner /
>>> Predicate etc opens the door for another way to specify the names again.
>>>
>>> So in order to achieve consistency, we are left with generally two
>>> options:
>>>
>>> 1) only allow users to specify names via the lambda interfaces that
>>> extends Named interface. This means we'd better remove the naming mechanism
>>> from the existing control objects to keep consistency.
>>>
>>> 2) only allow users to specify names via control classes, and we
>>> introduce a new class (Named) for those which do not have one yet --- this
>>> leads to the overloaded functions.
>>>
>>> I did a quick count on the num.of overloaded functions, and summing from
>>> KTable (8) / KStream (15) / KGroupedStream (6) / KGroupedTable (6) /
>>> TimeWindowedKStream (6) / SessionWindowedKStream (6) we got about 47
>>> overloaded functions (our guess was pretty close!) -- note this is based on
>>> John's proposal that we can let existing Grouped / Joined to extend Named
>>> and hence we only need overloaded functions with a default NamedOperation
>>> for those operators that do not have a control classes already.
>>>
>>> Thinking about this approach I feel it is not too bad compared with
>>> either 1) above, which would require us to deprecate lot of public
>>> functions around name(), or having a mixed mechanism for naming, which
>>> could lead to very confusing behavior to users. Additionally, for most
>>> users who would only want to specify the names for those stateful
>>> operations which have internal topics / state stores and hence are more
>>> keen to upgrade compatibility, those added overloads would be not-often
>>> used functions for them anyways. And by letting existing control classes to
>>> extend Named, we can have a unified method name for static constructor as
>>> well.
>>>
>>>
>>>
>>> Guozhang
>>>
>>>
>>> On Fri, Dec 14, 2018 at 10:24 AM John Roesler <[email protected]> wrote:
>>>
>>>> Hi Florian,
>>>>
>>>> Sorry about the run-around of rejecting the original proposal,
>>>> only to return to it later on. Hopefully, it's more encouraging
>>>> than frustrating that we're coming around to your initial way of
>>>> thinking.
>>>>
>>>> Thanks!
>>>> -John
>>>>
>>>> On Thu, Dec 13, 2018 at 4:28 PM Florian Hussonnois <
>>>> [email protected]>
>>>> wrote:
>>>>
>>>> > Hi all,
>>>> >
>>>> > Thanks again. I agree with your propositions.
>>>> > Also IMHO, overloading all methods (filter, map) to accept a new
>>>> control
>>>> > object seems to provide a more natural development experience for
>>>> users.
>>>> >
>>>> > Actually, this was the first proposition for this KIP, but we have
>>>> rejected
>>>> > it because this solution led to adding a lot of new methods.
>>>> > As you mentioned it, the API has evolve since the creation of this
>>>> KIP -
>>>> > some existing control objects already allow to customize internal
>>>> names. We
>>>> > should so keep on that strategy.
>>>> >
>>>> > If everyone is OK with that, I will update the KIP and the PR
>>>> accordingly;
>>>> >
>>>> > Thanks.
>>>> >
>>>> > Le jeu. 13 déc. 2018 à 18:08, John Roesler <[email protected]> a
>>>> écrit :
>>>> >
>>>> > > Hi again, all,
>>>> > >
>>>> > > Matthias, I agree with you.
>>>> > >
>>>> > > Florian, thanks for your response.
>>>> > >
>>>> > > I think your proposal is the best way to address the ask for hiding
>>>> the
>>>> > > name() getter. But I'd like to question that ask and instead
>>>> propose that
>>>> > > we just make the name() getter part of the public API.
>>>> > >
>>>> > > The desire to "hide" the getters causes a lot of complexity in our
>>>> code
>>>> > > base, and it will become completely impractical with the mixin
>>>> strategy
>>>> > of
>>>> > > Named.
>>>> > >
>>>> > > If we were to switch strategies back to mixing Named in to the
>>>> control
>>>> > > objects rather than the functions, then the path forward becomes
>>>> quite
>>>> > > clear.
>>>> > >
>>>> > > On the other hand, it seems harmless for anyone who wants to be
>>>> able to
>>>> > > query the name from a control object after setting it, so my vote
>>>> would
>>>> > be
>>>> > > simply to keep the Named interface as:
>>>> > >
>>>> > > public interface Named<T extends Named<T>> {
>>>> > >   String name();
>>>> > >   T withName(String name);
>>>> > > }
>>>> > >
>>>> > > Under this proposal, we only mix Named in to the control objects,
>>>> which
>>>> > > means we have no need of default implementations anymore (because
>>>> we can
>>>> > > update all the control objects concurrently with adding this
>>>> interface to
>>>> > > them).
>>>> > >
>>>> > > This does hinge on switching over to a control-object-only strategy,
>>>> > which
>>>> > > introduces the need to add about 50 new control object classes,
>>>> which
>>>> > would
>>>> > > only serve to implement Named. As a middle ground, maybe we could
>>>> just
>>>> > add
>>>> > > one generic control object class, like:
>>>> > >
>>>> > > public class NamedOperation implements Named<NamedOperation> {
>>>> > >   private final String name;
>>>> > >   private NamedOperation(final String name) { this.name = name; }
>>>> > >   public static NamedOperation name(final String name) {
>>>> > >     return new NamedOperation(name);
>>>> > >   }
>>>> > >   public String name() { return name; }
>>>> > >   public NamedOperation withName(final String name) {
>>>> > >     return new NamedOperation(name);
>>>> > >   }
>>>> > > }
>>>> > >
>>>> > > And then, we'd add overloads for all the methods that don't have
>>>> control
>>>> > > objects already (for example, filter() ):
>>>> > >
>>>> > > // existing
>>>> > > KStream<K, V> filter(Predicate<? super K, ? super V> predicate);
>>>> > >
>>>> > > // new
>>>> > > KStream<K, V> filter(Predicate<? super K, ? super V> predicate,
>>>> > > NamedOperation named);
>>>> > >
>>>> > > Additionally, in regard to Matthias's point about existing control
>>>> > objects
>>>> > > with naming semantics, they would extend Named (but not
>>>> NamedOperation)
>>>> > for
>>>> > > uniformity.
>>>> > >
>>>> > > You provided a good approach to hide the getter with your
>>>> SettableName
>>>> > > class; I think what you proposed is the only way we could hide the
>>>> name.
>>>> > > In the end, though, it's a lot of complexity added (control object
>>>> class
>>>> > > hierarchy, inheritance, mutable state, internal casting) for
>>>> something of
>>>> > > dubious value: to be able to hide the name from someone *after they
>>>> > > themselves have set it*.
>>>> > >
>>>> > > Although it'll be a pain, perhaps Matthias's suggestion to
>>>> enumerate all
>>>> > > the API methods is the best way to be sure we all agree on what's
>>>> going
>>>> > to
>>>> > > happen.
>>>> > >
>>>> > > Thanks again for wrangling with this issue,
>>>> > > -John
>>>> > >
>>>> > > On Thu, Dec 13, 2018 at 9:03 AM Matthias J. Sax <
>>>> [email protected]>
>>>> > > wrote:
>>>> > >
>>>> > > > Just catching up on this discussion.
>>>> > > >
>>>> > > > My overall personal take is, that I am not a big fan of the
>>>> interface
>>>> > > > `Named` that is used as a factory. I would rather prefer to add a
>>>> > > > control object parameter to all methods that don't have one yet.
>>>> This
>>>> > > > KIP was started a while ago, and we added new naming capabilities
>>>> in
>>>> > the
>>>> > > > meantime. Guozhang's example in the PR comment about naming in
>>>> > > > stream-stream join shows, that we might end up in a confusion
>>>> situation
>>>> > > > for users if we use `Named`. Also, in 2.1, user can already name
>>>> as
>>>> > > > repartition-/changelog-topics and stores. Thus, KIP-307 boils
>>>> down to
>>>> > > > provide non-functional naming?
>>>> > > >
>>>> > > > Hence, for all methods that allow to specify names already, I
>>>> don't see
>>>> > > > any reason to change them, but use the existing API to also name
>>>> the
>>>> > > > processor(s) instead of allowing uses to specify a new name.
>>>> > > >
>>>> > > > About the inconsistency in method naming. I agree, that `as` is
>>>> very
>>>> > > > generic and maybe not the best choice.
>>>> > > >
>>>> > > > I think it might be helpful, to have a table overview in the KIP,
>>>> that
>>>> > > > list all existing static/non-static methods that allow to specify
>>>> a
>>>> > > > name, plus a columns with the new suggested naming for those
>>>> methods?
>>>> > > >
>>>> > > > Thoughts?
>>>> > > >
>>>> > > >
>>>> > > > -Matthias
>>>> > > >
>>>> > > >
>>>> > > > On 12/12/18 12:45 AM, Florian Hussonnois wrote:
>>>> > > > > Thank you very much for your feedbacks.
>>>> > > > >
>>>> > > > > Currently, there is still lot of discussions regarding the Named
>>>> > > > interface.
>>>> > > > > On the one hand we should provided consistency over the stream
>>>> API
>>>> > and
>>>> > > on
>>>> > > > > the other hand we should not break the semantic as John point
>>>> it up.
>>>> > > > >
>>>> > > > > Guozhang, I'm sorry, but I'm little bit confused, maybe I missed
>>>> > > > something.
>>>> > > > > In your comment you have suggested that :
>>>> > > > > * Produced/Consumed/Suppressed should extends Named
>>>> > > > > * Named should have a private-package method to get the
>>>> specified
>>>> > > > processor
>>>> > > > > name internally (processorName())
>>>> > > > > * Finally we should end up with something like :  Named -> XXX
>>>> ->
>>>> > > > > XXXInternal or Named -> Produced -> ProducedInternal
>>>> > > > >
>>>> > > > > The objective behind that is to :
>>>> > > > > * consolidate the internal method processorName()
>>>> > > > > * consolidate the method withName that exists now existing into
>>>> > > Produced,
>>>> > > > > Consumed and Suppressed.
>>>> > > > >
>>>> > > > > But, Named is an interface so we can't define a private-package
>>>> > method
>>>> > > on
>>>> > > > > it. Also, for example Produced and ProducedInternal are not in
>>>> the
>>>> > same
>>>> > > > > package so having a private-package method doesn't really help.
>>>> > > > > In addition, if we add the withName method into Named interface
>>>> this
>>>> > > can
>>>> > > > > become confusing for developers because action interfaces
>>>> > (ValueMapper,
>>>> > > > > Reducer, etc) extend it.
>>>> > > > > The interface would look like :
>>>> > > > >
>>>> > > > > public interface Named<T extends Named<T>> {
>>>> > > > >     default String name() {
>>>> > > > >         return null;
>>>> > > > >     }
>>>> > > > >     default Named<T> withName(final String name) {
>>>> > > > >         return null;
>>>> > > > >     }
>>>> > > > > ...
>>>> > > > > }
>>>> > > > >
>>>> > > > > So maybe instead of adding another method to Named we could
>>>> create a
>>>> > > new
>>>> > > > > package-private class that could be extended by
>>>> > > > > Produced/Consumed/Joined/Suppressed. For exemple,
>>>> > > > > class SettableName<T extends SettableName<T>> implements Named {
>>>> > > > >
>>>> > > > >     protected String processorName;
>>>> > > > >
>>>> > > > >     SettableName(final SettableName settable) {
>>>> > > > >         this(Objects.requireNonNull(settable, "settable can't be
>>>> > > > > null").name());
>>>> > > > >     }
>>>> > > > >
>>>> > > > >     SettableName(final String processorName) {
>>>> > > > >         this.processorName = processorName;
>>>> > > > >     }
>>>> > > > >
>>>> > > > >     @Override
>>>> > > > >     public String name() {
>>>> > > > >         return processorName;
>>>> > > > >     }
>>>> > > > >     public T withName(final String processorName) {
>>>> > > > >         this.processorName = processorName;
>>>> > > > >         return (T)this;
>>>> > > > >     }
>>>> > > > > }
>>>> > > > >
>>>> > > > > In that way, we will get : public class Produced implements
>>>> > > > > SettableName<Produced> { ...
>>>> > > > >
>>>> > > > > WDYT?
>>>> > > > >
>>>> > > > >
>>>> > > > > Le mar. 11 déc. 2018 à 02:46, Guozhang Wang <[email protected]>
>>>> a
>>>> > > > écrit :
>>>> > > > >
>>>> > > > >> I had one meta comment on the PR:
>>>> > > > >>
>>>> https://github.com/apache/kafka/pull/5909#discussion_r240447153
>>>> > > > >>
>>>> > > > >> On Mon, Dec 10, 2018 at 5:22 PM John Roesler <
>>>> [email protected]>
>>>> > > wrote:
>>>> > > > >>
>>>> > > > >>> Hi Florian,
>>>> > > > >>>
>>>> > > > >>> I hope it's ok if I ask a few questions at this late stage...
>>>> > > > >>>
>>>> > > > >>> Comment 1 ======
>>>> > > > >>>
>>>> > > > >>> It seems like the proposal is to add a new "Named" interface
>>>> that
>>>> > is
>>>> > > > >>> intended to be mixed in with the existing API objects at
>>>> various
>>>> > > > points.
>>>> > > > >>>
>>>> > > > >>> Just to preface some of my comments, it looks like your KIP
>>>> was
>>>> > > created
>>>> > > > >>> quite a while ago, so the API may have changed somewhat since
>>>> you
>>>> > > > >> started.
>>>> > > > >>>
>>>> > > > >>> As I see the API, there are a few different kinds of DSL
>>>> method
>>>> > > > >> arguments:
>>>> > > > >>> * functions: things like Initializer, Aggregator, ValueJoiner,
>>>> > > > >>> ForEachAction... All of these are essentially Streams-flavored
>>>> > > Function
>>>> > > > >>> interfaces with different arities, type bounds, and semantics.
>>>> > > > >>> * config objects: things like Produced, Consumed, Joined,
>>>> > Grouped...
>>>> > > > >> These
>>>> > > > >>> are containers for configurations, where the target of the
>>>> > > > configuration
>>>> > > > >> is
>>>> > > > >>> the operation itself
>>>> > > > >>> * raw configurations: things like a raw topic-name string and
>>>> > > > >> Materialized:
>>>> > > > >>> These are configurations for operations that have no config
>>>> object,
>>>> > > and
>>>> > > > >> for
>>>> > > > >>> various reasons, we didn't make one. The distinguishing
>>>> feature is
>>>> > > that
>>>> > > > >> the
>>>> > > > >>> target of the configuration is not the operation itself, but
>>>> some
>>>> > > > aspect
>>>> > > > >> of
>>>> > > > >>> it. For example, in Materialized, we are not setting the
>>>> caching
>>>> > > > behavior
>>>> > > > >>> of, for example, an aggregation; we're setting the caching
>>>> behavior
>>>> > > of
>>>> > > > a
>>>> > > > >>> materialized state store attached to the aggregation.
>>>> > > > >>>
>>>> > > > >>> It seems like choosing to mix the Named interface in with the
>>>> > > functions
>>>> > > > >> has
>>>> > > > >>> a couple of unfortunate side-effects:
>>>> > > > >>> * Aggregator is not the only function passed to any of the
>>>> relevant
>>>> > > > >>> aggregate methods, so it seems a little arbitrary to pick that
>>>> > > function
>>>> > > > >>> over Initializer or Merger.
>>>> > > > >>> * As you noted, branch() takes an array of Predicate, so we
>>>> just
>>>> > > ignore
>>>> > > > >> the
>>>> > > > >>> provided name(s), even though Predicate names are used
>>>> elsewhere.
>>>> > > > >>> * Not all things that we want to name have function arguments,
>>>> > > notably
>>>> > > > >>> source and sink, so we'd switch paradigms and use the config
>>>> object
>>>> > > > >>> instead.
>>>> > > > >>> * Adding an extra method to the function interfaces means that
>>>> > those
>>>> > > > are
>>>> > > > >> no
>>>> > > > >>> longer SAM interfaces. You proposed to add a default
>>>> > implementation,
>>>> > > so
>>>> > > > >> we
>>>> > > > >>> could still pass a lambda if we don't want to set the name,
>>>> but if
>>>> > we
>>>> > > > >> *do*
>>>> > > > >>> want to set the name, we can no longer use lambdas.
>>>> > > > >>>
>>>> > > > >>> I think the obvious other choice would be to mix Named in
>>>> with the
>>>> > > > config
>>>> > > > >>> objects instead, but this has one main downside of its own...
>>>> > > > >>> * not every operator we wish to name has a config object. I
>>>> don't
>>>> > > know
>>>> > > > if
>>>> > > > >>> everyone involved is comfortable with adding a config object
>>>> to
>>>> > every
>>>> > > > >>> operator that's missing one.
>>>> > > > >>>
>>>> > > > >>> Personally, I favor moving toward a more consistent state
>>>> that's
>>>> > > > forward
>>>> > > > >>> compatible with any further changes we wish to make. I
>>>> *think* that
>>>> > > > >> giving
>>>> > > > >>> every operator two forms (one with no config and one with a
>>>> config
>>>> > > > >> object)
>>>> > > > >>> would be such an API.
>>>> > > > >>>
>>>> > > > >>> Comment 2 =========
>>>> > > > >>>
>>>> > > > >>> Finally, just a minor comment: the static method in Named
>>>> wouldn't
>>>> > > work
>>>> > > > >>> properly as defined. Assuming that we mix Named in with
>>>> Produced,
>>>> > for
>>>> > > > >>> example, we'd need to be able to use it like:
>>>> > > > >>>>  kStream.to("out", Produced.with("myOut"))
>>>> > > > >>> This doesn't work because with() returns a Named, but we need
>>>> a
>>>> > > > Produced.
>>>> > > > >>>
>>>> > > > >>> We can pull off a builder method in the interface, but not a
>>>> static
>>>> > > > >> method.
>>>> > > > >>> To define a builder method in the interface that returns an
>>>> > instance
>>>> > > of
>>>> > > > >> the
>>>> > > > >>> concrete subtype, you have to use the "curiously recurring
>>>> generic"
>>>> > > > >>> pattern.
>>>> > > > >>>
>>>> > > > >>> It would look like:
>>>> > > > >>>
>>>> > > > >>> public interface Named<N extends Named<N>> {
>>>> > > > >>>   String name();
>>>> > > > >>>   N withName(String name);
>>>> > > > >>> }
>>>> > > > >>>
>>>> > > > >>> You can see where the name of the pattern comes from ;)
>>>> > > > >>> An implementation would then look like:
>>>> > > > >>>
>>>> > > > >>> public class Produced implements Named<Produced> {
>>>> > > > >>>   String name() { return name; }
>>>> > > > >>>   Produced withName(final String name) { this.name = name;
>>>> return
>>>> > > > this;
>>>> > > > >> }
>>>> > > > >>> }
>>>> > > > >>>
>>>> > > > >>> Note that the generic parameter gets filled in properly in the
>>>> > > > >> implementing
>>>> > > > >>> class, so that you get the right return type out.
>>>> > > > >>>
>>>> > > > >>> It doesn't work at all with a static factory method at the
>>>> > interface
>>>> > > > >> level,
>>>> > > > >>> so it would be up to Produced to define a static factory if it
>>>> > wants
>>>> > > to
>>>> > > > >>> present one.
>>>> > > > >>>
>>>> > > > >>> ======
>>>> > > > >>>
>>>> > > > >>> Those are my two feedbacks!
>>>> > > > >>>
>>>> > > > >>> I hope you find this helpful, rather than frustrating. I'm
>>>> sorry I
>>>> > > > didn't
>>>> > > > >>> get a chance to comment sooner.
>>>> > > > >>>
>>>> > > > >>> Thanks for the KIP, I think it will be much nicer to be able
>>>> to
>>>> > name
>>>> > > > the
>>>> > > > >>> processor nodes.
>>>> > > > >>>
>>>> > > > >>> -John
>>>> > > > >>>
>>>> > > > >>> On Tue, Nov 27, 2018 at 6:34 PM Guozhang Wang <
>>>> [email protected]>
>>>> > > > >> wrote:
>>>> > > > >>>
>>>> > > > >>>> Hi Florian,
>>>> > > > >>>>
>>>> > > > >>>> I've made a pass over the PR. There are some comments that
>>>> are
>>>> > > related
>>>> > > > >> to
>>>> > > > >>>> the function names which may be affecting the KIP wiki page,
>>>> but
>>>> > > > >> overall
>>>> > > > >>> I
>>>> > > > >>>> think it looks good already.
>>>> > > > >>>>
>>>> > > > >>>>
>>>> > > > >>>> Guozhang
>>>> > > > >>>>
>>>> > > > >>>>
>>>> > > > >>>> On Fri, Nov 16, 2018 at 4:21 PM Guozhang Wang <
>>>> [email protected]
>>>> > >
>>>> > > > >>> wrote:
>>>> > > > >>>>
>>>> > > > >>>>> Thanks Florian! I will take a look at the PR.
>>>> > > > >>>>>
>>>> > > > >>>>>
>>>> > > > >>>>>
>>>> > > > >>>>> On Mon, Nov 12, 2018 at 2:44 PM Florian Hussonnois <
>>>> > > > >>>> [email protected]>
>>>> > > > >>>>> wrote:
>>>> > > > >>>>>
>>>> > > > >>>>>> Hi Matthias,
>>>> > > > >>>>>>
>>>> > > > >>>>>> Sorry I was absent for a while. I have started a new PR
>>>> for this
>>>> > > > >> KIP.
>>>> > > > >>> It
>>>> > > > >>>>>> is
>>>> > > > >>>>>> still in progress for now. I'm working on it.
>>>> > > > >>>>>> https://github.com/apache/kafka/pull/5909
>>>> > > > >>>>>>
>>>> > > > >>>>>> Le ven. 19 oct. 2018 à 20:13, Matthias J. Sax <
>>>> > > > >> [email protected]>
>>>> > > > >>> a
>>>> > > > >>>>>> écrit :
>>>> > > > >>>>>>
>>>> > > > >>>>>>> What is the status of this KIP?
>>>> > > > >>>>>>>
>>>> > > > >>>>>>> -Matthias
>>>> > > > >>>>>>>
>>>> > > > >>>>>>> On 7/19/18 5:17 PM, Guozhang Wang wrote:
>>>> > > > >>>>>>>> Hello Florian,
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> Sorry for being late... Found myself keep apologizing
>>>> for late
>>>> > > > >>>> replies
>>>> > > > >>>>>>>> these days. But I do want to push this KIP's progress
>>>> forward
>>>> > > > >> as I
>>>> > > > >>>>>> see it
>>>> > > > >>>>>>>> very important and helpful feature for extensibility.
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> About the exceptions, I've gone through them and
>>>> hopefully it
>>>> > is
>>>> > > > >>> an
>>>> > > > >>>>>>>> exhaustive list:
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> 1. KTable#toStream()
>>>> > > > >>>>>>>> 2. KStream#merge(KStream)
>>>> > > > >>>>>>>> 3. KStream#process() / transform() / transformValues()
>>>> > > > >>>>>>>> 4. KGroupedTable / KGroupedStream#count()
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> Here's my reasoning:
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> * It is okay not letting users to override the name for
>>>> 1/2,
>>>> > > > >> since
>>>> > > > >>>>>> they
>>>> > > > >>>>>>> are
>>>> > > > >>>>>>>> too trivial to be useful for debugging, plus their
>>>> processor
>>>> > > > >> names
>>>> > > > >>>>>> would
>>>> > > > >>>>>>>> not determine any related topic / store names.
>>>> > > > >>>>>>>> * For 3, I'd vote for adding overloaded functions with
>>>> Named.
>>>> > > > >>>>>>>> * For 4, if users really want to name the processor she
>>>> can
>>>> > call
>>>> > > > >>>>>>>> aggregate() instead, so I think it is okay to skip this
>>>> case.
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> Guozhang
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>> On Fri, Jul 6, 2018 at 3:06 PM, Florian Hussonnois <
>>>> > > > >>>>>>> [email protected]>
>>>> > > > >>>>>>>> wrote:
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>> Hi,
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> The option #3 seems to be a good alternative and I find
>>>> the
>>>> > API
>>>> > > > >>>> more
>>>> > > > >>>>>>>>> elegant (thanks John).
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> But, we still have the need to overload some methods
>>>> either
>>>> > > > >>> because
>>>> > > > >>>>>>> they do
>>>> > > > >>>>>>>>> not accept an action instance or because they are
>>>> translated
>>>> > to
>>>> > > > >>>>>> multiple
>>>> > > > >>>>>>>>> processors.
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> For example, this is the case for methods branch() and
>>>> > merge().
>>>> > > > >>> We
>>>> > > > >>>>>> could
>>>> > > > >>>>>>>>> introduce a new interface Named (or maybe a different
>>>> name ?)
>>>> > > > >>> with
>>>> > > > >>>> a
>>>> > > > >>>>>>> method
>>>> > > > >>>>>>>>> name(). All action interfaces could extend this one to
>>>> > > > >> implement
>>>> > > > >>>> the
>>>> > > > >>>>>>> option
>>>> > > > >>>>>>>>> 3).
>>>> > > > >>>>>>>>> This would result by having the following overloads  :
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> Stream<K, V> merge(final Named name, final KStream<K, V>
>>>> > > > >> stream);
>>>> > > > >>>>>>>>> KStream<K, V>[] branch(final Named name, final
>>>> Predicate<?
>>>> > > > >> super
>>>> > > > >>>> K, ?
>>>> > > > >>>>>>> super
>>>> > > > >>>>>>>>> V>... predicates)
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> N.B : The list above is  not exhaustive
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> ---------
>>>> > > > >>>>>>>>> user's code will become :
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>>         KStream<String, Integer> stream =
>>>> > > > >> builder.stream("test");
>>>> > > > >>>>>>>>>         KStream<String, Integer>[] branches =
>>>> > > > >>>>>>>>> stream.branch(Named.with("BRANCH-STREAM-ON-VALUE"),
>>>> > > > >>>>>>>>>                 Predicate.named("STREAM-PAIR-VALUE",
>>>> (k, v)
>>>> > ->
>>>> > > > >> v
>>>> > > > >>> %
>>>> > > > >>>> 2
>>>> > > > >>>>>> ==
>>>> > > > >>>>>>>>> 0),
>>>> > > > >>>>>>>>>                 Predicate.named("STREAM-IMPAIR-VALUE",
>>>> (k, v)
>>>> > > > >> ->
>>>> > > > >>> v
>>>> > > > >>>> %
>>>> > > > >>>>>> 2
>>>> > > > >>>>>>> !=
>>>> > > > >>>>>>>>> 0));
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>>         branches[0].to("pair");
>>>> > > > >>>>>>>>>         branches[1].to("impair");
>>>> > > > >>>>>>>>> ---------
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> This is a mix of the options 3) and 1)
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> Le ven. 6 juil. 2018 à 22:58, Guozhang Wang <
>>>> > > > >> [email protected]>
>>>> > > > >>> a
>>>> > > > >>>>>>> écrit :
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>>> Hi folks, just to summarize the options we have so far:
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> 1) Add a new "as" for KTable / KStream, plus adding new
>>>> > fields
>>>> > > > >>> for
>>>> > > > >>>>>>>>>> operators-returns-void control objects (the current
>>>> wiki's
>>>> > > > >>>>>> proposal).
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> Pros: no more overloads.
>>>> > > > >>>>>>>>>> Cons: a bit departing with the current high-level API
>>>> design
>>>> > > > >> of
>>>> > > > >>>> the
>>>> > > > >>>>>>> DSL,
>>>> > > > >>>>>>>>>> plus, the inconsistency between operators-returns-void
>>>> and
>>>> > > > >>>>>>>>>> operators-not-return-voids.
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> 2) Add overloaded functions for all operators, that
>>>> accepts
>>>> > a
>>>> > > > >>> new
>>>> > > > >>>>>>> control
>>>> > > > >>>>>>>>>> object "Described".
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> Pros: consistent with current APIs.
>>>> > > > >>>>>>>>>> Cons: lots of overloaded functions to add.
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> 3) Add another default function in the interface
>>>> (thank you
>>>> > > > >> J8!)
>>>> > > > >>>> as
>>>> > > > >>>>>>> John
>>>> > > > >>>>>>>>>> proposed.
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> Pros: no overloaded functions, no "Described".
>>>> > > > >>>>>>>>>> Cons: do we lose lambda functions really (seems not if
>>>> we
>>>> > > > >>> provide
>>>> > > > >>>> a
>>>> > > > >>>>>>>>> "named"
>>>> > > > >>>>>>>>>> for each func)? Plus "Described" may be more extensible
>>>> > than a
>>>> > > > >>>>>> single
>>>> > > > >>>>>>>>>> `String`.
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> My principle of considering which one is better depends
>>>> > > > >>> primarily
>>>> > > > >>>> on
>>>> > > > >>>>>>> "how
>>>> > > > >>>>>>>>>> to make advanced users easily use the additional API,
>>>> while
>>>> > > > >>>> keeping
>>>> > > > >>>>>> it
>>>> > > > >>>>>>>>>> hidden from normal users who do not care at all". For
>>>> that
>>>> > > > >>>> purpose I
>>>> > > > >>>>>>>>> think
>>>> > > > >>>>>>>>>> 3) > 1) > 2).
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> One caveat though, is that changing the interface
>>>> would not
>>>> > be
>>>> > > > >>>>>>>>>> binary-compatible though source-compatible, right? I.e.
>>>> > users
>>>> > > > >>> need
>>>> > > > >>>>>> to
>>>> > > > >>>>>>>>>> recompile their code though no changes needed.
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> Another note: for 3), if we really want to keep
>>>> > extensibility
>>>> > > > >> of
>>>> > > > >>>>>>>>> Described
>>>> > > > >>>>>>>>>> we could do sth. like:
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> ---------
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> public interface Predicate<K, V> {
>>>> > > > >>>>>>>>>>     // existing method
>>>> > > > >>>>>>>>>>     boolean test(final K key, final V value);
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>     // new default method adds the ability to name the
>>>> > > > >> predicate
>>>> > > > >>>>>>>>>>     default Described described() {
>>>> > > > >>>>>>>>>>         return new Described(null);
>>>> > > > >>>>>>>>>>     }
>>>> > > > >>>>>>>>>> }
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> ----------
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> where user's code becomes:
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> stream.filter(named("key", (k, v) -> true));   // note
>>>> > `named`
>>>> > > > >>> now
>>>> > > > >>>>>> just
>>>> > > > >>>>>>>>>> sets a Described("key") in "described()".
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> stream.filter(described(Described.as("key", /* any
>>>> other
>>>> > fancy
>>>> > > > >>>>>>>>> parameters
>>>> > > > >>>>>>>>>> in the future*/), (k, v) -> true));
>>>> > > > >>>>>>>>>> ----------
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> I feel it is not much likely that we'd need to extend
>>>> it
>>>> > > > >> further
>>>> > > > >>>> in
>>>> > > > >>>>>> the
>>>> > > > >>>>>>>>>> future, so just a `String` would be good enough. But
>>>> just
>>>> > > > >>> listing
>>>> > > > >>>>>> all
>>>> > > > >>>>>>>>>> possibilities here.
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> Guozhang
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> On Fri, Jul 6, 2018 at 8:19 AM, John Roesler <
>>>> > > > >> [email protected]
>>>> > > > >>>>
>>>> > > > >>>>>>> wrote:
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>> Hi Florian,
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Sorry I'm late to the party, but I missed the message
>>>> > > > >>> originally.
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Regarding the names, it's probably a good idea to
>>>> stick to
>>>> > > > >> the
>>>> > > > >>>> same
>>>> > > > >>>>>>>>>>> character set we're currently using: letters,
>>>> numbers, and
>>>> > > > >>>> hyphens.
>>>> > > > >>>>>>> The
>>>> > > > >>>>>>>>>>> names are used in Kafka topics, files and folders, and
>>>> > > > >> RocksDB
>>>> > > > >>>>>>>>> databases,
>>>> > > > >>>>>>>>>>> and we also need them to work with the file systems of
>>>> > > > >> Windows,
>>>> > > > >>>>>> Linux,
>>>> > > > >>>>>>>>>> and
>>>> > > > >>>>>>>>>>> MacOS. My opinion is that with a situation like that,
>>>> it's
>>>> > > > >>> better
>>>> > > > >>>>>> to
>>>> > > > >>>>>>> be
>>>> > > > >>>>>>>>>>> conservative. It might also be a good idea to impose
>>>> an
>>>> > upper
>>>> > > > >>>>>> limit on
>>>> > > > >>>>>>>>>> name
>>>> > > > >>>>>>>>>>> length to avoid running afoul of any of those systems.
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> ---
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> It seems like there's a small debate between 1)
>>>> adding a
>>>> > new
>>>> > > > >>>>>> method to
>>>> > > > >>>>>>>>>>> KStream (and maybe KTable) to modify its name after
>>>> the
>>>> > fact,
>>>> > > > >>> or
>>>> > > > >>>> 2)
>>>> > > > >>>>>>>>>>> piggy-backing on the config objects where they exist
>>>> and
>>>> > > > >> adding
>>>> > > > >>>> one
>>>> > > > >>>>>>>>> where
>>>> > > > >>>>>>>>>>> they don't. To me, #2 is the better alternative even
>>>> though
>>>> > > > >> it
>>>> > > > >>>>>>> produces
>>>> > > > >>>>>>>>>>> more overloads and may be a bit awkward in places.
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> The reason is simply that #1 is a high-level
>>>> departure from
>>>> > > > >> the
>>>> > > > >>>>>>>>>>> graph-building paradigm we're using in the DSL.
>>>> Consider:
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Graph.node1(config).node2(config)
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> vs
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Graph.node1().config().node2().config()
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> We could have done either, but we picked the former. I
>>>> > think
>>>> > > > >>> it's
>>>> > > > >>>>>>>>>> probably
>>>> > > > >>>>>>>>>>> a good goal to try and stick to it so that developers
>>>> can
>>>> > > > >>> develop
>>>> > > > >>>>>> and
>>>> > > > >>>>>>>>>> rely
>>>> > > > >>>>>>>>>>> on their instincts for how the DSL will behave.
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> I do want to present one alternative to adding new
>>>> config
>>>> > > > >>>> objects:
>>>> > > > >>>>>> we
>>>> > > > >>>>>>>>> can
>>>> > > > >>>>>>>>>>> just add a "name()" method to all our "action"
>>>> interfaces.
>>>> > > > >> For
>>>> > > > >>>>>>> example,
>>>> > > > >>>>>>>>>>> I'll demonstrate how we can add a "name" to Predicate
>>>> and
>>>> > > > >> then
>>>> > > > >>>> use
>>>> > > > >>>>>> it
>>>> > > > >>>>>>>>> to
>>>> > > > >>>>>>>>>>> name a "KStream#filter" DSL operator:
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> public interface Predicate<K, V> {
>>>> > > > >>>>>>>>>>>     // existing method
>>>> > > > >>>>>>>>>>>     boolean test(final K key, final V value);
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>>     // new default method adds the ability to name the
>>>> > > > >>> predicate
>>>> > > > >>>>>>>>>>>     default String name() {
>>>> > > > >>>>>>>>>>>         return null;
>>>> > > > >>>>>>>>>>>     }
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>>     // new static factory method adds the ability to
>>>> wrap
>>>> > > > >>> lambda
>>>> > > > >>>>>>>>>> predicates
>>>> > > > >>>>>>>>>>> with a named predicate
>>>> > > > >>>>>>>>>>>     static <K, V> Predicate<K, V> named(final String
>>>> name,
>>>> > > > >>> final
>>>> > > > >>>>>>>>>>> Predicate<K, V> predicate) {
>>>> > > > >>>>>>>>>>>         return new Predicate<K, V>() {
>>>> > > > >>>>>>>>>>>             @Override
>>>> > > > >>>>>>>>>>>             public boolean test(final K key, final V
>>>> > value) {
>>>> > > > >>>>>>>>>>>                 return predicate.test(key, value);
>>>> > > > >>>>>>>>>>>             }
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>>             @Override
>>>> > > > >>>>>>>>>>>             public String name() {
>>>> > > > >>>>>>>>>>>                 return name;
>>>> > > > >>>>>>>>>>>             }
>>>> > > > >>>>>>>>>>>         };
>>>> > > > >>>>>>>>>>>     }
>>>> > > > >>>>>>>>>>> }
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Then, here's how it would look to use it:
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> // Anonymous predicates continue to work just fine
>>>> > > > >>>>>>>>>>> stream.filter((k, v) -> true);
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> // Devs can swap in a Predicate that implements the
>>>> name()
>>>> > > > >>>> method.
>>>> > > > >>>>>>>>>>> stream.filter(new Predicate<Object, Object>() {
>>>> > > > >>>>>>>>>>>     @Override
>>>> > > > >>>>>>>>>>>     public boolean test(final Object key, final Object
>>>> > > > >> value) {
>>>> > > > >>>>>>>>>>>         return true;
>>>> > > > >>>>>>>>>>>     }
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>>     @Override
>>>> > > > >>>>>>>>>>>     public String name() {
>>>> > > > >>>>>>>>>>>         return "hey";
>>>> > > > >>>>>>>>>>>     }
>>>> > > > >>>>>>>>>>> });
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> // Or they can wrap their existing lambda using the
>>>> static
>>>> > > > >>>> factory
>>>> > > > >>>>>>>>> method
>>>> > > > >>>>>>>>>>> stream.filter(named("key", (k, v) -> true));
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Just a thought.
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Overall, I think it's really valuable to be able to
>>>> name
>>>> > the
>>>> > > > >>>>>>>>> processors,
>>>> > > > >>>>>>>>>>> for all the reasons you mentioned in the KIP. So
>>>> thank you
>>>> > > > >> for
>>>> > > > >>>>>>>>>> introducing
>>>> > > > >>>>>>>>>>> this!
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> Thanks,
>>>> > > > >>>>>>>>>>> -John
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> On Thu, Jul 5, 2018 at 4:53 PM Florian Hussonnois <
>>>> > > > >>>>>>>>> [email protected]
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>> wrote:
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>>> Hi, thank you very much for all you suggestions. I've
>>>> > > > >> started
>>>> > > > >>> to
>>>> > > > >>>>>>>>> update
>>>> > > > >>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>> KIP (
>>>> > > > >>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> > > > >>>>>>>>>>>
>>>> > > > >> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL
>>>> > > > >>>>>>>>>>>> ).
>>>> > > > >>>>>>>>>>>> Also, I propose to rename the Processed class into
>>>> > > > >> Described -
>>>> > > > >>>>>> this
>>>> > > > >>>>>>>>>> will
>>>> > > > >>>>>>>>>>> be
>>>> > > > >>>>>>>>>>>> more meaningful (but this is just a detail).
>>>> > > > >>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>> I'm OK to not enforcing uppercase for specific names
>>>> but
>>>> > > > >>> should
>>>> > > > >>>> we
>>>> > > > >>>>>>>>>> allow
>>>> > > > >>>>>>>>>>>> arbitrary names with whitespaces for example ?
>>>> Currently,
>>>> > I
>>>> > > > >>>> can't
>>>> > > > >>>>>>>>> tell
>>>> > > > >>>>>>>>>> if
>>>> > > > >>>>>>>>>>>> this can lead to some side effects ?
>>>> > > > >>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>> Le lun. 11 juin 2018 à 01:31, Matthias J. Sax <
>>>> > > > >>>>>> [email protected]
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> a
>>>> > > > >>>>>>>>>>>> écrit :
>>>> > > > >>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>> Just catching up on this thread.
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>> I like the general idea. Couple of comments:
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - I think that adding `Processed` (or maybe a
>>>> different
>>>> > > > >>> name?)
>>>> > > > >>>>>> is
>>>> > > > >>>>>>>>> a
>>>> > > > >>>>>>>>>>>>> valid proposal for stateless operators that only
>>>> have a
>>>> > > > >>> single
>>>> > > > >>>>>>>>>> overload
>>>> > > > >>>>>>>>>>>>> atm. It would align with the overall API design.
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - for all methods with multiple existing
>>>> overloads, we
>>>> > can
>>>> > > > >>>>>>>>> consider
>>>> > > > >>>>>>>>>> to
>>>> > > > >>>>>>>>>>>>> extend `Consumed`, `Produced`, `Materialized` etc
>>>> to take
>>>> > > > >> an
>>>> > > > >>>>>>>>>> additional
>>>> > > > >>>>>>>>>>>>> processor name (not sure atm how elegant this is; we
>>>> > would
>>>> > > > >>> need
>>>> > > > >>>>>> to
>>>> > > > >>>>>>>>>>>>> "play" with the API a little bit; the advantage
>>>> would be,
>>>> > > > >>> that
>>>> > > > >>>> we
>>>> > > > >>>>>>>>> do
>>>> > > > >>>>>>>>>>> not
>>>> > > > >>>>>>>>>>>>> add more overloads what seems to be key for this
>>>> KIP)
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - operators return void: while I agree that the
>>>> "name
>>>> > > > >> first"
>>>> > > > >>>>>>>>>> chaining
>>>> > > > >>>>>>>>>>>>> idea is not very intuitive, it might still work, if
>>>> we
>>>> > name
>>>> > > > >>> the
>>>> > > > >>>>>>>>>> method
>>>> > > > >>>>>>>>>>>>> correctly (again, we would need to "play" with the
>>>> API a
>>>> > > > >>> little
>>>> > > > >>>>>> bit
>>>> > > > >>>>>>>>>> to
>>>> > > > >>>>>>>>>>>> see)
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - for DSL operators that are translated to multiple
>>>> > nodes:
>>>> > > > >>> it
>>>> > > > >>>>>>>>> might
>>>> > > > >>>>>>>>>>>>> make sense to use the specified operator name as
>>>> prefix
>>>> > and
>>>> > > > >>> add
>>>> > > > >>>>>>>>>>>>> reasonable suffixes. For example, a join translates
>>>> into
>>>> > 5
>>>> > > > >>>>>>>>> operators
>>>> > > > >>>>>>>>>>>>> that could be name "name-left-store-processor",
>>>> > > > >>>>>>>>>>>>> "name-left-join-processor",
>>>> "name-right-store-processor",
>>>> > > > >>>>>>>>>>>>> "name-right-join-processor", and
>>>> > > > >> "name-join-merge-processor"
>>>> > > > >>>> (or
>>>> > > > >>>>>>>>>>>>> similar). Maybe just using numbers might also work.
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - I think, we should strip the number suffixes if
>>>> a user
>>>> > > > >>>>>> provides
>>>> > > > >>>>>>>>>>> names
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - enforcing upper case seems to be tricky: for
>>>> example,
>>>> > we
>>>> > > > >>> do
>>>> > > > >>>>>> not
>>>> > > > >>>>>>>>>>>>> enforce upper case for store names and we cannot
>>>> easily
>>>> > > > >>> change
>>>> > > > >>>> it
>>>> > > > >>>>>>>>> as
>>>> > > > >>>>>>>>>> it
>>>> > > > >>>>>>>>>>>>> would break compatibility -- thus, for consistency
>>>> > reasons
>>>> > > > >> we
>>>> > > > >>>>>> might
>>>> > > > >>>>>>>>>> not
>>>> > > > >>>>>>>>>>>>> want to do this
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>  - for better understand of the impact of the KIP,
>>>> it
>>>> > would
>>>> > > > >>> be
>>>> > > > >>>>>>>>> quite
>>>> > > > >>>>>>>>>>>>> helpful if you would list all method names that are
>>>> > > > >> affected
>>>> > > > >>> in
>>>> > > > >>>>>> the
>>>> > > > >>>>>>>>>> KIP
>>>> > > > >>>>>>>>>>>>> (ie, list all newly added overloads)
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>> -Matthias
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>> On 5/31/18 6:40 PM, Guozhang Wang wrote:
>>>> > > > >>>>>>>>>>>>>> Hi Florian,
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>> Re 1: I think changing the KStreamImpl /
>>>> KTableImpl to
>>>> > > > >> allow
>>>> > > > >>>>>>>>>>> modifying
>>>> > > > >>>>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>> processor name after the operator is fine as long
>>>> as we
>>>> > do
>>>> > > > >>> the
>>>> > > > >>>>>>>>>> check
>>>> > > > >>>>>>>>>>>>> again
>>>> > > > >>>>>>>>>>>>>> when modifying that. In fact, we are having some
>>>> > topology
>>>> > > > >>>>>>>>>>> optimization
>>>> > > > >>>>>>>>>>>>>> going on which may modify processor names in the
>>>> final
>>>> > > > >>>> topology
>>>> > > > >>>>>>>>>>>> anyways (
>>>> > > > >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4983).
>>>> > Semantically
>>>> > > > >> I
>>>> > > > >>>>>> think
>>>> > > > >>>>>>>>>> it
>>>> > > > >>>>>>>>>>> is
>>>> > > > >>>>>>>>>>>>>> easier to understand to developers than "deciding
>>>> the
>>>> > > > >>>> processor
>>>> > > > >>>>>>>>>> name
>>>> > > > >>>>>>>>>>>> for
>>>> > > > >>>>>>>>>>>>>> the next operator".
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>> Re 2: Yeah I'm thinking that for operators that
>>>> > translates
>>>> > > > >>> to
>>>> > > > >>>>>>>>>>> multiple
>>>> > > > >>>>>>>>>>>>>> processor names, we can still use the provided
>>>> "hint" to
>>>> > > > >>> name
>>>> > > > >>>>>> the
>>>> > > > >>>>>>>>>>>>> processor
>>>> > > > >>>>>>>>>>>>>> names, e.g. for Joins we can name them as
>>>> > `join-foo-this`
>>>> > > > >>> and
>>>> > > > >>>>>>>>>>>>>> `join-foo-that` etc if user calls `as("foo")`.
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>> Re 3: The motivation I had about removing the
>>>> suffix is
>>>> > > > >> that
>>>> > > > >>>> it
>>>> > > > >>>>>>>>> has
>>>> > > > >>>>>>>>>>>> huge
>>>> > > > >>>>>>>>>>>>>> restrictions on topology compatibilities: consider
>>>> if
>>>> > user
>>>> > > > >>>> code
>>>> > > > >>>>>>>>>>> added a
>>>> > > > >>>>>>>>>>>>> new
>>>> > > > >>>>>>>>>>>>>> operator, or library does some optimization to
>>>> remove
>>>> > some
>>>> > > > >>>>>>>>>> operators,
>>>> > > > >>>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>> suffix indexing may be changed for a large amount
>>>> of the
>>>> > > > >>>>>>>>> processor
>>>> > > > >>>>>>>>>>>> names:
>>>> > > > >>>>>>>>>>>>>> this will in turn change the internal state store
>>>> names,
>>>> > > > >> as
>>>> > > > >>>> well
>>>> > > > >>>>>>>>> as
>>>> > > > >>>>>>>>>>>>>> internal topic names as well, making the new
>>>> application
>>>> > > > >>>>>> topology
>>>> > > > >>>>>>>>>> to
>>>> > > > >>>>>>>>>>> be
>>>> > > > >>>>>>>>>>>>>> incompatible with the ones. One rationale I had
>>>> about
>>>> > this
>>>> > > > >>> KIP
>>>> > > > >>>>>> is
>>>> > > > >>>>>>>>>>> that
>>>> > > > >>>>>>>>>>>>>> aligned this effort, moving forward we can allow
>>>> users
>>>> > to
>>>> > > > >>>>>>>>> customize
>>>> > > > >>>>>>>>>>>>>> internal names so that they can still be reused
>>>> even
>>>> > with
>>>> > > > >>>>>>>>> topology
>>>> > > > >>>>>>>>>>>>> changes
>>>> > > > >>>>>>>>>>>>>> (e.g. KIP-230), so I think removing the suffix
>>>> index
>>>> > would
>>>> > > > >>> be
>>>> > > > >>>>>>>>> more
>>>> > > > >>>>>>>>>>>>>> applicable in the long run.
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>> Guozhang
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>> On Thu, May 31, 2018 at 3:08 PM, Florian
>>>> Hussonnois <
>>>> > > > >>>>>>>>>>>>> [email protected]>
>>>> > > > >>>>>>>>>>>>>> wrote:
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> Hi ,
>>>> > > > >>>>>>>>>>>>>>> Thank you very much for your feedback.
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> 1/
>>>> > > > >>>>>>>>>>>>>>> I agree that overloading most of the methods with
>>>> a
>>>> > > > >>> Processed
>>>> > > > >>>>>> is
>>>> > > > >>>>>>>>>> not
>>>> > > > >>>>>>>>>>>>> ideal.
>>>> > > > >>>>>>>>>>>>>>> I've started modifying the KStream API and I got
>>>> to the
>>>> > > > >>> same
>>>> > > > >>>>>>>>>>>> conclusion.
>>>> > > > >>>>>>>>>>>>>>> Also ading a new method directly to KStreamImpl
>>>> and
>>>> > > > >>>> KTableImpl
>>>> > > > >>>>>>>>>>> classes
>>>> > > > >>>>>>>>>>>>>>> seems to be a better option.
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> However a processor name cannot be redefined after
>>>> > > > >> calling
>>>> > > > >>> an
>>>> > > > >>>>>>>>>>> operator
>>>> > > > >>>>>>>>>>>>> (or
>>>> > > > >>>>>>>>>>>>>>> maybe I miss something in the code).
>>>> > > > >>>>>>>>>>>>>>> From my understanding, this will only set the
>>>> KStream
>>>> > > > >> name
>>>> > > > >>>>>>>>>> property
>>>> > > > >>>>>>>>>>>> not
>>>> > > > >>>>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>>> processor name previsouly added to the topology
>>>> > builder -
>>>> > > > >>>>>>>>> leading
>>>> > > > >>>>>>>>>> to
>>>> > > > >>>>>>>>>>>>>>> InvalidTopology exception.
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> So the new method should actually defines the
>>>> name of
>>>> > the
>>>> > > > >>>> next
>>>> > > > >>>>>>>>>>>>> processor :
>>>> > > > >>>>>>>>>>>>>>> Below is an example :
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> *stream.as <http://stream.as
>>>> > > > >>>>>>>>>>> (Processed.name("MAPPE_TO_UPPERCASE")*
>>>> > > > >>>>>>>>>>>>>>> *          .map( (k, v) -> KeyValue.pair(k,
>>>> > > > >>>> v.toUpperCase()))*
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> I think this approach could solve the cases for
>>>> methods
>>>> > > > >>>>>>>>> returning
>>>> > > > >>>>>>>>>>>> void ?
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> Regarding this new method we have two possible
>>>> > > > >>>> implementations
>>>> > > > >>>>>> :
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>    1. Adding a method like : withName(String
>>>> > > > >> processorName)
>>>> > > > >>>>>>>>>>>>>>>    2. or adding a method accepting an Processed
>>>> object
>>>> > :
>>>> > > > >>>>>>>>>>>> as(Processed).
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> I think solution 2. is preferable as the Processed
>>>> > class
>>>> > > > >>>> could
>>>> > > > >>>>>>>>> be
>>>> > > > >>>>>>>>>>>>> enriched
>>>> > > > >>>>>>>>>>>>>>> further (in futur).
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> 2/
>>>> > > > >>>>>>>>>>>>>>> As Guozhang said some operators add internal
>>>> > processors.
>>>> > > > >>>>>>>>>>>>>>> For example the branch() method create one
>>>> > KStreamBranch
>>>> > > > >>>>>>>>> processor
>>>> > > > >>>>>>>>>>> to
>>>> > > > >>>>>>>>>>>>> route
>>>> > > > >>>>>>>>>>>>>>> records and one KStreamPassThrough processor for
>>>> each
>>>> > > > >>> branch.
>>>> > > > >>>>>>>>>>>>>>> In that situation only the parent processor can be
>>>> > named.
>>>> > > > >>> For
>>>> > > > >>>>>>>>>>> children
>>>> > > > >>>>>>>>>>>>>>> processors we could keep the current behaviour
>>>> that
>>>> > add a
>>>> > > > >>>>>> suffix
>>>> > > > >>>>>>>>>>> (i.e
>>>> > > > >>>>>>>>>>>>>>> KSTREAM-BRANCHCHILD-)
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> This also the case for the join() method that
>>>> result to
>>>> > > > >>>> adding
>>>> > > > >>>>>>>>>>>> multiple
>>>> > > > >>>>>>>>>>>>>>> processors to the topology (windowing, left/right
>>>> joins
>>>> > > > >>> and a
>>>> > > > >>>>>>>>>> merge
>>>> > > > >>>>>>>>>>>>>>> processor).
>>>> > > > >>>>>>>>>>>>>>> I think, like for the branch method users could
>>>> only
>>>> > > > >>> define a
>>>> > > > >>>>>>>>>>>> processor
>>>> > > > >>>>>>>>>>>>>>> name prefix.
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> 3/
>>>> > > > >>>>>>>>>>>>>>> I think we should  still added a suffix like
>>>> > > > >> "-0000000000"
>>>> > > > >>> to
>>>> > > > >>>>>>>>>>>> processor
>>>> > > > >>>>>>>>>>>>>>> name and enforce uppercases as this will keep some
>>>> > > > >>>> consistency
>>>> > > > >>>>>>>>>> with
>>>> > > > >>>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>>> ones generated by the API.
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> 4/
>>>> > > > >>>>>>>>>>>>>>> Yes, the KTable interface should be modified like
>>>> > KStream
>>>> > > > >>> to
>>>> > > > >>>>>>>>> allow
>>>> > > > >>>>>>>>>>>>> custom
>>>> > > > >>>>>>>>>>>>>>> processor names definition.
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> Thanks,
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> Le jeu. 31 mai 2018 à 19:18, Damian Guy <
>>>> > > > >>>> [email protected]>
>>>> > > > >>>>>>>>> a
>>>> > > > >>>>>>>>>>>> écrit
>>>> > > > >>>>>>>>>>>>> :
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>> Hi Florian,
>>>> > > > >>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>> Thanks for the KIP. What about KTable and other
>>>> DSL
>>>> > > > >>>>>> interfaces?
>>>> > > > >>>>>>>>>>> Will
>>>> > > > >>>>>>>>>>>>> they
>>>> > > > >>>>>>>>>>>>>>>> not want to be able to do the same thing?
>>>> > > > >>>>>>>>>>>>>>>> It would be good to see a complete set of the
>>>> public
>>>> > API
>>>> > > > >>>>>>>>> changes.
>>>> > > > >>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>> Cheers,
>>>> > > > >>>>>>>>>>>>>>>> Damian
>>>> > > > >>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>> On Wed, 30 May 2018 at 19:45 Guozhang Wang <
>>>> > > > >>>>>> [email protected]
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>>> wrote:
>>>> > > > >>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> Hello Florian,
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> Thanks for the KIP. I have some meta feedbacks
>>>> on the
>>>> > > > >>>>>>>>> proposal:
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> 1. You mentioned that this `Processed` object
>>>> will be
>>>> > > > >>> added
>>>> > > > >>>>>>>>> to a
>>>> > > > >>>>>>>>>>> new
>>>> > > > >>>>>>>>>>>>>>>>> overloaded variant of all the stateless
>>>> operators,
>>>> > what
>>>> > > > >>>> about
>>>> > > > >>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>>>> stateful
>>>> > > > >>>>>>>>>>>>>>>>> operators? Would like to hear your opinions if
>>>> you
>>>> > have
>>>> > > > >>>>>>>>> thought
>>>> > > > >>>>>>>>>>>> about
>>>> > > > >>>>>>>>>>>>>>>> that:
>>>> > > > >>>>>>>>>>>>>>>>> note for stateful operators they will usually be
>>>> > mapped
>>>> > > > >>> to
>>>> > > > >>>>>>>>>>> multiple
>>>> > > > >>>>>>>>>>>>>>>>> processor node names, so we probably need to
>>>> come up
>>>> > > > >> with
>>>> > > > >>>>>> some
>>>> > > > >>>>>>>>>>> ways
>>>> > > > >>>>>>>>>>>> to
>>>> > > > >>>>>>>>>>>>>>>>> define all their names.
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> 2. I share the same concern with Bill as for
>>>> adding
>>>> > > > >> lots
>>>> > > > >>> of
>>>> > > > >>>>>>>>> new
>>>> > > > >>>>>>>>>>>>>>> overload
>>>> > > > >>>>>>>>>>>>>>>>> functions into the stateless operators, as we
>>>> have
>>>> > just
>>>> > > > >>>> spent
>>>> > > > >>>>>>>>>>> quite
>>>> > > > >>>>>>>>>>>>>>> some
>>>> > > > >>>>>>>>>>>>>>>>> effort in trimming them since 1.0.0 release. If
>>>> the
>>>> > > > >> goal
>>>> > > > >>> is
>>>> > > > >>>>>> to
>>>> > > > >>>>>>>>>>> just
>>>> > > > >>>>>>>>>>>>>>>> provide
>>>> > > > >>>>>>>>>>>>>>>>> some "hints" on the generated processor node
>>>> names,
>>>> > not
>>>> > > > >>>>>>>>> strictly
>>>> > > > >>>>>>>>>>>>>>>> enforcing
>>>> > > > >>>>>>>>>>>>>>>>> the exact names that to be generated, then how
>>>> about
>>>> > we
>>>> > > > >>>> just
>>>> > > > >>>>>>>>>> add a
>>>> > > > >>>>>>>>>>>> new
>>>> > > > >>>>>>>>>>>>>>>>> function to `KStream` and `KTable` classes like:
>>>> > > > >>>>>>>>>> "as(Processed)",
>>>> > > > >>>>>>>>>>>> with
>>>> > > > >>>>>>>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>>>>> semantics as "the latest operators that
>>>> generate this
>>>> > > > >>>> KStream
>>>> > > > >>>>>>>>> /
>>>> > > > >>>>>>>>>>>> KTable
>>>> > > > >>>>>>>>>>>>>>>> will
>>>> > > > >>>>>>>>>>>>>>>>> be named accordingly to this hint".
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> The only caveat, is that for all operators like
>>>> > > > >>>> `KStream#to`
>>>> > > > >>>>>>>>> and
>>>> > > > >>>>>>>>>>>>>>>>> `KStream#print` that returns void, this
>>>> alternative
>>>> > > > >> would
>>>> > > > >>>> not
>>>> > > > >>>>>>>>>>> work.
>>>> > > > >>>>>>>>>>>>> But
>>>> > > > >>>>>>>>>>>>>>>> for
>>>> > > > >>>>>>>>>>>>>>>>> the current operators:
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> a. KStream#print,
>>>> > > > >>>>>>>>>>>>>>>>> b. KStream#foreach,
>>>> > > > >>>>>>>>>>>>>>>>> c. KStream#to,
>>>> > > > >>>>>>>>>>>>>>>>> d. KStream#process
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> I personally felt that except `KStream#process`
>>>> users
>>>> > > > >>> would
>>>> > > > >>>>>>>>> not
>>>> > > > >>>>>>>>>>>>> usually
>>>> > > > >>>>>>>>>>>>>>>>> bother to override their names, and for
>>>> > > > >> `KStream#process`
>>>> > > > >>>> we
>>>> > > > >>>>>>>>>> could
>>>> > > > >>>>>>>>>>>> add
>>>> > > > >>>>>>>>>>>>>>> an
>>>> > > > >>>>>>>>>>>>>>>>> overload variant with the additional Processed
>>>> > object.
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> 3. In your example, the processor names are
>>>> still
>>>> > added
>>>> > > > >>>> with
>>>> > > > >>>>>> a
>>>> > > > >>>>>>>>>>>> suffix
>>>> > > > >>>>>>>>>>>>>>>> like
>>>> > > > >>>>>>>>>>>>>>>>> "
>>>> > > > >>>>>>>>>>>>>>>>> -0000000000", is this intentional? If yes, why
>>>> (I
>>>> > > > >> thought
>>>> > > > >>>>>> with
>>>> > > > >>>>>>>>>>> user
>>>> > > > >>>>>>>>>>>>>>>>> specified processor name hints we will not add
>>>> suffix
>>>> > > > >> to
>>>> > > > >>>>>>>>>>> distinguish
>>>> > > > >>>>>>>>>>>>>>>>> different nodes of the same type any more)?
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> Guozhang
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> On Tue, May 29, 2018 at 6:47 AM, Bill Bejeck <
>>>> > > > >>>>>>>>> [email protected]
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> wrote:
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> Hi Florian,
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> Thanks for the KIP.  I think being able to add
>>>> more
>>>> > > > >>>> context
>>>> > > > >>>>>>>>> to
>>>> > > > >>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>>>>>> processor names would be useful.
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> I like the idea of adding a
>>>> "withProcessorName" to
>>>> > > > >>>> Produced,
>>>> > > > >>>>>>>>>>>> Consumed
>>>> > > > >>>>>>>>>>>>>>>> and
>>>> > > > >>>>>>>>>>>>>>>>>> Joined.
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> But instead of adding the "Processed"
>>>> parameter to a
>>>> > > > >>> large
>>>> > > > >>>>>>>>>>>> percentage
>>>> > > > >>>>>>>>>>>>>>>> of
>>>> > > > >>>>>>>>>>>>>>>>>> the methods, which would result in overloaded
>>>> > methods
>>>> > > > >>>> (which
>>>> > > > >>>>>>>>> we
>>>> > > > >>>>>>>>>>>>>>> removed
>>>> > > > >>>>>>>>>>>>>>>>>> quite a bit with KIP-182) what do you think of
>>>> > adding
>>>> > > > >> a
>>>> > > > >>>>>>>>> method
>>>> > > > >>>>>>>>>>>>>>>>>> to the AbstractStream class "withName(String
>>>> > > > >>>>>> processorName)"?
>>>> > > > >>>>>>>>>> BTW
>>>> > > > >>>>>>>>>>>> I"m
>>>> > > > >>>>>>>>>>>>>>>> not
>>>> > > > >>>>>>>>>>>>>>>>>> married to the method name, it's the best I
>>>> can do
>>>> > off
>>>> > > > >>> the
>>>> > > > >>>>>>>>> top
>>>> > > > >>>>>>>>>> of
>>>> > > > >>>>>>>>>>>> my
>>>> > > > >>>>>>>>>>>>>>>>> head.
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> For the methods that return void, we'd have to
>>>> add a
>>>> > > > >>>>>>>>> parameter,
>>>> > > > >>>>>>>>>>> but
>>>> > > > >>>>>>>>>>>>>>>> that
>>>> > > > >>>>>>>>>>>>>>>>>> would at least cut down on the number of
>>>> overloaded
>>>> > > > >>>> methods
>>>> > > > >>>>>>>>> in
>>>> > > > >>>>>>>>>>> the
>>>> > > > >>>>>>>>>>>>>>> API.
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> Just my 2 cents.
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> Thanks,
>>>> > > > >>>>>>>>>>>>>>>>>> Bill
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> On Sun, May 27, 2018 at 4:13 PM, Florian
>>>> Hussonnois
>>>> > <
>>>> > > > >>>>>>>>>>>>>>>>> [email protected]
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>> wrote:
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>>> Hi,
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>>> I would like to start a new discussion on
>>>> following
>>>> > > > >>> KIP :
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>>> This is still a draft.
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>>> Looking forward for your feedback.
>>>> > > > >>>>>>>>>>>>>>>>>>> --
>>>> > > > >>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS
>>>> > > > >>>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>> --
>>>> > > > >>>>>>>>>>>>>>>>> -- Guozhang
>>>> > > > >>>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>> --
>>>> > > > >>>>>>>>>>>>>>> Florian HUSSONNOIS
>>>> > > > >>>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>> --
>>>> > > > >>>>>>>>>>>> Florian HUSSONNOIS
>>>> > > > >>>>>>>>>>>>
>>>> > > > >>>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>> --
>>>> > > > >>>>>>>>>> -- Guozhang
>>>> > > > >>>>>>>>>>
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>> --
>>>> > > > >>>>>>>>> Florian HUSSONNOIS
>>>> > > > >>>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>>
>>>> > > > >>>>>>>
>>>> > > > >>>>>>>
>>>> > > > >>>>>>
>>>> > > > >>>>>> --
>>>> > > > >>>>>> Florian HUSSONNOIS
>>>> > > > >>>>>>
>>>> > > > >>>>>
>>>> > > > >>>>>
>>>> > > > >>>>> --
>>>> > > > >>>>> -- Guozhang
>>>> > > > >>>>>
>>>> > > > >>>>
>>>> > > > >>>>
>>>> > > > >>>> --
>>>> > > > >>>> -- Guozhang
>>>> > > > >>>>
>>>> > > > >>>
>>>> > > > >>
>>>> > > > >>
>>>> > > > >> --
>>>> > > > >> -- Guozhang
>>>> > > > >>
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > > >
>>>> > >
>>>> >
>>>> >
>>>> > --
>>>> > Florian HUSSONNOIS
>>>> >
>>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
>>
>> --
>> -- Guozhang
>>
>
>
> --
> -- Guozhang
>


-- 
Florian HUSSONNOIS

Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

Reply via email to