Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

Florian Hussonnois Sun, 13 Jan 2019 04:23:53 -0800

Hi Matthias,

The reason for overloading the methods with Materialized parameter is
regarding the semantic of this class.
The Materialized class allow to name a queryable store. if a name is set
then it will be used both to name the state-store and  the changelog-topic.
If no name is given, then the provided Named will be used.
This allow to name the operation without having a queriable store.


So if my analysis is correct, we will end up with :

                                  Generated  | Named   | Joined /  Grouped
|  Materialized
-------------------------------------------------------------------------------------------------
Node                     |               X       |     X       |    X
                 |
-------------------------------------------------------------------------------------------------
Repartition Topic   |               X       |              |    X
             |
-------------------------------------------------------------------------------------------------
Queryable Store    |                        |              |
              |     X
-------------------------------------------------------------------------------------------------
State store             |               X      |     X      |     X
               |     X
-------------------------------------------------------------------------------------------------
Changelog Topic    |              X       |      X    |      X
         |     X
-------------------------------------------------------------------------------------------------

Le dim. 13 janv. 2019 à 03:23, Matthias J. Sax <matth...@confluent.io> a
écrit :

> Just catching up on this KIP again.
>
> One nit. The KIP says:
>
> > In addition, the generated names have a few disadvantages to guarantee
> topology compatibilities. In fact, adding a new operator, using a
> third-library doing some optimization to remove some operators or upgrading
> to a new KafkaStreams version with internal API changes may changed suffix
> indexing for a large amount of the processor names. This will in turn
> change the internal state store names, as well as internal topic names as
> well.
> >
>
> This is not true any longer (I guess it was true, when the KIP was
> initially proposed), because all stores/internal-topics can be named
> since 2.1 release. I would suggest to remove the paragraph.
>
> Overall, I like the Named/NamedOperation design.
>
> What is unclear to me thought is, why we need new overloads for methods
> that accept `Materialized`. To be more precise, I think it make sense to
> add an overload that only takes `Named`, but not one that takes both
> `Named` and `Materialized`. For example:
>
> KGroupedStream#count() // exists
> KGroupedStream#count(Materialized) // exits
> KGroupedStream#count(Named) // added (makes sense to me)
> KGroupedStream#count(Named, Materialized) // added -- why?
>
> I would prefer to use `Materialized` to name the processor for this
> case, too. Can you elaborate on the motivation?
>
>
> -Matthias
>
> On 1/11/19 3:39 PM, Florian Hussonnois wrote:
> > Hi Guozhang,
> >
> > I have updated the PR as well as the KIP. I should add more unit tests to
> > covers all new methods.
> >
> > However, I still have one test in failure. The reason is that using
> > Joined.name() in both potential repartition topic and processor nodes
> leads
> > to topology-incompatible.
> > How should we deal with that ?
> >
> > Thanks,
> >
> > Le jeu. 10 janv. 2019 à 01:21, Guozhang Wang <wangg...@gmail.com> a
> écrit :
> >
> >> Hello Florian,
> >>
> >> Just checking if have read about my previous email and if you feel happy
> >> about it. We have the 2.2 KIP freeze deadline at 24th this month, while
> the
> >> PR itself is getting quite close. So it'll be great if we can get the
> >> agreement on it and get it into 2.2.0 release.
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Mon, Dec 17, 2018 at 2:39 PM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >>
> >>> Hi Florian / John,
> >>>
> >>> Just wanted to throw a couple minor thoughts on the current proposal:
> >>>
> >>> 1) Regarding the interface / function name, I'd propose we call the
> >>> interface `NamedOperation` which would be implemented by Produced /
> >>> Consumed / Printed / Joined / Grouped / Suppressed (note I
> intentionally
> >>> exclude Materialized here since its semantics is quite), and have the
> >>> default class that implements `NamedOperation` as `Named`, which would
> be
> >>> used in our adding overload functions. The main reason is to have
> >>> consistency in naming.
> >>>
> >>> 2) As a minor tweak, I think it's better to use Joined.name() in both
> its
> >>> possibly generate repartition topic, as well as the map processor used
> for
> >>> group-by (currently this name is only used for the repartition topic).
> >>>
> >>>
> >>> Florian: if you think this proposal makes sense, please feel free to go
> >>> ahead and update the PR; after we made a first pass on it and feels
> >>> confident about it, we can go ahead with the VOTING process. About the
> >>> implementation of 2) above, this may be out of your implementation
> scope,
> >>> so feel free to leave it out side your PR while Bill who originally
> worked
> >>> on the Grouped KIP can make a follow-up PR for it.
> >>>
> >>> Guozhang
> >>>
> >>> On Fri, Dec 14, 2018 at 9:43 PM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >>>
> >>>> Hello Florian,
> >>>>
> >>>> Really appreciate you for your patience.
> >>>>
> >>>> I know that we've discussed about the approach to adding overloaded
> >>>> functions and rejected it early on. But looking deeper into the
> current PR
> >>>> I realized that this approach has a danger of great API confusions to
> users
> >>>> (I tried to explain my thoughts in the PR, but it was not very clear)
> ---
> >>>> the basic idea is that, today we already have a few existing control
> >>>> classes including Grouped, Joined, Suppressed that allow users to
> specify
> >>>> serdes etc, while also a "name" which can then be used to define the
> >>>> processor name / internal topic names in the topology (the static
> function
> >>>> names are not consistent, which I think we should fix as well). And
> Named
> >>>> interface, by extending the lambda function interfaces like
> ValueJoiner /
> >>>> Predicate etc opens the door for another way to specify the names
> again.
> >>>>
> >>>> So in order to achieve consistency, we are left with generally two
> >>>> options:
> >>>>
> >>>> 1) only allow users to specify names via the lambda interfaces that
> >>>> extends Named interface. This means we'd better remove the naming
> mechanism
> >>>> from the existing control objects to keep consistency.
> >>>>
> >>>> 2) only allow users to specify names via control classes, and we
> >>>> introduce a new class (Named) for those which do not have one yet ---
> this
> >>>> leads to the overloaded functions.
> >>>>
> >>>> I did a quick count on the num.of overloaded functions, and summing
> from
> >>>> KTable (8) / KStream (15) / KGroupedStream (6) / KGroupedTable (6) /
> >>>> TimeWindowedKStream (6) / SessionWindowedKStream (6) we got about 47
> >>>> overloaded functions (our guess was pretty close!) -- note this is
> based on
> >>>> John's proposal that we can let existing Grouped / Joined to extend
> Named
> >>>> and hence we only need overloaded functions with a default
> NamedOperation
> >>>> for those operators that do not have a control classes already.
> >>>>
> >>>> Thinking about this approach I feel it is not too bad compared with
> >>>> either 1) above, which would require us to deprecate lot of public
> >>>> functions around name(), or having a mixed mechanism for naming, which
> >>>> could lead to very confusing behavior to users. Additionally, for most
> >>>> users who would only want to specify the names for those stateful
> >>>> operations which have internal topics / state stores and hence are
> more
> >>>> keen to upgrade compatibility, those added overloads would be
> not-often
> >>>> used functions for them anyways. And by letting existing control
> classes to
> >>>> extend Named, we can have a unified method name for static
> constructor as
> >>>> well.
> >>>>
> >>>>
> >>>>
> >>>> Guozhang
> >>>>
> >>>>
> >>>> On Fri, Dec 14, 2018 at 10:24 AM John Roesler <j...@confluent.io>
> wrote:
> >>>>
> >>>>> Hi Florian,
> >>>>>
> >>>>> Sorry about the run-around of rejecting the original proposal,
> >>>>> only to return to it later on. Hopefully, it's more encouraging
> >>>>> than frustrating that we're coming around to your initial way of
> >>>>> thinking.
> >>>>>
> >>>>> Thanks!
> >>>>> -John
> >>>>>
> >>>>> On Thu, Dec 13, 2018 at 4:28 PM Florian Hussonnois <
> >>>>> fhussonn...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> Thanks again. I agree with your propositions.
> >>>>>> Also IMHO, overloading all methods (filter, map) to accept a new
> >>>>> control
> >>>>>> object seems to provide a more natural development experience for
> >>>>> users.
> >>>>>>
> >>>>>> Actually, this was the first proposition for this KIP, but we have
> >>>>> rejected
> >>>>>> it because this solution led to adding a lot of new methods.
> >>>>>> As you mentioned it, the API has evolve since the creation of this
> >>>>> KIP -
> >>>>>> some existing control objects already allow to customize internal
> >>>>> names. We
> >>>>>> should so keep on that strategy.
> >>>>>>
> >>>>>> If everyone is OK with that, I will update the KIP and the PR
> >>>>> accordingly;
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>> Le jeu. 13 déc. 2018 à 18:08, John Roesler <j...@confluent.io> a
> >>>>> écrit :
> >>>>>>
> >>>>>>> Hi again, all,
> >>>>>>>
> >>>>>>> Matthias, I agree with you.
> >>>>>>>
> >>>>>>> Florian, thanks for your response.
> >>>>>>>
> >>>>>>> I think your proposal is the best way to address the ask for hiding
> >>>>> the
> >>>>>>> name() getter. But I'd like to question that ask and instead
> >>>>> propose that
> >>>>>>> we just make the name() getter part of the public API.
> >>>>>>>
> >>>>>>> The desire to "hide" the getters causes a lot of complexity in our
> >>>>> code
> >>>>>>> base, and it will become completely impractical with the mixin
> >>>>> strategy
> >>>>>> of
> >>>>>>> Named.
> >>>>>>>
> >>>>>>> If we were to switch strategies back to mixing Named in to the
> >>>>> control
> >>>>>>> objects rather than the functions, then the path forward becomes
> >>>>> quite
> >>>>>>> clear.
> >>>>>>>
> >>>>>>> On the other hand, it seems harmless for anyone who wants to be
> >>>>> able to
> >>>>>>> query the name from a control object after setting it, so my vote
> >>>>> would
> >>>>>> be
> >>>>>>> simply to keep the Named interface as:
> >>>>>>>
> >>>>>>> public interface Named<T extends Named<T>> {
> >>>>>>>   String name();
> >>>>>>>   T withName(String name);
> >>>>>>> }
> >>>>>>>
> >>>>>>> Under this proposal, we only mix Named in to the control objects,
> >>>>> which
> >>>>>>> means we have no need of default implementations anymore (because
> >>>>> we can
> >>>>>>> update all the control objects concurrently with adding this
> >>>>> interface to
> >>>>>>> them).
> >>>>>>>
> >>>>>>> This does hinge on switching over to a control-object-only
> strategy,
> >>>>>> which
> >>>>>>> introduces the need to add about 50 new control object classes,
> >>>>> which
> >>>>>> would
> >>>>>>> only serve to implement Named. As a middle ground, maybe we could
> >>>>> just
> >>>>>> add
> >>>>>>> one generic control object class, like:
> >>>>>>>
> >>>>>>> public class NamedOperation implements Named<NamedOperation> {
> >>>>>>>   private final String name;
> >>>>>>>   private NamedOperation(final String name) { this.name = name; }
> >>>>>>>   public static NamedOperation name(final String name) {
> >>>>>>>     return new NamedOperation(name);
> >>>>>>>   }
> >>>>>>>   public String name() { return name; }
> >>>>>>>   public NamedOperation withName(final String name) {
> >>>>>>>     return new NamedOperation(name);
> >>>>>>>   }
> >>>>>>> }
> >>>>>>>
> >>>>>>> And then, we'd add overloads for all the methods that don't have
> >>>>> control
> >>>>>>> objects already (for example, filter() ):
> >>>>>>>
> >>>>>>> // existing
> >>>>>>> KStream<K, V> filter(Predicate<? super K, ? super V> predicate);
> >>>>>>>
> >>>>>>> // new
> >>>>>>> KStream<K, V> filter(Predicate<? super K, ? super V> predicate,
> >>>>>>> NamedOperation named);
> >>>>>>>
> >>>>>>> Additionally, in regard to Matthias's point about existing control
> >>>>>> objects
> >>>>>>> with naming semantics, they would extend Named (but not
> >>>>> NamedOperation)
> >>>>>> for
> >>>>>>> uniformity.
> >>>>>>>
> >>>>>>> You provided a good approach to hide the getter with your
> >>>>> SettableName
> >>>>>>> class; I think what you proposed is the only way we could hide the
> >>>>> name.
> >>>>>>> In the end, though, it's a lot of complexity added (control object
> >>>>> class
> >>>>>>> hierarchy, inheritance, mutable state, internal casting) for
> >>>>> something of
> >>>>>>> dubious value: to be able to hide the name from someone *after they
> >>>>>>> themselves have set it*.
> >>>>>>>
> >>>>>>> Although it'll be a pain, perhaps Matthias's suggestion to
> >>>>> enumerate all
> >>>>>>> the API methods is the best way to be sure we all agree on what's
> >>>>> going
> >>>>>> to
> >>>>>>> happen.
> >>>>>>>
> >>>>>>> Thanks again for wrangling with this issue,
> >>>>>>> -John
> >>>>>>>
> >>>>>>> On Thu, Dec 13, 2018 at 9:03 AM Matthias J. Sax <
> >>>>> matth...@confluent.io>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Just catching up on this discussion.
> >>>>>>>>
> >>>>>>>> My overall personal take is, that I am not a big fan of the
> >>>>> interface
> >>>>>>>> `Named` that is used as a factory. I would rather prefer to add a
> >>>>>>>> control object parameter to all methods that don't have one yet.
> >>>>> This
> >>>>>>>> KIP was started a while ago, and we added new naming capabilities
> >>>>> in
> >>>>>> the
> >>>>>>>> meantime. Guozhang's example in the PR comment about naming in
> >>>>>>>> stream-stream join shows, that we might end up in a confusion
> >>>>> situation
> >>>>>>>> for users if we use `Named`. Also, in 2.1, user can already name
> >>>>> as
> >>>>>>>> repartition-/changelog-topics and stores. Thus, KIP-307 boils
> >>>>> down to
> >>>>>>>> provide non-functional naming?
> >>>>>>>>
> >>>>>>>> Hence, for all methods that allow to specify names already, I
> >>>>> don't see
> >>>>>>>> any reason to change them, but use the existing API to also name
> >>>>> the
> >>>>>>>> processor(s) instead of allowing uses to specify a new name.
> >>>>>>>>
> >>>>>>>> About the inconsistency in method naming. I agree, that `as` is
> >>>>> very
> >>>>>>>> generic and maybe not the best choice.
> >>>>>>>>
> >>>>>>>> I think it might be helpful, to have a table overview in the KIP,
> >>>>> that
> >>>>>>>> list all existing static/non-static methods that allow to specify
> >>>>> a
> >>>>>>>> name, plus a columns with the new suggested naming for those
> >>>>> methods?
> >>>>>>>>
> >>>>>>>> Thoughts?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 12/12/18 12:45 AM, Florian Hussonnois wrote:
> >>>>>>>>> Thank you very much for your feedbacks.
> >>>>>>>>>
> >>>>>>>>> Currently, there is still lot of discussions regarding the Named
> >>>>>>>> interface.
> >>>>>>>>> On the one hand we should provided consistency over the stream
> >>>>> API
> >>>>>> and
> >>>>>>> on
> >>>>>>>>> the other hand we should not break the semantic as John point
> >>>>> it up.
> >>>>>>>>>
> >>>>>>>>> Guozhang, I'm sorry, but I'm little bit confused, maybe I missed
> >>>>>>>> something.
> >>>>>>>>> In your comment you have suggested that :
> >>>>>>>>> * Produced/Consumed/Suppressed should extends Named
> >>>>>>>>> * Named should have a private-package method to get the
> >>>>> specified
> >>>>>>>> processor
> >>>>>>>>> name internally (processorName())
> >>>>>>>>> * Finally we should end up with something like :  Named -> XXX
> >>>>> ->
> >>>>>>>>> XXXInternal or Named -> Produced -> ProducedInternal
> >>>>>>>>>
> >>>>>>>>> The objective behind that is to :
> >>>>>>>>> * consolidate the internal method processorName()
> >>>>>>>>> * consolidate the method withName that exists now existing into
> >>>>>>> Produced,
> >>>>>>>>> Consumed and Suppressed.
> >>>>>>>>>
> >>>>>>>>> But, Named is an interface so we can't define a private-package
> >>>>>> method
> >>>>>>> on
> >>>>>>>>> it. Also, for example Produced and ProducedInternal are not in
> >>>>> the
> >>>>>> same
> >>>>>>>>> package so having a private-package method doesn't really help.
> >>>>>>>>> In addition, if we add the withName method into Named interface
> >>>>> this
> >>>>>>> can
> >>>>>>>>> become confusing for developers because action interfaces
> >>>>>> (ValueMapper,
> >>>>>>>>> Reducer, etc) extend it.
> >>>>>>>>> The interface would look like :
> >>>>>>>>>
> >>>>>>>>> public interface Named<T extends Named<T>> {
> >>>>>>>>>     default String name() {
> >>>>>>>>>         return null;
> >>>>>>>>>     }
> >>>>>>>>>     default Named<T> withName(final String name) {
> >>>>>>>>>         return null;
> >>>>>>>>>     }
> >>>>>>>>> ...
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> So maybe instead of adding another method to Named we could
> >>>>> create a
> >>>>>>> new
> >>>>>>>>> package-private class that could be extended by
> >>>>>>>>> Produced/Consumed/Joined/Suppressed. For exemple,
> >>>>>>>>> class SettableName<T extends SettableName<T>> implements Named {
> >>>>>>>>>
> >>>>>>>>>     protected String processorName;
> >>>>>>>>>
> >>>>>>>>>     SettableName(final SettableName settable) {
> >>>>>>>>>         this(Objects.requireNonNull(settable, "settable can't be
> >>>>>>>>> null").name());
> >>>>>>>>>     }
> >>>>>>>>>
> >>>>>>>>>     SettableName(final String processorName) {
> >>>>>>>>>         this.processorName = processorName;
> >>>>>>>>>     }
> >>>>>>>>>
> >>>>>>>>>     @Override
> >>>>>>>>>     public String name() {
> >>>>>>>>>         return processorName;
> >>>>>>>>>     }
> >>>>>>>>>     public T withName(final String processorName) {
> >>>>>>>>>         this.processorName = processorName;
> >>>>>>>>>         return (T)this;
> >>>>>>>>>     }
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> In that way, we will get : public class Produced implements
> >>>>>>>>> SettableName<Produced> { ...
> >>>>>>>>>
> >>>>>>>>> WDYT?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Le mar. 11 déc. 2018 à 02:46, Guozhang Wang <wangg...@gmail.com>
> >>>>> a
> >>>>>>>> écrit :
> >>>>>>>>>
> >>>>>>>>>> I had one meta comment on the PR:
> >>>>>>>>>>
> >>>>> https://github.com/apache/kafka/pull/5909#discussion_r240447153
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Dec 10, 2018 at 5:22 PM John Roesler <
> >>>>> j...@confluent.io>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Florian,
> >>>>>>>>>>>
> >>>>>>>>>>> I hope it's ok if I ask a few questions at this late stage...
> >>>>>>>>>>>
> >>>>>>>>>>> Comment 1 ======
> >>>>>>>>>>>
> >>>>>>>>>>> It seems like the proposal is to add a new "Named" interface
> >>>>> that
> >>>>>> is
> >>>>>>>>>>> intended to be mixed in with the existing API objects at
> >>>>> various
> >>>>>>>> points.
> >>>>>>>>>>>
> >>>>>>>>>>> Just to preface some of my comments, it looks like your KIP
> >>>>> was
> >>>>>>> created
> >>>>>>>>>>> quite a while ago, so the API may have changed somewhat since
> >>>>> you
> >>>>>>>>>> started.
> >>>>>>>>>>>
> >>>>>>>>>>> As I see the API, there are a few different kinds of DSL
> >>>>> method
> >>>>>>>>>> arguments:
> >>>>>>>>>>> * functions: things like Initializer, Aggregator, ValueJoiner,
> >>>>>>>>>>> ForEachAction... All of these are essentially Streams-flavored
> >>>>>>> Function
> >>>>>>>>>>> interfaces with different arities, type bounds, and semantics.
> >>>>>>>>>>> * config objects: things like Produced, Consumed, Joined,
> >>>>>> Grouped...
> >>>>>>>>>> These
> >>>>>>>>>>> are containers for configurations, where the target of the
> >>>>>>>> configuration
> >>>>>>>>>> is
> >>>>>>>>>>> the operation itself
> >>>>>>>>>>> * raw configurations: things like a raw topic-name string and
> >>>>>>>>>> Materialized:
> >>>>>>>>>>> These are configurations for operations that have no config
> >>>>> object,
> >>>>>>> and
> >>>>>>>>>> for
> >>>>>>>>>>> various reasons, we didn't make one. The distinguishing
> >>>>> feature is
> >>>>>>> that
> >>>>>>>>>> the
> >>>>>>>>>>> target of the configuration is not the operation itself, but
> >>>>> some
> >>>>>>>> aspect
> >>>>>>>>>> of
> >>>>>>>>>>> it. For example, in Materialized, we are not setting the
> >>>>> caching
> >>>>>>>> behavior
> >>>>>>>>>>> of, for example, an aggregation; we're setting the caching
> >>>>> behavior
> >>>>>>> of
> >>>>>>>> a
> >>>>>>>>>>> materialized state store attached to the aggregation.
> >>>>>>>>>>>
> >>>>>>>>>>> It seems like choosing to mix the Named interface in with the
> >>>>>>> functions
> >>>>>>>>>> has
> >>>>>>>>>>> a couple of unfortunate side-effects:
> >>>>>>>>>>> * Aggregator is not the only function passed to any of the
> >>>>> relevant
> >>>>>>>>>>> aggregate methods, so it seems a little arbitrary to pick that
> >>>>>>> function
> >>>>>>>>>>> over Initializer or Merger.
> >>>>>>>>>>> * As you noted, branch() takes an array of Predicate, so we
> >>>>> just
> >>>>>>> ignore
> >>>>>>>>>> the
> >>>>>>>>>>> provided name(s), even though Predicate names are used
> >>>>> elsewhere.
> >>>>>>>>>>> * Not all things that we want to name have function arguments,
> >>>>>>> notably
> >>>>>>>>>>> source and sink, so we'd switch paradigms and use the config
> >>>>> object
> >>>>>>>>>>> instead.
> >>>>>>>>>>> * Adding an extra method to the function interfaces means that
> >>>>>> those
> >>>>>>>> are
> >>>>>>>>>> no
> >>>>>>>>>>> longer SAM interfaces. You proposed to add a default
> >>>>>> implementation,
> >>>>>>> so
> >>>>>>>>>> we
> >>>>>>>>>>> could still pass a lambda if we don't want to set the name,
> >>>>> but if
> >>>>>> we
> >>>>>>>>>> *do*
> >>>>>>>>>>> want to set the name, we can no longer use lambdas.
> >>>>>>>>>>>
> >>>>>>>>>>> I think the obvious other choice would be to mix Named in
> >>>>> with the
> >>>>>>>> config
> >>>>>>>>>>> objects instead, but this has one main downside of its own...
> >>>>>>>>>>> * not every operator we wish to name has a config object. I
> >>>>> don't
> >>>>>>> know
> >>>>>>>> if
> >>>>>>>>>>> everyone involved is comfortable with adding a config object
> >>>>> to
> >>>>>> every
> >>>>>>>>>>> operator that's missing one.
> >>>>>>>>>>>
> >>>>>>>>>>> Personally, I favor moving toward a more consistent state
> >>>>> that's
> >>>>>>>> forward
> >>>>>>>>>>> compatible with any further changes we wish to make. I
> >>>>> *think* that
> >>>>>>>>>> giving
> >>>>>>>>>>> every operator two forms (one with no config and one with a
> >>>>> config
> >>>>>>>>>> object)
> >>>>>>>>>>> would be such an API.
> >>>>>>>>>>>
> >>>>>>>>>>> Comment 2 =========
> >>>>>>>>>>>
> >>>>>>>>>>> Finally, just a minor comment: the static method in Named
> >>>>> wouldn't
> >>>>>>> work
> >>>>>>>>>>> properly as defined. Assuming that we mix Named in with
> >>>>> Produced,
> >>>>>> for
> >>>>>>>>>>> example, we'd need to be able to use it like:
> >>>>>>>>>>>>  kStream.to("out", Produced.with("myOut"))
> >>>>>>>>>>> This doesn't work because with() returns a Named, but we need
> >>>>> a
> >>>>>>>> Produced.
> >>>>>>>>>>>
> >>>>>>>>>>> We can pull off a builder method in the interface, but not a
> >>>>> static
> >>>>>>>>>> method.
> >>>>>>>>>>> To define a builder method in the interface that returns an
> >>>>>> instance
> >>>>>>> of
> >>>>>>>>>> the
> >>>>>>>>>>> concrete subtype, you have to use the "curiously recurring
> >>>>> generic"
> >>>>>>>>>>> pattern.
> >>>>>>>>>>>
> >>>>>>>>>>> It would look like:
> >>>>>>>>>>>
> >>>>>>>>>>> public interface Named<N extends Named<N>> {
> >>>>>>>>>>>   String name();
> >>>>>>>>>>>   N withName(String name);
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> You can see where the name of the pattern comes from ;)
> >>>>>>>>>>> An implementation would then look like:
> >>>>>>>>>>>
> >>>>>>>>>>> public class Produced implements Named<Produced> {
> >>>>>>>>>>>   String name() { return name; }
> >>>>>>>>>>>   Produced withName(final String name) { this.name = name;
> >>>>> return
> >>>>>>>> this;
> >>>>>>>>>> }
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> Note that the generic parameter gets filled in properly in the
> >>>>>>>>>> implementing
> >>>>>>>>>>> class, so that you get the right return type out.
> >>>>>>>>>>>
> >>>>>>>>>>> It doesn't work at all with a static factory method at the
> >>>>>> interface
> >>>>>>>>>> level,
> >>>>>>>>>>> so it would be up to Produced to define a static factory if it
> >>>>>> wants
> >>>>>>> to
> >>>>>>>>>>> present one.
> >>>>>>>>>>>
> >>>>>>>>>>> ======
> >>>>>>>>>>>
> >>>>>>>>>>> Those are my two feedbacks!
> >>>>>>>>>>>
> >>>>>>>>>>> I hope you find this helpful, rather than frustrating. I'm
> >>>>> sorry I
> >>>>>>>> didn't
> >>>>>>>>>>> get a chance to comment sooner.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the KIP, I think it will be much nicer to be able
> >>>>> to
> >>>>>> name
> >>>>>>>> the
> >>>>>>>>>>> processor nodes.
> >>>>>>>>>>>
> >>>>>>>>>>> -John
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Nov 27, 2018 at 6:34 PM Guozhang Wang <
> >>>>> wangg...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Florian,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've made a pass over the PR. There are some comments that
> >>>>> are
> >>>>>>> related
> >>>>>>>>>> to
> >>>>>>>>>>>> the function names which may be affecting the KIP wiki page,
> >>>>> but
> >>>>>>>>>> overall
> >>>>>>>>>>> I
> >>>>>>>>>>>> think it looks good already.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Nov 16, 2018 at 4:21 PM Guozhang Wang <
> >>>>> wangg...@gmail.com
> >>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks Florian! I will take a look at the PR.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Nov 12, 2018 at 2:44 PM Florian Hussonnois <
> >>>>>>>>>>>> fhussonn...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Matthias,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Sorry I was absent for a while. I have started a new PR
> >>>>> for this
> >>>>>>>>>> KIP.
> >>>>>>>>>>> It
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>> still in progress for now. I'm working on it.
> >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5909
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Le ven. 19 oct. 2018 à 20:13, Matthias J. Sax <
> >>>>>>>>>> matth...@confluent.io>
> >>>>>>>>>>> a
> >>>>>>>>>>>>>> écrit :
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> What is the status of this KIP?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 7/19/18 5:17 PM, Guozhang Wang wrote:
> >>>>>>>>>>>>>>>> Hello Florian,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Sorry for being late... Found myself keep apologizing
> >>>>> for late
> >>>>>>>>>>>> replies
> >>>>>>>>>>>>>>>> these days. But I do want to push this KIP's progress
> >>>>> forward
> >>>>>>>>>> as I
> >>>>>>>>>>>>>> see it
> >>>>>>>>>>>>>>>> very important and helpful feature for extensibility.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> About the exceptions, I've gone through them and
> >>>>> hopefully it
> >>>>>> is
> >>>>>>>>>>> an
> >>>>>>>>>>>>>>>> exhaustive list:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 1. KTable#toStream()
> >>>>>>>>>>>>>>>> 2. KStream#merge(KStream)
> >>>>>>>>>>>>>>>> 3. KStream#process() / transform() / transformValues()
> >>>>>>>>>>>>>>>> 4. KGroupedTable / KGroupedStream#count()
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Here's my reasoning:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> * It is okay not letting users to override the name for
> >>>>> 1/2,
> >>>>>>>>>> since
> >>>>>>>>>>>>>> they
> >>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>> too trivial to be useful for debugging, plus their
> >>>>> processor
> >>>>>>>>>> names
> >>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>> not determine any related topic / store names.
> >>>>>>>>>>>>>>>> * For 3, I'd vote for adding overloaded functions with
> >>>>> Named.
> >>>>>>>>>>>>>>>> * For 4, if users really want to name the processor she
> >>>>> can
> >>>>>> call
> >>>>>>>>>>>>>>>> aggregate() instead, so I think it is okay to skip this
> >>>>> case.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Fri, Jul 6, 2018 at 3:06 PM, Florian Hussonnois <
> >>>>>>>>>>>>>>> fhussonn...@gmail.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The option #3 seems to be a good alternative and I find
> >>>>> the
> >>>>>> API
> >>>>>>>>>>>> more
> >>>>>>>>>>>>>>>>> elegant (thanks John).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> But, we still have the need to overload some methods
> >>>>> either
> >>>>>>>>>>> because
> >>>>>>>>>>>>>>> they do
> >>>>>>>>>>>>>>>>> not accept an action instance or because they are
> >>>>> translated
> >>>>>> to
> >>>>>>>>>>>>>> multiple
> >>>>>>>>>>>>>>>>> processors.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> For example, this is the case for methods branch() and
> >>>>>> merge().
> >>>>>>>>>>> We
> >>>>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>> introduce a new interface Named (or maybe a different
> >>>>> name ?)
> >>>>>>>>>>> with
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>>> method
> >>>>>>>>>>>>>>>>> name(). All action interfaces could extend this one to
> >>>>>>>>>> implement
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> option
> >>>>>>>>>>>>>>>>> 3).
> >>>>>>>>>>>>>>>>> This would result by having the following overloads  :
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Stream<K, V> merge(final Named name, final KStream<K, V>
> >>>>>>>>>> stream);
> >>>>>>>>>>>>>>>>> KStream<K, V>[] branch(final Named name, final
> >>>>> Predicate<?
> >>>>>>>>>> super
> >>>>>>>>>>>> K, ?
> >>>>>>>>>>>>>>> super
> >>>>>>>>>>>>>>>>> V>... predicates)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> N.B : The list above is  not exhaustive
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ---------
> >>>>>>>>>>>>>>>>> user's code will become :
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>         KStream<String, Integer> stream =
> >>>>>>>>>> builder.stream("test");
> >>>>>>>>>>>>>>>>>         KStream<String, Integer>[] branches =
> >>>>>>>>>>>>>>>>> stream.branch(Named.with("BRANCH-STREAM-ON-VALUE"),
> >>>>>>>>>>>>>>>>>                 Predicate.named("STREAM-PAIR-VALUE",
> >>>>> (k, v)
> >>>>>> ->
> >>>>>>>>>> v
> >>>>>>>>>>> %
> >>>>>>>>>>>> 2
> >>>>>>>>>>>>>> ==
> >>>>>>>>>>>>>>>>> 0),
> >>>>>>>>>>>>>>>>>                 Predicate.named("STREAM-IMPAIR-VALUE",
> >>>>> (k, v)
> >>>>>>>>>> ->
> >>>>>>>>>>> v
> >>>>>>>>>>>> %
> >>>>>>>>>>>>>> 2
> >>>>>>>>>>>>>>> !=
> >>>>>>>>>>>>>>>>> 0));
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>         branches[0].to("pair");
> >>>>>>>>>>>>>>>>>         branches[1].to("impair");
> >>>>>>>>>>>>>>>>> ---------
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> This is a mix of the options 3) and 1)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Le ven. 6 juil. 2018 à 22:58, Guozhang Wang <
> >>>>>>>>>> wangg...@gmail.com>
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>> écrit :
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi folks, just to summarize the options we have so far:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1) Add a new "as" for KTable / KStream, plus adding new
> >>>>>> fields
> >>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>> operators-returns-void control objects (the current
> >>>>> wiki's
> >>>>>>>>>>>>>> proposal).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Pros: no more overloads.
> >>>>>>>>>>>>>>>>>> Cons: a bit departing with the current high-level API
> >>>>> design
> >>>>>>>>>> of
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> DSL,
> >>>>>>>>>>>>>>>>>> plus, the inconsistency between operators-returns-void
> >>>>> and
> >>>>>>>>>>>>>>>>>> operators-not-return-voids.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2) Add overloaded functions for all operators, that
> >>>>> accepts
> >>>>>> a
> >>>>>>>>>>> new
> >>>>>>>>>>>>>>> control
> >>>>>>>>>>>>>>>>>> object "Described".
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Pros: consistent with current APIs.
> >>>>>>>>>>>>>>>>>> Cons: lots of overloaded functions to add.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 3) Add another default function in the interface
> >>>>> (thank you
> >>>>>>>>>> J8!)
> >>>>>>>>>>>> as
> >>>>>>>>>>>>>>> John
> >>>>>>>>>>>>>>>>>> proposed.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Pros: no overloaded functions, no "Described".
> >>>>>>>>>>>>>>>>>> Cons: do we lose lambda functions really (seems not if
> >>>>> we
> >>>>>>>>>>> provide
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>> "named"
> >>>>>>>>>>>>>>>>>> for each func)? Plus "Described" may be more extensible
> >>>>>> than a
> >>>>>>>>>>>>>> single
> >>>>>>>>>>>>>>>>>> `String`.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> My principle of considering which one is better depends
> >>>>>>>>>>> primarily
> >>>>>>>>>>>> on
> >>>>>>>>>>>>>>> "how
> >>>>>>>>>>>>>>>>>> to make advanced users easily use the additional API,
> >>>>> while
> >>>>>>>>>>>> keeping
> >>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>> hidden from normal users who do not care at all". For
> >>>>> that
> >>>>>>>>>>>> purpose I
> >>>>>>>>>>>>>>>>> think
> >>>>>>>>>>>>>>>>>> 3) > 1) > 2).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> One caveat though, is that changing the interface
> >>>>> would not
> >>>>>> be
> >>>>>>>>>>>>>>>>>> binary-compatible though source-compatible, right? I.e.
> >>>>>> users
> >>>>>>>>>>> need
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> recompile their code though no changes needed.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Another note: for 3), if we really want to keep
> >>>>>> extensibility
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>>> Described
> >>>>>>>>>>>>>>>>>> we could do sth. like:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> ---------
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> public interface Predicate<K, V> {
> >>>>>>>>>>>>>>>>>>     // existing method
> >>>>>>>>>>>>>>>>>>     boolean test(final K key, final V value);
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>     // new default method adds the ability to name the
> >>>>>>>>>> predicate
> >>>>>>>>>>>>>>>>>>     default Described described() {
> >>>>>>>>>>>>>>>>>>         return new Described(null);
> >>>>>>>>>>>>>>>>>>     }
> >>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> ----------
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> where user's code becomes:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> stream.filter(named("key", (k, v) -> true));   // note
> >>>>>> `named`
> >>>>>>>>>>> now
> >>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>> sets a Described("key") in "described()".
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> stream.filter(described(Described.as("key", /* any
> >>>>> other
> >>>>>> fancy
> >>>>>>>>>>>>>>>>> parameters
> >>>>>>>>>>>>>>>>>> in the future*/), (k, v) -> true));
> >>>>>>>>>>>>>>>>>> ----------
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I feel it is not much likely that we'd need to extend
> >>>>> it
> >>>>>>>>>> further
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> future, so just a `String` would be good enough. But
> >>>>> just
> >>>>>>>>>>> listing
> >>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>> possibilities here.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Fri, Jul 6, 2018 at 8:19 AM, John Roesler <
> >>>>>>>>>> j...@confluent.io
> >>>>>>>>>>>>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi Florian,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Sorry I'm late to the party, but I missed the message
> >>>>>>>>>>> originally.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Regarding the names, it's probably a good idea to
> >>>>> stick to
> >>>>>>>>>> the
> >>>>>>>>>>>> same
> >>>>>>>>>>>>>>>>>>> character set we're currently using: letters,
> >>>>> numbers, and
> >>>>>>>>>>>> hyphens.
> >>>>>>>>>>>>>>> The
> >>>>>>>>>>>>>>>>>>> names are used in Kafka topics, files and folders, and
> >>>>>>>>>> RocksDB
> >>>>>>>>>>>>>>>>> databases,
> >>>>>>>>>>>>>>>>>>> and we also need them to work with the file systems of
> >>>>>>>>>> Windows,
> >>>>>>>>>>>>>> Linux,
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> MacOS. My opinion is that with a situation like that,
> >>>>> it's
> >>>>>>>>>>> better
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> conservative. It might also be a good idea to impose
> >>>>> an
> >>>>>> upper
> >>>>>>>>>>>>>> limit on
> >>>>>>>>>>>>>>>>>> name
> >>>>>>>>>>>>>>>>>>> length to avoid running afoul of any of those systems.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> It seems like there's a small debate between 1)
> >>>>> adding a
> >>>>>> new
> >>>>>>>>>>>>>> method to
> >>>>>>>>>>>>>>>>>>> KStream (and maybe KTable) to modify its name after
> >>>>> the
> >>>>>> fact,
> >>>>>>>>>>> or
> >>>>>>>>>>>> 2)
> >>>>>>>>>>>>>>>>>>> piggy-backing on the config objects where they exist
> >>>>> and
> >>>>>>>>>> adding
> >>>>>>>>>>>> one
> >>>>>>>>>>>>>>>>> where
> >>>>>>>>>>>>>>>>>>> they don't. To me, #2 is the better alternative even
> >>>>> though
> >>>>>>>>>> it
> >>>>>>>>>>>>>>> produces
> >>>>>>>>>>>>>>>>>>> more overloads and may be a bit awkward in places.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> The reason is simply that #1 is a high-level
> >>>>> departure from
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> graph-building paradigm we're using in the DSL.
> >>>>> Consider:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Graph.node1(config).node2(config)
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> vs
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Graph.node1().config().node2().config()
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> We could have done either, but we picked the former. I
> >>>>>> think
> >>>>>>>>>>> it's
> >>>>>>>>>>>>>>>>>> probably
> >>>>>>>>>>>>>>>>>>> a good goal to try and stick to it so that developers
> >>>>> can
> >>>>>>>>>>> develop
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> rely
> >>>>>>>>>>>>>>>>>>> on their instincts for how the DSL will behave.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I do want to present one alternative to adding new
> >>>>> config
> >>>>>>>>>>>> objects:
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>> just add a "name()" method to all our "action"
> >>>>> interfaces.
> >>>>>>>>>> For
> >>>>>>>>>>>>>>> example,
> >>>>>>>>>>>>>>>>>>> I'll demonstrate how we can add a "name" to Predicate
> >>>>> and
> >>>>>>>>>> then
> >>>>>>>>>>>> use
> >>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> name a "KStream#filter" DSL operator:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> public interface Predicate<K, V> {
> >>>>>>>>>>>>>>>>>>>     // existing method
> >>>>>>>>>>>>>>>>>>>     boolean test(final K key, final V value);
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>     // new default method adds the ability to name the
> >>>>>>>>>>> predicate
> >>>>>>>>>>>>>>>>>>>     default String name() {
> >>>>>>>>>>>>>>>>>>>         return null;
> >>>>>>>>>>>>>>>>>>>     }
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>     // new static factory method adds the ability to
> >>>>> wrap
> >>>>>>>>>>> lambda
> >>>>>>>>>>>>>>>>>> predicates
> >>>>>>>>>>>>>>>>>>> with a named predicate
> >>>>>>>>>>>>>>>>>>>     static <K, V> Predicate<K, V> named(final String
> >>>>> name,
> >>>>>>>>>>> final
> >>>>>>>>>>>>>>>>>>> Predicate<K, V> predicate) {
> >>>>>>>>>>>>>>>>>>>         return new Predicate<K, V>() {
> >>>>>>>>>>>>>>>>>>>             @Override
> >>>>>>>>>>>>>>>>>>>             public boolean test(final K key, final V
> >>>>>> value) {
> >>>>>>>>>>>>>>>>>>>                 return predicate.test(key, value);
> >>>>>>>>>>>>>>>>>>>             }
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>             @Override
> >>>>>>>>>>>>>>>>>>>             public String name() {
> >>>>>>>>>>>>>>>>>>>                 return name;
> >>>>>>>>>>>>>>>>>>>             }
> >>>>>>>>>>>>>>>>>>>         };
> >>>>>>>>>>>>>>>>>>>     }
> >>>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Then, here's how it would look to use it:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> // Anonymous predicates continue to work just fine
> >>>>>>>>>>>>>>>>>>> stream.filter((k, v) -> true);
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> // Devs can swap in a Predicate that implements the
> >>>>> name()
> >>>>>>>>>>>> method.
> >>>>>>>>>>>>>>>>>>> stream.filter(new Predicate<Object, Object>() {
> >>>>>>>>>>>>>>>>>>>     @Override
> >>>>>>>>>>>>>>>>>>>     public boolean test(final Object key, final Object
> >>>>>>>>>> value) {
> >>>>>>>>>>>>>>>>>>>         return true;
> >>>>>>>>>>>>>>>>>>>     }
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>     @Override
> >>>>>>>>>>>>>>>>>>>     public String name() {
> >>>>>>>>>>>>>>>>>>>         return "hey";
> >>>>>>>>>>>>>>>>>>>     }
> >>>>>>>>>>>>>>>>>>> });
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> // Or they can wrap their existing lambda using the
> >>>>> static
> >>>>>>>>>>>> factory
> >>>>>>>>>>>>>>>>> method
> >>>>>>>>>>>>>>>>>>> stream.filter(named("key", (k, v) -> true));
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Just a thought.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Overall, I think it's really valuable to be able to
> >>>>> name
> >>>>>> the
> >>>>>>>>>>>>>>>>> processors,
> >>>>>>>>>>>>>>>>>>> for all the reasons you mentioned in the KIP. So
> >>>>> thank you
> >>>>>>>>>> for
> >>>>>>>>>>>>>>>>>> introducing
> >>>>>>>>>>>>>>>>>>> this!
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Thu, Jul 5, 2018 at 4:53 PM Florian Hussonnois <
> >>>>>>>>>>>>>>>>> fhussonn...@gmail.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi, thank you very much for all you suggestions. I've
> >>>>>>>>>> started
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> update
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> KIP (
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL
> >>>>>>>>>>>>>>>>>>>> ).
> >>>>>>>>>>>>>>>>>>>> Also, I propose to rename the Processed class into
> >>>>>>>>>> Described -
> >>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> more meaningful (but this is just a detail).
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I'm OK to not enforcing uppercase for specific names
> >>>>> but
> >>>>>>>>>>> should
> >>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>> allow
> >>>>>>>>>>>>>>>>>>>> arbitrary names with whitespaces for example ?
> >>>>> Currently,
> >>>>>> I
> >>>>>>>>>>>> can't
> >>>>>>>>>>>>>>>>> tell
> >>>>>>>>>>>>>>>>>> if
> >>>>>>>>>>>>>>>>>>>> this can lead to some side effects ?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Le lun. 11 juin 2018 à 01:31, Matthias J. Sax <
> >>>>>>>>>>>>>> matth...@confluent.io
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> écrit :
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Just catching up on this thread.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I like the general idea. Couple of comments:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - I think that adding `Processed` (or maybe a
> >>>>> different
> >>>>>>>>>>> name?)
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>> valid proposal for stateless operators that only
> >>>>> have a
> >>>>>>>>>>> single
> >>>>>>>>>>>>>>>>>> overload
> >>>>>>>>>>>>>>>>>>>>> atm. It would align with the overall API design.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - for all methods with multiple existing
> >>>>> overloads, we
> >>>>>> can
> >>>>>>>>>>>>>>>>> consider
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> extend `Consumed`, `Produced`, `Materialized` etc
> >>>>> to take
> >>>>>>>>>> an
> >>>>>>>>>>>>>>>>>> additional
> >>>>>>>>>>>>>>>>>>>>> processor name (not sure atm how elegant this is; we
> >>>>>> would
> >>>>>>>>>>> need
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> "play" with the API a little bit; the advantage
> >>>>> would be,
> >>>>>>>>>>> that
> >>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> add more overloads what seems to be key for this
> >>>>> KIP)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - operators return void: while I agree that the
> >>>>> "name
> >>>>>>>>>> first"
> >>>>>>>>>>>>>>>>>> chaining
> >>>>>>>>>>>>>>>>>>>>> idea is not very intuitive, it might still work, if
> >>>>> we
> >>>>>> name
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> method
> >>>>>>>>>>>>>>>>>>>>> correctly (again, we would need to "play" with the
> >>>>> API a
> >>>>>>>>>>> little
> >>>>>>>>>>>>>> bit
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> see)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - for DSL operators that are translated to multiple
> >>>>>> nodes:
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> might
> >>>>>>>>>>>>>>>>>>>>> make sense to use the specified operator name as
> >>>>> prefix
> >>>>>> and
> >>>>>>>>>>> add
> >>>>>>>>>>>>>>>>>>>>> reasonable suffixes. For example, a join translates
> >>>>> into
> >>>>>> 5
> >>>>>>>>>>>>>>>>> operators
> >>>>>>>>>>>>>>>>>>>>> that could be name "name-left-store-processor",
> >>>>>>>>>>>>>>>>>>>>> "name-left-join-processor",
> >>>>> "name-right-store-processor",
> >>>>>>>>>>>>>>>>>>>>> "name-right-join-processor", and
> >>>>>>>>>> "name-join-merge-processor"
> >>>>>>>>>>>> (or
> >>>>>>>>>>>>>>>>>>>>> similar). Maybe just using numbers might also work.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - I think, we should strip the number suffixes if
> >>>>> a user
> >>>>>>>>>>>>>> provides
> >>>>>>>>>>>>>>>>>>> names
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - enforcing upper case seems to be tricky: for
> >>>>> example,
> >>>>>> we
> >>>>>>>>>>> do
> >>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> enforce upper case for store names and we cannot
> >>>>> easily
> >>>>>>>>>>> change
> >>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>> would break compatibility -- thus, for consistency
> >>>>>> reasons
> >>>>>>>>>> we
> >>>>>>>>>>>>>> might
> >>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> want to do this
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  - for better understand of the impact of the KIP,
> >>>>> it
> >>>>>> would
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> quite
> >>>>>>>>>>>>>>>>>>>>> helpful if you would list all method names that are
> >>>>>>>>>> affected
> >>>>>>>>>>> in
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> KIP
> >>>>>>>>>>>>>>>>>>>>> (ie, list all newly added overloads)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On 5/31/18 6:40 PM, Guozhang Wang wrote:
> >>>>>>>>>>>>>>>>>>>>>> Hi Florian,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Re 1: I think changing the KStreamImpl /
> >>>>> KTableImpl to
> >>>>>>>>>> allow
> >>>>>>>>>>>>>>>>>>> modifying
> >>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> processor name after the operator is fine as long
> >>>>> as we
> >>>>>> do
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> check
> >>>>>>>>>>>>>>>>>>>>> again
> >>>>>>>>>>>>>>>>>>>>>> when modifying that. In fact, we are having some
> >>>>>> topology
> >>>>>>>>>>>>>>>>>>> optimization
> >>>>>>>>>>>>>>>>>>>>>> going on which may modify processor names in the
> >>>>> final
> >>>>>>>>>>>> topology
> >>>>>>>>>>>>>>>>>>>> anyways (
> >>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4983).
> >>>>>> Semantically
> >>>>>>>>>> I
> >>>>>>>>>>>>>> think
> >>>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>> easier to understand to developers than "deciding
> >>>>> the
> >>>>>>>>>>>> processor
> >>>>>>>>>>>>>>>>>> name
> >>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>> the next operator".
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Re 2: Yeah I'm thinking that for operators that
> >>>>>> translates
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> multiple
> >>>>>>>>>>>>>>>>>>>>>> processor names, we can still use the provided
> >>>>> "hint" to
> >>>>>>>>>>> name
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> processor
> >>>>>>>>>>>>>>>>>>>>>> names, e.g. for Joins we can name them as
> >>>>>> `join-foo-this`
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> `join-foo-that` etc if user calls `as("foo")`.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Re 3: The motivation I had about removing the
> >>>>> suffix is
> >>>>>>>>>> that
> >>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>>>> huge
> >>>>>>>>>>>>>>>>>>>>>> restrictions on topology compatibilities: consider
> >>>>> if
> >>>>>> user
> >>>>>>>>>>>> code
> >>>>>>>>>>>>>>>>>>> added a
> >>>>>>>>>>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>>>>>>> operator, or library does some optimization to
> >>>>> remove
> >>>>>> some
> >>>>>>>>>>>>>>>>>> operators,
> >>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> suffix indexing may be changed for a large amount
> >>>>> of the
> >>>>>>>>>>>>>>>>> processor
> >>>>>>>>>>>>>>>>>>>> names:
> >>>>>>>>>>>>>>>>>>>>>> this will in turn change the internal state store
> >>>>> names,
> >>>>>>>>>> as
> >>>>>>>>>>>> well
> >>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>>> internal topic names as well, making the new
> >>>>> application
> >>>>>>>>>>>>>> topology
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>> incompatible with the ones. One rationale I had
> >>>>> about
> >>>>>> this
> >>>>>>>>>>> KIP
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>> aligned this effort, moving forward we can allow
> >>>>> users
> >>>>>> to
> >>>>>>>>>>>>>>>>> customize
> >>>>>>>>>>>>>>>>>>>>>> internal names so that they can still be reused
> >>>>> even
> >>>>>> with
> >>>>>>>>>>>>>>>>> topology
> >>>>>>>>>>>>>>>>>>>>> changes
> >>>>>>>>>>>>>>>>>>>>>> (e.g. KIP-230), so I think removing the suffix
> >>>>> index
> >>>>>> would
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> more
> >>>>>>>>>>>>>>>>>>>>>> applicable in the long run.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Thu, May 31, 2018 at 3:08 PM, Florian
> >>>>> Hussonnois <
> >>>>>>>>>>>>>>>>>>>>> fhussonn...@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Hi ,
> >>>>>>>>>>>>>>>>>>>>>>> Thank you very much for your feedback.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 1/
> >>>>>>>>>>>>>>>>>>>>>>> I agree that overloading most of the methods with
> >>>>> a
> >>>>>>>>>>> Processed
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> ideal.
> >>>>>>>>>>>>>>>>>>>>>>> I've started modifying the KStream API and I got
> >>>>> to the
> >>>>>>>>>>> same
> >>>>>>>>>>>>>>>>>>>> conclusion.
> >>>>>>>>>>>>>>>>>>>>>>> Also ading a new method directly to KStreamImpl
> >>>>> and
> >>>>>>>>>>>> KTableImpl
> >>>>>>>>>>>>>>>>>>> classes
> >>>>>>>>>>>>>>>>>>>>>>> seems to be a better option.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> However a processor name cannot be redefined after
> >>>>>>>>>> calling
> >>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>> operator
> >>>>>>>>>>>>>>>>>>>>> (or
> >>>>>>>>>>>>>>>>>>>>>>> maybe I miss something in the code).
> >>>>>>>>>>>>>>>>>>>>>>> From my understanding, this will only set the
> >>>>> KStream
> >>>>>>>>>> name
> >>>>>>>>>>>>>>>>>> property
> >>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> processor name previsouly added to the topology
> >>>>>> builder -
> >>>>>>>>>>>>>>>>> leading
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>> InvalidTopology exception.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> So the new method should actually defines the
> >>>>> name of
> >>>>>> the
> >>>>>>>>>>>> next
> >>>>>>>>>>>>>>>>>>>>> processor :
> >>>>>>>>>>>>>>>>>>>>>>> Below is an example :
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> *stream.as <http://stream.as
> >>>>>>>>>>>>>>>>>>> (Processed.name("MAPPE_TO_UPPERCASE")*
> >>>>>>>>>>>>>>>>>>>>>>> *          .map( (k, v) -> KeyValue.pair(k,
> >>>>>>>>>>>> v.toUpperCase()))*
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I think this approach could solve the cases for
> >>>>> methods
> >>>>>>>>>>>>>>>>> returning
> >>>>>>>>>>>>>>>>>>>> void ?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Regarding this new method we have two possible
> >>>>>>>>>>>> implementations
> >>>>>>>>>>>>>> :
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>    1. Adding a method like : withName(String
> >>>>>>>>>> processorName)
> >>>>>>>>>>>>>>>>>>>>>>>    2. or adding a method accepting an Processed
> >>>>> object
> >>>>>> :
> >>>>>>>>>>>>>>>>>>>> as(Processed).
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I think solution 2. is preferable as the Processed
> >>>>>> class
> >>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>> enriched
> >>>>>>>>>>>>>>>>>>>>>>> further (in futur).
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 2/
> >>>>>>>>>>>>>>>>>>>>>>> As Guozhang said some operators add internal
> >>>>>> processors.
> >>>>>>>>>>>>>>>>>>>>>>> For example the branch() method create one
> >>>>>> KStreamBranch
> >>>>>>>>>>>>>>>>> processor
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> route
> >>>>>>>>>>>>>>>>>>>>>>> records and one KStreamPassThrough processor for
> >>>>> each
> >>>>>>>>>>> branch.
> >>>>>>>>>>>>>>>>>>>>>>> In that situation only the parent processor can be
> >>>>>> named.
> >>>>>>>>>>> For
> >>>>>>>>>>>>>>>>>>> children
> >>>>>>>>>>>>>>>>>>>>>>> processors we could keep the current behaviour
> >>>>> that
> >>>>>> add a
> >>>>>>>>>>>>>> suffix
> >>>>>>>>>>>>>>>>>>> (i.e
> >>>>>>>>>>>>>>>>>>>>>>> KSTREAM-BRANCHCHILD-)
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> This also the case for the join() method that
> >>>>> result to
> >>>>>>>>>>>> adding
> >>>>>>>>>>>>>>>>>>>> multiple
> >>>>>>>>>>>>>>>>>>>>>>> processors to the topology (windowing, left/right
> >>>>> joins
> >>>>>>>>>>> and a
> >>>>>>>>>>>>>>>>>> merge
> >>>>>>>>>>>>>>>>>>>>>>> processor).
> >>>>>>>>>>>>>>>>>>>>>>> I think, like for the branch method users could
> >>>>> only
> >>>>>>>>>>> define a
> >>>>>>>>>>>>>>>>>>>> processor
> >>>>>>>>>>>>>>>>>>>>>>> name prefix.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 3/
> >>>>>>>>>>>>>>>>>>>>>>> I think we should  still added a suffix like
> >>>>>>>>>> "-0000000000"
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> processor
> >>>>>>>>>>>>>>>>>>>>>>> name and enforce uppercases as this will keep some
> >>>>>>>>>>>> consistency
> >>>>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> ones generated by the API.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 4/
> >>>>>>>>>>>>>>>>>>>>>>> Yes, the KTable interface should be modified like
> >>>>>> KStream
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> allow
> >>>>>>>>>>>>>>>>>>>>> custom
> >>>>>>>>>>>>>>>>>>>>>>> processor names definition.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Le jeu. 31 mai 2018 à 19:18, Damian Guy <
> >>>>>>>>>>>> damian....@gmail.com>
> >>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> écrit
> >>>>>>>>>>>>>>>>>>>>> :
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hi Florian,
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. What about KTable and other
> >>>>> DSL
> >>>>>>>>>>>>>> interfaces?
> >>>>>>>>>>>>>>>>>>> Will
> >>>>>>>>>>>>>>>>>>>>> they
> >>>>>>>>>>>>>>>>>>>>>>>> not want to be able to do the same thing?
> >>>>>>>>>>>>>>>>>>>>>>>> It would be good to see a complete set of the
> >>>>> public
> >>>>>> API
> >>>>>>>>>>>>>>>>> changes.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>>>>>> Damian
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 May 2018 at 19:45 Guozhang Wang <
> >>>>>>>>>>>>>> wangg...@gmail.com
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Hello Florian,
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. I have some meta feedbacks
> >>>>> on the
> >>>>>>>>>>>>>>>>> proposal:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> 1. You mentioned that this `Processed` object
> >>>>> will be
> >>>>>>>>>>> added
> >>>>>>>>>>>>>>>>> to a
> >>>>>>>>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>>>>>>>>>> overloaded variant of all the stateless
> >>>>> operators,
> >>>>>> what
> >>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>> stateful
> >>>>>>>>>>>>>>>>>>>>>>>>> operators? Would like to hear your opinions if
> >>>>> you
> >>>>>> have
> >>>>>>>>>>>>>>>>> thought
> >>>>>>>>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>>>>>>> that:
> >>>>>>>>>>>>>>>>>>>>>>>>> note for stateful operators they will usually be
> >>>>>> mapped
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> multiple
> >>>>>>>>>>>>>>>>>>>>>>>>> processor node names, so we probably need to
> >>>>> come up
> >>>>>>>>>> with
> >>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>>>>> ways
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>> define all their names.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> 2. I share the same concern with Bill as for
> >>>>> adding
> >>>>>>>>>> lots
> >>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>>>>>>>> overload
> >>>>>>>>>>>>>>>>>>>>>>>>> functions into the stateless operators, as we
> >>>>> have
> >>>>>> just
> >>>>>>>>>>>> spent
> >>>>>>>>>>>>>>>>>>> quite
> >>>>>>>>>>>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>>>>>>>>>>> effort in trimming them since 1.0.0 release. If
> >>>>> the
> >>>>>>>>>> goal
> >>>>>>>>>>> is
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>>>>>>>> provide
> >>>>>>>>>>>>>>>>>>>>>>>>> some "hints" on the generated processor node
> >>>>> names,
> >>>>>> not
> >>>>>>>>>>>>>>>>> strictly
> >>>>>>>>>>>>>>>>>>>>>>>> enforcing
> >>>>>>>>>>>>>>>>>>>>>>>>> the exact names that to be generated, then how
> >>>>> about
> >>>>>> we
> >>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>> add a
> >>>>>>>>>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>>>>>>>>>> function to `KStream` and `KTable` classes like:
> >>>>>>>>>>>>>>>>>> "as(Processed)",
> >>>>>>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> semantics as "the latest operators that
> >>>>> generate this
> >>>>>>>>>>>> KStream
> >>>>>>>>>>>>>>>>> /
> >>>>>>>>>>>>>>>>>>>> KTable
> >>>>>>>>>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>>>>>>>> be named accordingly to this hint".
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> The only caveat, is that for all operators like
> >>>>>>>>>>>> `KStream#to`
> >>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>> `KStream#print` that returns void, this
> >>>>> alternative
> >>>>>>>>>> would
> >>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>> work.
> >>>>>>>>>>>>>>>>>>>>> But
> >>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>> the current operators:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> a. KStream#print,
> >>>>>>>>>>>>>>>>>>>>>>>>> b. KStream#foreach,
> >>>>>>>>>>>>>>>>>>>>>>>>> c. KStream#to,
> >>>>>>>>>>>>>>>>>>>>>>>>> d. KStream#process
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I personally felt that except `KStream#process`
> >>>>> users
> >>>>>>>>>>> would
> >>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> usually
> >>>>>>>>>>>>>>>>>>>>>>>>> bother to override their names, and for
> >>>>>>>>>> `KStream#process`
> >>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>>>>> add
> >>>>>>>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>>>>>>> overload variant with the additional Processed
> >>>>>> object.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> 3. In your example, the processor names are
> >>>>> still
> >>>>>> added
> >>>>>>>>>>>> with
> >>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> suffix
> >>>>>>>>>>>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>>>>>> "
> >>>>>>>>>>>>>>>>>>>>>>>>> -0000000000", is this intentional? If yes, why
> >>>>> (I
> >>>>>>>>>> thought
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>> user
> >>>>>>>>>>>>>>>>>>>>>>>>> specified processor name hints we will not add
> >>>>> suffix
> >>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> distinguish
> >>>>>>>>>>>>>>>>>>>>>>>>> different nodes of the same type any more)?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 29, 2018 at 6:47 AM, Bill Bejeck <
> >>>>>>>>>>>>>>>>> bbej...@gmail.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Florian,
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP.  I think being able to add
> >>>>> more
> >>>>>>>>>>>> context
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> processor names would be useful.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I like the idea of adding a
> >>>>> "withProcessorName" to
> >>>>>>>>>>>> Produced,
> >>>>>>>>>>>>>>>>>>>> Consumed
> >>>>>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>> Joined.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> But instead of adding the "Processed"
> >>>>> parameter to a
> >>>>>>>>>>> large
> >>>>>>>>>>>>>>>>>>>> percentage
> >>>>>>>>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>>>>>> the methods, which would result in overloaded
> >>>>>> methods
> >>>>>>>>>>>> (which
> >>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>> removed
> >>>>>>>>>>>>>>>>>>>>>>>>>> quite a bit with KIP-182) what do you think of
> >>>>>> adding
> >>>>>>>>>> a
> >>>>>>>>>>>>>>>>> method
> >>>>>>>>>>>>>>>>>>>>>>>>>> to the AbstractStream class "withName(String
> >>>>>>>>>>>>>> processorName)"?
> >>>>>>>>>>>>>>>>>> BTW
> >>>>>>>>>>>>>>>>>>>> I"m
> >>>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>> married to the method name, it's the best I
> >>>>> can do
> >>>>>> off
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> top
> >>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>> my
> >>>>>>>>>>>>>>>>>>>>>>>>> head.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> For the methods that return void, we'd have to
> >>>>> add a
> >>>>>>>>>>>>>>>>> parameter,
> >>>>>>>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>>>> would at least cut down on the number of
> >>>>> overloaded
> >>>>>>>>>>>> methods
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> API.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Just my 2 cents.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>>>>>> Bill
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, May 27, 2018 at 4:13 PM, Florian
> >>>>> Hussonnois
> >>>>>> <
> >>>>>>>>>>>>>>>>>>>>>>>>> fhussonn...@gmail.com
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a new discussion on
> >>>>> following
> >>>>>>>>>>> KIP :
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> This is still a draft.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward for your feedback.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>> Florian HUSSONNOIS
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> Florian HUSSONNOIS
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> Florian HUSSONNOIS
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> -- Guozhang
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Florian HUSSONNOIS
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> -- Guozhang
> >>>>
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
> >
>
>

-- 
Florian HUSSONNOIS

Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

Reply via email to