Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

Guozhang Wang Wed, 30 May 2018 11:45:37 -0700

Hello Florian,

Thanks for the KIP. I have some meta feedbacks on the proposal:

1. You mentioned that this `Processed` object will be added to a new
overloaded variant of all the stateless operators, what about the stateful
operators? Would like to hear your opinions if you have thought about that:
note for stateful operators they will usually be mapped to multiple
processor node names, so we probably need to come up with some ways to
define all their names.

2. I share the same concern with Bill as for adding lots of new overload
functions into the stateless operators, as we have just spent quite some
effort in trimming them since 1.0.0 release. If the goal is to just provide
some "hints" on the generated processor node names, not strictly enforcing
the exact names that to be generated, then how about we just add a new
function to `KStream` and `KTable` classes like: "as(Processed)", with the
semantics as "the latest operators that generate this KStream / KTable will
be named accordingly to this hint".

The only caveat, is that for all operators like `KStream#to` and
`KStream#print` that returns void, this alternative would not work. But for
the current operators:

a. KStream#print,
b. KStream#foreach,
c. KStream#to,
d. KStream#process

I personally felt that except `KStream#process` users would not usually
bother to override their names, and for `KStream#process` we could add an
overload variant with the additional Processed object.

3. In your example, the processor names are still added with a suffix like "
-0000000000", is this intentional? If yes, why (I thought with user
specified processor name hints we will not add suffix to distinguish
different nodes of the same type any more)?

Guozhang

On Tue, May 29, 2018 at 6:47 AM, Bill Bejeck <bbej...@gmail.com> wrote:

> Hi Florian,
>
> Thanks for the KIP.  I think being able to add more context to the
> processor names would be useful.
>
> I like the idea of adding a "withProcessorName" to Produced, Consumed and
> Joined.
>
> But instead of adding the "Processed" parameter to a large percentage of
> the methods, which would result in overloaded methods (which we removed
> quite a bit with KIP-182) what do you think of adding a method
> to the AbstractStream class "withName(String processorName)"? BTW I"m not
> married to the method name, it's the best I can do off the top of my head.
>
> For the methods that return void, we'd have to add a parameter, but that
> would at least cut down on the number of overloaded methods in the API.
>
> Just my 2 cents.
>
> Thanks,
> Bill
>
> On Sun, May 27, 2018 at 4:13 PM, Florian Hussonnois <fhussonn...@gmail.com
> >
> wrote:
>
> > Hi,
> >
> > I would like to start a new discussion on following KIP :
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL
> >
> > This is still a draft.
> >
> > Looking forward for your feedback.
> > --
> > Florian HUSSONNOIS
> >
>

-- 
-- Guozhang

Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

Reply via email to