Thanks, Guozhang.

> Compared with reference checks and runtime exceptions for those who
> mistakenly change the key, I think that enforcing everyone to `setValue`
> may incur more costs..

This is a fair point. I agree that this may incur in more costs than key
checking.

Will hold for more feedback, but if we agree I will update the KIP during
the week.

Cheers,
Jorge.


On Sun, 27 Feb 2022 at 00:50, Guozhang Wang <wangg...@gmail.com> wrote:

> Hello folks,
>
> Regarding the outstanding question, I'm actually a bit leaning towards the
> second option since that `withKey()` itself always creates a new Record
> object. This has a few implications:
>
> * That we would have to discard the previous Record object to be GC'ed with
> the new object --- note in practice, processing value does not mean you'd
> have to replace the whole value with `withValue`, but maybe you just need
> to manipulate some fields of the value object if it is a JSon / etc.
> * It may become an obstacle for further runtime optimizations e.g. skip
> serdes and interpret processing as direct byte manipulations.
>
> Compared with reference checks and runtime exceptions for those who
> mistakenly change the key, I think that enforcing everyone to `setValue`
> may incur more costs..
>
> Guozhang
>
> On Fri, Feb 25, 2022 at 12:54 PM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Hi all,
> >
> > Appreciate very much all the great feedback received so far.
> >
> > > After applying that interface change, I don't see any syntax
> > errors in our tests (which use those methods), and the
> > StreamBuilderTest still passes for me.
> >
> > This is awesome John, thank you for your efforts here.
> >
> > > Jorge, do you mind clarifying these points in the Compatibility section
> > of your KIP?
> >
> > +1. I have clarified the impact of changing the return type in the KIP.
> >
> > > I think the other outstanding question for you is whether
> > > the output key type for processValues should be K or Void.
> > >
> > > One thing I realized belatedly was that if we do set it to
> > > Void, then users will actually have to override the key when
> > > forwarding, like `record.withKey(null)`, whereas if we keep
> > > it is K, all users have to do is not touch the key at all.
> >
> > This is a tricky one.
> > On one hand, with Void type for key output, we force the users to cast to
> > Void and change the key to null,
> > though this can be documented on the API, so the users are aware of the
> > peculiarity of forwarding within `processValues`.
> > On the other hand, keeping the key type as output doesn't _require_ to do
> > any change of keys,
> > but this could lead to key-checking runtime exceptions.
> >
> > I slightly inclined myself for the first option and change the type to
> > `Void`.
> > This will impose a bit of pain on the users to gain some type-safety and
> > avoid runtime exceptions.
> > We can justify this requirement as a way to prove that the key hasn't
> > changed.
> >
> > Btw, thanks for this idea Matthias!
> >
> >
> > On Fri, 25 Feb 2022 at 17:10, John Roesler <vvcep...@apache.org> wrote:
> >
> > > Oh, one more thing Jorge,
> > >
> > > I think the other outstanding question for you is whether
> > > the output key type for processValues should be K or Void. I
> > > get the impression that all of us don't feel too strongly
> > > about it, so I think the ball is in your court to consider
> > > everyone's points and make a call (with justification).
> > >
> > > One thing I realized belatedly was that if we do set it to
> > > Void, then users will actually have to override the key when
> > > forwarding, like `record.withKey(null)`, whereas if we keep
> > > it as K, all users have to do is not touch the key at all.
> > >
> > > Thanks,
> > > -John
> > >
> > > On Fri, 2022-02-25 at 11:07 -0600, John Roesler wrote:
> > > > Hello all,
> > > >
> > > > I'll chime in again in the interest of trying to do a better
> > > > job of keeping KIPs moving forward...
> > > >
> > > > Matthias raised some very good questions about whether the
> > > > change is really source compatible. I just checked out the
> > > > code and make the interface change that Jorge specified in
> > > > the KIP:
> > > >
> > > > > Modified methods:
> > > > >
> > > > > KStream<KOut,VOut> KStream#process(ProcessorSupplier<K, V,
> > > > KOut, VOut> processorSupplier, String... stateStoreNames)
> > > > > KStream<KOut,VOut> KStream#process(ProcessorSupplier<K, V,
> > > > KOut, VOut> processorSupplier, Named named, String...
> > > > stateStoreNames)
> > > >
> > > > After applying that interface change, I don't see any syntax
> > > > errors in our tests (which use those methods), and the
> > > > StreamBuilderTest still passes for me.
> > > >
> > > > The reason is that the existing API already takes a
> > > > ProcessorSupplier<K, V, Void, Void> and is currently a
> > > > `void` return.
> > > >
> > > > After this interface change, all existing usages will just
> > > > bind Void to KOut and Void to VOut. In other words, KOut,
> > > > which is short for `KOut extends Object` is an upper bound
> > > > on Void, so all existing processor suppliers are still valid
> > > > arguments.
> > > >
> > > > Because the current methods are void returns, no existing
> > > > code could be assigning the result to any variable, so
> > > > moving from a void return to a typed return is always
> > > > compatible.
> > > >
> > > > Jorge, do you mind clarifying these points in the
> > > > Compatibility section of your KIP?
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > > >
> > > > On Wed, 2022-02-23 at 15:07 -0800, Matthias J. Sax wrote:
> > > > > For this KIP, I also see the value. I was just trying to make a
> step
> > > > > back and ask if it's a good short term solution. If we believe it
> is,
> > > I
> > > > > am fine with it.
> > > > >
> > > > > (I am more worried about the header's KIP...)
> > > > >
> > > > > Btw: I am still wondering if we can change existing `process()` as
> > > > > proposed in the KIP? It the propose change source compatible? (It's
> > > for
> > > > > sure not binary compatible, but this seems fine -- I don't think we
> > > > > guarantee binary compatibility).
> > > > >
> > > > > Btw: would be good to clarify what is changes for process() --
> should
> > > be
> > > > > return type change from `void` to `KStream<KOut, VOut>` as well as
> > > > > change of `ProcessorSupplier` generic types (output types change
> from
> > > > > `Void` to `KOut` and `VOut`?
> > > > >
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 2/23/22 11:32 AM, Guozhang Wang wrote:
> > > > > > Hi folks,
> > > > > >
> > > > > > I agree with John that this KIP by itself could be a good
> > > improvement, and
> > > > > > I feel it aligns well with the eventual DSL 2.0 proposal so we do
> > > not need
> > > > > > to hold it until later.
> > > > > >
> > > > > > Regarding the last point (i.e. whether we should do enforcement
> > with
> > > a new
> > > > > > interface), here's my 2c: in the past we introduced public
> > > > > > `ValueTransfomer/etc` for two purposes, 1) to enforce the key is
> > not
> > > > > > modifiable, 2) to indicate inside the library's topology builder
> > > itself
> > > > > > that since the key is not modified, the direct downstream does
> not
> > > need to
> > > > > > inject a repartition stage. I think we are more or less on the
> same
> > > page
> > > > > > that for purpose 1), doing runtime check could be sufficient; as
> > for
> > > the
> > > > > > purpose of 2), as for this KIP itself I think it is similar to
> what
> > > we have
> > > > > > (i.e. just base on the function name "processValue" itself) and
> > > hence are
> > > > > > not sacrificed either. I do not know if
> > > > > > `KStream#processValue(ProcessorSupplier<K, V, Void, VOut>
> > > > > > processorSupplier)` will work, or work better, maybe Jorge could
> do
> > > some
> > > > > > digging and get back to us.
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 18, 2022 at 8:24 AM John Roesler <
> vvcep...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Hello all,
> > > > > > >
> > > > > > > While I sympathize with Matthias’s desire to wipe the slate
> clean
> > > and
> > > > > > > redesign the dsl with full knowledge of everything we’ve
> learned
> > > in the
> > > > > > > past few years, that would also be a pretty intense project on
> > its
> > > own. It
> > > > > > > seems better to leave that project for someone who is motivated
> > to
> > > take it
> > > > > > > on.
> > > > > > >
> > > > > > > Reading between the lines, it seems like Jorge’s motivation is
> > > more along
> > > > > > > the lines of removing a few specific pain points. I appreciate
> > > Matthias
> > > > > > > extending the offer, but if Jorge doesn’t want to redesign the
> > dsl
> > > right
> > > > > > > now, we’re better off just accepting the work he’s willing to
> do.
> > > > > > >
> > > > > > > Specifically, this KIP is quite a nice improvement. Looking at
> > the
> > > KStream
> > > > > > > interface, roughly half of it is devoted to various flavors of
> > > “transform”,
> > > > > > > which makes it really hard on users to figure out which they
> are
> > > supposed
> > > > > > > to use for what purpose. This kip let us drop all that
> complexity
> > > in favor
> > > > > > > of just two methods, thanks to the fact that we now have the
> > > ability for
> > > > > > > processors to specify their forwarding type.
> > > > > > >
> > > > > > > By the way, I really like Matthias’s suggestion to set the KOut
> > > generic
> > > > > > > bound to Void for processValues. Then, instead of doing an
> > > equality check
> > > > > > > on the key during forward, you’d just set the key back to the
> one
> > > saved
> > > > > > > before processing (with setRecordKey). This is both more
> > efficient
> > > (because
> > > > > > > we don’t have the equality check) and more foolproof for users
> > > (because
> > > > > > > it’s enforced by the compiler instead of the runtime).
> > > > > > >
> > > > > > > Thanks, all!
> > > > > > > -John
> > > > > > >
> > > > > > > On Fri, Feb 18, 2022, at 00:43, Jorge Esteban Quilcate Otoya
> > wrote:
> > > > > > > > On Fri, 18 Feb 2022 at 02:16, Matthias J. Sax <
> > mj...@apache.org>
> > > wrote:
> > > > > > > >
> > > > > > > > > > It probably deserves its own thread to start discussing
> > > ideas.
> > > > > > > > >
> > > > > > > > > Yes. My question was: if we think it's time to do a DSL
> 2.0,
> > > should we
> > > > > > > > > drop this KIP and just fix via DSL 2.0 instead?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > Good question. Would love to hear what others think about
> this.
> > > > > > > >
> > > > > > > > I've stated my position about this here:
> > > > > > > >
> > > > > > > > > For this KIP specifically, I think about it as a
> continuation
> > > from
> > > > > > > > KIP-478. Therefore, it could make sense to have it as part of
> > > the current
> > > > > > > > version of the DSL.
> > > > > > > >
> > > > > > > > I'd even add that if this KIP is adopted, I would not be that
> > > > > > > disappointed
> > > > > > > > if KIP-634 is dropped in favor of a DSL v2.0 as the access to
> > > headers
> > > > > > > > provided by KIP-478's via Record API is much better than
> > previous
> > > > > > > > `.context().headers()`.
> > > > > > > >
> > > > > > > > But happy to reconsider if there is an agreement to focus
> > efforts
> > > > > > > towards a
> > > > > > > > DSL 2.0.
> > > > > > > >
> > > > > > > >
> > > > > > > > > > You're right. I'm not proposing the method signature.
> > > > > > > > >
> > > > > > > > > What signature do you propose? I don't see an update on the
> > > KIP.
> > > > > > > > >
> > > > > > > > > My bad. I have clarified this in the KIP's public
> interfaces
> > > now:
> > > > > > > >
> > > > > > > > ```
> > > > > > > >
> > > > > > > > New methods:
> > > > > > > >
> > > > > > > >     - KStream<K,VOut>
> > KStream#processValues(ProcessorSupplier<K,
> > > V, K,
> > > > > > > VOut>
> > > > > > > >     processorSupplier, String... stateStoreNames)
> > > > > > > >     - KStream<K,VOut>
> > KStream#processValues(ProcessorSupplier<K,
> > > V, K,
> > > > > > > VOut>
> > > > > > > >     processorSupplier, Named named, String...
> stateStoreNames)
> > > > > > > >
> > > > > > > > Modified methods:
> > > > > > > >
> > > > > > > >     - KStream<KOut,VOut> KStream#process(ProcessorSupplier<K,
> > V,
> > > KOut,
> > > > > > > VOut>
> > > > > > > >     processorSupplier, String... stateStoreNames)
> > > > > > > >     - KStream<KOut,VOut> KStream#process(ProcessorSupplier<K,
> > V,
> > > KOut,
> > > > > > > VOut>
> > > > > > > >     processorSupplier, Named named, String...
> stateStoreNames)
> > > > > > > >
> > > > > > > > ```
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > Not sure if I understand how this would look like. Do you
> > > mean
> > > > > > > checking
> > > > > > > > > it
> > > > > > > > > > on the Record itself or somewhere else?
> > > > > > > > >
> > > > > > > > > @Guozhang: I am not worried about the runtime overhead. I
> am
> > > worries
> > > > > > > > > about user experience. It's not clear from the method
> > > signature, that
> > > > > > > > > you are not allowed to change the key, what seems to be bad
> > > API desig.
> > > > > > > > > Even if I understand the desire to keep the API surface
> ares
> > > small -- I
> > > > > > > > > would rather have a compile time enforcement than a runtime
> > > check.
> > > > > > > > >
> > > > > > > > > For example, we have `map()` and `mapValues()` and
> > > `mapValues()` returns
> > > > > > > > > a `Value V` (enforces that that key is not changes) instead
> > of
> > > a
> > > > > > > > > `KeyValue<KIn,VOut>` and we use a runtime check to check
> that
> > > the key is
> > > > > > > > > not changed.
> > > > > > > > >
> > > > > > > > > Naively, could we enforce something similar by setting the
> > > output key
> > > > > > > > > type as `Void`.
> > > > > > > > >
> > > > > > > > >     KStream#processValue(ProcessorSupplier<K, V, Void,
> VOut>
> > > > > > > > > processorSupplier)
> > > > > > > > >
> > > > > > > > > Not sure if this would work or not?
> > > > > > > > >
> > > > > > > > > Or it might be worth to add a new interface,
> > > `ValueProcessorSupplier`
> > > > > > > > > that ensures that the key is not modified?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > This is an important discussion, even more so with a DSL
> v2.0.
> > > > > > > >
> > > > > > > > At the moment, the DSL just flags whether partitioning is
> > > required based
> > > > > > > on
> > > > > > > > the DSL operation. As mentioned, `mapValues()` enforces only
> > the
> > > value
> > > > > > > has
> > > > > > > > changed through the DSL, though the only _guarantee_ we have
> is
> > > that
> > > > > > > Kafka
> > > > > > > > Streams "owns" the implementation, and we can flag this
> > properly.
> > > > > > > >
> > > > > > > > With a hypothetical v2.0 based on Record API, this will be
> > > harder to
> > > > > > > > enforce with the current APIs. e.g. with `mapValues(Record<K,
> > V>
> > > > > > > record)`,
> > > > > > > > nothing would stop users from using
> > > > > > > `record.withKey("needs_partitioning")`.
> > > > > > > >
> > > > > > > > The approach defined on this KIP is similar to what we have
> at
> > > the moment
> > > > > > > > on `ValueTransformer*` where it validates at runtime that the
> > > users are
> > > > > > > not
> > > > > > > > calling `forward` with `ForwardingDisabledProcessorContext`.
> > > > > > > > `ValueProcessorSupplier` is not meant to be a public API.
> Only
> > > to be used
> > > > > > > > internally on `processValues` implementation.
> > > > > > > >
> > > > > > > > At first, `KStream#processValue(ProcessorSupplier<K, V, Void,
> > > VOut>
> > > > > > > > processorSupplier)` won't work as it will require the
> > `Processor`
> > > > > > > > implementation to actually change the key. Will take a deeper
> > > look to
> > > > > > > > validate if this could solve this issue.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -Matthias
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On 2/17/22 10:56 AM, Guozhang Wang wrote:
> > > > > > > > > > Regarding the last question Matthias had, I wonder if
> it's
> > > similar to
> > > > > > > my
> > > > > > > > > > first email's point 2) above? I think the rationale is
> > that,
> > > since
> > > > > > > > > > reference checks are relatively very cheap, it is
> > worthwhile
> > > to pay
> > > > > > > this
> > > > > > > > > > extra runtime checks and in return to have a single
> > > consolidated
> > > > > > > > > > ProcessorSupplier programming interface (i.e. we would
> > > eventually
> > > > > > > > > > deprecate ValueTransformerWithKeySupplier).
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 16, 2022 at 10:57 AM Jorge Esteban Quilcate
> > > Otoya <
> > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > Thank you Matthias, this is great feedback.
> > > > > > > > > > >
> > > > > > > > > > > Adding my comments below.
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 16 Feb 2022 at 00:42, Matthias J. Sax <
> > > mj...@apache.org>
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the KIP.
> > > > > > > > > > > >
> > > > > > > > > > > > In alignment to my reply to KIP-634, I am wondering
> if
> > > we are
> > > > > > > heading
> > > > > > > > > > > > into the right direction, or if we should consider to
> > > re-design the
> > > > > > > DSL
> > > > > > > > > > > > from scratch?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > I'm very excited about the idea of a DLS v2.0. It
> > probably
> > > deserves
> > > > > > > its
> > > > > > > > > own
> > > > > > > > > > > thread to start discussing ideas.
> > > > > > > > > > >
> > > > > > > > > > > For this KIP specifically, I think about it as a
> > > continuation from
> > > > > > > > > KIP-478.
> > > > > > > > > > > Therefore, it could make sense to have it as part of
> the
> > > current
> > > > > > > > > version of
> > > > > > > > > > > the DSL.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Even if we don't do a DSL 2.0 right now, I have some
> > > concerns about
> > > > > > > > > this
> > > > > > > > > > > > KIP:
> > > > > > > > > > > >
> > > > > > > > > > > > (1) I am not sure if the propose changed is backward
> > > compatible? We
> > > > > > > > > > > > currently have:
> > > > > > > > > > > >
> > > > > > > > > > > >      void KStream#process(ProcessorSupplier,
> String...)
> > > > > > > > > > > >
> > > > > > > > > > > > The newly proposed method:
> > > > > > > > > > > >
> > > > > > > > > > > >      KStream KStream#process(ProcessorSupplier)
> > > > > > > > > > > >
> > > > > > > > > > > > seems to be an incompatible change?
> > > > > > > > > > > >
> > > > > > > > > > > > The KIP states:
> > > > > > > > > > > >
> > > > > > > > > > > > > Modified method KStream#process should be
> compatible
> > > with previous
> > > > > > > > > > > > version, that at the moment is fixed to a Void return
> > > type.
> > > > > > > > > > > >
> > > > > > > > > > > > Why is it backward compatible? Having both old and
> new
> > > #process()
> > > > > > > seems
> > > > > > > > > > > > not to be compatible to me? Or are you proposing to
> > > _change_ the
> > > > > > > method
> > > > > > > > > > > > signature (if yes, the `String...` parameter to add a
> > > state store
> > > > > > > seems
> > > > > > > > > > > > to be missing)? For this case, it seems that existing
> > > programs
> > > > > > > would at
> > > > > > > > > > > > least need to be recompiled -- it would only be a
> > source
> > > compatible
> > > > > > > > > > > > change, but not a binary compatible change?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > You're right. I'm not proposing the method signature.
> > > > > > > > > > > Totally agree about compatibility issue. I was only
> > > considering
> > > > > > > source
> > > > > > > > > > > compatibility and was ignorant that changing from void
> to
> > > a specific
> > > > > > > > > type
> > > > > > > > > > > would break binary compatibility.
> > > > > > > > > > > I will update the KIP to reflect this:
> > > > > > > > > > >
> > > > > > > > > > > > Modifications to method KStream#process are source
> > > compatible with
> > > > > > > > > > > previous version, though not binary compatible.
> Therefore
> > > will
> > > > > > > require
> > > > > > > > > > > users to recompile their applications with the latest
> > > version.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > I am also wondering if/how this change related to
> > > KIP-401:
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97553756
> > > > > > > > > > > >
> > > > > > > > > > > >    From a high level it might not conflict, but I
> > wanted
> > > to double
> > > > > > > > > check?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > Wasn't aware of this KIP, thanks for sharing! I don't
> > > think there is
> > > > > > > > > > > conflict between KIPs, as far as I understand.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > For `KStream#processValues()`, my main concern is the
> > > added runtime
> > > > > > > > > > > > check if the key was modified or not -- it seems to
> > > provide bad user
> > > > > > > > > > > > experience -- enforcing that the key is not modified
> on
> > > an API
> > > > > > > level,
> > > > > > > > > > > > would seem to be much better.
> > > > > > > > > > > >
> > > > > > > > > > > > Last, what is the purpose of `setRecordKey()` and
> > > > > > > `clearRecordKey()`? I
> > > > > > > > > > > > am not sure if I understand their purpose?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > Both methods set/clear the context (current key) to be
> > > used when
> > > > > > > > > checking
> > > > > > > > > > > keys on forward(record) implementation.
> > > > > > > > > > >
> > > > > > > > > > > > enforcing that the key is not modified on an API
> level,
> > > would seem
> > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > much better.
> > > > > > > > > > >
> > > > > > > > > > > Not sure if I understand how this would look like. Do
> you
> > > mean
> > > > > > > checking
> > > > > > > > > it
> > > > > > > > > > > on the Record itself or somewhere else?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > -Matthias
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On 2/15/22 11:53 AM, John Roesler wrote:
> > > > > > > > > > > > > My apologies, this feedback was intended for
> KIP-634.
> > > > > > > > > > > > > -John
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Feb 15, 2022, at 13:15, John Roesler wrote:
> > > > > > > > > > > > > > Thanks for the update, Jorge,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I've just looked over the KIP again. Just one
> more
> > > small
> > > > > > > > > > > > > > concern:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 5) We can't just change the type of
> > Record#headers()
> > > to a
> > > > > > > > > > > > > > new fully qualified type. That would be a source-
> > > > > > > > > > > > > > incompatible breaking change for users.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Out options are:
> > > > > > > > > > > > > > * Deprecate the existing method and create a new
> > one
> > > with
> > > > > > > > > > > > > > the new type
> > > > > > > > > > > > > > * If the existing Headers is "not great but ok",
> > > then maybe
> > > > > > > > > > > > > > we leave it alone.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > -John
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, 2022-02-14 at 13:58 -0600, Paul Whalen
> > wrote:
> > > > > > > > > > > > > > > No specific comments, but I just wanted to
> > mention
> > > I like the
> > > > > > > > > > > > direction of
> > > > > > > > > > > > > > > the KIP.  My team is a big user of "transform"
> > > methods because of
> > > > > > > > > the
> > > > > > > > > > > > > > > ability to chain them, and I have always found
> > the
> > > terminology
> > > > > > > > > > > > challenging
> > > > > > > > > > > > > > > to explain alongside "process".  It felt like
> one
> > > concept with
> > > > > > > two
> > > > > > > > > > > > names.
> > > > > > > > > > > > > > > So moving towards a single API that is powerful
> > > enough to handle
> > > > > > > > > both
> > > > > > > > > > > > use
> > > > > > > > > > > > > > > cases seems absolutely correct to me.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Paul
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Feb 14, 2022 at 1:12 PM Jorge Esteban
> > > Quilcate Otoya <
> > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Got it. Thanks John, this make sense.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I've updated the KIP to include the
> deprecation
> > > of:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >       - KStream#transform
> > > > > > > > > > > > > > > >       - KStream#transformValues
> > > > > > > > > > > > > > > >       - KStream#flatTransform
> > > > > > > > > > > > > > > >       - KStream#flatTransformValues
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, 11 Feb 2022 at 15:16, John Roesler <
> > > vvcep...@apache.org
> > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks, Jorge!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think it’ll be better to keep this KIP
> > > focused on KStream
> > > > > > > > > methods
> > > > > > > > > > > > only.
> > > > > > > > > > > > > > > > > I suspect that the KTable methods may be
> more
> > > complicated than
> > > > > > > > > just
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > proposed replacement, but it’ll also be
> > easier
> > > to consider that
> > > > > > > > > > > > question
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > isolation.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The nice thing about just deprecating the
> > > KStream methods and
> > > > > > > not
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > Transform* interfaces is that you can keep
> > > your proposal just
> > > > > > > > > > > scoped
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > KStream and not have any consequences for
> the
> > > rest of the DSL.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks again,
> > > > > > > > > > > > > > > > > John
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Feb 11, 2022, at 06:43, Jorge
> Esteban
> > > Quilcate Otoya
> > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > Thanks, John.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 4) I agree that we shouldn't deprecate
> > the
> > > Transformer*
> > > > > > > > > > > > > > > > > > classes, but do you think we should
> > > deprecate the
> > > > > > > > > > > > > > > > > > KStream#transform* methods? I'm curious
> if
> > > there's any
> > > > > > > > > > > > > > > > > > remaining reason to have those methods,
> or
> > > if your KIP
> > > > > > > > > > > > > > > > > > completely obviates them.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Good catch.
> > > > > > > > > > > > > > > > > > I considered that deprecating
> > `Transformer*`
> > > and `transform*`
> > > > > > > > > > > would
> > > > > > > > > > > > go
> > > > > > > > > > > > > > > > > hand
> > > > > > > > > > > > > > > > > > in hand — maybe it happened similarly
> with
> > > old `Processor` and
> > > > > > > > > > > > > > > > `process`?
> > > > > > > > > > > > > > > > > > Though deprecating only `transform*`
> > > operations could be a
> > > > > > > better
> > > > > > > > > > > > > > > > signal
> > > > > > > > > > > > > > > > > > for users than non deprecating anything
> at
> > > all and pave the
> > > > > > > way
> > > > > > > > > to
> > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > deprecation.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Should this deprecation also consider
> > > including
> > > > > > > > > > > > > > > > `KTable#transformValues`?
> > > > > > > > > > > > > > > > > > The approach proposed on the KIP:
> > > > > > > > > > > > > > > > > >
> > > `ktable.toStream().processValues().toTable()` seems fair to
> > > > > > > me,
> > > > > > > > > > > > though
> > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > will have to test it further.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I'm happy to update the KIP if there's
> some
> > > consensus around
> > > > > > > > > this.
> > > > > > > > > > > > > > > > > > Will add the deprecation notes these days
> > > and wait for any
> > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > feedback on this topic before wrapping up
> > > the KIP.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, 11 Feb 2022 at 04:03, John
> Roesler
> > <
> > > > > > > vvcep...@apache.org>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks for the update, Jorge!
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I just read over the KIP again, and I'm
> > in
> > > support. One more
> > > > > > > > > > > > > > > > > > > question came up for me, though:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 4) I agree that we shouldn't deprecate
> > the
> > > Transformer*
> > > > > > > > > > > > > > > > > > > classes, but do you think we should
> > > deprecate the
> > > > > > > > > > > > > > > > > > > KStream#transform* methods? I'm curious
> > if
> > > there's any
> > > > > > > > > > > > > > > > > > > remaining reason to have those methods,
> > or
> > > if your KIP
> > > > > > > > > > > > > > > > > > > completely obviates them.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > -John
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, 2022-02-10 at 21:32 +0000,
> Jorge
> > > Esteban Quilcate
> > > > > > > > > > > > > > > > > > > Otoya wrote:
> > > > > > > > > > > > > > > > > > > > Thank you both for your feedback!
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I have added the following note on
> > > punctuation:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > > > > > NOTE: The key validation can be
> defined
> > > when processing the
> > > > > > > > > > > > message.
> > > > > > > > > > > > > > > > > > > > Though, with punctuations it won't be
> > > possible to define the
> > > > > > > > > key
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > validation before forwarding,
> therefore
> > > it won't be
> > > > > > > possible to
> > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > from punctuation.
> > > > > > > > > > > > > > > > > > > > This is similar behavior to how
> > > `ValueTransformer`s behave
> > > > > > > at
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > moment.
> > > > > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Also make it explicit also that we
> are
> > > going to apply
> > > > > > > > > > > referencial
> > > > > > > > > > > > > > > > > > > equality
> > > > > > > > > > > > > > > > > > > > for key validation.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I hope this is covering all your
> > > feedback, let me know if
> > > > > > > I'm
> > > > > > > > > > > > > > > > missing
> > > > > > > > > > > > > > > > > > > > anything.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > > Jorge.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Wed, 9 Feb 2022 at 22:19, Guozhang
> > > Wang <
> > > > > > > wangg...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I'm +1 on John's point 3) for
> > > punctuations.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > And I think if people are on the
> same
> > > page that a reference
> > > > > > > > > > > > > > > > equality
> > > > > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > > > > per record is not a huge overhead,
> I
> > > think doing that
> > > > > > > > > > > enforcement
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > > than documentations and hand-wavy
> > > undefined behaviors.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Wed, Feb 9, 2022 at 11:27 AM
> John
> > > Roesler <
> > > > > > > > > > > > vvcep...@apache.org
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks for the KIP Jorge,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I'm in support of your proposal.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > > > > > > > I do agree with Guozhang's point
> > > (1). I think the cleanest
> > > > > > > > > > > > > > > > > > > > > > approach. I think it's cleaner
> and
> > > better to keep the
> > > > > > > > > > > > > > > > > > > > > > enforcement internal to the
> > > framework than to introduce a
> > > > > > > > > > > > > > > > > > > > > > public API or context wrapper for
> > > processors to use
> > > > > > > > > > > > > > > > > > > > > > explicitly.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2) I tend to agree with you on
> this
> > > one; I think the
> > > > > > > > > > > > > > > > > > > > > > equality check ought to be fast
> > > enough in practice.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 3) I think this is implicit, but
> > > should be explicit in the
> > > > > > > > > > > > > > > > > > > > > > KIP: For the `processValues` API,
> > > because the framework
> > > > > > > sets
> > > > > > > > > > > > > > > > > > > > > > the key on the context before
> > > calling `process` and then
> > > > > > > > > > > > > > > > > > > > > > unsets it afterwards, there will
> > > always be no key set
> > > > > > > during
> > > > > > > > > > > > > > > > > > > > > > task puctuation. Therefore, while
> > > processors may still
> > > > > > > > > > > > > > > > > > > > > > register punctuators, they will
> not
> > > be able to forward
> > > > > > > > > > > > > > > > > > > > > > anything from them.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > This is functionally equivalent
> to
> > > the existing
> > > > > > > > > > > > > > > > > > > > > > transformers, by the way, that
> are
> > > also forbidden to
> > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > anything during punctuation.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > For what it's worth, I think this
> > is
> > > the best tradeoff.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > The only alternative I see is not
> > to
> > > place any restriction
> > > > > > > > > > > > > > > > > > > > > > on forwarded keys at all and just
> > > document that if users
> > > > > > > > > > > > > > > > > > > > > > don't maintain proper
> partitioning,
> > > they'll get undefined
> > > > > > > > > > > > > > > > > > > > > > behavior. That might be more
> > > powerful, but it's also a
> > > > > > > > > > > > > > > > > > > > > > usability problem.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > -John
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Wed, 2022-02-09 at 11:34
> +0000,
> > > Jorge Esteban Quilcate
> > > > > > > > > > > > > > > > > > > > > > Otoya wrote:
> > > > > > > > > > > > > > > > > > > > > > > Thanks Guozhang.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Does `ValueProcessorContext`
> > > have to be a public API? It
> > > > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > > > > > to me
> > > > > > > > > > > > > > > > > > > > > > > that this can be completely
> > > abstracted away from user
> > > > > > > > > > > > > > > > interfaces
> > > > > > > > > > > > > > > > > > > as an
> > > > > > > > > > > > > > > > > > > > > > > internal class
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Totally agree. No intention to
> > add
> > > these as public APIs.
> > > > > > > > > Will
> > > > > > > > > > > > > > > > > > > update
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > KIP to reflect this.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > in the past the rationale for
> > > enforcing it at the
> > > > > > > > > > > > > > > > > > > > > > > interface layer rather than do
> > > runtime checks is that it
> > > > > > > is
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > > efficient.
> > > > > > > > > > > > > > > > > > > > > > > > I'm not sure how much
> overhead
> > > it may incur to check if
> > > > > > > the
> > > > > > > > > > > > > > > > > key
> > > > > > > > > > > > > > > > > > > did
> > > > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > > > > change: if it is just a
> reference
> > > equality check maybe
> > > > > > > it's
> > > > > > > > > > > > > > > > > okay.
> > > > > > > > > > > > > > > > > > > > > What's
> > > > > > > > > > > > > > > > > > > > > > > your take on this?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Agree, reference equality
> should
> > > cover this validation
> > > > > > > and
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > overhead
> > > > > > > > > > > > > > > > > > > > > > > impact should not be
> meaningful.
> > > > > > > > > > > > > > > > > > > > > > > Will update the KIP to reflect
> > > this as well.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Tue, 8 Feb 2022 at 19:05,
> > > Guozhang Wang <
> > > > > > > > > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Hello Jorge,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks for bringing this
> KIP! I
> > > think this is a nice
> > > > > > > idea
> > > > > > > > > to
> > > > > > > > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > > > > > > a single overloaded function
> > > name for #process, just a
> > > > > > > > > > > > > > > > couple
> > > > > > > > > > > > > > > > > > > quick
> > > > > > > > > > > > > > > > > > > > > > > > questions after reading the
> > > proposal:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > 1) Does
> `ValueProcessorContext`
> > > have to be a public
> > > > > > > API? It
> > > > > > > > > > > > > > > > > > > seems to
> > > > > > > > > > > > > > > > > > > > > me
> > > > > > > > > > > > > > > > > > > > > > > > that this can be completely
> > > abstracted away from user
> > > > > > > > > > > > > > > > > interfaces
> > > > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > > > internal class, and we call
> the
> > > `setKey` before calling
> > > > > > > > > > > > > > > > > > > > > > user-instantiated
> > > > > > > > > > > > > > > > > > > > > > > > `process` function, and then
> in
> > > its overridden
> > > > > > > `forward` it
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > > > > > > > if the key changes or not.
> > > > > > > > > > > > > > > > > > > > > > > > 2) Related to 1) above, in
> the
> > > past the rationale for
> > > > > > > > > > > > > > > > > enforcing
> > > > > > > > > > > > > > > > > > > it at
> > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > interface layer rather than
> do
> > > runtime checks is that
> > > > > > > it is
> > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > > efficient.
> > > > > > > > > > > > > > > > > > > > > > > > I'm not sure how much
> overhead
> > > it may incur to check if
> > > > > > > the
> > > > > > > > > > > > > > > > > key
> > > > > > > > > > > > > > > > > > > did
> > > > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > > > > > change: if it is just a
> > > reference equality check maybe
> > > > > > > it's
> > > > > > > > > > > > > > > > > okay.
> > > > > > > > > > > > > > > > > > > > > > What's
> > > > > > > > > > > > > > > > > > > > > > > > your take on this?
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 8, 2022 at 5:17
> AM
> > > Jorge Esteban Quilcate
> > > > > > > Otoya
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com>
> > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Hi Dev team,
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > I'd like to start a new
> > > discussion thread on Kafka
> > > > > > > Streams
> > > > > > > > > > > > > > > > > > > KIP-820:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > This KIP is aimed to extend
> > > the current
> > > > > > > `KStream#process`
> > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > > > > > > > output values that could be
> > > chained across the
> > > > > > > topology,
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > well as
> > > > > > > > > > > > > > > > > > > > > > > > > introducing a new
> > > `KStream#processValues` to use
> > > > > > > processor
> > > > > > > > > > > > > > > > > > > while
> > > > > > > > > > > > > > > > > > > > > > > > validating
> > > > > > > > > > > > > > > > > > > > > > > > > keys haven't change and
> > > repartition is not required.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your
> > > feedback.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > > > > > > > Jorge.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> > >
> >
>
>
> --
> -- Guozhang
>

Reply via email to