My apologies, this feedback was intended for KIP-634. -John
On Tue, Feb 15, 2022, at 13:15, John Roesler wrote: > Thanks for the update, Jorge, > > I've just looked over the KIP again. Just one more small > concern: > > 5) We can't just change the type of Record#headers() to a > new fully qualified type. That would be a source- > incompatible breaking change for users. > > Out options are: > * Deprecate the existing method and create a new one with > the new type > * If the existing Headers is "not great but ok", then maybe > we leave it alone. > > Thanks, > -John > > On Mon, 2022-02-14 at 13:58 -0600, Paul Whalen wrote: >> No specific comments, but I just wanted to mention I like the direction of >> the KIP. My team is a big user of "transform" methods because of the >> ability to chain them, and I have always found the terminology challenging >> to explain alongside "process". It felt like one concept with two names. >> So moving towards a single API that is powerful enough to handle both use >> cases seems absolutely correct to me. >> >> Paul >> >> On Mon, Feb 14, 2022 at 1:12 PM Jorge Esteban Quilcate Otoya < >> quilcate.jo...@gmail.com> wrote: >> >> > Got it. Thanks John, this make sense. >> > >> > I've updated the KIP to include the deprecation of: >> > >> > - KStream#transform >> > - KStream#transformValues >> > - KStream#flatTransform >> > - KStream#flatTransformValues >> > >> > >> > >> > On Fri, 11 Feb 2022 at 15:16, John Roesler <vvcep...@apache.org> wrote: >> > >> > > Thanks, Jorge! >> > > >> > > I think it’ll be better to keep this KIP focused on KStream methods only. >> > > I suspect that the KTable methods may be more complicated than just that >> > > proposed replacement, but it’ll also be easier to consider that question >> > in >> > > isolation. >> > > >> > > The nice thing about just deprecating the KStream methods and not the >> > > Transform* interfaces is that you can keep your proposal just scoped to >> > > KStream and not have any consequences for the rest of the DSL. >> > > >> > > Thanks again, >> > > John >> > > >> > > On Fri, Feb 11, 2022, at 06:43, Jorge Esteban Quilcate Otoya wrote: >> > > > Thanks, John. >> > > > >> > > > > 4) I agree that we shouldn't deprecate the Transformer* >> > > > classes, but do you think we should deprecate the >> > > > KStream#transform* methods? I'm curious if there's any >> > > > remaining reason to have those methods, or if your KIP >> > > > completely obviates them. >> > > > >> > > > Good catch. >> > > > I considered that deprecating `Transformer*` and `transform*` would go >> > > hand >> > > > in hand — maybe it happened similarly with old `Processor` and >> > `process`? >> > > > Though deprecating only `transform*` operations could be a better >> > signal >> > > > for users than non deprecating anything at all and pave the way to it's >> > > > deprecation. >> > > > >> > > > Should this deprecation also consider including >> > `KTable#transformValues`? >> > > > The approach proposed on the KIP: >> > > > `ktable.toStream().processValues().toTable()` seems fair to me, though >> > I >> > > > will have to test it further. >> > > > >> > > > I'm happy to update the KIP if there's some consensus around this. >> > > > Will add the deprecation notes these days and wait for any additional >> > > > feedback on this topic before wrapping up the KIP. >> > > > >> > > > >> > > > On Fri, 11 Feb 2022 at 04:03, John Roesler <vvcep...@apache.org> >> > wrote: >> > > > >> > > > > Thanks for the update, Jorge! >> > > > > >> > > > > I just read over the KIP again, and I'm in support. One more >> > > > > question came up for me, though: >> > > > > >> > > > > 4) I agree that we shouldn't deprecate the Transformer* >> > > > > classes, but do you think we should deprecate the >> > > > > KStream#transform* methods? I'm curious if there's any >> > > > > remaining reason to have those methods, or if your KIP >> > > > > completely obviates them. >> > > > > >> > > > > Thanks, >> > > > > -John >> > > > > >> > > > > On Thu, 2022-02-10 at 21:32 +0000, Jorge Esteban Quilcate >> > > > > Otoya wrote: >> > > > > > Thank you both for your feedback! >> > > > > > >> > > > > > I have added the following note on punctuation: >> > > > > > >> > > > > > ``` >> > > > > > NOTE: The key validation can be defined when processing the >> > > > > > message. >> > > > > > Though, with punctuations it won't be possible to define the key >> > > > > > for >> > > > > > validation before forwarding, therefore it won't be possible to >> > > forward >> > > > > > from punctuation. >> > > > > > This is similar behavior to how `ValueTransformer`s behave at the >> > > moment. >> > > > > > ``` >> > > > > > >> > > > > > Also make it explicit also that we are going to apply referencial >> > > > > equality >> > > > > > for key validation. >> > > > > > >> > > > > > I hope this is covering all your feedback, let me know if I'm >> > missing >> > > > > > anything. >> > > > > > >> > > > > > Cheers, >> > > > > > Jorge. >> > > > > > >> > > > > > On Wed, 9 Feb 2022 at 22:19, Guozhang Wang <wangg...@gmail.com> >> > > wrote: >> > > > > > >> > > > > > > I'm +1 on John's point 3) for punctuations. >> > > > > > > >> > > > > > > And I think if people are on the same page that a reference >> > equality >> > > > > check >> > > > > > > per record is not a huge overhead, I think doing that enforcement >> > is >> > > > > better >> > > > > > > than documentations and hand-wavy undefined behaviors. >> > > > > > > >> > > > > > > >> > > > > > > Guozhang >> > > > > > > >> > > > > > > On Wed, Feb 9, 2022 at 11:27 AM John Roesler <vvcep...@apache.org >> > > >> > > > > wrote: >> > > > > > > >> > > > > > > > Thanks for the KIP Jorge, >> > > > > > > > >> > > > > > > > I'm in support of your proposal. >> > > > > > > > >> > > > > > > > 1) >> > > > > > > > I do agree with Guozhang's point (1). I think the cleanest >> > > > > > > > approach. I think it's cleaner and better to keep the >> > > > > > > > enforcement internal to the framework than to introduce a >> > > > > > > > public API or context wrapper for processors to use >> > > > > > > > explicitly. >> > > > > > > > >> > > > > > > > 2) I tend to agree with you on this one; I think the >> > > > > > > > equality check ought to be fast enough in practice. >> > > > > > > > >> > > > > > > > 3) I think this is implicit, but should be explicit in the >> > > > > > > > KIP: For the `processValues` API, because the framework sets >> > > > > > > > the key on the context before calling `process` and then >> > > > > > > > unsets it afterwards, there will always be no key set during >> > > > > > > > task puctuation. Therefore, while processors may still >> > > > > > > > register punctuators, they will not be able to forward >> > > > > > > > anything from them. >> > > > > > > > >> > > > > > > > This is functionally equivalent to the existing >> > > > > > > > transformers, by the way, that are also forbidden to forward >> > > > > > > > anything during punctuation. >> > > > > > > > >> > > > > > > > For what it's worth, I think this is the best tradeoff. >> > > > > > > > >> > > > > > > > The only alternative I see is not to place any restriction >> > > > > > > > on forwarded keys at all and just document that if users >> > > > > > > > don't maintain proper partitioning, they'll get undefined >> > > > > > > > behavior. That might be more powerful, but it's also a >> > > > > > > > usability problem. >> > > > > > > > >> > > > > > > > Thanks, >> > > > > > > > -John >> > > > > > > > >> > > > > > > > On Wed, 2022-02-09 at 11:34 +0000, Jorge Esteban Quilcate >> > > > > > > > Otoya wrote: >> > > > > > > > > Thanks Guozhang. >> > > > > > > > > >> > > > > > > > > > Does `ValueProcessorContext` have to be a public API? It >> > seems >> > > > > to me >> > > > > > > > > that this can be completely abstracted away from user >> > interfaces >> > > > > as an >> > > > > > > > > internal class >> > > > > > > > > >> > > > > > > > > Totally agree. No intention to add these as public APIs. Will >> > > > > update >> > > > > > > the >> > > > > > > > > KIP to reflect this. >> > > > > > > > > >> > > > > > > > > > in the past the rationale for enforcing it at the >> > > > > > > > > interface layer rather than do runtime checks is that it is >> > more >> > > > > > > > efficient. >> > > > > > > > > > I'm not sure how much overhead it may incur to check if the >> > > key >> > > > > did >> > > > > > > not >> > > > > > > > > change: if it is just a reference equality check maybe it's >> > > okay. >> > > > > > > What's >> > > > > > > > > your take on this? >> > > > > > > > > >> > > > > > > > > Agree, reference equality should cover this validation and >> > > > > > > > > the >> > > > > overhead >> > > > > > > > > impact should not be meaningful. >> > > > > > > > > Will update the KIP to reflect this as well. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > On Tue, 8 Feb 2022 at 19:05, Guozhang Wang < >> > wangg...@gmail.com> >> > > > > wrote: >> > > > > > > > > >> > > > > > > > > > Hello Jorge, >> > > > > > > > > > >> > > > > > > > > > Thanks for bringing this KIP! I think this is a nice idea >> > > > > > > > > > to >> > > > > consider >> > > > > > > > using >> > > > > > > > > > a single overloaded function name for #process, just a >> > couple >> > > > > quick >> > > > > > > > > > questions after reading the proposal: >> > > > > > > > > > >> > > > > > > > > > 1) Does `ValueProcessorContext` have to be a public API? It >> > > > > seems to >> > > > > > > me >> > > > > > > > > > that this can be completely abstracted away from user >> > > interfaces >> > > > > as >> > > > > > > an >> > > > > > > > > > internal class, and we call the `setKey` before calling >> > > > > > > > user-instantiated >> > > > > > > > > > `process` function, and then in its overridden `forward` it >> > > can >> > > > > just >> > > > > > > > check >> > > > > > > > > > if the key changes or not. >> > > > > > > > > > 2) Related to 1) above, in the past the rationale for >> > > enforcing >> > > > > it at >> > > > > > > > the >> > > > > > > > > > interface layer rather than do runtime checks is that it is >> > > more >> > > > > > > > efficient. >> > > > > > > > > > I'm not sure how much overhead it may incur to check if the >> > > key >> > > > > did >> > > > > > > not >> > > > > > > > > > change: if it is just a reference equality check maybe it's >> > > okay. >> > > > > > > > What's >> > > > > > > > > > your take on this? >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Guozhang >> > > > > > > > > > >> > > > > > > > > > On Tue, Feb 8, 2022 at 5:17 AM Jorge Esteban Quilcate Otoya >> > < >> > > > > > > > > > quilcate.jo...@gmail.com> wrote: >> > > > > > > > > > >> > > > > > > > > > > Hi Dev team, >> > > > > > > > > > > >> > > > > > > > > > > I'd like to start a new discussion thread on Kafka >> > > > > > > > > > > Streams >> > > > > KIP-820: >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > >> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API >> > > > > > > > > > > >> > > > > > > > > > > This KIP is aimed to extend the current `KStream#process` >> > > API >> > > > > to >> > > > > > > > return >> > > > > > > > > > > output values that could be chained across the topology, >> > as >> > > > > well as >> > > > > > > > > > > introducing a new `KStream#processValues` to use >> > > > > > > > > > > processor >> > > > > while >> > > > > > > > > > validating >> > > > > > > > > > > keys haven't change and repartition is not required. >> > > > > > > > > > > >> > > > > > > > > > > Looking forward to your feedback. >> > > > > > > > > > > >> > > > > > > > > > > Regards, >> > > > > > > > > > > Jorge. >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > -- >> > > > > > > > > > -- Guozhang >> > > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > -- Guozhang >> > > > > > > >> > > > > >> > > > > >> > > >> >