Thanks for the KIP Jorge, I'm in support of your proposal.
1) I do agree with Guozhang's point (1). I think the cleanest approach. I think it's cleaner and better to keep the enforcement internal to the framework than to introduce a public API or context wrapper for processors to use explicitly. 2) I tend to agree with you on this one; I think the equality check ought to be fast enough in practice. 3) I think this is implicit, but should be explicit in the KIP: For the `processValues` API, because the framework sets the key on the context before calling `process` and then unsets it afterwards, there will always be no key set during task puctuation. Therefore, while processors may still register punctuators, they will not be able to forward anything from them. This is functionally equivalent to the existing transformers, by the way, that are also forbidden to forward anything during punctuation. For what it's worth, I think this is the best tradeoff. The only alternative I see is not to place any restriction on forwarded keys at all and just document that if users don't maintain proper partitioning, they'll get undefined behavior. That might be more powerful, but it's also a usability problem. Thanks, -John On Wed, 2022-02-09 at 11:34 +0000, Jorge Esteban Quilcate Otoya wrote: > Thanks Guozhang. > > > Does `ValueProcessorContext` have to be a public API? It seems to me > that this can be completely abstracted away from user interfaces as an > internal class > > Totally agree. No intention to add these as public APIs. Will update the > KIP to reflect this. > > > in the past the rationale for enforcing it at the > interface layer rather than do runtime checks is that it is more efficient. > > I'm not sure how much overhead it may incur to check if the key did not > change: if it is just a reference equality check maybe it's okay. What's > your take on this? > > Agree, reference equality should cover this validation and the overhead > impact should not be meaningful. > Will update the KIP to reflect this as well. > > > On Tue, 8 Feb 2022 at 19:05, Guozhang Wang <wangg...@gmail.com> wrote: > > > Hello Jorge, > > > > Thanks for bringing this KIP! I think this is a nice idea to consider using > > a single overloaded function name for #process, just a couple quick > > questions after reading the proposal: > > > > 1) Does `ValueProcessorContext` have to be a public API? It seems to me > > that this can be completely abstracted away from user interfaces as an > > internal class, and we call the `setKey` before calling user-instantiated > > `process` function, and then in its overridden `forward` it can just check > > if the key changes or not. > > 2) Related to 1) above, in the past the rationale for enforcing it at the > > interface layer rather than do runtime checks is that it is more efficient. > > I'm not sure how much overhead it may incur to check if the key did not > > change: if it is just a reference equality check maybe it's okay. What's > > your take on this? > > > > > > Guozhang > > > > On Tue, Feb 8, 2022 at 5:17 AM Jorge Esteban Quilcate Otoya < > > quilcate.jo...@gmail.com> wrote: > > > > > Hi Dev team, > > > > > > I'd like to start a new discussion thread on Kafka Streams KIP-820: > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API > > > > > > This KIP is aimed to extend the current `KStream#process` API to return > > > output values that could be chained across the topology, as well as > > > introducing a new `KStream#processValues` to use processor while > > validating > > > keys haven't change and repartition is not required. > > > > > > Looking forward to your feedback. > > > > > > Regards, > > > Jorge. > > > > > > > > > -- > > -- Guozhang > >