[ 
https://issues.apache.org/jira/browse/KAFKA-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844095#comment-16844095
 ] 

Guozhang Wang commented on KAFKA-8396:
--------------------------------------

I like the idea; 

About the further cleanup (TBD), note one difference between return value and 
context.forward is that the former is strong typed, while the latter is not --- 
so technically users can pass anything via that call and it would only cause 
runtime exception.

> Clean up Transformer API
> ------------------------
>
>                 Key: KAFKA-8396
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8396
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: John Roesler
>            Priority: Major
>              Labels: needs-kip, user-experience
>
> Currently, KStream operators transformValues and flatTransformValues disable 
> context forwarding, and force operators to just return the new values.
> The reason is that we wanted to prevent the key from changing, since the 
> whole point of a `xValues` transformation is that we _do not_ change the key, 
> and hence don't need to repartition.
> However, the chosen mechanism has some drawbacks: The Transform concept is 
> basically a way to plug in a custom Processor within the Streams DSL, but 
> these restrictions make it more like a MapValues with access to the context. 
> For example, even though you can still schedule punctuations, there's no way 
> to forward values as a result of them. So, as a user, it's hard to build a 
> mental model of how to use a TransformValues (because it's not quite a 
> Transformer and not quite a Mapper).
> Also, logically, a Transformer can call forward as much as it wants, so a 
> Transformer and a FlatTransformer are effectively the same thing. Then, we 
> also have TransformValues and FlatTransformValues that are also two more 
> versions of the same thing, just to implement the key restrictions. 
> Internally, some of these can send downstream by returning OR forwarding, and 
> others can only return. It's a lot for users to keep in mind.
> We can clean up this API significantly by just allowing all transformers to 
> call `forward`. In the `Values` case, we can wrap the ProcessorContext in one 
> that checks the key is `equal` to the one that got passed in (i.e., saves a 
> reference and enforces equality with that reference in any call to 
> `forward`). Then, we can actually deprecate the `*ValueTransformer*` 
> interfaces and remove the restriction about calling forward.
> We can consider a further cleanup (TBD) to deprecate the existing Transformer 
> interface entirely, and replace it with one with a `void` return type. Then, 
> the Transform and FlatTransform cases collapse together, and we just need 
> Transform and TransformValues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to