One more question: Should we add any of - InitizialierWithKey - ReducerWithKey - ValueTransformerWithKey
To get consistent/complete API, it might be a good idea. Any thoughts? -Matthias On 5/24/17 3:47 PM, Matthias J. Sax wrote: > Jeyhun, > > I was just wondering if you did look into the key-deep-copy idea we > discussed. I am curious to see what the impact might be. > > > -Matthias > > On 5/20/17 2:03 AM, Jeyhun Karimov wrote: >> Hi, >> >> Thanks for your comments. I rethink about including rich functions into >> this KIP. >> I think once we include rich functions in this KIP and then fix >> ProcessorContext in another KIP and incorporate with existing rich >> functions, the code will not be backwards compatible. >> >> I see Damian's and your point more clearly now. >> >> Conclusion: we include only withKey interfaces in this KIP (I updated the >> KIP), I will try also initiate another KIP for rich functions. >> >> Cheers, >> Jeyhun >> >> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> With the current KIP, using ValueMapper and ValueMapperWithKey >>> interfaces, RichFunction seems to be an independent add-on. To fix the >>> original issue to allow key access, RichFunctions are not required IMHO. >>> >>> I initially put the RichFunction idea on the table, because I was hoping >>> to get a uniform API. And I think, is was good to consider them -- >>> however, the discussion showed that they are not necessary for key >>> access. Thus, it seems to be better to handle RichFunctions in an own >>> KIP. The ProcessorContext/RecordContext issues seems to be a main >>> challenge for this. And introducing RichFunctions with parameter-less >>> init() method, seem not to help too much. We would get an "intermediate" >>> API that we want to change anyway later on... >>> >>> As you put already much effort into RichFunction, feel free do push this >>> further and start a new KIP (we can do this even in parallel) -- we >>> don't want to slow you down :) But it make the discussion and code >>> review easier, if we separate both IMHO. >>> >>> >>> -Matthias >>> >>> >>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote: >>>> Hi Damian, >>>> >>>> Thanks for your comments. I think providing to users *interface* rather >>>> than *abstract class* should be preferred (Matthias also raised this >>> issue >>>> ), anyway I changed the corresponding parts of KIP. >>>> >>>> Regarding with passing additional contextual information, I think it is a >>>> tradeoff, >>>> 1) first, we fix the context parameter for *init() *method in another PR >>>> and solve Rich functions afterwards >>>> 2) first, we fix the requested issues on jira ([1-3]) with providing >>>> (not-complete) Rich functions and integrate the context parameters to >>> this >>>> afterwards (like improvement) >>>> >>>> To me, the second approach seems more incremental. However you are right, >>>> the names might confuse the users. >>>> >>>> >>>> >>>> [1] https://issues.apache.org/jira/browse/KAFKA-4218 >>>> [2] https://issues.apache.org/jira/browse/KAFKA-4726 >>>> [3] https://issues.apache.org/jira/browse/KAFKA-3745 >>>> >>>> >>>> Cheers, >>>> Jeyhun >>>> >>>> >>>> On Fri, May 19, 2017 at 10:42 AM Damian Guy <damian....@gmail.com> >>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I see you've removed the `ProcessorContext` from the RichFunction which >>> is >>>>> good, but why is it a `RichFunction`? I'd have expected it to pass some >>>>> additional contextual information, i.e., the `RecordContext` that >>> contains >>>>> just the topic, partition, timestamp, offset. I'm ok with it not >>> passing >>>>> this contextual information, but is the naming incorrect? I'm not sure, >>>>> tbh. I'm wondering if we should drop `RichFunctions` until we can do it >>>>> properly with the correct context? >>>>> >>>>> Also, i don't like the abstract classes: RichValueMapper, >>> RichValueJoiner, >>>>> RichInitializer etc. Why can't they not just be interfaces? Generally we >>>>> should provide users with Intefaces to code against, not classes. >>>>> >>>>> Thanks, >>>>> Damian >>>>> >>>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov <je.kari...@gmail.com> >>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Thanks. I initiated the PR as well, to get a better overview. >>>>>> >>>>>> The only reason that I used abstract class instead of interface for >>> Rich >>>>>> functions is that in future if we will have some AbstractRichFunction >>>>>> abstract classes, >>>>>> we can easily extend: >>>>>> >>>>>> public abstract class RichValueMapper<K, V, VR> implements >>>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends >>>>> AbstractRichFunction*{ >>>>>> } >>>>>> With interfaces we are only limited to interfaces for inheritance. >>>>>> >>>>>> However, I think we can easily change it (from abstract class -> >>>>> interface) >>>>>> if you think interface is a better fit. >>>>>> >>>>>> >>>>>> Cheers, >>>>>> Jeyhun >>>>>> >>>>>> >>>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax <matth...@confluent.io >>>> >>>>>> wrote: >>>>>> >>>>>>> Thanks for the update and explanations. The KIP is quite improved now >>>>> -- >>>>>>> great job! >>>>>>> >>>>>>> One more question: Why are RichValueMapper etc abstract classes and >>> not >>>>>>> interfaces? >>>>>>> >>>>>>> >>>>>>> Overall, I like the KIP a lot! >>>>>>> >>>>>>> >>>>>>> -Matthias >>>>>>> >>>>>>> >>>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Thanks for your comments. >>>>>>>> >>>>>>>> I think supporting Lambdas for `withKey` and `AbstractRichFunction` >>>>>>>>> don't go together, as Lambdas are only supported for interfaces >>>>> AFAIK. >>>>>>>> >>>>>>>> >>>>>>>> Maybe I misunderstood your comment. >>>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they don't have >>>>>>> direct >>>>>>>> relation with *AbstractRichFunction*. >>>>>>>> *withKey* and and *withOnlyValue* interfaces have only one method , >>>>> so >>>>>>> we >>>>>>>> can use lambdas. >>>>>>>> Where does the *AbstractRichFunction* comes to play? Inside Rich >>>>>>> functions. >>>>>>>> And we use Rich functions in 2 places: >>>>>>>> >>>>>>>> 1. User doesn't use rich functions. Just regular *withKey* and and >>>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get those >>>>>>> interfaces >>>>>>>> and wrap into Rich function while building the topology, and send to >>>>>>>> Processor. >>>>>>>> 2. User does use rich functions (Rich functions implement *withKey* >>>>>>>> interface). As a result no lamdas here by definition. In this case, >>>>>> while >>>>>>>> building the topology we do a type check if the object type is >>>>>>>> *withKey* or *RichFunction*. >>>>>>>> >>>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich functions >>>>>> and >>>>>>>> does not affect using lambdas. >>>>>>>> >>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to have a >>>>>>>>> interface approach like this >>>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>>> - ValueMapper >>>>>>>>> - ValueMapperWithKey >>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction >>>>>>>> >>>>>>>> >>>>>>>> As I said above, currently we support lambdas for *withKey* >>>>> interfaces >>>>>> as >>>>>>>> well. However, I agree with your idea and I will remove the >>>>>>>> AbstractRichFunction from the design. >>>>>>>> >>>>>>>> As an alternative, we could argue, that it is sufficient to support >>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". >>>>> For >>>>>>>>> this, RichFunction could add key+init+close and AbstractRichFunction >>>>>>>>> would allow to only care about getting the key. >>>>>>>>> Not sure, which one is better. I don't like the idea of more >>>>>> overloaded >>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because we >>>>>> have >>>>>>>>> already so many overlaods. On the other hand, I do see value in >>>>>>>>> supporting Lambdas for `withKey`. >>>>>>>> >>>>>>>> >>>>>>>> Just to clarify, with current design we have only one extra >>>>> overloaded >>>>>>>> method per *withOnlyValue* interface: which is *withKey* version of >>>>>>>> particular interface. >>>>>>>> We don't need extra overload for Rich function as Rich function >>>>>>> implements >>>>>>>> *withKey* interface as a result they have same type. We differentiate >>>>>>> them >>>>>>>> while building the topology. >>>>>>>> We supported lambdas for *withKey* APIs because of the comment: >>>>>>>> >>>>>>>> @Jeyhun: I did not put any thought into this, but can we have a >>>>> design >>>>>>>>> that allows for both? Also, with regard to lambdas, it might make >>>>>> sense >>>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ? >>>>>>>> >>>>>>>> >>>>>>>> However, I don't think that this complicates the overall design >>>>>>>> significantly. >>>>>>>> >>>>>>>> >>>>>>>> Depending on what we want to support, it might make sense to >>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this also >>>>>>>>> determines if we should have a "ProcessorContext KIP" before driving >>>>>>>>> this KIP further. >>>>>>>> >>>>>>>> >>>>>>>> Based on our discussion I think we should keep Rich functions as I >>>>>> don't >>>>>>>> think that they bring extra layer of overhead to library. >>>>>>>> >>>>>>>> Any comments are appreciated. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Jeyhun >>>>>>>> >>>>>>>> >>>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax < >>>>>> matth...@confluent.io> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Jeyhun, >>>>>>>>> >>>>>>>>> thanks for the update. >>>>>>>>> >>>>>>>>> I think supporting Lambdas for `withKey` and `AbstractRichFunction` >>>>>>>>> don't go together, as Lambdas are only supported for interfaces >>>>> AFAIK. >>>>>>>>> >>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to have a >>>>>>>>> interface approach like this >>>>>>>>> >>>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>>> >>>>>>>>> - ValueMapper >>>>>>>>> - ValueMapperWithKey >>>>>>>>> >>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction >>>>>>>>> >>>>>>>>> For this approach, AbstractRichFunction does not make sense anymore, >>>>>> as >>>>>>>>> the only purpose of `RichFunction` is to allow the implementation of >>>>>>>>> init() and close() -- if you don't want those, you would implement a >>>>>>>>> different interface (ie, ValueMapperWithKey) >>>>>>>>> >>>>>>>>> As an alternative, we could argue, that it is sufficient to support >>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". >>>>> For >>>>>>>>> this, RichFunction could add key+init+close and AbstractRichFunction >>>>>>>>> would allow to only care about getting the key. >>>>>>>>> >>>>>>>>> Not sure, which one is better. I don't like the idea of more >>>>>> overloaded >>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because we >>>>>> have >>>>>>>>> already so many overlaods. On the other hand, I do see value in >>>>>>>>> supporting Lambdas for `withKey`. >>>>>>>>> >>>>>>>>> Depending on what we want to support, it might make sense to >>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this also >>>>>>>>> determines if we should have a "ProcessorContext KIP" before driving >>>>>>>>> this KIP further. >>>>>>>>> >>>>>>>>> Thoughts? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Sorry for super late response. Thanks for your comments. >>>>>>>>>> >>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I >>>>>> cannot >>>>>>>>>>> follow the explanation in the KIP to see what the problem is. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - From [1] says "A functional interface is an interface that has >>>>> just >>>>>>> one >>>>>>>>>> abstract method, and thus represents a single function contract". >>>>>>>>>> So basically once we extend some interface from another (in our >>>>> case, >>>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in the >>>>>>>>> extended >>>>>>>>>> interface. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Further comments: >>>>>>>>>>> - The KIP get a little hard to read -- can you maybe reformat the >>>>>>> wiki >>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - I will work on the KIP. >>>>>>>>>> >>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods added >>>>>> for >>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any new >>>>>>>>>>> overloads) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - Actually there are more than one Processor and public APIs to be >>>>>>>>>> changed (KStream-KTable >>>>>>>>>> joins is one case). However all of them has similar structure: we >>>>>>>>> overload >>>>>>>>>> the *method* with *methodWithKey*, >>>>>>>>>> wrap it into the Rich function, send to processor and inside the >>>>>>>>> processor >>>>>>>>>> call *init* and *close* methods of the Rich function. >>>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea with >>>>> only >>>>>>>>>> *ValueMapper* as the same can be applied to all changes. >>>>>>>>>> Anyway I will update the KIP. >>>>>>>>>> >>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and* close()* >>>>>>>>> methods >>>>>>>>>> in every Rich function with empty body like: >>>>>>>>>> >>>>>>>>>> @Override >>>>>>>>>> void init(ProcessorContext context) {} >>>>>>>>>> >>>>>>>>>> @Override >>>>>>>>>> void close () {} >>>>>>>>>> >>>>>>>>>> I thought that we can override them once in *AbstractRichFunction* >>>>>> and >>>>>>>>>> extent new Rich functions from *AbstractRichFunction*. >>>>>>>>>> Basically this can eliminate code copy-paste and ease the >>>>>> maintenance. >>>>>>>>>> >>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, >>>>>> Predicate, >>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but we >>>>>>> should >>>>>>>>>>> discuss all of them and add where it does make sense (e.g., >>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Definitely agree. As I said, the same technique applies to all this >>>>>>>>>> interfaces and I didn't want to explode the KIP, just wanted to >>>>> give >>>>>>> the >>>>>>>>>> overall intuition. >>>>>>>>>> However, I will update the KIP as I said. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>>> `RichValueXX` >>>>>>>>>>> in general -- but why can't we do all this with interfaces only? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Sure we can. However the main intuition is we should not force >>>>> users >>>>>> to >>>>>>>>>> implement *init(ProcessorContext)* and *close()* functions every >>>>> time >>>>>>>>> they >>>>>>>>>> use Rich functions. >>>>>>>>>> If one needs, she can override the respective methods. However, I >>>>> am >>>>>>> open >>>>>>>>>> for discussion. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any >>>>> further >>>>>>> than >>>>>>>>>>> it currently is. So maybe we need another KIP that is done before >>>>>>> this? >>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> That is good point. I wanted to make *init(ProcessorContext)* >>>>> method >>>>>>>>>> persistent among the library (which use ProcessorContext as an >>>>>> input), >>>>>>>>>> therefore I put *ProcessorContext* as an input. >>>>>>>>>> So the important question is that (as @dguy and @mjsax mentioned) >>>>>>> whether >>>>>>>>>> continue this KIP without providing users an access to >>>>>>> *ProcessorContext* >>>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or >>>>>>>>>> initiate another KIP before this. >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- >>>>>>>>> se-8-jls-pfd-diffs.pdf >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Jeyhun >>>>>>>>>> >>>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy <damian....@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any >>>>> further >>>>>>>>> than >>>>>>>>>>> it currently is. So maybe we need another KIP that is done before >>>>>>> this? >>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax < >>>>> matth...@confluent.io >>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> I agree that that `ProcessorContext` interface is too broad in >>>>>>> general >>>>>>>>>>>> -- this is even true for transform/process, and it's also >>>>> reflected >>>>>>> in >>>>>>>>>>>> the API improvement list we want to do. >>>>>>>>>>>> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/ >>>>>>>>>>> Kafka+Streams+Discussions >>>>>>>>>>>> >>>>>>>>>>>> So I am wondering, if you question the `RichFunction` approach in >>>>>>>>>>>> general? Or if you suggest to either extend the scope of this KIP >>>>>> to >>>>>>>>>>>> include this---or maybe better, do another KIP for it and delay >>>>>> this >>>>>>>>> KIP >>>>>>>>>>>> until the other one is done? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -Matthias >>>>>>>>>>>> >>>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote: >>>>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we really >>>>>> want >>>>>>> to >>>>>>>>>>>> give >>>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It has a >>>>> bunch >>>>>>> of >>>>>>>>>>>>> methods on it that seem in-appropriate for some of the DSL >>>>>> methods, >>>>>>>>>>> i.e, >>>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is >>>>> far >>>>>>> too >>>>>>>>>>>>> broad. I think it would be better to have a narrower interface >>>>>> like >>>>>>>>> the >>>>>>>>>>>>> `RecordContext` - remembering it is easier to add >>>>>>> methods/interfaces >>>>>>>>>>>> later >>>>>>>>>>>>> than to remove them >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax < >>>>>> matth...@confluent.io >>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? >>>>> I >>>>>>>>>>> cannot >>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem is. >>>>>>>>>>>>>> >>>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's up to >>>>> you. >>>>>>>>>>> Maybe a >>>>>>>>>>>>>> committer can comment on this? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Further comments: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe reformat >>>>>> the >>>>>>>>>>> wiki >>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>>>>>>> >>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods >>>>>> added >>>>>>>>> for >>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any >>>>>> new >>>>>>>>>>>>>> overloads) >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>>>>>> >>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, >>>>>>>>>>> Predicate, >>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but >>>>> we >>>>>>>>>>> should >>>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g., >>>>>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>>>>>>> `RichValueXX` >>>>>>>>>>>>>> in general -- but why can't we do all this with interfaces >>>>> only? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend the two >>>>>>>>> interfaces >>>>>>>>>>>> if >>>>>>>>>>>>>> we >>>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should >>>>>> change >>>>>>>>>>> the >>>>>>>>>>>>>>> title, because now we are not limiting the KIP to only >>>>>>> ValueMapper, >>>>>>>>>>>>>>> ValueTransformer and ValueJoiner. >>>>>>>>>>>>>>> Please feel free to comment. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < >>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't need >>>>> the >>>>>>> new >>>>>>>>>>>>>>>> overlaod. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens when >>>>>>> building >>>>>>>>>>> the >>>>>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we >>>>> don't >>>>>>>>> need >>>>>>>>>>>> any >>>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for feedback. >>>>>>>>>>>>>>>>> Then we need to overload method >>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ? >>>>>> extends >>>>>>>>>>> VR> >>>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? super >>>>> V, >>>>>> ? >>>>>>>>>>>> extends >>>>>>>>>>>>>>>> VR> >>>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to check it >>>>> is >>>>>>>>>>>>>>>> ValueMapper >>>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich >>>>>> function. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < >>>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> +1 :) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we >>>>>> should >>>>>>>>> do >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> design a little different. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction and >>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the time. >>>>>> Thus, >>>>>>> on >>>>>>>>>>>> the >>>>>>>>>>>>>>>> DSL >>>>>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we >>>>> wrap >>>>>> it >>>>>>>>>>> by a >>>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. This >>>>>>>>>>>> RichFunction >>>>>>>>>>>>>>>>>>> would just forward the call omitting the key: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet): >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements >>>>> RichValueMapper() >>>>>> { >>>>>>>>>>>>>>>>>>>> private ValueMapper userProvidedMapper; // set by >>>>>>>>> constructor >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> public VR apply(final K key, final V1 value1, final V2 >>>>>>>>>>> value2) >>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>> return userProvidedMapper(value1, value2); >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> From a performance point of view, I am not sure if the >>>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have more >>>>>>>>> overhead >>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>> this approach (we would do more nested method call for >>>>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>>>>> which should be more common than RichFunctions). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I miss >>>>>>>>> something?) >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top level >>>>>>> interface >>>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and >>>>> also >>>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes >>>>>>> required). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Any thoughts? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include >>>>> rich >>>>>>>>>>>>>> functions. >>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep copying of >>>>>> keys. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hey Matthias, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the immutability of >>>>>> the >>>>>>>>>>> key >>>>>>>>>>>> is >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> best approach available. Better than requiring the key >>>>> to >>>>>>> be >>>>>>>>>>>>>>>>>> serializable >>>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no performance >>>>>> risk. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type constraints >>>>>> of >>>>>>>>>>> some >>>>>>>>>>>>>>>> kind. >>>>>>>>>>>>>>>>>> :-) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax < >>>>>>>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, it >>>>>>> should >>>>>>>>>>>> cover >>>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure what to >>>>> do >>>>>>>>> about >>>>>>>>>>>>>> it... >>>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and name the >>>>> key >>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I >>>>> would >>>>>>>>>>> prefer >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no good idea >>>>>> how >>>>>>>>> to >>>>>>>>>>>> do >>>>>>>>>>>>>>>>>> this. >>>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of corrupted >>>>>>>>>>>>>> partitioning >>>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, well, bad >>>>>>> luck, >>>>>>>>>>>> you >>>>>>>>>>>>>>>> got >>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep copy >>>>>> with >>>>>>> a >>>>>>>>>>>>>>>> potential >>>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm without any >>>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>> test). >>>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's worth >>>>>> for >>>>>>>>> you >>>>>>>>>>>> to >>>>>>>>>>>>>>>>>> apply >>>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very >>>>> basic >>>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you want to >>>>> do >>>>>>>>>>> this, >>>>>>>>>>>>>> look >>>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and into >>>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find the >>>>>> perf >>>>>>>>>>>> test. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi Matthias, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions totally >>>>>>>>> makes >>>>>>>>>>>>>>>> sense. >>>>>>>>>>>>>>>>>>>>> So, >>>>>>>>>>>>>>>>>>>>>>> we don't want to guard the keys because it is >>>>> costly. >>>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should not be >>>>>>>>>>> limited >>>>>>>>>>>>>> only >>>>>>>>>>>>>>>>>>>>> the 3 >>>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of >>>>>>> interfaces. >>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax < >>>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the user list: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= >>>>>>>>>>>>>>>>>>>>>> >>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ >>>>>>>>>>> key+ >>>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder, >>>>> and >>>>>>>>>>>>>> Substractor >>>>>>>>>>>>>>>>>>>>>>>> inferface, too. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are other >>>>> interface >>>>>>> we >>>>>>>>>>>> might >>>>>>>>>>>>>>>>>> miss >>>>>>>>>>>>>>>>>>>>>> atm. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am >>>>>>> actually >>>>>>>>>>> a >>>>>>>>>>>>>>>> little >>>>>>>>>>>>>>>>>>>>> bit >>>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might want to >>>>>> do >>>>>>>>>>> some >>>>>>>>>>>>>> test >>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>> quantify this impact. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of `RichFunction` >>>>>>>>>>>> interface >>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata (like >>>>>>>>>>> timestamp, >>>>>>>>>>>>>>>>>> offset, >>>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a similar >>>>>>>>> concept. >>>>>>>>>>>>>>>> Thus, I >>>>>>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the >>>>> scope >>>>>>> of >>>>>>>>>>>> this >>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>> that? WDYT? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach >>>>> and >>>>>>>>>>> updated >>>>>>>>>>>>>> PR >>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in Processors >>>>>>>>>>> sending >>>>>>>>>>>> a >>>>>>>>>>>>>>>> copy >>>>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>>> actual key. >>>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I think >>>>>>> memory >>>>>>>>>>>> can >>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>> bottleneck >>>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper (...etc) to >>>>>> be >>>>>>>>>>>>>> classes, >>>>>>>>>>>>>>>>>>>>> rather >>>>>>>>>>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards incompatible >>>>>>>>>>> change. >>>>>>>>>>>>>> An >>>>>>>>>>>>>>>>>>>>>>>> alternative >>>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible would >>>>> be >>>>>>> to >>>>>>>>>>>>>> define >>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those >>>>>>> interfaces >>>>>>>>>>>> are >>>>>>>>>>>>>>>>>> used. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as "final" >>>>>>>>>>> doesn't >>>>>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>>>>>> much >>>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change. It only >>>>> prevents >>>>>>> the >>>>>>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>>>>>> variable >>>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type is a >>>>> mutable >>>>>>>>>>> object >>>>>>>>>>>>>>>> (eg. >>>>>>>>>>>>>>>>>>>>>>>> byte[]), >>>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But >>>>> I'm >>>>>>> not >>>>>>>>>>>>>> really >>>>>>>>>>>>>>>>>> sure >>>>>>>>>>>>>>>>>>>>>>>> there's >>>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov < >>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can guard >>>>> for >>>>>>>>>>>>>> immutable >>>>>>>>>>>>>>>>>> keys >>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I >>>>> didn't >>>>>>>>>>>> consider >>>>>>>>>>>>>>>>>>>>>> backward >>>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In both >>>>>>>>> cases, >>>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>>> accesses >>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type >>>>> parameter >>>>>>>>> will >>>>>>>>>>>>>> break >>>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to cast to >>>>>>> actual >>>>>>>>>>>> key >>>>>>>>>>>>>>>>>> type. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with 2 >>>>>>>>> argument >>>>>>>>>>>>>> (key >>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>> value) >>>>>>>>>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this, I >>>>>> think >>>>>>> we >>>>>>>>>>>> can >>>>>>>>>>>>>>>>>>>>> address >>>>>>>>>>>>>>>>>>>>>>>> both >>>>>>>>>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against key >>>>>>> change. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess like: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> public class KeyAccess { >>>>>>>>>>>>>>>>>>>>>>>>>>>> Object key; >>>>>>>>>>>>>>>>>>>>>>>>>>>> public void beforeApply(final Object key) { >>>>>>>>>>>>>>>>>>>>>>>>>>>> this.key = key; >>>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>>> public Object getKey() { >>>>>>>>>>>>>>>>>>>>>>>>>>>> return key; >>>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and >>>>>>>>>>>>>>>> *ValueTransformer* >>>>>>>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example >>>>>>>>>>>>>>>>>>>>>> *KTableMapValuesProcessor*) >>>>>>>>>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can set >>>>> the >>>>>>>>>>> *key* >>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user can >>>>>> use >>>>>>>>>>>>>>>> *getKey()* >>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax < >>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP! >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to address: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward >>>>>>> compatibility. >>>>>>>>>>>> Users >>>>>>>>>>>>>>>> did >>>>>>>>>>>>>>>>>>>>>>>>>>> complain >>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as the >>>>>> user >>>>>>>>>>> base >>>>>>>>>>>>>>>>>> grows, >>>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in >>>>> future >>>>>>>>>>>> releases >>>>>>>>>>>>>>>>>>>>> anymore. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to allow >>>>> key >>>>>>>>>>>> access. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same direction >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile >>>>>> failures >>>>>>>>>>> that >>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) >>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no guard >>>>> to >>>>>>>>>>> prevent >>>>>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt partitioning >>>>> if >>>>>>>>>>> users >>>>>>>>>>>> do >>>>>>>>>>>>>>>>>> alter >>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not aware >>>>>>> that >>>>>>>>>>>> they >>>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) and >>>>>> thus >>>>>>>>>>> break >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> application. (This was the original motivation >>>>> to >>>>>>> not >>>>>>>>>>>>>> provide >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against >>>>>>> modification.) >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, have >>>>>>> wished >>>>>>>>>>> for >>>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar >>>>>> operation. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile >>>>> failures >>>>>>>>> that >>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) But >>>>>> at >>>>>>>>>>> least >>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov >>>>> < >>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear community, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues >>>>>>>>>>> KAFKA-4218 >>>>>>>>>>>>>> [2], >>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-4726 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3], >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be found at >>>>>>> [5]. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ >>>>>>>>>>> confluence/display/KAFKA/KIP- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira >>>>>>>>> /browse/KAFKA-4218 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira >>>>>>>>> /browse/KAFKA-4726 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira >>>>>>>>> /browse/KAFKA-3745 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [5] https://github.com/apache/kafka/pull/2946 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>> -Cheers >>>>>> >>>>>> Jeyhun >>>>>> >>>>> >>> >>> -- >> -Cheers >> >> Jeyhun >> >
signature.asc
Description: OpenPGP digital signature