Jeyhun, I was just wondering if you did look into the key-deep-copy idea we discussed. I am curious to see what the impact might be.
-Matthias On 5/20/17 2:03 AM, Jeyhun Karimov wrote: > Hi, > > Thanks for your comments. I rethink about including rich functions into > this KIP. > I think once we include rich functions in this KIP and then fix > ProcessorContext in another KIP and incorporate with existing rich > functions, the code will not be backwards compatible. > > I see Damian's and your point more clearly now. > > Conclusion: we include only withKey interfaces in this KIP (I updated the > KIP), I will try also initiate another KIP for rich functions. > > Cheers, > Jeyhun > > On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax <matth...@confluent.io> > wrote: > >> With the current KIP, using ValueMapper and ValueMapperWithKey >> interfaces, RichFunction seems to be an independent add-on. To fix the >> original issue to allow key access, RichFunctions are not required IMHO. >> >> I initially put the RichFunction idea on the table, because I was hoping >> to get a uniform API. And I think, is was good to consider them -- >> however, the discussion showed that they are not necessary for key >> access. Thus, it seems to be better to handle RichFunctions in an own >> KIP. The ProcessorContext/RecordContext issues seems to be a main >> challenge for this. And introducing RichFunctions with parameter-less >> init() method, seem not to help too much. We would get an "intermediate" >> API that we want to change anyway later on... >> >> As you put already much effort into RichFunction, feel free do push this >> further and start a new KIP (we can do this even in parallel) -- we >> don't want to slow you down :) But it make the discussion and code >> review easier, if we separate both IMHO. >> >> >> -Matthias >> >> >> On 5/19/17 2:25 AM, Jeyhun Karimov wrote: >>> Hi Damian, >>> >>> Thanks for your comments. I think providing to users *interface* rather >>> than *abstract class* should be preferred (Matthias also raised this >> issue >>> ), anyway I changed the corresponding parts of KIP. >>> >>> Regarding with passing additional contextual information, I think it is a >>> tradeoff, >>> 1) first, we fix the context parameter for *init() *method in another PR >>> and solve Rich functions afterwards >>> 2) first, we fix the requested issues on jira ([1-3]) with providing >>> (not-complete) Rich functions and integrate the context parameters to >> this >>> afterwards (like improvement) >>> >>> To me, the second approach seems more incremental. However you are right, >>> the names might confuse the users. >>> >>> >>> >>> [1] https://issues.apache.org/jira/browse/KAFKA-4218 >>> [2] https://issues.apache.org/jira/browse/KAFKA-4726 >>> [3] https://issues.apache.org/jira/browse/KAFKA-3745 >>> >>> >>> Cheers, >>> Jeyhun >>> >>> >>> On Fri, May 19, 2017 at 10:42 AM Damian Guy <damian....@gmail.com> >> wrote: >>> >>>> Hi, >>>> >>>> I see you've removed the `ProcessorContext` from the RichFunction which >> is >>>> good, but why is it a `RichFunction`? I'd have expected it to pass some >>>> additional contextual information, i.e., the `RecordContext` that >> contains >>>> just the topic, partition, timestamp, offset. I'm ok with it not >> passing >>>> this contextual information, but is the naming incorrect? I'm not sure, >>>> tbh. I'm wondering if we should drop `RichFunctions` until we can do it >>>> properly with the correct context? >>>> >>>> Also, i don't like the abstract classes: RichValueMapper, >> RichValueJoiner, >>>> RichInitializer etc. Why can't they not just be interfaces? Generally we >>>> should provide users with Intefaces to code against, not classes. >>>> >>>> Thanks, >>>> Damian >>>> >>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov <je.kari...@gmail.com> >> wrote: >>>> >>>>> Hi, >>>>> >>>>> Thanks. I initiated the PR as well, to get a better overview. >>>>> >>>>> The only reason that I used abstract class instead of interface for >> Rich >>>>> functions is that in future if we will have some AbstractRichFunction >>>>> abstract classes, >>>>> we can easily extend: >>>>> >>>>> public abstract class RichValueMapper<K, V, VR> implements >>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends >>>> AbstractRichFunction*{ >>>>> } >>>>> With interfaces we are only limited to interfaces for inheritance. >>>>> >>>>> However, I think we can easily change it (from abstract class -> >>>> interface) >>>>> if you think interface is a better fit. >>>>> >>>>> >>>>> Cheers, >>>>> Jeyhun >>>>> >>>>> >>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax <matth...@confluent.io >>> >>>>> wrote: >>>>> >>>>>> Thanks for the update and explanations. The KIP is quite improved now >>>> -- >>>>>> great job! >>>>>> >>>>>> One more question: Why are RichValueMapper etc abstract classes and >> not >>>>>> interfaces? >>>>>> >>>>>> >>>>>> Overall, I like the KIP a lot! >>>>>> >>>>>> >>>>>> -Matthias >>>>>> >>>>>> >>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Thanks for your comments. >>>>>>> >>>>>>> I think supporting Lambdas for `withKey` and `AbstractRichFunction` >>>>>>>> don't go together, as Lambdas are only supported for interfaces >>>> AFAIK. >>>>>>> >>>>>>> >>>>>>> Maybe I misunderstood your comment. >>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they don't have >>>>>> direct >>>>>>> relation with *AbstractRichFunction*. >>>>>>> *withKey* and and *withOnlyValue* interfaces have only one method , >>>> so >>>>>> we >>>>>>> can use lambdas. >>>>>>> Where does the *AbstractRichFunction* comes to play? Inside Rich >>>>>> functions. >>>>>>> And we use Rich functions in 2 places: >>>>>>> >>>>>>> 1. User doesn't use rich functions. Just regular *withKey* and and >>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get those >>>>>> interfaces >>>>>>> and wrap into Rich function while building the topology, and send to >>>>>>> Processor. >>>>>>> 2. User does use rich functions (Rich functions implement *withKey* >>>>>>> interface). As a result no lamdas here by definition. In this case, >>>>> while >>>>>>> building the topology we do a type check if the object type is >>>>>>> *withKey* or *RichFunction*. >>>>>>> >>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich functions >>>>> and >>>>>>> does not affect using lambdas. >>>>>>> >>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to have a >>>>>>>> interface approach like this >>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>> - ValueMapper >>>>>>>> - ValueMapperWithKey >>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction >>>>>>> >>>>>>> >>>>>>> As I said above, currently we support lambdas for *withKey* >>>> interfaces >>>>> as >>>>>>> well. However, I agree with your idea and I will remove the >>>>>>> AbstractRichFunction from the design. >>>>>>> >>>>>>> As an alternative, we could argue, that it is sufficient to support >>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". >>>> For >>>>>>>> this, RichFunction could add key+init+close and AbstractRichFunction >>>>>>>> would allow to only care about getting the key. >>>>>>>> Not sure, which one is better. I don't like the idea of more >>>>> overloaded >>>>>>>> methods to get Lambdas for `withKey` interfaces too much because we >>>>> have >>>>>>>> already so many overlaods. On the other hand, I do see value in >>>>>>>> supporting Lambdas for `withKey`. >>>>>>> >>>>>>> >>>>>>> Just to clarify, with current design we have only one extra >>>> overloaded >>>>>>> method per *withOnlyValue* interface: which is *withKey* version of >>>>>>> particular interface. >>>>>>> We don't need extra overload for Rich function as Rich function >>>>>> implements >>>>>>> *withKey* interface as a result they have same type. We differentiate >>>>>> them >>>>>>> while building the topology. >>>>>>> We supported lambdas for *withKey* APIs because of the comment: >>>>>>> >>>>>>> @Jeyhun: I did not put any thought into this, but can we have a >>>> design >>>>>>>> that allows for both? Also, with regard to lambdas, it might make >>>>> sense >>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ? >>>>>>> >>>>>>> >>>>>>> However, I don't think that this complicates the overall design >>>>>>> significantly. >>>>>>> >>>>>>> >>>>>>> Depending on what we want to support, it might make sense to >>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this also >>>>>>>> determines if we should have a "ProcessorContext KIP" before driving >>>>>>>> this KIP further. >>>>>>> >>>>>>> >>>>>>> Based on our discussion I think we should keep Rich functions as I >>>>> don't >>>>>>> think that they bring extra layer of overhead to library. >>>>>>> >>>>>>> Any comments are appreciated. >>>>>>> >>>>>>> Cheers, >>>>>>> Jeyhun >>>>>>> >>>>>>> >>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax < >>>>> matth...@confluent.io> >>>>>>> wrote: >>>>>>> >>>>>>>> Jeyhun, >>>>>>>> >>>>>>>> thanks for the update. >>>>>>>> >>>>>>>> I think supporting Lambdas for `withKey` and `AbstractRichFunction` >>>>>>>> don't go together, as Lambdas are only supported for interfaces >>>> AFAIK. >>>>>>>> >>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to have a >>>>>>>> interface approach like this >>>>>>>> >>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>> >>>>>>>> - ValueMapper >>>>>>>> - ValueMapperWithKey >>>>>>>> >>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction >>>>>>>> >>>>>>>> For this approach, AbstractRichFunction does not make sense anymore, >>>>> as >>>>>>>> the only purpose of `RichFunction` is to allow the implementation of >>>>>>>> init() and close() -- if you don't want those, you would implement a >>>>>>>> different interface (ie, ValueMapperWithKey) >>>>>>>> >>>>>>>> As an alternative, we could argue, that it is sufficient to support >>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". >>>> For >>>>>>>> this, RichFunction could add key+init+close and AbstractRichFunction >>>>>>>> would allow to only care about getting the key. >>>>>>>> >>>>>>>> Not sure, which one is better. I don't like the idea of more >>>>> overloaded >>>>>>>> methods to get Lambdas for `withKey` interfaces too much because we >>>>> have >>>>>>>> already so many overlaods. On the other hand, I do see value in >>>>>>>> supporting Lambdas for `withKey`. >>>>>>>> >>>>>>>> Depending on what we want to support, it might make sense to >>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this also >>>>>>>> determines if we should have a "ProcessorContext KIP" before driving >>>>>>>> this KIP further. >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Matthias >>>>>>>> >>>>>>>> >>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Sorry for super late response. Thanks for your comments. >>>>>>>>> >>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I >>>>> cannot >>>>>>>>>> follow the explanation in the KIP to see what the problem is. >>>>>>>>> >>>>>>>>> >>>>>>>>> - From [1] says "A functional interface is an interface that has >>>> just >>>>>> one >>>>>>>>> abstract method, and thus represents a single function contract". >>>>>>>>> So basically once we extend some interface from another (in our >>>> case, >>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in the >>>>>>>> extended >>>>>>>>> interface. >>>>>>>>> >>>>>>>>> >>>>>>>>> Further comments: >>>>>>>>>> - The KIP get a little hard to read -- can you maybe reformat the >>>>>> wiki >>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>> >>>>>>>>> >>>>>>>>> - I will work on the KIP. >>>>>>>>> >>>>>>>>> - What about KStream-KTable joins? You don't have overlaods added >>>>> for >>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any new >>>>>>>>>> overloads) >>>>>>>>> >>>>>>>>> >>>>>>>>> - Actually there are more than one Processor and public APIs to be >>>>>>>>> changed (KStream-KTable >>>>>>>>> joins is one case). However all of them has similar structure: we >>>>>>>> overload >>>>>>>>> the *method* with *methodWithKey*, >>>>>>>>> wrap it into the Rich function, send to processor and inside the >>>>>>>> processor >>>>>>>>> call *init* and *close* methods of the Rich function. >>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea with >>>> only >>>>>>>>> *ValueMapper* as the same can be applied to all changes. >>>>>>>>> Anyway I will update the KIP. >>>>>>>>> >>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>> >>>>>>>>> >>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and* close()* >>>>>>>> methods >>>>>>>>> in every Rich function with empty body like: >>>>>>>>> >>>>>>>>> @Override >>>>>>>>> void init(ProcessorContext context) {} >>>>>>>>> >>>>>>>>> @Override >>>>>>>>> void close () {} >>>>>>>>> >>>>>>>>> I thought that we can override them once in *AbstractRichFunction* >>>>> and >>>>>>>>> extent new Rich functions from *AbstractRichFunction*. >>>>>>>>> Basically this can eliminate code copy-paste and ease the >>>>> maintenance. >>>>>>>>> >>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, >>>>> Predicate, >>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but we >>>>>> should >>>>>>>>>> discuss all of them and add where it does make sense (e.g., >>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>> >>>>>>>>> >>>>>>>>> Definitely agree. As I said, the same technique applies to all this >>>>>>>>> interfaces and I didn't want to explode the KIP, just wanted to >>>> give >>>>>> the >>>>>>>>> overall intuition. >>>>>>>>> However, I will update the KIP as I said. >>>>>>>>> >>>>>>>>> >>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>> `RichValueXX` >>>>>>>>>> in general -- but why can't we do all this with interfaces only? >>>>>>>>> >>>>>>>>> >>>>>>>>> Sure we can. However the main intuition is we should not force >>>> users >>>>> to >>>>>>>>> implement *init(ProcessorContext)* and *close()* functions every >>>> time >>>>>>>> they >>>>>>>>> use Rich functions. >>>>>>>>> If one needs, she can override the respective methods. However, I >>>> am >>>>>> open >>>>>>>>> for discussion. >>>>>>>>> >>>>>>>>> >>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any >>>> further >>>>>> than >>>>>>>>>> it currently is. So maybe we need another KIP that is done before >>>>>> this? >>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. >>>>>>>>> >>>>>>>>> >>>>>>>>> That is good point. I wanted to make *init(ProcessorContext)* >>>> method >>>>>>>>> persistent among the library (which use ProcessorContext as an >>>>> input), >>>>>>>>> therefore I put *ProcessorContext* as an input. >>>>>>>>> So the important question is that (as @dguy and @mjsax mentioned) >>>>>> whether >>>>>>>>> continue this KIP without providing users an access to >>>>>> *ProcessorContext* >>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or >>>>>>>>> initiate another KIP before this. >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- >>>>>>>> se-8-jls-pfd-diffs.pdf >>>>>>>>> >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Jeyhun >>>>>>>>> >>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy <damian....@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any >>>> further >>>>>>>> than >>>>>>>>>> it currently is. So maybe we need another KIP that is done before >>>>>> this? >>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax < >>>> matth...@confluent.io >>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I agree that that `ProcessorContext` interface is too broad in >>>>>> general >>>>>>>>>>> -- this is even true for transform/process, and it's also >>>> reflected >>>>>> in >>>>>>>>>>> the API improvement list we want to do. >>>>>>>>>>> >>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/ >>>>>>>>>> Kafka+Streams+Discussions >>>>>>>>>>> >>>>>>>>>>> So I am wondering, if you question the `RichFunction` approach in >>>>>>>>>>> general? Or if you suggest to either extend the scope of this KIP >>>>> to >>>>>>>>>>> include this---or maybe better, do another KIP for it and delay >>>>> this >>>>>>>> KIP >>>>>>>>>>> until the other one is done? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote: >>>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>>> >>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we really >>>>> want >>>>>> to >>>>>>>>>>> give >>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It has a >>>> bunch >>>>>> of >>>>>>>>>>>> methods on it that seem in-appropriate for some of the DSL >>>>> methods, >>>>>>>>>> i.e, >>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is >>>> far >>>>>> too >>>>>>>>>>>> broad. I think it would be better to have a narrower interface >>>>> like >>>>>>>> the >>>>>>>>>>>> `RecordContext` - remembering it is easier to add >>>>>> methods/interfaces >>>>>>>>>>> later >>>>>>>>>>>> than to remove them >>>>>>>>>>>> >>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax < >>>>> matth...@confluent.io >>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>> >>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? >>>> I >>>>>>>>>> cannot >>>>>>>>>>>>> follow the explanation in the KIP to see what the problem is. >>>>>>>>>>>>> >>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's up to >>>> you. >>>>>>>>>> Maybe a >>>>>>>>>>>>> committer can comment on this? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Further comments: >>>>>>>>>>>>> >>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe reformat >>>>> the >>>>>>>>>> wiki >>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>>>>>> >>>>>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods >>>>> added >>>>>>>> for >>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any >>>>> new >>>>>>>>>>>>> overloads) >>>>>>>>>>>>> >>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>>>>> >>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, >>>>>>>>>> Predicate, >>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but >>>> we >>>>>>>>>> should >>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g., >>>>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>>>>>> `RichValueXX` >>>>>>>>>>>>> in general -- but why can't we do all this with interfaces >>>> only? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -Matthias >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend the two >>>>>>>> interfaces >>>>>>>>>>> if >>>>>>>>>>>>> we >>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should >>>>> change >>>>>>>>>> the >>>>>>>>>>>>>> title, because now we are not limiting the KIP to only >>>>>> ValueMapper, >>>>>>>>>>>>>> ValueTransformer and ValueJoiner. >>>>>>>>>>>>>> Please feel free to comment. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < >>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't need >>>> the >>>>>> new >>>>>>>>>>>>>>> overlaod. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens when >>>>>> building >>>>>>>>>> the >>>>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we >>>> don't >>>>>>>> need >>>>>>>>>>> any >>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for feedback. >>>>>>>>>>>>>>>> Then we need to overload method >>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ? >>>>> extends >>>>>>>>>> VR> >>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? super >>>> V, >>>>> ? >>>>>>>>>>> extends >>>>>>>>>>>>>>> VR> >>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to check it >>>> is >>>>>>>>>>>>>>> ValueMapper >>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich >>>>> function. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < >>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> +1 :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we >>>>> should >>>>>>>> do >>>>>>>>>>> the >>>>>>>>>>>>>>>>>> design a little different. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction and >>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the time. >>>>> Thus, >>>>>> on >>>>>>>>>>> the >>>>>>>>>>>>>>> DSL >>>>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we >>>> wrap >>>>> it >>>>>>>>>> by a >>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. This >>>>>>>>>>> RichFunction >>>>>>>>>>>>>>>>>> would just forward the call omitting the key: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet): >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements >>>> RichValueMapper() >>>>> { >>>>>>>>>>>>>>>>>>> private ValueMapper userProvidedMapper; // set by >>>>>>>> constructor >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> public VR apply(final K key, final V1 value1, final V2 >>>>>>>>>> value2) >>>>>>>>>>> { >>>>>>>>>>>>>>>>>>> return userProvidedMapper(value1, value2); >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> From a performance point of view, I am not sure if the >>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have more >>>>>>>> overhead >>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>> this approach (we would do more nested method call for >>>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>>>> which should be more common than RichFunctions). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I miss >>>>>>>> something?) >>>>>>>>>>> and >>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top level >>>>>> interface >>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and >>>> also >>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes >>>>>> required). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Any thoughts? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include >>>> rich >>>>>>>>>>>>> functions. >>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep copying of >>>>> keys. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hey Matthias, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the immutability of >>>>> the >>>>>>>>>> key >>>>>>>>>>> is >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> best approach available. Better than requiring the key >>>> to >>>>>> be >>>>>>>>>>>>>>>>> serializable >>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no performance >>>>> risk. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type constraints >>>>> of >>>>>>>>>> some >>>>>>>>>>>>>>> kind. >>>>>>>>>>>>>>>>> :-) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax < >>>>>>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, it >>>>>> should >>>>>>>>>>> cover >>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure what to >>>> do >>>>>>>> about >>>>>>>>>>>>> it... >>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and name the >>>> key >>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I >>>> would >>>>>>>>>> prefer >>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no good idea >>>>> how >>>>>>>> to >>>>>>>>>>> do >>>>>>>>>>>>>>>>> this. >>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of corrupted >>>>>>>>>>>>> partitioning >>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, well, bad >>>>>> luck, >>>>>>>>>>> you >>>>>>>>>>>>>>> got >>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep copy >>>>> with >>>>>> a >>>>>>>>>>>>>>> potential >>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm without any >>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>> test). >>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's worth >>>>> for >>>>>>>> you >>>>>>>>>>> to >>>>>>>>>>>>>>>>> apply >>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very >>>> basic >>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you want to >>>> do >>>>>>>>>> this, >>>>>>>>>>>>> look >>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and into >>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find the >>>>> perf >>>>>>>>>>> test. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>> Hi Matthias, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions totally >>>>>>>> makes >>>>>>>>>>>>>>> sense. >>>>>>>>>>>>>>>>>>>> So, >>>>>>>>>>>>>>>>>>>>>> we don't want to guard the keys because it is >>>> costly. >>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should not be >>>>>>>>>> limited >>>>>>>>>>>>> only >>>>>>>>>>>>>>>>>>>> the 3 >>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of >>>>>> interfaces. >>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax < >>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the user list: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= >>>>>>>>>>>>>>>>>>>>> >>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ >>>>>>>>>> key+ >>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder, >>>> and >>>>>>>>>>>>> Substractor >>>>>>>>>>>>>>>>>>>>>>> inferface, too. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are other >>>> interface >>>>>> we >>>>>>>>>>> might >>>>>>>>>>>>>>>>> miss >>>>>>>>>>>>>>>>>>>>> atm. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am >>>>>> actually >>>>>>>>>> a >>>>>>>>>>>>>>> little >>>>>>>>>>>>>>>>>>>> bit >>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might want to >>>>> do >>>>>>>>>> some >>>>>>>>>>>>> test >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> quantify this impact. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of `RichFunction` >>>>>>>>>>> interface >>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata (like >>>>>>>>>> timestamp, >>>>>>>>>>>>>>>>> offset, >>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a similar >>>>>>>> concept. >>>>>>>>>>>>>>> Thus, I >>>>>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the >>>> scope >>>>>> of >>>>>>>>>>> this >>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>> that? WDYT? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach >>>> and >>>>>>>>>> updated >>>>>>>>>>>>> PR >>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in Processors >>>>>>>>>> sending >>>>>>>>>>> a >>>>>>>>>>>>>>> copy >>>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>> actual key. >>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I think >>>>>> memory >>>>>>>>>>> can >>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>> bottleneck >>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper (...etc) to >>>>> be >>>>>>>>>>>>> classes, >>>>>>>>>>>>>>>>>>>> rather >>>>>>>>>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards incompatible >>>>>>>>>> change. >>>>>>>>>>>>> An >>>>>>>>>>>>>>>>>>>>>>> alternative >>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible would >>>> be >>>>>> to >>>>>>>>>>>>> define >>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those >>>>>> interfaces >>>>>>>>>>> are >>>>>>>>>>>>>>>>> used. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as "final" >>>>>>>>>> doesn't >>>>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>>>>> much >>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change. It only >>>> prevents >>>>>> the >>>>>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>>>>> variable >>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type is a >>>> mutable >>>>>>>>>> object >>>>>>>>>>>>>>> (eg. >>>>>>>>>>>>>>>>>>>>>>> byte[]), >>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But >>>> I'm >>>>>> not >>>>>>>>>>>>> really >>>>>>>>>>>>>>>>> sure >>>>>>>>>>>>>>>>>>>>>>> there's >>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov < >>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can guard >>>> for >>>>>>>>>>>>> immutable >>>>>>>>>>>>>>>>> keys >>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I >>>> didn't >>>>>>>>>>> consider >>>>>>>>>>>>>>>>>>>>> backward >>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In both >>>>>>>> cases, >>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>> accesses >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type >>>> parameter >>>>>>>> will >>>>>>>>>>>>> break >>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to cast to >>>>>> actual >>>>>>>>>>> key >>>>>>>>>>>>>>>>> type. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with 2 >>>>>>>> argument >>>>>>>>>>>>> (key >>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>> value) >>>>>>>>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this, I >>>>> think >>>>>> we >>>>>>>>>>> can >>>>>>>>>>>>>>>>>>>> address >>>>>>>>>>>>>>>>>>>>>>> both >>>>>>>>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against key >>>>>> change. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess like: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> public class KeyAccess { >>>>>>>>>>>>>>>>>>>>>>>>>>> Object key; >>>>>>>>>>>>>>>>>>>>>>>>>>> public void beforeApply(final Object key) { >>>>>>>>>>>>>>>>>>>>>>>>>>> this.key = key; >>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>> public Object getKey() { >>>>>>>>>>>>>>>>>>>>>>>>>>> return key; >>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and >>>>>>>>>>>>>>> *ValueTransformer* >>>>>>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example >>>>>>>>>>>>>>>>>>>>> *KTableMapValuesProcessor*) >>>>>>>>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can set >>>> the >>>>>>>>>> *key* >>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user can >>>>> use >>>>>>>>>>>>>>> *getKey()* >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax < >>>>>>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP! >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to address: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward >>>>>> compatibility. >>>>>>>>>>> Users >>>>>>>>>>>>>>> did >>>>>>>>>>>>>>>>>>>>>>>>>> complain >>>>>>>>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as the >>>>> user >>>>>>>>>> base >>>>>>>>>>>>>>>>> grows, >>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in >>>> future >>>>>>>>>>> releases >>>>>>>>>>>>>>>>>>>> anymore. >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to allow >>>> key >>>>>>>>>>> access. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same direction >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile >>>>> failures >>>>>>>>>> that >>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) >>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no guard >>>> to >>>>>>>>>> prevent >>>>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt partitioning >>>> if >>>>>>>>>> users >>>>>>>>>>> do >>>>>>>>>>>>>>>>> alter >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not aware >>>>>> that >>>>>>>>>>> they >>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) and >>>>> thus >>>>>>>>>> break >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>> application. (This was the original motivation >>>> to >>>>>> not >>>>>>>>>>>>> provide >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against >>>>>> modification.) >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, have >>>>>> wished >>>>>>>>>> for >>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar >>>>> operation. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile >>>> failures >>>>>>>> that >>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) But >>>>> at >>>>>>>>>> least >>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov >>>> < >>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear community, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues >>>>>>>>>> KAFKA-4218 >>>>>>>>>>>>> [2], >>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-4726 >>>>>>>>>>>>>>>>>>>>>>>>>>>> [3], >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be found at >>>>>> [5]. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ >>>>>>>>>> confluence/display/KAFKA/KIP- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira >>>>>>>> /browse/KAFKA-4218 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira >>>>>>>> /browse/KAFKA-4726 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira >>>>>>>> /browse/KAFKA-3745 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [5] https://github.com/apache/kafka/pull/2946 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>> -Cheers >>>>> >>>>> Jeyhun >>>>> >>>> >> >> -- > -Cheers > > Jeyhun >
signature.asc
Description: OpenPGP digital signature