Hi Matthias, Thanks for your comments.
I tested the deep copy approach. It has significant overhead. Especially for "light" and stateless operators it slows down the topology significantly (> 20% ). I think "warning" users about not-changing the key is better warning them about possible performance loss. About the interfaces, additionally I considered adding InitializerWithKey, AggregatorWithKey and ValueTransformerWithKey. I think they are included in PR but not in KIP. I will also include them in KIP, sorry my bad. Including ReducerWithKey definitely makes sense. Thanks. One thing I want to mention is that, maybe we should deprecate methods with argument type ValueTransformerSupplier (KStream.transformValues(...)) and and as a whole the ValueTransformerSupplier interface. We can use ValueTransformer/ValueTransformerWithKey type instead without additional supplier layer. Cheers, Jeyhun On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax <matth...@confluent.io> wrote: > One more question: > > Should we add any of > - InitizialierWithKey > - ReducerWithKey > - ValueTransformerWithKey > > To get consistent/complete API, it might be a good idea. Any thoughts? > > > -Matthias > > > On 5/24/17 3:47 PM, Matthias J. Sax wrote: > > Jeyhun, > > > > I was just wondering if you did look into the key-deep-copy idea we > > discussed. I am curious to see what the impact might be. > > > > > > -Matthias > > > > On 5/20/17 2:03 AM, Jeyhun Karimov wrote: > >> Hi, > >> > >> Thanks for your comments. I rethink about including rich functions into > >> this KIP. > >> I think once we include rich functions in this KIP and then fix > >> ProcessorContext in another KIP and incorporate with existing rich > >> functions, the code will not be backwards compatible. > >> > >> I see Damian's and your point more clearly now. > >> > >> Conclusion: we include only withKey interfaces in this KIP (I updated > the > >> KIP), I will try also initiate another KIP for rich functions. > >> > >> Cheers, > >> Jeyhun > >> > >> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax <matth...@confluent.io > > > >> wrote: > >> > >>> With the current KIP, using ValueMapper and ValueMapperWithKey > >>> interfaces, RichFunction seems to be an independent add-on. To fix the > >>> original issue to allow key access, RichFunctions are not required > IMHO. > >>> > >>> I initially put the RichFunction idea on the table, because I was > hoping > >>> to get a uniform API. And I think, is was good to consider them -- > >>> however, the discussion showed that they are not necessary for key > >>> access. Thus, it seems to be better to handle RichFunctions in an own > >>> KIP. The ProcessorContext/RecordContext issues seems to be a main > >>> challenge for this. And introducing RichFunctions with parameter-less > >>> init() method, seem not to help too much. We would get an > "intermediate" > >>> API that we want to change anyway later on... > >>> > >>> As you put already much effort into RichFunction, feel free do push > this > >>> further and start a new KIP (we can do this even in parallel) -- we > >>> don't want to slow you down :) But it make the discussion and code > >>> review easier, if we separate both IMHO. > >>> > >>> > >>> -Matthias > >>> > >>> > >>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote: > >>>> Hi Damian, > >>>> > >>>> Thanks for your comments. I think providing to users *interface* > rather > >>>> than *abstract class* should be preferred (Matthias also raised this > >>> issue > >>>> ), anyway I changed the corresponding parts of KIP. > >>>> > >>>> Regarding with passing additional contextual information, I think it > is a > >>>> tradeoff, > >>>> 1) first, we fix the context parameter for *init() *method in another > PR > >>>> and solve Rich functions afterwards > >>>> 2) first, we fix the requested issues on jira ([1-3]) with providing > >>>> (not-complete) Rich functions and integrate the context parameters to > >>> this > >>>> afterwards (like improvement) > >>>> > >>>> To me, the second approach seems more incremental. However you are > right, > >>>> the names might confuse the users. > >>>> > >>>> > >>>> > >>>> [1] https://issues.apache.org/jira/browse/KAFKA-4218 > >>>> [2] https://issues.apache.org/jira/browse/KAFKA-4726 > >>>> [3] https://issues.apache.org/jira/browse/KAFKA-3745 > >>>> > >>>> > >>>> Cheers, > >>>> Jeyhun > >>>> > >>>> > >>>> On Fri, May 19, 2017 at 10:42 AM Damian Guy <damian....@gmail.com> > >>> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> I see you've removed the `ProcessorContext` from the RichFunction > which > >>> is > >>>>> good, but why is it a `RichFunction`? I'd have expected it to pass > some > >>>>> additional contextual information, i.e., the `RecordContext` that > >>> contains > >>>>> just the topic, partition, timestamp, offset. I'm ok with it not > >>> passing > >>>>> this contextual information, but is the naming incorrect? I'm not > sure, > >>>>> tbh. I'm wondering if we should drop `RichFunctions` until we can do > it > >>>>> properly with the correct context? > >>>>> > >>>>> Also, i don't like the abstract classes: RichValueMapper, > >>> RichValueJoiner, > >>>>> RichInitializer etc. Why can't they not just be interfaces? > Generally we > >>>>> should provide users with Intefaces to code against, not classes. > >>>>> > >>>>> Thanks, > >>>>> Damian > >>>>> > >>>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov <je.kari...@gmail.com> > >>> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> Thanks. I initiated the PR as well, to get a better overview. > >>>>>> > >>>>>> The only reason that I used abstract class instead of interface for > >>> Rich > >>>>>> functions is that in future if we will have some > AbstractRichFunction > >>>>>> abstract classes, > >>>>>> we can easily extend: > >>>>>> > >>>>>> public abstract class RichValueMapper<K, V, VR> implements > >>>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends > >>>>> AbstractRichFunction*{ > >>>>>> } > >>>>>> With interfaces we are only limited to interfaces for inheritance. > >>>>>> > >>>>>> However, I think we can easily change it (from abstract class -> > >>>>> interface) > >>>>>> if you think interface is a better fit. > >>>>>> > >>>>>> > >>>>>> Cheers, > >>>>>> Jeyhun > >>>>>> > >>>>>> > >>>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax < > matth...@confluent.io > >>>> > >>>>>> wrote: > >>>>>> > >>>>>>> Thanks for the update and explanations. The KIP is quite improved > now > >>>>> -- > >>>>>>> great job! > >>>>>>> > >>>>>>> One more question: Why are RichValueMapper etc abstract classes and > >>> not > >>>>>>> interfaces? > >>>>>>> > >>>>>>> > >>>>>>> Overall, I like the KIP a lot! > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> > >>>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Thanks for your comments. > >>>>>>>> > >>>>>>>> I think supporting Lambdas for `withKey` and > `AbstractRichFunction` > >>>>>>>>> don't go together, as Lambdas are only supported for interfaces > >>>>> AFAIK. > >>>>>>>> > >>>>>>>> > >>>>>>>> Maybe I misunderstood your comment. > >>>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they don't > have > >>>>>>> direct > >>>>>>>> relation with *AbstractRichFunction*. > >>>>>>>> *withKey* and and *withOnlyValue* interfaces have only one > method , > >>>>> so > >>>>>>> we > >>>>>>>> can use lambdas. > >>>>>>>> Where does the *AbstractRichFunction* comes to play? Inside Rich > >>>>>>> functions. > >>>>>>>> And we use Rich functions in 2 places: > >>>>>>>> > >>>>>>>> 1. User doesn't use rich functions. Just regular *withKey* and and > >>>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get those > >>>>>>> interfaces > >>>>>>>> and wrap into Rich function while building the topology, and send > to > >>>>>>>> Processor. > >>>>>>>> 2. User does use rich functions (Rich functions implement > *withKey* > >>>>>>>> interface). As a result no lamdas here by definition. In this > case, > >>>>>> while > >>>>>>>> building the topology we do a type check if the object type is > >>>>>>>> *withKey* or *RichFunction*. > >>>>>>>> > >>>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich > functions > >>>>>> and > >>>>>>>> does not affect using lambdas. > >>>>>>>> > >>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to > have a > >>>>>>>>> interface approach like this > >>>>>>>>> - RichFunction -> only adding init() and close() > >>>>>>>>> - ValueMapper > >>>>>>>>> - ValueMapperWithKey > >>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction > >>>>>>>> > >>>>>>>> > >>>>>>>> As I said above, currently we support lambdas for *withKey* > >>>>> interfaces > >>>>>> as > >>>>>>>> well. However, I agree with your idea and I will remove the > >>>>>>>> AbstractRichFunction from the design. > >>>>>>>> > >>>>>>>> As an alternative, we could argue, that it is sufficient to > support > >>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". > >>>>> For > >>>>>>>>> this, RichFunction could add key+init+close and > AbstractRichFunction > >>>>>>>>> would allow to only care about getting the key. > >>>>>>>>> Not sure, which one is better. I don't like the idea of more > >>>>>> overloaded > >>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because > we > >>>>>> have > >>>>>>>>> already so many overlaods. On the other hand, I do see value in > >>>>>>>>> supporting Lambdas for `withKey`. > >>>>>>>> > >>>>>>>> > >>>>>>>> Just to clarify, with current design we have only one extra > >>>>> overloaded > >>>>>>>> method per *withOnlyValue* interface: which is *withKey* version > of > >>>>>>>> particular interface. > >>>>>>>> We don't need extra overload for Rich function as Rich function > >>>>>>> implements > >>>>>>>> *withKey* interface as a result they have same type. We > differentiate > >>>>>>> them > >>>>>>>> while building the topology. > >>>>>>>> We supported lambdas for *withKey* APIs because of the comment: > >>>>>>>> > >>>>>>>> @Jeyhun: I did not put any thought into this, but can we have a > >>>>> design > >>>>>>>>> that allows for both? Also, with regard to lambdas, it might make > >>>>>> sense > >>>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ? > >>>>>>>> > >>>>>>>> > >>>>>>>> However, I don't think that this complicates the overall design > >>>>>>>> significantly. > >>>>>>>> > >>>>>>>> > >>>>>>>> Depending on what we want to support, it might make sense to > >>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this > also > >>>>>>>>> determines if we should have a "ProcessorContext KIP" before > driving > >>>>>>>>> this KIP further. > >>>>>>>> > >>>>>>>> > >>>>>>>> Based on our discussion I think we should keep Rich functions as I > >>>>>> don't > >>>>>>>> think that they bring extra layer of overhead to library. > >>>>>>>> > >>>>>>>> Any comments are appreciated. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Jeyhun > >>>>>>>> > >>>>>>>> > >>>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax < > >>>>>> matth...@confluent.io> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Jeyhun, > >>>>>>>>> > >>>>>>>>> thanks for the update. > >>>>>>>>> > >>>>>>>>> I think supporting Lambdas for `withKey` and > `AbstractRichFunction` > >>>>>>>>> don't go together, as Lambdas are only supported for interfaces > >>>>> AFAIK. > >>>>>>>>> > >>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to > have a > >>>>>>>>> interface approach like this > >>>>>>>>> > >>>>>>>>> - RichFunction -> only adding init() and close() > >>>>>>>>> > >>>>>>>>> - ValueMapper > >>>>>>>>> - ValueMapperWithKey > >>>>>>>>> > >>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction > >>>>>>>>> > >>>>>>>>> For this approach, AbstractRichFunction does not make sense > anymore, > >>>>>> as > >>>>>>>>> the only purpose of `RichFunction` is to allow the > implementation of > >>>>>>>>> init() and close() -- if you don't want those, you would > implement a > >>>>>>>>> different interface (ie, ValueMapperWithKey) > >>>>>>>>> > >>>>>>>>> As an alternative, we could argue, that it is sufficient to > support > >>>>>>>>> Lambdas for the "plain" API only, but not for any "extended API". > >>>>> For > >>>>>>>>> this, RichFunction could add key+init+close and > AbstractRichFunction > >>>>>>>>> would allow to only care about getting the key. > >>>>>>>>> > >>>>>>>>> Not sure, which one is better. I don't like the idea of more > >>>>>> overloaded > >>>>>>>>> methods to get Lambdas for `withKey` interfaces too much because > we > >>>>>> have > >>>>>>>>> already so many overlaods. On the other hand, I do see value in > >>>>>>>>> supporting Lambdas for `withKey`. > >>>>>>>>> > >>>>>>>>> Depending on what we want to support, it might make sense to > >>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this > also > >>>>>>>>> determines if we should have a "ProcessorContext KIP" before > driving > >>>>>>>>> this KIP further. > >>>>>>>>> > >>>>>>>>> Thoughts? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Sorry for super late response. Thanks for your comments. > >>>>>>>>>> > >>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I > >>>>>> cannot > >>>>>>>>>>> follow the explanation in the KIP to see what the problem is. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> - From [1] says "A functional interface is an interface that has > >>>>> just > >>>>>>> one > >>>>>>>>>> abstract method, and thus represents a single function > contract". > >>>>>>>>>> So basically once we extend some interface from another (in our > >>>>> case, > >>>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in > the > >>>>>>>>> extended > >>>>>>>>>> interface. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Further comments: > >>>>>>>>>>> - The KIP get a little hard to read -- can you maybe reformat > the > >>>>>>> wiki > >>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> - I will work on the KIP. > >>>>>>>>>> > >>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods > added > >>>>>> for > >>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any > new > >>>>>>>>>>> overloads) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> - Actually there are more than one Processor and public APIs to > be > >>>>>>>>>> changed (KStream-KTable > >>>>>>>>>> joins is one case). However all of them has similar structure: > we > >>>>>>>>> overload > >>>>>>>>>> the *method* with *methodWithKey*, > >>>>>>>>>> wrap it into the Rich function, send to processor and inside the > >>>>>>>>> processor > >>>>>>>>>> call *init* and *close* methods of the Rich function. > >>>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea with > >>>>> only > >>>>>>>>>> *ValueMapper* as the same can be applied to all changes. > >>>>>>>>>> Anyway I will update the KIP. > >>>>>>>>>> > >>>>>>>>>> - Why do we need `AbstractRichFunction`? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and* > close()* > >>>>>>>>> methods > >>>>>>>>>> in every Rich function with empty body like: > >>>>>>>>>> > >>>>>>>>>> @Override > >>>>>>>>>> void init(ProcessorContext context) {} > >>>>>>>>>> > >>>>>>>>>> @Override > >>>>>>>>>> void close () {} > >>>>>>>>>> > >>>>>>>>>> I thought that we can override them once in > *AbstractRichFunction* > >>>>>> and > >>>>>>>>>> extent new Rich functions from *AbstractRichFunction*. > >>>>>>>>>> Basically this can eliminate code copy-paste and ease the > >>>>>> maintenance. > >>>>>>>>>> > >>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, > >>>>>> Predicate, > >>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but > we > >>>>>>> should > >>>>>>>>>>> discuss all of them and add where it does make sense (e.g., > >>>>>>>>>>> RichForachAction does make sense IMHO) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Definitely agree. As I said, the same technique applies to all > this > >>>>>>>>>> interfaces and I didn't want to explode the KIP, just wanted to > >>>>> give > >>>>>>> the > >>>>>>>>>> overall intuition. > >>>>>>>>>> However, I will update the KIP as I said. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- > >>>>>>> `RichValueXX` > >>>>>>>>>>> in general -- but why can't we do all this with interfaces > only? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Sure we can. However the main intuition is we should not force > >>>>> users > >>>>>> to > >>>>>>>>>> implement *init(ProcessorContext)* and *close()* functions every > >>>>> time > >>>>>>>>> they > >>>>>>>>>> use Rich functions. > >>>>>>>>>> If one needs, she can override the respective methods. However, > I > >>>>> am > >>>>>>> open > >>>>>>>>>> for discussion. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any > >>>>> further > >>>>>>> than > >>>>>>>>>>> it currently is. So maybe we need another KIP that is done > before > >>>>>>> this? > >>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> That is good point. I wanted to make *init(ProcessorContext)* > >>>>> method > >>>>>>>>>> persistent among the library (which use ProcessorContext as an > >>>>>> input), > >>>>>>>>>> therefore I put *ProcessorContext* as an input. > >>>>>>>>>> So the important question is that (as @dguy and @mjsax > mentioned) > >>>>>>> whether > >>>>>>>>>> continue this KIP without providing users an access to > >>>>>>> *ProcessorContext* > >>>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or > >>>>>>>>>> initiate another KIP before this. > >>>>>>>>>> > >>>>>>>>>> [1] > >>>>>>>>>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- > >>>>>>>>> se-8-jls-pfd-diffs.pdf > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> Jeyhun > >>>>>>>>>> > >>>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy < > damian....@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any > >>>>> further > >>>>>>>>> than > >>>>>>>>>>> it currently is. So maybe we need another KIP that is done > before > >>>>>>> this? > >>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too large. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax < > >>>>> matth...@confluent.io > >>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> I agree that that `ProcessorContext` interface is too broad in > >>>>>>> general > >>>>>>>>>>>> -- this is even true for transform/process, and it's also > >>>>> reflected > >>>>>>> in > >>>>>>>>>>>> the API improvement list we want to do. > >>>>>>>>>>>> > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/ > >>>>>>>>>>> Kafka+Streams+Discussions > >>>>>>>>>>>> > >>>>>>>>>>>> So I am wondering, if you question the `RichFunction` > approach in > >>>>>>>>>>>> general? Or if you suggest to either extend the scope of this > KIP > >>>>>> to > >>>>>>>>>>>> include this---or maybe better, do another KIP for it and > delay > >>>>>> this > >>>>>>>>> KIP > >>>>>>>>>>>> until the other one is done? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote: > >>>>>>>>>>>>> Thanks for the KIP. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we > really > >>>>>> want > >>>>>>> to > >>>>>>>>>>>> give > >>>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It has a > >>>>> bunch > >>>>>>> of > >>>>>>>>>>>>> methods on it that seem in-appropriate for some of the DSL > >>>>>> methods, > >>>>>>>>>>> i.e, > >>>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is > >>>>> far > >>>>>>> too > >>>>>>>>>>>>> broad. I think it would be better to have a narrower > interface > >>>>>> like > >>>>>>>>> the > >>>>>>>>>>>>> `RecordContext` - remembering it is easier to add > >>>>>>> methods/interfaces > >>>>>>>>>>>> later > >>>>>>>>>>>>> than to remove them > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax < > >>>>>> matth...@confluent.io > >>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Jeyhun, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little > bit? > >>>>> I > >>>>>>>>>>> cannot > >>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem > is. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's up to > >>>>> you. > >>>>>>>>>>> Maybe a > >>>>>>>>>>>>>> committer can comment on this? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Further comments: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe > reformat > >>>>>> the > >>>>>>>>>>> wiki > >>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods > >>>>>> added > >>>>>>>>> for > >>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add > any > >>>>>> new > >>>>>>>>>>>>>> overloads) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, > >>>>>>>>>>> Predicate, > >>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, > but > >>>>> we > >>>>>>>>>>> should > >>>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g., > >>>>>>>>>>>>>> RichForachAction does make sense IMHO) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- > >>>>>>>>>>> `RichValueXX` > >>>>>>>>>>>>>> in general -- but why can't we do all this with interfaces > >>>>> only? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend the two > >>>>>>>>> interfaces > >>>>>>>>>>>> if > >>>>>>>>>>>>>> we > >>>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should > >>>>>> change > >>>>>>>>>>> the > >>>>>>>>>>>>>>> title, because now we are not limiting the KIP to only > >>>>>>> ValueMapper, > >>>>>>>>>>>>>>> ValueTransformer and ValueJoiner. > >>>>>>>>>>>>>>> Please feel free to comment. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >>>>>>>>>>> ValueMapper%2C+and+ValueJoiner > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < > >>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't > need > >>>>> the > >>>>>>> new > >>>>>>>>>>>>>>>> overlaod. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens when > >>>>>>> building > >>>>>>>>>>> the > >>>>>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we > >>>>> don't > >>>>>>>>> need > >>>>>>>>>>>> any > >>>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`) > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Thanks for feedback. > >>>>>>>>>>>>>>>>> Then we need to overload method > >>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ? > >>>>>> extends > >>>>>>>>>>> VR> > >>>>>>>>>>>>>>>>> mapper); > >>>>>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? > super > >>>>> V, > >>>>>> ? > >>>>>>>>>>>> extends > >>>>>>>>>>>>>>>> VR> > >>>>>>>>>>>>>>>>> mapper); > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to check > it > >>>>> is > >>>>>>>>>>>>>>>> ValueMapper > >>>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich > >>>>>> function. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < > >>>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> +1 :) > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: > >>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we > >>>>>> should > >>>>>>>>> do > >>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> design a little different. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction and > >>>>>>>>>>>> non-RichFunction > >>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the time. > >>>>>> Thus, > >>>>>>> on > >>>>>>>>>>>> the > >>>>>>>>>>>>>>>> DSL > >>>>>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we > >>>>> wrap > >>>>>> it > >>>>>>>>>>> by a > >>>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. This > >>>>>>>>>>>> RichFunction > >>>>>>>>>>>>>>>>>>> would just forward the call omitting the key: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet): > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements > >>>>> RichValueMapper() > >>>>>> { > >>>>>>>>>>>>>>>>>>>> private ValueMapper userProvidedMapper; // set by > >>>>>>>>> constructor > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> public VR apply(final K key, final V1 value1, > final V2 > >>>>>>>>>>> value2) > >>>>>>>>>>>> { > >>>>>>>>>>>>>>>>>>>> return userProvidedMapper(value1, value2); > >>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> From a performance point of view, I am not sure if the > >>>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have > more > >>>>>>>>> overhead > >>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>>> this approach (we would do more nested method call for > >>>>>>>>>>>>>> non-RichFunction > >>>>>>>>>>>>>>>>>>> which should be more common than RichFunctions). > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I miss > >>>>>>>>> something?) > >>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top level > >>>>>>> interface > >>>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and > >>>>> also > >>>>>>>>>>>>>> interfaces > >>>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes > >>>>>>> required). > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Any thoughts? > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include > >>>>> rich > >>>>>>>>>>>>>> functions. > >>>>>>>>>>>>>>>> I > >>>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep copying > of > >>>>>> keys. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < > >>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Hey Matthias, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the > immutability of > >>>>>> the > >>>>>>>>>>> key > >>>>>>>>>>>> is > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> best approach available. Better than requiring the > key > >>>>> to > >>>>>>> be > >>>>>>>>>>>>>>>>>> serializable > >>>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no > performance > >>>>>> risk. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type > constraints > >>>>>> of > >>>>>>>>>>> some > >>>>>>>>>>>>>>>> kind. > >>>>>>>>>>>>>>>>>> :-) > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax < > >>>>>>>>>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, > it > >>>>>>> should > >>>>>>>>>>>> cover > >>>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure what > to > >>>>> do > >>>>>>>>> about > >>>>>>>>>>>>>> it... > >>>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and name > the > >>>>> key > >>>>>>>>>>>>>> parameter > >>>>>>>>>>>>>>>>>> like > >>>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I > >>>>> would > >>>>>>>>>>> prefer > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no good > idea > >>>>>> how > >>>>>>>>> to > >>>>>>>>>>>> do > >>>>>>>>>>>>>>>>>> this. > >>>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of > corrupted > >>>>>>>>>>>>>> partitioning > >>>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, well, > bad > >>>>>>> luck, > >>>>>>>>>>>> you > >>>>>>>>>>>>>>>> got > >>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep > copy > >>>>>> with > >>>>>>> a > >>>>>>>>>>>>>>>> potential > >>>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm without > any > >>>>>>>>>>>>>> performance > >>>>>>>>>>>>>>>>>>>>> test). > >>>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's > worth > >>>>>> for > >>>>>>>>> you > >>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> apply > >>>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very > >>>>> basic > >>>>>>>>>>>>>> performance > >>>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you want > to > >>>>> do > >>>>>>>>>>> this, > >>>>>>>>>>>>>> look > >>>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and into > >>>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find > the > >>>>>> perf > >>>>>>>>>>>> test. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>>>>> Hi Matthias, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions > totally > >>>>>>>>> makes > >>>>>>>>>>>>>>>> sense. > >>>>>>>>>>>>>>>>>>>>> So, > >>>>>>>>>>>>>>>>>>>>>>> we don't want to guard the keys because it is > >>>>> costly. > >>>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should > not be > >>>>>>>>>>> limited > >>>>>>>>>>>>>> only > >>>>>>>>>>>>>>>>>>>>> the 3 > >>>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of > >>>>>>> interfaces. > >>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax < > >>>>>>>>>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the user > list: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= > >>>>>>>>>>>>>>>>>>>>>> > >>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ > >>>>>>>>>>> key+ > >>>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder, > >>>>> and > >>>>>>>>>>>>>> Substractor > >>>>>>>>>>>>>>>>>>>>>>>> inferface, too. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are other > >>>>> interface > >>>>>>> we > >>>>>>>>>>>> might > >>>>>>>>>>>>>>>>>> miss > >>>>>>>>>>>>>>>>>>>>>> atm. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am > >>>>>>> actually > >>>>>>>>>>> a > >>>>>>>>>>>>>>>> little > >>>>>>>>>>>>>>>>>>>>> bit > >>>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might > want to > >>>>>> do > >>>>>>>>>>> some > >>>>>>>>>>>>>> test > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>> quantify this impact. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of > `RichFunction` > >>>>>>>>>>>> interface > >>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata (like > >>>>>>>>>>> timestamp, > >>>>>>>>>>>>>>>>>> offset, > >>>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a > similar > >>>>>>>>> concept. > >>>>>>>>>>>>>>>> Thus, I > >>>>>>>>>>>>>>>>>>>>> am > >>>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the > >>>>> scope > >>>>>>> of > >>>>>>>>>>>> this > >>>>>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>> that? WDYT? > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu, > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach > >>>>> and > >>>>>>>>>>> updated > >>>>>>>>>>>>>> PR > >>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in > Processors > >>>>>>>>>>> sending > >>>>>>>>>>>> a > >>>>>>>>>>>>>>>> copy > >>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>>>>>>>>> actual key. > >>>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I > think > >>>>>>> memory > >>>>>>>>>>>> can > >>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>> bottleneck > >>>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < > >>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper > (...etc) to > >>>>>> be > >>>>>>>>>>>>>> classes, > >>>>>>>>>>>>>>>>>>>>> rather > >>>>>>>>>>>>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards > incompatible > >>>>>>>>>>> change. > >>>>>>>>>>>>>> An > >>>>>>>>>>>>>>>>>>>>>>>> alternative > >>>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible > would > >>>>> be > >>>>>>> to > >>>>>>>>>>>>>> define > >>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those > >>>>>>> interfaces > >>>>>>>>>>>> are > >>>>>>>>>>>>>>>>>> used. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as > "final" > >>>>>>>>>>> doesn't > >>>>>>>>>>>>>>>> change > >>>>>>>>>>>>>>>>>>>>>> much > >>>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change. It only > >>>>> prevents > >>>>>>> the > >>>>>>>>>>>>>>>>>> parameter > >>>>>>>>>>>>>>>>>>>>>>>> variable > >>>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type is a > >>>>> mutable > >>>>>>>>>>> object > >>>>>>>>>>>>>>>> (eg. > >>>>>>>>>>>>>>>>>>>>>>>> byte[]), > >>>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But > >>>>> I'm > >>>>>>> not > >>>>>>>>>>>>>> really > >>>>>>>>>>>>>>>>>> sure > >>>>>>>>>>>>>>>>>>>>>>>> there's > >>>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov > < > >>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can > guard > >>>>> for > >>>>>>>>>>>>>> immutable > >>>>>>>>>>>>>>>>>> keys > >>>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I > >>>>> didn't > >>>>>>>>>>>> consider > >>>>>>>>>>>>>>>>>>>>>> backward > >>>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In > both > >>>>>>>>> cases, > >>>>>>>>>>>>>> user > >>>>>>>>>>>>>>>>>>>>>> accesses > >>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type > >>>>> parameter > >>>>>>>>> will > >>>>>>>>>>>>>> break > >>>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to cast > to > >>>>>>> actual > >>>>>>>>>>>> key > >>>>>>>>>>>>>>>>>> type. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with > 2 > >>>>>>>>> argument > >>>>>>>>>>>>>> (key > >>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>> value) > >>>>>>>>>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this, I > >>>>>> think > >>>>>>> we > >>>>>>>>>>>> can > >>>>>>>>>>>>>>>>>>>>> address > >>>>>>>>>>>>>>>>>>>>>>>> both > >>>>>>>>>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against > key > >>>>>>> change. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess > like: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> public class KeyAccess { > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Object key; > >>>>>>>>>>>>>>>>>>>>>>>>>>>> public void beforeApply(final Object > key) { > >>>>>>>>>>>>>>>>>>>>>>>>>>>> this.key = key; > >>>>>>>>>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>>>>>>>>> public Object getKey() { > >>>>>>>>>>>>>>>>>>>>>>>>>>>> return key; > >>>>>>>>>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and > >>>>>>>>>>>>>>>> *ValueTransformer* > >>>>>>>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example > >>>>>>>>>>>>>>>>>>>>>> *KTableMapValuesProcessor*) > >>>>>>>>>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can > set > >>>>> the > >>>>>>>>>>> *key* > >>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user > can > >>>>>> use > >>>>>>>>>>>>>>>> *getKey()* > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> access > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. > Sax < > >>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP! > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to > address: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward > >>>>>>> compatibility. > >>>>>>>>>>>> Users > >>>>>>>>>>>>>>>> did > >>>>>>>>>>>>>>>>>>>>>>>>>>> complain > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as > the > >>>>>> user > >>>>>>>>>>> base > >>>>>>>>>>>>>>>>>> grows, > >>>>>>>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in > >>>>> future > >>>>>>>>>>>> releases > >>>>>>>>>>>>>>>>>>>>> anymore. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to > allow > >>>>> key > >>>>>>>>>>>> access. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same > direction > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile > >>>>>> failures > >>>>>>>>>>> that > >>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>> need > >>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no > guard > >>>>> to > >>>>>>>>>>> prevent > >>>>>>>>>>>>>>>> user > >>>>>>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt > partitioning > >>>>> if > >>>>>>>>>>> users > >>>>>>>>>>>> do > >>>>>>>>>>>>>>>>>> alter > >>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not > aware > >>>>>>> that > >>>>>>>>>>>> they > >>>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) > and > >>>>>> thus > >>>>>>>>>>> break > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> application. (This was the original > motivation > >>>>> to > >>>>>>> not > >>>>>>>>>>>>>> provide > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> key > >>>>>>>>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against > >>>>>>> modification.) > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, > have > >>>>>>> wished > >>>>>>>>>>> for > >>>>>>>>>>>>>>>>>> access > >>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar > >>>>>> operation. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile > >>>>> failures > >>>>>>>>> that > >>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>> need > >>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) > But > >>>>>> at > >>>>>>>>>>> least > >>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>>>>> all > >>>>>>>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun > Karimov > >>>>> < > >>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear community, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues > >>>>>>>>>>> KAFKA-4218 > >>>>>>>>>>>>>> [2], > >>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-4726 > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3], > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be > found at > >>>>>>> [5]. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ > >>>>>>>>>>> confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira > >>>>>>>>> /browse/KAFKA-4218 > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira > >>>>>>>>> /browse/KAFKA-4726 > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira > >>>>>>>>> /browse/KAFKA-3745 > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [5] > https://github.com/apache/kafka/pull/2946 > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- -Cheers Jeyhun