Hi, Thanks for your comments.
I think supporting Lambdas for `withKey` and `AbstractRichFunction` > don't go together, as Lambdas are only supported for interfaces AFAIK. Maybe I misunderstood your comment. *withKey* and and *withOnlyValue* are interfaces. So they don't have direct relation with *AbstractRichFunction*. *withKey* and and *withOnlyValue* interfaces have only one method , so we can use lambdas. Where does the *AbstractRichFunction* comes to play? Inside Rich functions. And we use Rich functions in 2 places: 1. User doesn't use rich functions. Just regular *withKey* and and *withOnlyValue* interfaces(both support lambdas) . We get those interfaces and wrap into Rich function while building the topology, and send to Processor. 2. User does use rich functions (Rich functions implement *withKey* interface). As a result no lamdas here by definition. In this case, while building the topology we do a type check if the object type is *withKey* or *RichFunction*. So *AbstractRichFunction* is just syntactic sugar for Rich functions and does not affect using lambdas. Thus, if we want to support Lambdas for `withKey`, we need to have a > interface approach like this > - RichFunction -> only adding init() and close() > - ValueMapper > - ValueMapperWithKey > - RichValueMapper extends ValueMapperWithKey, RichFunction As I said above, currently we support lambdas for *withKey* interfaces as well. However, I agree with your idea and I will remove the AbstractRichFunction from the design. As an alternative, we could argue, that it is sufficient to support > Lambdas for the "plain" API only, but not for any "extended API". For > this, RichFunction could add key+init+close and AbstractRichFunction > would allow to only care about getting the key. > Not sure, which one is better. I don't like the idea of more overloaded > methods to get Lambdas for `withKey` interfaces too much because we have > already so many overlaods. On the other hand, I do see value in > supporting Lambdas for `withKey`. Just to clarify, with current design we have only one extra overloaded method per *withOnlyValue* interface: which is *withKey* version of particular interface. We don't need extra overload for Rich function as Rich function implements *withKey* interface as a result they have same type. We differentiate them while building the topology. We supported lambdas for *withKey* APIs because of the comment: @Jeyhun: I did not put any thought into this, but can we have a design > that allows for both? Also, with regard to lambdas, it might make sense > to allow for both `V -> newV` and `(K, V) -> newV` ? However, I don't think that this complicates the overall design significantly. Depending on what we want to support, it might make sense to > include/exclude RichFunctions from this KIP -- and thus, this also > determines if we should have a "ProcessorContext KIP" before driving > this KIP further. Based on our discussion I think we should keep Rich functions as I don't think that they bring extra layer of overhead to library. Any comments are appreciated. Cheers, Jeyhun On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax <matth...@confluent.io> wrote: > Jeyhun, > > thanks for the update. > > I think supporting Lambdas for `withKey` and `AbstractRichFunction` > don't go together, as Lambdas are only supported for interfaces AFAIK. > > Thus, if we want to support Lambdas for `withKey`, we need to have a > interface approach like this > > - RichFunction -> only adding init() and close() > > - ValueMapper > - ValueMapperWithKey > > - RichValueMapper extends ValueMapperWithKey, RichFunction > > For this approach, AbstractRichFunction does not make sense anymore, as > the only purpose of `RichFunction` is to allow the implementation of > init() and close() -- if you don't want those, you would implement a > different interface (ie, ValueMapperWithKey) > > As an alternative, we could argue, that it is sufficient to support > Lambdas for the "plain" API only, but not for any "extended API". For > this, RichFunction could add key+init+close and AbstractRichFunction > would allow to only care about getting the key. > > Not sure, which one is better. I don't like the idea of more overloaded > methods to get Lambdas for `withKey` interfaces too much because we have > already so many overlaods. On the other hand, I do see value in > supporting Lambdas for `withKey`. > > Depending on what we want to support, it might make sense to > include/exclude RichFunctions from this KIP -- and thus, this also > determines if we should have a "ProcessorContext KIP" before driving > this KIP further. > > Thoughts? > > > > > -Matthias > > > On 5/15/17 11:01 AM, Jeyhun Karimov wrote: > > Hi, > > > > Sorry for super late response. Thanks for your comments. > > > > I am not an expert on Lambdas. Can you elaborate a little bit? I cannot > >> follow the explanation in the KIP to see what the problem is. > > > > > > - From [1] says "A functional interface is an interface that has just one > > abstract method, and thus represents a single function contract". > > So basically once we extend some interface from another (in our case, > > ValueMapperWithKey from ValueMapper) we cannot use lambdas in the > extended > > interface. > > > > > > Further comments: > >> - The KIP get a little hard to read -- can you maybe reformat the wiki > >> page a little bit? I think using `CodeBlock` would help. > > > > > > - I will work on the KIP. > > > > - What about KStream-KTable joins? You don't have overlaods added for > >> them. Why? (Even if I still hope that we don't need to add any new > >> overloads) > > > > > > - Actually there are more than one Processor and public APIs to be > > changed (KStream-KTable > > joins is one case). However all of them has similar structure: we > overload > > the *method* with *methodWithKey*, > > wrap it into the Rich function, send to processor and inside the > processor > > call *init* and *close* methods of the Rich function. > > As I wrote in KIP, I wanted to demonstrate the overall idea with only > > *ValueMapper* as the same can be applied to all changes. > > Anyway I will update the KIP. > > > > - Why do we need `AbstractRichFunction`? > > > > > > Instead of overriding the *init(ProcessorContext p)* and* close()* > methods > > in every Rich function with empty body like: > > > > @Override > > void init(ProcessorContext context) {} > > > > @Override > > void close () {} > > > > I thought that we can override them once in *AbstractRichFunction* and > > extent new Rich functions from *AbstractRichFunction*. > > Basically this can eliminate code copy-paste and ease the maintenance. > > > > - What about interfaces Initializer, ForeachAction, Merger, Predicate, > >> Reducer? I don't want to say we should/need to add to all, but we should > >> discuss all of them and add where it does make sense (e.g., > >> RichForachAction does make sense IMHO) > > > > > > Definitely agree. As I said, the same technique applies to all this > > interfaces and I didn't want to explode the KIP, just wanted to give the > > overall intuition. > > However, I will update the KIP as I said. > > > > > > Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- `RichValueXX` > >> in general -- but why can't we do all this with interfaces only? > > > > > > Sure we can. However the main intuition is we should not force users to > > implement *init(ProcessorContext)* and *close()* functions every time > they > > use Rich functions. > > If one needs, she can override the respective methods. However, I am open > > for discussion. > > > > > > I'd rather not see the use of `ProcessorContext` spread any further than > >> it currently is. So maybe we need another KIP that is done before this? > >> Otherwise i think the scope of this KIP is becoming too large. > > > > > > That is good point. I wanted to make *init(ProcessorContext)* method > > persistent among the library (which use ProcessorContext as an input), > > therefore I put *ProcessorContext* as an input. > > So the important question is that (as @dguy and @mjsax mentioned) whether > > continue this KIP without providing users an access to *ProcessorContext* > > (change *init (ProcessorContext)* to * init()* ) or > > initiate another KIP before this. > > > > [1] > > http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- > se-8-jls-pfd-diffs.pdf > > > > > > Cheers, > > Jeyhun > > > > On Mon, May 15, 2017 at 7:15 PM, Damian Guy <damian....@gmail.com> > wrote: > > > >> I'd rather not see the use of `ProcessorContext` spread any further > than > >> it currently is. So maybe we need another KIP that is done before this? > >> Otherwise i think the scope of this KIP is becoming too large. > >> > >> > >> On Mon, 15 May 2017 at 18:06 Matthias J. Sax <matth...@confluent.io> > >> wrote: > >> > >>> I agree that that `ProcessorContext` interface is too broad in general > >>> -- this is even true for transform/process, and it's also reflected in > >>> the API improvement list we want to do. > >>> > >>> https://cwiki.apache.org/confluence/display/KAFKA/ > >> Kafka+Streams+Discussions > >>> > >>> So I am wondering, if you question the `RichFunction` approach in > >>> general? Or if you suggest to either extend the scope of this KIP to > >>> include this---or maybe better, do another KIP for it and delay this > KIP > >>> until the other one is done? > >>> > >>> > >>> -Matthias > >>> > >>> On 5/15/17 2:35 AM, Damian Guy wrote: > >>>> Thanks for the KIP. > >>>> > >>>> I'm not convinced on the `RichFunction` approach. Do we really want to > >>> give > >>>> every DSL method access to the `ProcessorContext` ? It has a bunch of > >>>> methods on it that seem in-appropriate for some of the DSL methods, > >> i.e, > >>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is far too > >>>> broad. I think it would be better to have a narrower interface like > the > >>>> `RecordContext` - remembering it is easier to add methods/interfaces > >>> later > >>>> than to remove them > >>>> > >>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax <matth...@confluent.io> > >>> wrote: > >>>> > >>>>> Jeyhun, > >>>>> > >>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I > >> cannot > >>>>> follow the explanation in the KIP to see what the problem is. > >>>>> > >>>>> For updating the KIP title I don't know -- guess it's up to you. > >> Maybe a > >>>>> committer can comment on this? > >>>>> > >>>>> > >>>>> Further comments: > >>>>> > >>>>> - The KIP get a little hard to read -- can you maybe reformat the > >> wiki > >>>>> page a little bit? I think using `CodeBlock` would help. > >>>>> > >>>>> - What about KStream-KTable joins? You don't have overlaods added > for > >>>>> them. Why? (Even if I still hope that we don't need to add any new > >>>>> overloads) > >>>>> > >>>>> - Why do we need `AbstractRichFunction`? > >>>>> > >>>>> - What about interfaces Initializer, ForeachAction, Merger, > >> Predicate, > >>>>> Reducer? I don't want to say we should/need to add to all, but we > >> should > >>>>> discuss all of them and add where it does make sense (e.g., > >>>>> RichForachAction does make sense IMHO) > >>>>> > >>>>> > >>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- > >> `RichValueXX` > >>>>> in general -- but why can't we do all this with interfaces only? > >>>>> > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> > >>>>> > >>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Thanks for your comments. I think we cannot extend the two > interfaces > >>> if > >>>>> we > >>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should change > >> the > >>>>>> title, because now we are not limiting the KIP to only ValueMapper, > >>>>>> ValueTransformer and ValueJoiner. > >>>>>> Please feel free to comment. > >>>>>> > >>>>>> [1] > >>>>>> > >>>>> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >> ValueMapper%2C+and+ValueJoiner > >>>>>> > >>>>>> > >>>>>> Cheers, > >>>>>> Jeyhun > >>>>>> > >>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < > >> matth...@confluent.io> > >>>>>> wrote: > >>>>>> > >>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't need the new > >>>>>>> overlaod. > >>>>>>> > >>>>>>> And yes, we need to do one check -- but this happens when building > >> the > >>>>>>> topology. At runtime (I mean after KafkaStream#start() we don't > need > >>> any > >>>>>>> check as we will always use `ValueMapperWithKey`) > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> > >>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Thanks for feedback. > >>>>>>>> Then we need to overload method > >>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ? extends > >> VR> > >>>>>>>> mapper); > >>>>>>>> with > >>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? super V, ? > >>> extends > >>>>>>> VR> > >>>>>>>> mapper); > >>>>>>>> > >>>>>>>> and in runtime (inside processor) we still have to check it is > >>>>>>> ValueMapper > >>>>>>>> or ValueMapperWithKey before wrapping it into the rich function. > >>>>>>>> > >>>>>>>> > >>>>>>>> Please correct me if I am wrong. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Jeyhun > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < > >>>>>>>> michal.borowie...@openbet.com> wrote: > >>>>>>>> > >>>>>>>>> +1 :) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> I was reading the updated KIP and I am wondering, if we should > do > >>> the > >>>>>>>>>> design a little different. > >>>>>>>>>> > >>>>>>>>>> Instead of distinguishing between a RichFunction and > >>> non-RichFunction > >>>>>>> at > >>>>>>>>>> runtime level, we would use RichFunctions all the time. Thus, on > >>> the > >>>>>>> DSL > >>>>>>>>>> entry level, if a user provides a non-RichFunction, we wrap it > >> by a > >>>>>>>>>> RichFunction that is fully implemented by Streams. This > >>> RichFunction > >>>>>>>>>> would just forward the call omitting the key: > >>>>>>>>>> > >>>>>>>>>> Just to sketch the idea (incomplete code snippet): > >>>>>>>>>> > >>>>>>>>>>> public StreamsRichValueMapper implements RichValueMapper() { > >>>>>>>>>>> private ValueMapper userProvidedMapper; // set by > constructor > >>>>>>>>>>> > >>>>>>>>>>> public VR apply(final K key, final V1 value1, final V2 > >> value2) > >>> { > >>>>>>>>>>> return userProvidedMapper(value1, value2); > >>>>>>>>>>> } > >>>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> From a performance point of view, I am not sure if the > >>>>>>>>>> "if(isRichFunction)" including casts etc would have more > overhead > >>>>> than > >>>>>>>>>> this approach (we would do more nested method call for > >>>>> non-RichFunction > >>>>>>>>>> which should be more common than RichFunctions). > >>>>>>>>>> > >>>>>>>>>> This approach should not effect lambdas (or do I miss > something?) > >>> and > >>>>>>>>>> might be cleaner, as we could have one more top level interface > >>>>>>>>>> `RichFunction` with methods `init()` and `close()` and also > >>>>> interfaces > >>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes required). > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Any thoughts? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -Matthias > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for comments. I extended PR and KIP to include rich > >>>>> functions. > >>>>>>> I > >>>>>>>>>>> will still have to evaluate the cost of deep copying of keys. > >>>>>>>>>>> > >>>>>>>>>>> Cheers, > >>>>>>>>>>> Jeyhun > >>>>>>>>>>> > >>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < > >>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hey Matthias, > >>>>>>>>>>>> > >>>>>>>>>>>> My opinion would be that documenting the immutability of the > >> key > >>> is > >>>>>>> the > >>>>>>>>>>>> best approach available. Better than requiring the key to be > >>>>>>>>> serializable > >>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no performance risk. > >>>>>>>>>>>> > >>>>>>>>>>>> It'd be different if Java had immutable type constraints of > >> some > >>>>>>> kind. > >>>>>>>>> :-) > >>>>>>>>>>>> > >>>>>>>>>>>> Mathieu > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax < > >>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, it should > >>> cover > >>>>>>>>>>>>> all(?) DSL interfaces. > >>>>>>>>>>>>> > >>>>>>>>>>>>> About guarding the key -- I am still not sure what to do > about > >>>>> it... > >>>>>>>>>>>>> Maybe it might be enough to document it (and name the key > >>>>> parameter > >>>>>>>>> like > >>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I would > >> prefer > >>>>> to > >>>>>>>>>>>>> guard against any modification, but I have no good idea how > to > >>> do > >>>>>>>>> this. > >>>>>>>>>>>>> Not sure what others think about the risk of corrupted > >>>>> partitioning > >>>>>>>>>>>>> (what would be a user error and we could say, well, bad luck, > >>> you > >>>>>>> got > >>>>>>>>> a > >>>>>>>>>>>>> bug in your code, that's not our fault), vs deep copy with a > >>>>>>> potential > >>>>>>>>>>>>> performance hit (that we can't quantity atm without any > >>>>> performance > >>>>>>>>>>>> test). > >>>>>>>>>>>>> We do have a performance system test. Maybe it's worth for > you > >>> to > >>>>>>>>> apply > >>>>>>>>>>>>> the deep copy strategy and run the test. It's very basic > >>>>> performance > >>>>>>>>>>>>> test only, but might give some insight. If you want to do > >> this, > >>>>> look > >>>>>>>>>>>>> into folder "tests" for general test setup, and into > >>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find the perf > >>> test. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>> Hi Matthias, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I think extending KIP to include RichFunctions totally > makes > >>>>>>> sense. > >>>>>>>>>>>> So, > >>>>>>>>>>>>>> we don't want to guard the keys because it is costly. > >>>>>>>>>>>>>> If we introduce RichFunctions I think it should not be > >> limited > >>>>> only > >>>>>>>>>>>> the 3 > >>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of interfaces. > >>>>>>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax < > >>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> One follow up. There was this email on the user list: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= > >>>>>>>>>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ > >> key+ > >>>>>>>>>>>>>>> It might make sense so include Initializer, Adder, and > >>>>> Substractor > >>>>>>>>>>>>>>> inferface, too. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> And we should double check if there are other interface we > >>> might > >>>>>>>>> miss > >>>>>>>>>>>>> atm. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: > >>>>>>>>>>>>>>>> Thanks for updating the KIP. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am actually > >> a > >>>>>>> little > >>>>>>>>>>>> bit > >>>>>>>>>>>>>>>> worried about performance impact... We might want to do > >> some > >>>>> test > >>>>>>>>> to > >>>>>>>>>>>>>>>> quantify this impact. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Btw: this remind me about the idea of `RichFunction` > >>> interface > >>>>>>> that > >>>>>>>>>>>>>>>> would allow users to access record metadata (like > >> timestamp, > >>>>>>>>> offset, > >>>>>>>>>>>>>>>> partition etc) within DSL. This would be a similar > concept. > >>>>>>> Thus, I > >>>>>>>>>>>> am > >>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the scope of > >>> this > >>>>>>> KIP > >>>>>>>>> by > >>>>>>>>>>>>>>>> that? WDYT? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>> Hi Mathieu, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach and > >> updated > >>>>> PR > >>>>>>>>> and > >>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in Processors > >> sending > >>> a > >>>>>>> copy > >>>>>>>>>>>> of > >>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>> actual key. > >>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I think memory > >>> can > >>>>> be > >>>>>>>>>>>>>>> bottleneck > >>>>>>>>>>>>>>>>> in some use-cases. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < > >>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> This approach would change ValueMapper (...etc) to be > >>>>> classes, > >>>>>>>>>>>> rather > >>>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>> interfaces, which is also a backwards incompatible > >> change. > >>>>> An > >>>>>>>>>>>>>>> alternative > >>>>>>>>>>>>>>>>>> approach that would be backwards compatible would be to > >>>>> define > >>>>>>>>> new > >>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those interfaces > >>> are > >>>>>>>>> used. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as "final" > >> doesn't > >>>>>>> change > >>>>>>>>>>>>> much > >>>>>>>>>>>>>>>>>> about guarding against key change. It only prevents the > >>>>>>>>> parameter > >>>>>>>>>>>>>>> variable > >>>>>>>>>>>>>>>>>> from being reassigned. If the key type is a mutable > >> object > >>>>>>> (eg. > >>>>>>>>>>>>>>> byte[]), > >>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But I'm not > >>>>> really > >>>>>>>>> sure > >>>>>>>>>>>>>>> there's > >>>>>>>>>>>>>>>>>> much that can be done about that. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov < > >>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks for comments. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can guard for > >>>>> immutable > >>>>>>>>> keys > >>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>> current implementation (with few changes), I didn't > >>> consider > >>>>>>>>>>>>> backward > >>>>>>>>>>>>>>>>>>> compatibility. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In both > cases, > >>>>> user > >>>>>>>>>>>>> accesses > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type parameter > will > >>>>> break > >>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to cast to actual > >>> key > >>>>>>>>> type. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with 2 > argument > >>>>> (key > >>>>>>>>> and > >>>>>>>>>>>>>>> value) > >>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this, I think we > >>> can > >>>>>>>>>>>> address > >>>>>>>>>>>>>>> both > >>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against key change. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess like: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> public class KeyAccess { > >>>>>>>>>>>>>>>>>>> Object key; > >>>>>>>>>>>>>>>>>>> public void beforeApply(final Object key) { > >>>>>>>>>>>>>>>>>>> this.key = key; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> public Object getKey() { > >>>>>>>>>>>>>>>>>>> return key; > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and > >>>>>>> *ValueTransformer* > >>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example > >>>>>>>>>>>>> *KTableMapValuesProcessor*) > >>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can set the > >> *key* > >>> by > >>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user can use > >>>>>>> *getKey()* > >>>>>>>>> to > >>>>>>>>>>>>>>> access > >>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax < > >>>>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Jeyhun, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP! > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I think there are two issues we need to address: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward compatibility. > >>> Users > >>>>>>> did > >>>>>>>>>>>>>>>>>> complain > >>>>>>>>>>>>>>>>>>>> about this in past releases already, and as the user > >> base > >>>>>>>>> grows, > >>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>> should not break backward compatibility in future > >>> releases > >>>>>>>>>>>> anymore. > >>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to allow key > >>> access. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same direction > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile failures > >> that > >>>>>>> would > >>>>>>>>>>>> need > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) > >>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no guard to > >> prevent > >>>>>>> user > >>>>>>>>>>>> code > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt partitioning if > >> users > >>> do > >>>>>>>>> alter > >>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not aware that > >>> they > >>>>>>> are > >>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) and thus > >> break > >>>>> the > >>>>>>>>>>>>>>>>>>>> application. (This was the original motivation to not > >>>>> provide > >>>>>>>>> the > >>>>>>>>>>>>> key > >>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>> the first place -- it's guards against modification.) > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: > >>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, have wished > >> for > >>>>>>>>> access > >>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar operation. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile failures > that > >>>>> would > >>>>>>>>>>>> need > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) But at > >> least > >>>>> it > >>>>>>>>>>>> would > >>>>>>>>>>>>>>> all > >>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>> pretty clear and easy change. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov < > >>>>>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> Dear community, > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues > >> KAFKA-4218 > >>>>> [2], > >>>>>>>>>>>>>>>>>> KAFKA-4726 > >>>>>>>>>>>>>>>>>>>> [3], > >>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be found at [5]. > >>>>>>>>>>>>>>>>>>>>>> I would like to get your comments. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ > >> confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner > >>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira > /browse/KAFKA-4218 > >>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira > /browse/KAFKA-4726 > >>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira > /browse/KAFKA-3745 > >>>>>>>>>>>>>>>>>>>>>> [5] https://github.com/apache/kafka/pull/2946 > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>> -Cheers > >>>>>>>> > >>>>>>>> Jeyhun > >>>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>> -Cheers > >>>>>> > >>>>>> Jeyhun > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > >