Hi, Thanks. I initiated the PR as well, to get a better overview.
The only reason that I used abstract class instead of interface for Rich functions is that in future if we will have some AbstractRichFunction abstract classes, we can easily extend: public abstract class RichValueMapper<K, V, VR> implements ValueMapperWithKey<K, V, VR>, RichFunction *extends AbstractRichFunction*{ } With interfaces we are only limited to interfaces for inheritance. However, I think we can easily change it (from abstract class -> interface) if you think interface is a better fit. Cheers, Jeyhun On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax <matth...@confluent.io> wrote: > Thanks for the update and explanations. The KIP is quite improved now -- > great job! > > One more question: Why are RichValueMapper etc abstract classes and not > interfaces? > > > Overall, I like the KIP a lot! > > > -Matthias > > > On 5/16/17 7:03 AM, Jeyhun Karimov wrote: > > Hi, > > > > Thanks for your comments. > > > > I think supporting Lambdas for `withKey` and `AbstractRichFunction` > >> don't go together, as Lambdas are only supported for interfaces AFAIK. > > > > > > Maybe I misunderstood your comment. > > *withKey* and and *withOnlyValue* are interfaces. So they don't have > direct > > relation with *AbstractRichFunction*. > > *withKey* and and *withOnlyValue* interfaces have only one method , so > we > > can use lambdas. > > Where does the *AbstractRichFunction* comes to play? Inside Rich > functions. > > And we use Rich functions in 2 places: > > > > 1. User doesn't use rich functions. Just regular *withKey* and and > > *withOnlyValue* interfaces(both support lambdas) . We get those > interfaces > > and wrap into Rich function while building the topology, and send to > > Processor. > > 2. User does use rich functions (Rich functions implement *withKey* > > interface). As a result no lamdas here by definition. In this case, while > > building the topology we do a type check if the object type is > > *withKey* or *RichFunction*. > > > > So *AbstractRichFunction* is just syntactic sugar for Rich functions and > > does not affect using lambdas. > > > > Thus, if we want to support Lambdas for `withKey`, we need to have a > >> interface approach like this > >> - RichFunction -> only adding init() and close() > >> - ValueMapper > >> - ValueMapperWithKey > >> - RichValueMapper extends ValueMapperWithKey, RichFunction > > > > > > As I said above, currently we support lambdas for *withKey* interfaces as > > well. However, I agree with your idea and I will remove the > > AbstractRichFunction from the design. > > > > As an alternative, we could argue, that it is sufficient to support > >> Lambdas for the "plain" API only, but not for any "extended API". For > >> this, RichFunction could add key+init+close and AbstractRichFunction > >> would allow to only care about getting the key. > >> Not sure, which one is better. I don't like the idea of more overloaded > >> methods to get Lambdas for `withKey` interfaces too much because we have > >> already so many overlaods. On the other hand, I do see value in > >> supporting Lambdas for `withKey`. > > > > > > Just to clarify, with current design we have only one extra overloaded > > method per *withOnlyValue* interface: which is *withKey* version of > > particular interface. > > We don't need extra overload for Rich function as Rich function > implements > > *withKey* interface as a result they have same type. We differentiate > them > > while building the topology. > > We supported lambdas for *withKey* APIs because of the comment: > > > > @Jeyhun: I did not put any thought into this, but can we have a design > >> that allows for both? Also, with regard to lambdas, it might make sense > >> to allow for both `V -> newV` and `(K, V) -> newV` ? > > > > > > However, I don't think that this complicates the overall design > > significantly. > > > > > > Depending on what we want to support, it might make sense to > >> include/exclude RichFunctions from this KIP -- and thus, this also > >> determines if we should have a "ProcessorContext KIP" before driving > >> this KIP further. > > > > > > Based on our discussion I think we should keep Rich functions as I don't > > think that they bring extra layer of overhead to library. > > > > Any comments are appreciated. > > > > Cheers, > > Jeyhun > > > > > > On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax <matth...@confluent.io> > > wrote: > > > >> Jeyhun, > >> > >> thanks for the update. > >> > >> I think supporting Lambdas for `withKey` and `AbstractRichFunction` > >> don't go together, as Lambdas are only supported for interfaces AFAIK. > >> > >> Thus, if we want to support Lambdas for `withKey`, we need to have a > >> interface approach like this > >> > >> - RichFunction -> only adding init() and close() > >> > >> - ValueMapper > >> - ValueMapperWithKey > >> > >> - RichValueMapper extends ValueMapperWithKey, RichFunction > >> > >> For this approach, AbstractRichFunction does not make sense anymore, as > >> the only purpose of `RichFunction` is to allow the implementation of > >> init() and close() -- if you don't want those, you would implement a > >> different interface (ie, ValueMapperWithKey) > >> > >> As an alternative, we could argue, that it is sufficient to support > >> Lambdas for the "plain" API only, but not for any "extended API". For > >> this, RichFunction could add key+init+close and AbstractRichFunction > >> would allow to only care about getting the key. > >> > >> Not sure, which one is better. I don't like the idea of more overloaded > >> methods to get Lambdas for `withKey` interfaces too much because we have > >> already so many overlaods. On the other hand, I do see value in > >> supporting Lambdas for `withKey`. > >> > >> Depending on what we want to support, it might make sense to > >> include/exclude RichFunctions from this KIP -- and thus, this also > >> determines if we should have a "ProcessorContext KIP" before driving > >> this KIP further. > >> > >> Thoughts? > >> > >> > >> > >> > >> -Matthias > >> > >> > >> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: > >>> Hi, > >>> > >>> Sorry for super late response. Thanks for your comments. > >>> > >>> I am not an expert on Lambdas. Can you elaborate a little bit? I cannot > >>>> follow the explanation in the KIP to see what the problem is. > >>> > >>> > >>> - From [1] says "A functional interface is an interface that has just > one > >>> abstract method, and thus represents a single function contract". > >>> So basically once we extend some interface from another (in our case, > >>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in the > >> extended > >>> interface. > >>> > >>> > >>> Further comments: > >>>> - The KIP get a little hard to read -- can you maybe reformat the > wiki > >>>> page a little bit? I think using `CodeBlock` would help. > >>> > >>> > >>> - I will work on the KIP. > >>> > >>> - What about KStream-KTable joins? You don't have overlaods added for > >>>> them. Why? (Even if I still hope that we don't need to add any new > >>>> overloads) > >>> > >>> > >>> - Actually there are more than one Processor and public APIs to be > >>> changed (KStream-KTable > >>> joins is one case). However all of them has similar structure: we > >> overload > >>> the *method* with *methodWithKey*, > >>> wrap it into the Rich function, send to processor and inside the > >> processor > >>> call *init* and *close* methods of the Rich function. > >>> As I wrote in KIP, I wanted to demonstrate the overall idea with only > >>> *ValueMapper* as the same can be applied to all changes. > >>> Anyway I will update the KIP. > >>> > >>> - Why do we need `AbstractRichFunction`? > >>> > >>> > >>> Instead of overriding the *init(ProcessorContext p)* and* close()* > >> methods > >>> in every Rich function with empty body like: > >>> > >>> @Override > >>> void init(ProcessorContext context) {} > >>> > >>> @Override > >>> void close () {} > >>> > >>> I thought that we can override them once in *AbstractRichFunction* and > >>> extent new Rich functions from *AbstractRichFunction*. > >>> Basically this can eliminate code copy-paste and ease the maintenance. > >>> > >>> - What about interfaces Initializer, ForeachAction, Merger, Predicate, > >>>> Reducer? I don't want to say we should/need to add to all, but we > should > >>>> discuss all of them and add where it does make sense (e.g., > >>>> RichForachAction does make sense IMHO) > >>> > >>> > >>> Definitely agree. As I said, the same technique applies to all this > >>> interfaces and I didn't want to explode the KIP, just wanted to give > the > >>> overall intuition. > >>> However, I will update the KIP as I said. > >>> > >>> > >>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- > `RichValueXX` > >>>> in general -- but why can't we do all this with interfaces only? > >>> > >>> > >>> Sure we can. However the main intuition is we should not force users to > >>> implement *init(ProcessorContext)* and *close()* functions every time > >> they > >>> use Rich functions. > >>> If one needs, she can override the respective methods. However, I am > open > >>> for discussion. > >>> > >>> > >>> I'd rather not see the use of `ProcessorContext` spread any further > than > >>>> it currently is. So maybe we need another KIP that is done before > this? > >>>> Otherwise i think the scope of this KIP is becoming too large. > >>> > >>> > >>> That is good point. I wanted to make *init(ProcessorContext)* method > >>> persistent among the library (which use ProcessorContext as an input), > >>> therefore I put *ProcessorContext* as an input. > >>> So the important question is that (as @dguy and @mjsax mentioned) > whether > >>> continue this KIP without providing users an access to > *ProcessorContext* > >>> (change *init (ProcessorContext)* to * init()* ) or > >>> initiate another KIP before this. > >>> > >>> [1] > >>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- > >> se-8-jls-pfd-diffs.pdf > >>> > >>> > >>> Cheers, > >>> Jeyhun > >>> > >>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy <damian....@gmail.com> > >> wrote: > >>> > >>>> I'd rather not see the use of `ProcessorContext` spread any further > >> than > >>>> it currently is. So maybe we need another KIP that is done before > this? > >>>> Otherwise i think the scope of this KIP is becoming too large. > >>>> > >>>> > >>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax <matth...@confluent.io> > >>>> wrote: > >>>> > >>>>> I agree that that `ProcessorContext` interface is too broad in > general > >>>>> -- this is even true for transform/process, and it's also reflected > in > >>>>> the API improvement list we want to do. > >>>>> > >>>>> https://cwiki.apache.org/confluence/display/KAFKA/ > >>>> Kafka+Streams+Discussions > >>>>> > >>>>> So I am wondering, if you question the `RichFunction` approach in > >>>>> general? Or if you suggest to either extend the scope of this KIP to > >>>>> include this---or maybe better, do another KIP for it and delay this > >> KIP > >>>>> until the other one is done? > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> On 5/15/17 2:35 AM, Damian Guy wrote: > >>>>>> Thanks for the KIP. > >>>>>> > >>>>>> I'm not convinced on the `RichFunction` approach. Do we really want > to > >>>>> give > >>>>>> every DSL method access to the `ProcessorContext` ? It has a bunch > of > >>>>>> methods on it that seem in-appropriate for some of the DSL methods, > >>>> i.e, > >>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is far > too > >>>>>> broad. I think it would be better to have a narrower interface like > >> the > >>>>>> `RecordContext` - remembering it is easier to add > methods/interfaces > >>>>> later > >>>>>> than to remove them > >>>>>> > >>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax <matth...@confluent.io > > > >>>>> wrote: > >>>>>> > >>>>>>> Jeyhun, > >>>>>>> > >>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I > >>>> cannot > >>>>>>> follow the explanation in the KIP to see what the problem is. > >>>>>>> > >>>>>>> For updating the KIP title I don't know -- guess it's up to you. > >>>> Maybe a > >>>>>>> committer can comment on this? > >>>>>>> > >>>>>>> > >>>>>>> Further comments: > >>>>>>> > >>>>>>> - The KIP get a little hard to read -- can you maybe reformat the > >>>> wiki > >>>>>>> page a little bit? I think using `CodeBlock` would help. > >>>>>>> > >>>>>>> - What about KStream-KTable joins? You don't have overlaods added > >> for > >>>>>>> them. Why? (Even if I still hope that we don't need to add any new > >>>>>>> overloads) > >>>>>>> > >>>>>>> - Why do we need `AbstractRichFunction`? > >>>>>>> > >>>>>>> - What about interfaces Initializer, ForeachAction, Merger, > >>>> Predicate, > >>>>>>> Reducer? I don't want to say we should/need to add to all, but we > >>>> should > >>>>>>> discuss all of them and add where it does make sense (e.g., > >>>>>>> RichForachAction does make sense IMHO) > >>>>>>> > >>>>>>> > >>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- > >>>> `RichValueXX` > >>>>>>> in general -- but why can't we do all this with interfaces only? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Thanks for your comments. I think we cannot extend the two > >> interfaces > >>>>> if > >>>>>>> we > >>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should change > >>>> the > >>>>>>>> title, because now we are not limiting the KIP to only > ValueMapper, > >>>>>>>> ValueTransformer and ValueJoiner. > >>>>>>>> Please feel free to comment. > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>>>> > >>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >>>> ValueMapper%2C+and+ValueJoiner > >>>>>>>> > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Jeyhun > >>>>>>>> > >>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < > >>>> matth...@confluent.io> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't need the > new > >>>>>>>>> overlaod. > >>>>>>>>> > >>>>>>>>> And yes, we need to do one check -- but this happens when > building > >>>> the > >>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we don't > >> need > >>>>> any > >>>>>>>>> check as we will always use `ValueMapperWithKey`) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Thanks for feedback. > >>>>>>>>>> Then we need to overload method > >>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ? extends > >>>> VR> > >>>>>>>>>> mapper); > >>>>>>>>>> with > >>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? super V, ? > >>>>> extends > >>>>>>>>> VR> > >>>>>>>>>> mapper); > >>>>>>>>>> > >>>>>>>>>> and in runtime (inside processor) we still have to check it is > >>>>>>>>> ValueMapper > >>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich function. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> Jeyhun > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < > >>>>>>>>>> michal.borowie...@openbet.com> wrote: > >>>>>>>>>> > >>>>>>>>>>> +1 :) > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we should > >> do > >>>>> the > >>>>>>>>>>>> design a little different. > >>>>>>>>>>>> > >>>>>>>>>>>> Instead of distinguishing between a RichFunction and > >>>>> non-RichFunction > >>>>>>>>> at > >>>>>>>>>>>> runtime level, we would use RichFunctions all the time. Thus, > on > >>>>> the > >>>>>>>>> DSL > >>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we wrap it > >>>> by a > >>>>>>>>>>>> RichFunction that is fully implemented by Streams. This > >>>>> RichFunction > >>>>>>>>>>>> would just forward the call omitting the key: > >>>>>>>>>>>> > >>>>>>>>>>>> Just to sketch the idea (incomplete code snippet): > >>>>>>>>>>>> > >>>>>>>>>>>>> public StreamsRichValueMapper implements RichValueMapper() { > >>>>>>>>>>>>> private ValueMapper userProvidedMapper; // set by > >> constructor > >>>>>>>>>>>>> > >>>>>>>>>>>>> public VR apply(final K key, final V1 value1, final V2 > >>>> value2) > >>>>> { > >>>>>>>>>>>>> return userProvidedMapper(value1, value2); > >>>>>>>>>>>>> } > >>>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> From a performance point of view, I am not sure if the > >>>>>>>>>>>> "if(isRichFunction)" including casts etc would have more > >> overhead > >>>>>>> than > >>>>>>>>>>>> this approach (we would do more nested method call for > >>>>>>> non-RichFunction > >>>>>>>>>>>> which should be more common than RichFunctions). > >>>>>>>>>>>> > >>>>>>>>>>>> This approach should not effect lambdas (or do I miss > >> something?) > >>>>> and > >>>>>>>>>>>> might be cleaner, as we could have one more top level > interface > >>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and also > >>>>>>> interfaces > >>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes > required). > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Any thoughts? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: > >>>>>>>>>>>>> Hi, > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include rich > >>>>>>> functions. > >>>>>>>>> I > >>>>>>>>>>>>> will still have to evaluate the cost of deep copying of keys. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < > >>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hey Matthias, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> My opinion would be that documenting the immutability of the > >>>> key > >>>>> is > >>>>>>>>> the > >>>>>>>>>>>>>> best approach available. Better than requiring the key to > be > >>>>>>>>>>> serializable > >>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no performance risk. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> It'd be different if Java had immutable type constraints of > >>>> some > >>>>>>>>> kind. > >>>>>>>>>>> :-) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax < > >>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, it > should > >>>>> cover > >>>>>>>>>>>>>>> all(?) DSL interfaces. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> About guarding the key -- I am still not sure what to do > >> about > >>>>>>> it... > >>>>>>>>>>>>>>> Maybe it might be enough to document it (and name the key > >>>>>>> parameter > >>>>>>>>>>> like > >>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I would > >>>> prefer > >>>>>>> to > >>>>>>>>>>>>>>> guard against any modification, but I have no good idea how > >> to > >>>>> do > >>>>>>>>>>> this. > >>>>>>>>>>>>>>> Not sure what others think about the risk of corrupted > >>>>>>> partitioning > >>>>>>>>>>>>>>> (what would be a user error and we could say, well, bad > luck, > >>>>> you > >>>>>>>>> got > >>>>>>>>>>> a > >>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep copy with > a > >>>>>>>>> potential > >>>>>>>>>>>>>>> performance hit (that we can't quantity atm without any > >>>>>>> performance > >>>>>>>>>>>>>> test). > >>>>>>>>>>>>>>> We do have a performance system test. Maybe it's worth for > >> you > >>>>> to > >>>>>>>>>>> apply > >>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very basic > >>>>>>> performance > >>>>>>>>>>>>>>> test only, but might give some insight. If you want to do > >>>> this, > >>>>>>> look > >>>>>>>>>>>>>>> into folder "tests" for general test setup, and into > >>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find the perf > >>>>> test. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>> Hi Matthias, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions totally > >> makes > >>>>>>>>> sense. > >>>>>>>>>>>>>> So, > >>>>>>>>>>>>>>>> we don't want to guard the keys because it is costly. > >>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should not be > >>>> limited > >>>>>>> only > >>>>>>>>>>>>>> the 3 > >>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of > interfaces. > >>>>>>>>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax < > >>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> One follow up. There was this email on the user list: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= > >>>>>>>>>>>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ > >>>> key+ > >>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder, and > >>>>>>> Substractor > >>>>>>>>>>>>>>>>> inferface, too. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> And we should double check if there are other interface > we > >>>>> might > >>>>>>>>>>> miss > >>>>>>>>>>>>>>> atm. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: > >>>>>>>>>>>>>>>>>> Thanks for updating the KIP. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am > actually > >>>> a > >>>>>>>>> little > >>>>>>>>>>>>>> bit > >>>>>>>>>>>>>>>>>> worried about performance impact... We might want to do > >>>> some > >>>>>>> test > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> quantify this impact. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of `RichFunction` > >>>>> interface > >>>>>>>>> that > >>>>>>>>>>>>>>>>>> would allow users to access record metadata (like > >>>> timestamp, > >>>>>>>>>>> offset, > >>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a similar > >> concept. > >>>>>>>>> Thus, I > >>>>>>>>>>>>>> am > >>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the scope > of > >>>>> this > >>>>>>>>> KIP > >>>>>>>>>>> by > >>>>>>>>>>>>>>>>>> that? WDYT? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>> Hi Mathieu, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach and > >>>> updated > >>>>>>> PR > >>>>>>>>>>> and > >>>>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in Processors > >>>> sending > >>>>> a > >>>>>>>>> copy > >>>>>>>>>>>>>> of > >>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>> actual key. > >>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I think > memory > >>>>> can > >>>>>>> be > >>>>>>>>>>>>>>>>> bottleneck > >>>>>>>>>>>>>>>>>>> in some use-cases. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < > >>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper (...etc) to be > >>>>>>> classes, > >>>>>>>>>>>>>> rather > >>>>>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards incompatible > >>>> change. > >>>>>>> An > >>>>>>>>>>>>>>>>> alternative > >>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible would be > to > >>>>>>> define > >>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those > interfaces > >>>>> are > >>>>>>>>>>> used. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as "final" > >>>> doesn't > >>>>>>>>> change > >>>>>>>>>>>>>>> much > >>>>>>>>>>>>>>>>>>>> about guarding against key change. It only prevents > the > >>>>>>>>>>> parameter > >>>>>>>>>>>>>>>>> variable > >>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type is a mutable > >>>> object > >>>>>>>>> (eg. > >>>>>>>>>>>>>>>>> byte[]), > >>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But I'm > not > >>>>>>> really > >>>>>>>>>>> sure > >>>>>>>>>>>>>>>>> there's > >>>>>>>>>>>>>>>>>>>> much that can be done about that. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov < > >>>>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Thanks for comments. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can guard for > >>>>>>> immutable > >>>>>>>>>>> keys > >>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I didn't > >>>>> consider > >>>>>>>>>>>>>>> backward > >>>>>>>>>>>>>>>>>>>>> compatibility. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In both > >> cases, > >>>>>>> user > >>>>>>>>>>>>>>> accesses > >>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type parameter > >> will > >>>>>>> break > >>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to cast to > actual > >>>>> key > >>>>>>>>>>> type. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with 2 > >> argument > >>>>>>> (key > >>>>>>>>>>> and > >>>>>>>>>>>>>>>>> value) > >>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this, I think > we > >>>>> can > >>>>>>>>>>>>>> address > >>>>>>>>>>>>>>>>> both > >>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against key > change. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess like: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> public class KeyAccess { > >>>>>>>>>>>>>>>>>>>>> Object key; > >>>>>>>>>>>>>>>>>>>>> public void beforeApply(final Object key) { > >>>>>>>>>>>>>>>>>>>>> this.key = key; > >>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>> public Object getKey() { > >>>>>>>>>>>>>>>>>>>>> return key; > >>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and > >>>>>>>>> *ValueTransformer* > >>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example > >>>>>>>>>>>>>>> *KTableMapValuesProcessor*) > >>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can set the > >>>> *key* > >>>>> by > >>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user can use > >>>>>>>>> *getKey()* > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>> access > >>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax < > >>>>>>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Jeyhun, > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP! > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to address: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward > compatibility. > >>>>> Users > >>>>>>>>> did > >>>>>>>>>>>>>>>>>>>> complain > >>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as the user > >>>> base > >>>>>>>>>>> grows, > >>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in future > >>>>> releases > >>>>>>>>>>>>>> anymore. > >>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to allow key > >>>>> access. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same direction > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile failures > >>>> that > >>>>>>>>> would > >>>>>>>>>>>>>> need > >>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) > >>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no guard to > >>>> prevent > >>>>>>>>> user > >>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt partitioning if > >>>> users > >>>>> do > >>>>>>>>>>> alter > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not aware > that > >>>>> they > >>>>>>>>> are > >>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) and thus > >>>> break > >>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> application. (This was the original motivation to > not > >>>>>>> provide > >>>>>>>>>>> the > >>>>>>>>>>>>>>> key > >>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against > modification.) > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: > >>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, have > wished > >>>> for > >>>>>>>>>>> access > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar operation. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile failures > >> that > >>>>>>> would > >>>>>>>>>>>>>> need > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-) But at > >>>> least > >>>>>>> it > >>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>> all > >>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov < > >>>>>>>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> Dear community, > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues > >>>> KAFKA-4218 > >>>>>>> [2], > >>>>>>>>>>>>>>>>>>>> KAFKA-4726 > >>>>>>>>>>>>>>>>>>>>>> [3], > >>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be found at > [5]. > >>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ > >>>> confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner > >>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira > >> /browse/KAFKA-4218 > >>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira > >> /browse/KAFKA-4726 > >>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira > >> /browse/KAFKA-3745 > >>>>>>>>>>>>>>>>>>>>>>>> [5] https://github.com/apache/kafka/pull/2946 > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>> -Cheers > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>> -Cheers > >>>>>>>>>> > >>>>>>>>>> Jeyhun > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>> -Cheers > >>>>>>>> > >>>>>>>> Jeyhun > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >> > >> > > > > -- -Cheers Jeyhun