Thanks for clarification Matthias, now everything is clear. On Sun, May 28, 2017 at 6:21 PM Matthias J. Sax <matth...@confluent.io> wrote:
> I don't think we can drop ValueTransformerSupplier. If you don't have an > supplier, you only get a single instance of your function. But for a > stateful transformation, we need multiple instances (one for each task) > of ValueTransformer. > > We don't need suppliers for functions like "ValueMapper" etc because > those are stateless and thus, we can reuse a single instance over > multiple tasks -- but we can't do this for ValueTransformer (and similar). > > Btw: This reminds me about KIP-159: with regard to the RichFunction we > might need a supplier pattern, too. (I'll comment on the other thread, > too.) > > > -Matthias > > On 5/28/17 5:45 AM, Jeyhun Karimov wrote: > > Hi, > > > > I updated KIP. > > Just to avoid misunderstanding, I meant deprecating > ValueTransformerSupplier > > and I am ok with ValueTransformer. > > So instead of using ValueTransformerSupplier can't we directly use > > ValueTransformer > > or ValueTransformerWithKey? > > > > Btw, in current design all features of ValueTransformer will be available > > in ValueTransformerWithKey interface. > > > > Cheers, > > Jeyhun > > > > On Sun, May 28, 2017 at 6:15 AM Matthias J. Sax <matth...@confluent.io> > > wrote: > > > >> Thanks Jeyhun. > >> > >> About ValueTransformer: I don't think we can remove it. Note, > >> ValueTransformer allows to attach a state and also allows to register > >> punctuations. Both those features will not be available via withKey() > >> interfaces. > >> > >> -Matthias > >> > >> > >> On 5/27/17 1:25 PM, Jeyhun Karimov wrote: > >>> Hi Matthias, > >>> > >>> Thanks for your comments. > >>> > >>> I tested the deep copy approach. It has significant overhead. > Especially > >>> for "light" and stateless operators it slows down the topology > >>> significantly (> 20% ). I think "warning" users about not-changing the > >> key > >>> is better warning them about possible performance loss. > >>> > >>> About the interfaces, additionally I considered adding > >> InitializerWithKey, > >>> AggregatorWithKey and ValueTransformerWithKey. I think they are > included > >> in > >>> PR but not in KIP. I will also include them in KIP, sorry my bad. > >>> Including ReducerWithKey definitely makes sense. Thanks. > >>> > >>> One thing I want to mention is that, maybe we should deprecate methods > >> with > >>> argument type ValueTransformerSupplier (KStream.transformValues(...)) > and > >>> and as a whole the ValueTransformerSupplier interface. > >>> We can use ValueTransformer/ValueTransformerWithKey type instead > without > >>> additional supplier layer. > >>> > >>> > >>> Cheers, > >>> Jeyhun > >>> > >>> > >>> On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax <matth...@confluent.io > > > >>> wrote: > >>> > >>>> One more question: > >>>> > >>>> Should we add any of > >>>> - InitizialierWithKey > >>>> - ReducerWithKey > >>>> - ValueTransformerWithKey > >>>> > >>>> To get consistent/complete API, it might be a good idea. Any thoughts? > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> > >>>> On 5/24/17 3:47 PM, Matthias J. Sax wrote: > >>>>> Jeyhun, > >>>>> > >>>>> I was just wondering if you did look into the key-deep-copy idea we > >>>>> discussed. I am curious to see what the impact might be. > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> On 5/20/17 2:03 AM, Jeyhun Karimov wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Thanks for your comments. I rethink about including rich functions > >> into > >>>>>> this KIP. > >>>>>> I think once we include rich functions in this KIP and then fix > >>>>>> ProcessorContext in another KIP and incorporate with existing rich > >>>>>> functions, the code will not be backwards compatible. > >>>>>> > >>>>>> I see Damian's and your point more clearly now. > >>>>>> > >>>>>> Conclusion: we include only withKey interfaces in this KIP (I > updated > >>>> the > >>>>>> KIP), I will try also initiate another KIP for rich functions. > >>>>>> > >>>>>> Cheers, > >>>>>> Jeyhun > >>>>>> > >>>>>> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax < > >> matth...@confluent.io > >>>>> > >>>>>> wrote: > >>>>>> > >>>>>>> With the current KIP, using ValueMapper and ValueMapperWithKey > >>>>>>> interfaces, RichFunction seems to be an independent add-on. To fix > >> the > >>>>>>> original issue to allow key access, RichFunctions are not required > >>>> IMHO. > >>>>>>> > >>>>>>> I initially put the RichFunction idea on the table, because I was > >>>> hoping > >>>>>>> to get a uniform API. And I think, is was good to consider them -- > >>>>>>> however, the discussion showed that they are not necessary for key > >>>>>>> access. Thus, it seems to be better to handle RichFunctions in an > own > >>>>>>> KIP. The ProcessorContext/RecordContext issues seems to be a main > >>>>>>> challenge for this. And introducing RichFunctions with > parameter-less > >>>>>>> init() method, seem not to help too much. We would get an > >>>> "intermediate" > >>>>>>> API that we want to change anyway later on... > >>>>>>> > >>>>>>> As you put already much effort into RichFunction, feel free do push > >>>> this > >>>>>>> further and start a new KIP (we can do this even in parallel) -- we > >>>>>>> don't want to slow you down :) But it make the discussion and code > >>>>>>> review easier, if we separate both IMHO. > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> > >>>>>>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote: > >>>>>>>> Hi Damian, > >>>>>>>> > >>>>>>>> Thanks for your comments. I think providing to users *interface* > >>>> rather > >>>>>>>> than *abstract class* should be preferred (Matthias also raised > this > >>>>>>> issue > >>>>>>>> ), anyway I changed the corresponding parts of KIP. > >>>>>>>> > >>>>>>>> Regarding with passing additional contextual information, I think > it > >>>> is a > >>>>>>>> tradeoff, > >>>>>>>> 1) first, we fix the context parameter for *init() *method in > >> another > >>>> PR > >>>>>>>> and solve Rich functions afterwards > >>>>>>>> 2) first, we fix the requested issues on jira ([1-3]) with > providing > >>>>>>>> (not-complete) Rich functions and integrate the context parameters > >> to > >>>>>>> this > >>>>>>>> afterwards (like improvement) > >>>>>>>> > >>>>>>>> To me, the second approach seems more incremental. However you are > >>>> right, > >>>>>>>> the names might confuse the users. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> [1] https://issues.apache.org/jira/browse/KAFKA-4218 > >>>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-4726 > >>>>>>>> [3] https://issues.apache.org/jira/browse/KAFKA-3745 > >>>>>>>> > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Jeyhun > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, May 19, 2017 at 10:42 AM Damian Guy <damian....@gmail.com > > > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I see you've removed the `ProcessorContext` from the RichFunction > >>>> which > >>>>>>> is > >>>>>>>>> good, but why is it a `RichFunction`? I'd have expected it to > pass > >>>> some > >>>>>>>>> additional contextual information, i.e., the `RecordContext` that > >>>>>>> contains > >>>>>>>>> just the topic, partition, timestamp, offset. I'm ok with it not > >>>>>>> passing > >>>>>>>>> this contextual information, but is the naming incorrect? I'm not > >>>> sure, > >>>>>>>>> tbh. I'm wondering if we should drop `RichFunctions` until we can > >> do > >>>> it > >>>>>>>>> properly with the correct context? > >>>>>>>>> > >>>>>>>>> Also, i don't like the abstract classes: RichValueMapper, > >>>>>>> RichValueJoiner, > >>>>>>>>> RichInitializer etc. Why can't they not just be interfaces? > >>>> Generally we > >>>>>>>>> should provide users with Intefaces to code against, not classes. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Damian > >>>>>>>>> > >>>>>>>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov < > je.kari...@gmail.com> > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Thanks. I initiated the PR as well, to get a better overview. > >>>>>>>>>> > >>>>>>>>>> The only reason that I used abstract class instead of interface > >> for > >>>>>>> Rich > >>>>>>>>>> functions is that in future if we will have some > >>>> AbstractRichFunction > >>>>>>>>>> abstract classes, > >>>>>>>>>> we can easily extend: > >>>>>>>>>> > >>>>>>>>>> public abstract class RichValueMapper<K, V, VR> implements > >>>>>>>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends > >>>>>>>>> AbstractRichFunction*{ > >>>>>>>>>> } > >>>>>>>>>> With interfaces we are only limited to interfaces for > >> inheritance. > >>>>>>>>>> > >>>>>>>>>> However, I think we can easily change it (from abstract class -> > >>>>>>>>> interface) > >>>>>>>>>> if you think interface is a better fit. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> Jeyhun > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax < > >>>> matth...@confluent.io > >>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Thanks for the update and explanations. The KIP is quite > improved > >>>> now > >>>>>>>>> -- > >>>>>>>>>>> great job! > >>>>>>>>>>> > >>>>>>>>>>> One more question: Why are RichValueMapper etc abstract classes > >> and > >>>>>>> not > >>>>>>>>>>> interfaces? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Overall, I like the KIP a lot! > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -Matthias > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks for your comments. > >>>>>>>>>>>> > >>>>>>>>>>>> I think supporting Lambdas for `withKey` and > >>>> `AbstractRichFunction` > >>>>>>>>>>>>> don't go together, as Lambdas are only supported for > interfaces > >>>>>>>>> AFAIK. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Maybe I misunderstood your comment. > >>>>>>>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they > don't > >>>> have > >>>>>>>>>>> direct > >>>>>>>>>>>> relation with *AbstractRichFunction*. > >>>>>>>>>>>> *withKey* and and *withOnlyValue* interfaces have only one > >>>> method , > >>>>>>>>> so > >>>>>>>>>>> we > >>>>>>>>>>>> can use lambdas. > >>>>>>>>>>>> Where does the *AbstractRichFunction* comes to play? Inside > Rich > >>>>>>>>>>> functions. > >>>>>>>>>>>> And we use Rich functions in 2 places: > >>>>>>>>>>>> > >>>>>>>>>>>> 1. User doesn't use rich functions. Just regular *withKey* and > >> and > >>>>>>>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get > those > >>>>>>>>>>> interfaces > >>>>>>>>>>>> and wrap into Rich function while building the topology, and > >> send > >>>> to > >>>>>>>>>>>> Processor. > >>>>>>>>>>>> 2. User does use rich functions (Rich functions implement > >>>> *withKey* > >>>>>>>>>>>> interface). As a result no lamdas here by definition. In this > >>>> case, > >>>>>>>>>> while > >>>>>>>>>>>> building the topology we do a type check if the object type is > >>>>>>>>>>>> *withKey* or *RichFunction*. > >>>>>>>>>>>> > >>>>>>>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich > >>>> functions > >>>>>>>>>> and > >>>>>>>>>>>> does not affect using lambdas. > >>>>>>>>>>>> > >>>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to > >>>> have a > >>>>>>>>>>>>> interface approach like this > >>>>>>>>>>>>> - RichFunction -> only adding init() and close() > >>>>>>>>>>>>> - ValueMapper > >>>>>>>>>>>>> - ValueMapperWithKey > >>>>>>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> As I said above, currently we support lambdas for *withKey* > >>>>>>>>> interfaces > >>>>>>>>>> as > >>>>>>>>>>>> well. However, I agree with your idea and I will remove the > >>>>>>>>>>>> AbstractRichFunction from the design. > >>>>>>>>>>>> > >>>>>>>>>>>> As an alternative, we could argue, that it is sufficient to > >>>> support > >>>>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended > >> API". > >>>>>>>>> For > >>>>>>>>>>>>> this, RichFunction could add key+init+close and > >>>> AbstractRichFunction > >>>>>>>>>>>>> would allow to only care about getting the key. > >>>>>>>>>>>>> Not sure, which one is better. I don't like the idea of more > >>>>>>>>>> overloaded > >>>>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much > >> because > >>>> we > >>>>>>>>>> have > >>>>>>>>>>>>> already so many overlaods. On the other hand, I do see value > in > >>>>>>>>>>>>> supporting Lambdas for `withKey`. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Just to clarify, with current design we have only one extra > >>>>>>>>> overloaded > >>>>>>>>>>>> method per *withOnlyValue* interface: which is *withKey* > >> version > >>>> of > >>>>>>>>>>>> particular interface. > >>>>>>>>>>>> We don't need extra overload for Rich function as Rich > function > >>>>>>>>>>> implements > >>>>>>>>>>>> *withKey* interface as a result they have same type. We > >>>> differentiate > >>>>>>>>>>> them > >>>>>>>>>>>> while building the topology. > >>>>>>>>>>>> We supported lambdas for *withKey* APIs because of the > comment: > >>>>>>>>>>>> > >>>>>>>>>>>> @Jeyhun: I did not put any thought into this, but can we have > a > >>>>>>>>> design > >>>>>>>>>>>>> that allows for both? Also, with regard to lambdas, it might > >> make > >>>>>>>>>> sense > >>>>>>>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> However, I don't think that this complicates the overall > design > >>>>>>>>>>>> significantly. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Depending on what we want to support, it might make sense to > >>>>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this > >>>> also > >>>>>>>>>>>>> determines if we should have a "ProcessorContext KIP" before > >>>> driving > >>>>>>>>>>>>> this KIP further. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Based on our discussion I think we should keep Rich functions > >> as I > >>>>>>>>>> don't > >>>>>>>>>>>> think that they bring extra layer of overhead to library. > >>>>>>>>>>>> > >>>>>>>>>>>> Any comments are appreciated. > >>>>>>>>>>>> > >>>>>>>>>>>> Cheers, > >>>>>>>>>>>> Jeyhun > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax < > >>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Jeyhun, > >>>>>>>>>>>>> > >>>>>>>>>>>>> thanks for the update. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I think supporting Lambdas for `withKey` and > >>>> `AbstractRichFunction` > >>>>>>>>>>>>> don't go together, as Lambdas are only supported for > interfaces > >>>>>>>>> AFAIK. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need to > >>>> have a > >>>>>>>>>>>>> interface approach like this > >>>>>>>>>>>>> > >>>>>>>>>>>>> - RichFunction -> only adding init() and close() > >>>>>>>>>>>>> > >>>>>>>>>>>>> - ValueMapper > >>>>>>>>>>>>> - ValueMapperWithKey > >>>>>>>>>>>>> > >>>>>>>>>>>>> - RichValueMapper extends ValueMapperWithKey, RichFunction > >>>>>>>>>>>>> > >>>>>>>>>>>>> For this approach, AbstractRichFunction does not make sense > >>>> anymore, > >>>>>>>>>> as > >>>>>>>>>>>>> the only purpose of `RichFunction` is to allow the > >>>> implementation of > >>>>>>>>>>>>> init() and close() -- if you don't want those, you would > >>>> implement a > >>>>>>>>>>>>> different interface (ie, ValueMapperWithKey) > >>>>>>>>>>>>> > >>>>>>>>>>>>> As an alternative, we could argue, that it is sufficient to > >>>> support > >>>>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended > >> API". > >>>>>>>>> For > >>>>>>>>>>>>> this, RichFunction could add key+init+close and > >>>> AbstractRichFunction > >>>>>>>>>>>>> would allow to only care about getting the key. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Not sure, which one is better. I don't like the idea of more > >>>>>>>>>> overloaded > >>>>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much > >> because > >>>> we > >>>>>>>>>> have > >>>>>>>>>>>>> already so many overlaods. On the other hand, I do see value > in > >>>>>>>>>>>>> supporting Lambdas for `withKey`. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Depending on what we want to support, it might make sense to > >>>>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, this > >>>> also > >>>>>>>>>>>>> determines if we should have a "ProcessorContext KIP" before > >>>> driving > >>>>>>>>>>>>> this KIP further. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thoughts? > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Sorry for super late response. Thanks for your comments. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little > >> bit? I > >>>>>>>>>> cannot > >>>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem > is. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - From [1] says "A functional interface is an interface that > >> has > >>>>>>>>> just > >>>>>>>>>>> one > >>>>>>>>>>>>>> abstract method, and thus represents a single function > >>>> contract". > >>>>>>>>>>>>>> So basically once we extend some interface from another (in > >> our > >>>>>>>>> case, > >>>>>>>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas > in > >>>> the > >>>>>>>>>>>>> extended > >>>>>>>>>>>>>> interface. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Further comments: > >>>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe > >> reformat > >>>> the > >>>>>>>>>>> wiki > >>>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - I will work on the KIP. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have overlaods > >>>> added > >>>>>>>>>> for > >>>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add > >> any > >>>> new > >>>>>>>>>>>>>>> overloads) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - Actually there are more than one Processor and public APIs > >> to > >>>> be > >>>>>>>>>>>>>> changed (KStream-KTable > >>>>>>>>>>>>>> joins is one case). However all of them has similar > structure: > >>>> we > >>>>>>>>>>>>> overload > >>>>>>>>>>>>>> the *method* with *methodWithKey*, > >>>>>>>>>>>>>> wrap it into the Rich function, send to processor and inside > >> the > >>>>>>>>>>>>> processor > >>>>>>>>>>>>>> call *init* and *close* methods of the Rich function. > >>>>>>>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea > >> with > >>>>>>>>> only > >>>>>>>>>>>>>> *ValueMapper* as the same can be applied to all changes. > >>>>>>>>>>>>>> Anyway I will update the KIP. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and* > >>>> close()* > >>>>>>>>>>>>> methods > >>>>>>>>>>>>>> in every Rich function with empty body like: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> @Override > >>>>>>>>>>>>>> void init(ProcessorContext context) {} > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> @Override > >>>>>>>>>>>>>> void close () {} > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I thought that we can override them once in > >>>> *AbstractRichFunction* > >>>>>>>>>> and > >>>>>>>>>>>>>> extent new Rich functions from *AbstractRichFunction*. > >>>>>>>>>>>>>> Basically this can eliminate code copy-paste and ease the > >>>>>>>>>> maintenance. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, Merger, > >>>>>>>>>> Predicate, > >>>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, > >> but > >>>> we > >>>>>>>>>>> should > >>>>>>>>>>>>>>> discuss all of them and add where it does make sense (e.g., > >>>>>>>>>>>>>>> RichForachAction does make sense IMHO) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Definitely agree. As I said, the same technique applies to > all > >>>> this > >>>>>>>>>>>>>> interfaces and I didn't want to explode the KIP, just wanted > >> to > >>>>>>>>> give > >>>>>>>>>>> the > >>>>>>>>>>>>>> overall intuition. > >>>>>>>>>>>>>> However, I will update the KIP as I said. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- > >>>>>>>>>>> `RichValueXX` > >>>>>>>>>>>>>>> in general -- but why can't we do all this with interfaces > >>>> only? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Sure we can. However the main intuition is we should not > force > >>>>>>>>> users > >>>>>>>>>> to > >>>>>>>>>>>>>> implement *init(ProcessorContext)* and *close()* functions > >> every > >>>>>>>>> time > >>>>>>>>>>>>> they > >>>>>>>>>>>>>> use Rich functions. > >>>>>>>>>>>>>> If one needs, she can override the respective methods. > >> However, > >>>> I > >>>>>>>>> am > >>>>>>>>>>> open > >>>>>>>>>>>>>> for discussion. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread any > >>>>>>>>> further > >>>>>>>>>>> than > >>>>>>>>>>>>>>> it currently is. So maybe we need another KIP that is done > >>>> before > >>>>>>>>>>> this? > >>>>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too > >> large. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> That is good point. I wanted to make > *init(ProcessorContext)* > >>>>>>>>> method > >>>>>>>>>>>>>> persistent among the library (which use ProcessorContext as > an > >>>>>>>>>> input), > >>>>>>>>>>>>>> therefore I put *ProcessorContext* as an input. > >>>>>>>>>>>>>> So the important question is that (as @dguy and @mjsax > >>>> mentioned) > >>>>>>>>>>> whether > >>>>>>>>>>>>>> continue this KIP without providing users an access to > >>>>>>>>>>> *ProcessorContext* > >>>>>>>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or > >>>>>>>>>>>>>> initiate another KIP before this. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>> > http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- > >>>>>>>>>>>>> se-8-jls-pfd-diffs.pdf > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy < > >>>> damian....@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread > any > >>>>>>>>> further > >>>>>>>>>>>>> than > >>>>>>>>>>>>>>> it currently is. So maybe we need another KIP that is done > >>>> before > >>>>>>>>>>> this? > >>>>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too > >> large. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax < > >>>>>>>>> matth...@confluent.io > >>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I agree that that `ProcessorContext` interface is too > broad > >> in > >>>>>>>>>>> general > >>>>>>>>>>>>>>>> -- this is even true for transform/process, and it's also > >>>>>>>>> reflected > >>>>>>>>>>> in > >>>>>>>>>>>>>>>> the API improvement list we want to do. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/ > >>>>>>>>>>>>>>> Kafka+Streams+Discussions > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> So I am wondering, if you question the `RichFunction` > >>>> approach in > >>>>>>>>>>>>>>>> general? Or if you suggest to either extend the scope of > >> this > >>>> KIP > >>>>>>>>>> to > >>>>>>>>>>>>>>>> include this---or maybe better, do another KIP for it and > >>>> delay > >>>>>>>>>> this > >>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>> until the other one is done? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote: > >>>>>>>>>>>>>>>>> Thanks for the KIP. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we > >>>> really > >>>>>>>>>> want > >>>>>>>>>>> to > >>>>>>>>>>>>>>>> give > >>>>>>>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It > has > >> a > >>>>>>>>> bunch > >>>>>>>>>>> of > >>>>>>>>>>>>>>>>> methods on it that seem in-appropriate for some of the > DSL > >>>>>>>>>> methods, > >>>>>>>>>>>>>>> i.e, > >>>>>>>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. > It > >> is > >>>>>>>>> far > >>>>>>>>>>> too > >>>>>>>>>>>>>>>>> broad. I think it would be better to have a narrower > >>>> interface > >>>>>>>>>> like > >>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>> `RecordContext` - remembering it is easier to add > >>>>>>>>>>> methods/interfaces > >>>>>>>>>>>>>>>> later > >>>>>>>>>>>>>>>>> than to remove them > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax < > >>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Jeyhun, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a > little > >>>> bit? > >>>>>>>>> I > >>>>>>>>>>>>>>> cannot > >>>>>>>>>>>>>>>>>> follow the explanation in the KIP to see what the > problem > >>>> is. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's up > >> to > >>>>>>>>> you. > >>>>>>>>>>>>>>> Maybe a > >>>>>>>>>>>>>>>>>> committer can comment on this? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Further comments: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe > >>>> reformat > >>>>>>>>>> the > >>>>>>>>>>>>>>> wiki > >>>>>>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have > >> overlaods > >>>>>>>>>> added > >>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to > add > >>>> any > >>>>>>>>>> new > >>>>>>>>>>>>>>>>>> overloads) > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, > >> Merger, > >>>>>>>>>>>>>>> Predicate, > >>>>>>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to > all, > >>>> but > >>>>>>>>> we > >>>>>>>>>>>>>>> should > >>>>>>>>>>>>>>>>>> discuss all of them and add where it does make sense > >> (e.g., > >>>>>>>>>>>>>>>>>> RichForachAction does make sense IMHO) > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` > -- > >>>>>>>>>>>>>>> `RichValueXX` > >>>>>>>>>>>>>>>>>> in general -- but why can't we do all this with > interfaces > >>>>>>>>> only? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend the > >> two > >>>>>>>>>>>>> interfaces > >>>>>>>>>>>>>>>> if > >>>>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I > >> should > >>>>>>>>>> change > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> title, because now we are not limiting the KIP to only > >>>>>>>>>>> ValueMapper, > >>>>>>>>>>>>>>>>>>> ValueTransformer and ValueJoiner. > >>>>>>>>>>>>>>>>>>> Please feel free to comment. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ > >>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < > >>>>>>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't > >>>> need > >>>>>>>>> the > >>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>> overlaod. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens > >> when > >>>>>>>>>>> building > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() > >> we > >>>>>>>>> don't > >>>>>>>>>>>>> need > >>>>>>>>>>>>>>>> any > >>>>>>>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`) > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Thanks for feedback. > >>>>>>>>>>>>>>>>>>>>> Then we need to overload method > >>>>>>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super > V, > >> ? > >>>>>>>>>> extends > >>>>>>>>>>>>>>> VR> > >>>>>>>>>>>>>>>>>>>>> mapper); > >>>>>>>>>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? > >>>> super > >>>>>>>>> V, > >>>>>>>>>> ? > >>>>>>>>>>>>>>>> extends > >>>>>>>>>>>>>>>>>>>> VR> > >>>>>>>>>>>>>>>>>>>>> mapper); > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to > >> check > >>>> it > >>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> ValueMapper > >>>>>>>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the > rich > >>>>>>>>>> function. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < > >>>>>>>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> +1 :) > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: > >>>>>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, > if > >> we > >>>>>>>>>> should > >>>>>>>>>>>>> do > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> design a little different. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction > and > >>>>>>>>>>>>>>>> non-RichFunction > >>>>>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the > >> time. > >>>>>>>>>> Thus, > >>>>>>>>>>> on > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> DSL > >>>>>>>>>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, > >> we > >>>>>>>>> wrap > >>>>>>>>>> it > >>>>>>>>>>>>>>> by a > >>>>>>>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. > >> This > >>>>>>>>>>>>>>>> RichFunction > >>>>>>>>>>>>>>>>>>>>>>> would just forward the call omitting the key: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet): > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements > >>>>>>>>> RichValueMapper() > >>>>>>>>>> { > >>>>>>>>>>>>>>>>>>>>>>>> private ValueMapper userProvidedMapper; // set > by > >>>>>>>>>>>>> constructor > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> public VR apply(final K key, final V1 value1, > >>>> final V2 > >>>>>>>>>>>>>>> value2) > >>>>>>>>>>>>>>>> { > >>>>>>>>>>>>>>>>>>>>>>>> return userProvidedMapper(value1, value2); > >>>>>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> From a performance point of view, I am not sure if > >> the > >>>>>>>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have > >>>> more > >>>>>>>>>>>>> overhead > >>>>>>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>>>>>>> this approach (we would do more nested method call > >> for > >>>>>>>>>>>>>>>>>> non-RichFunction > >>>>>>>>>>>>>>>>>>>>>>> which should be more common than RichFunctions). > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I > miss > >>>>>>>>>>>>> something?) > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top > level > >>>>>>>>>>> interface > >>>>>>>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` > >> and > >>>>>>>>> also > >>>>>>>>>>>>>>>>>> interfaces > >>>>>>>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract > classes > >>>>>>>>>>> required). > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Any thoughts? > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to > >> include > >>>>>>>>> rich > >>>>>>>>>>>>>>>>>> functions. > >>>>>>>>>>>>>>>>>>>> I > >>>>>>>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep > copying > >>>> of > >>>>>>>>>> keys. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < > >>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Hey Matthias, > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the > >>>> immutability of > >>>>>>>>>> the > >>>>>>>>>>>>>>> key > >>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> best approach available. Better than requiring > the > >>>> key > >>>>>>>>> to > >>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> serializable > >>>>>>>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no > >>>> performance > >>>>>>>>>> risk. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type > >>>> constraints > >>>>>>>>>> of > >>>>>>>>>>>>>>> some > >>>>>>>>>>>>>>>>>>>> kind. > >>>>>>>>>>>>>>>>>>>>>> :-) > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax > < > >>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this > path, > >>>> it > >>>>>>>>>>> should > >>>>>>>>>>>>>>>> cover > >>>>>>>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure > what > >>>> to > >>>>>>>>> do > >>>>>>>>>>>>> about > >>>>>>>>>>>>>>>>>> it... > >>>>>>>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and > name > >>>> the > >>>>>>>>> key > >>>>>>>>>>>>>>>>>> parameter > >>>>>>>>>>>>>>>>>>>>>> like > >>>>>>>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). > Ultimately, > >> I > >>>>>>>>> would > >>>>>>>>>>>>>>> prefer > >>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no > good > >>>> idea > >>>>>>>>>> how > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> do > >>>>>>>>>>>>>>>>>>>>>> this. > >>>>>>>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of > >>>> corrupted > >>>>>>>>>>>>>>>>>> partitioning > >>>>>>>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, > >> well, > >>>> bad > >>>>>>>>>>> luck, > >>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>> got > >>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep > >>>> copy > >>>>>>>>>> with > >>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>> potential > >>>>>>>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm > >> without > >>>> any > >>>>>>>>>>>>>>>>>> performance > >>>>>>>>>>>>>>>>>>>>>>>>> test). > >>>>>>>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's > >>>> worth > >>>>>>>>>> for > >>>>>>>>>>>>> you > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> apply > >>>>>>>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's > very > >>>>>>>>> basic > >>>>>>>>>>>>>>>>>> performance > >>>>>>>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you > >> want > >>>> to > >>>>>>>>> do > >>>>>>>>>>>>>>> this, > >>>>>>>>>>>>>>>>>> look > >>>>>>>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and > >> into > >>>>>>>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find > find > >>>> the > >>>>>>>>>> perf > >>>>>>>>>>>>>>>> test. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Matthias, > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions > >>>> totally > >>>>>>>>>>>>> makes > >>>>>>>>>>>>>>>>>>>> sense. > >>>>>>>>>>>>>>>>>>>>>>>>> So, > >>>>>>>>>>>>>>>>>>>>>>>>>>> we don't want to guard the keys because it is > >>>>>>>>> costly. > >>>>>>>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should > >>>> not be > >>>>>>>>>>>>>>> limited > >>>>>>>>>>>>>>>>>> only > >>>>>>>>>>>>>>>>>>>>>>>>> the 3 > >>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of > >>>>>>>>>>> interfaces. > >>>>>>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. > Sax < > >>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io > >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the > user > >>>> list: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ > >>>>>>>>>>>>>>> key+ > >>>>>>>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, > >> Adder, > >>>>>>>>> and > >>>>>>>>>>>>>>>>>> Substractor > >>>>>>>>>>>>>>>>>>>>>>>>>>>> inferface, too. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are other > >>>>>>>>> interface > >>>>>>>>>>> we > >>>>>>>>>>>>>>>> might > >>>>>>>>>>>>>>>>>>>>>> miss > >>>>>>>>>>>>>>>>>>>>>>>>>> atm. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but > I > >> am > >>>>>>>>>>> actually > >>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>> little > >>>>>>>>>>>>>>>>>>>>>>>>> bit > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might > >>>> want to > >>>>>>>>>> do > >>>>>>>>>>>>>>> some > >>>>>>>>>>>>>>>>>> test > >>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> quantify this impact. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of > >>>> `RichFunction` > >>>>>>>>>>>>>>>> interface > >>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata > >> (like > >>>>>>>>>>>>>>> timestamp, > >>>>>>>>>>>>>>>>>>>>>> offset, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a > >>>> similar > >>>>>>>>>>>>> concept. > >>>>>>>>>>>>>>>>>>>> Thus, I > >>>>>>>>>>>>>>>>>>>>>>>>> am > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge > >> the > >>>>>>>>> scope > >>>>>>>>>>> of > >>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> that? WDYT? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar > >> approach > >>>>>>>>> and > >>>>>>>>>>>>>>> updated > >>>>>>>>>>>>>>>>>> PR > >>>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in > >>>> Processors > >>>>>>>>>>>>>>> sending > >>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>> copy > >>>>>>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actual key. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I > >>>> think > >>>>>>>>>>> memory > >>>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>>>>> bottleneck > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu > >> Fenniak < > >>>>>>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper > >>>> (...etc) to > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> classes, > >>>>>>>>>>>>>>>>>>>>>>>>> rather > >>>>>>>>>>>>>>>>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards > >>>> incompatible > >>>>>>>>>>>>>>> change. > >>>>>>>>>>>>>>>>>> An > >>>>>>>>>>>>>>>>>>>>>>>>>>>> alternative > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible > >>>> would > >>>>>>>>> be > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> define > >>>>>>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where > those > >>>>>>>>>>> interfaces > >>>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>>>>>>>> used. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as > >>>> "final" > >>>>>>>>>>>>>>> doesn't > >>>>>>>>>>>>>>>>>>>> change > >>>>>>>>>>>>>>>>>>>>>>>>>> much > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change. It only > >>>>>>>>> prevents > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> parameter > >>>>>>>>>>>>>>>>>>>>>>>>>>>> variable > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type is > a > >>>>>>>>> mutable > >>>>>>>>>>>>>>> object > >>>>>>>>>>>>>>>>>>>> (eg. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> byte[]), > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0). > >> But > >>>>>>>>> I'm > >>>>>>>>>>> not > >>>>>>>>>>>>>>>>>> really > >>>>>>>>>>>>>>>>>>>>>> sure > >>>>>>>>>>>>>>>>>>>>>>>>>>>> there's > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun > >> Karimov > >>>> < > >>>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can > >>>> guard > >>>>>>>>> for > >>>>>>>>>>>>>>>>>> immutable > >>>>>>>>>>>>>>>>>>>>>> keys > >>>>>>>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few > changes), I > >>>>>>>>> didn't > >>>>>>>>>>>>>>>> consider > >>>>>>>>>>>>>>>>>>>>>>>>>> backward > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. > In > >>>> both > >>>>>>>>>>>>> cases, > >>>>>>>>>>>>>>>>>> user > >>>>>>>>>>>>>>>>>>>>>>>>>> accesses > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type > >>>>>>>>> parameter > >>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>> break > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has to > >> cast > >>>> to > >>>>>>>>>>> actual > >>>>>>>>>>>>>>>> key > >>>>>>>>>>>>>>>>>>>>>> type. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -- -Cheers Jeyhun