We started with this proposal but it does not allow for Lambdas (in case you want key access). Do you think preserving Lambdas is not important enough for this case -- I agree that there is some price to be paid to keep Lambdas.
About API consistency: I can't remember a concrete user request to have key access in all operators. But I am pretty sure, if we only add it for some, this request will pop up quickly. IMHO, it's better to do it all in one shot. (But I am not strict about it if most people think we should limit it to what was requested by users.) -Matthias On 6/4/17 6:23 PM, Guozhang Wang wrote: > With KIP-159, the added "valueMapperWithKey" and "valueTransformerWithKey" > along seem to be just adding a few syntax-sugars? E.g. we can simply > implement: > > mapValues(ValueMapperWithKey mapperWithKey) > > as > > map((k, v) -> (k, mapperWithKey(k, v)) > > ---------------------- > > I'm not sure how many users would really need such syntax sugars, and even > they do, we can support them easily as the above implementations; > > Similarly for "ReducerWithKey", it can be implemented as `Aggregator<K, V, > V>`, and people who needs key access can just use `aggregate` directly. > > The function which I think is really of valuable is `ValueJoinerWithKey`, > and that seems to be the original request from users while others are > coming from "API consistency" concerns. Personally I'd prefer only keep the > last one (`InitializerWithKey` might also have some value, but I have not > seen people widely requesting it in their DSL usage yet; if there is a > common request we can keep it in this KIP as well). WDYT? > > Guozhang > > > On Sun, May 28, 2017 at 10:16 AM, Jeyhun Karimov <je.kari...@gmail.com> > wrote: > >> Thanks for clarification Matthias, now everything is clear. >> >> On Sun, May 28, 2017 at 6:21 PM Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> I don't think we can drop ValueTransformerSupplier. If you don't have an >>> supplier, you only get a single instance of your function. But for a >>> stateful transformation, we need multiple instances (one for each task) >>> of ValueTransformer. >>> >>> We don't need suppliers for functions like "ValueMapper" etc because >>> those are stateless and thus, we can reuse a single instance over >>> multiple tasks -- but we can't do this for ValueTransformer (and >> similar). >>> >>> Btw: This reminds me about KIP-159: with regard to the RichFunction we >>> might need a supplier pattern, too. (I'll comment on the other thread, >>> too.) >>> >>> >>> -Matthias >>> >>> On 5/28/17 5:45 AM, Jeyhun Karimov wrote: >>>> Hi, >>>> >>>> I updated KIP. >>>> Just to avoid misunderstanding, I meant deprecating >>> ValueTransformerSupplier >>>> and I am ok with ValueTransformer. >>>> So instead of using ValueTransformerSupplier can't we directly use >>>> ValueTransformer >>>> or ValueTransformerWithKey? >>>> >>>> Btw, in current design all features of ValueTransformer will be >> available >>>> in ValueTransformerWithKey interface. >>>> >>>> Cheers, >>>> Jeyhun >>>> >>>> On Sun, May 28, 2017 at 6:15 AM Matthias J. Sax <matth...@confluent.io >>> >>>> wrote: >>>> >>>>> Thanks Jeyhun. >>>>> >>>>> About ValueTransformer: I don't think we can remove it. Note, >>>>> ValueTransformer allows to attach a state and also allows to register >>>>> punctuations. Both those features will not be available via withKey() >>>>> interfaces. >>>>> >>>>> -Matthias >>>>> >>>>> >>>>> On 5/27/17 1:25 PM, Jeyhun Karimov wrote: >>>>>> Hi Matthias, >>>>>> >>>>>> Thanks for your comments. >>>>>> >>>>>> I tested the deep copy approach. It has significant overhead. >>> Especially >>>>>> for "light" and stateless operators it slows down the topology >>>>>> significantly (> 20% ). I think "warning" users about not-changing >> the >>>>> key >>>>>> is better warning them about possible performance loss. >>>>>> >>>>>> About the interfaces, additionally I considered adding >>>>> InitializerWithKey, >>>>>> AggregatorWithKey and ValueTransformerWithKey. I think they are >>> included >>>>> in >>>>>> PR but not in KIP. I will also include them in KIP, sorry my bad. >>>>>> Including ReducerWithKey definitely makes sense. Thanks. >>>>>> >>>>>> One thing I want to mention is that, maybe we should deprecate >> methods >>>>> with >>>>>> argument type ValueTransformerSupplier (KStream.transformValues(...)) >>> and >>>>>> and as a whole the ValueTransformerSupplier interface. >>>>>> We can use ValueTransformer/ValueTransformerWithKey type instead >>> without >>>>>> additional supplier layer. >>>>>> >>>>>> >>>>>> Cheers, >>>>>> Jeyhun >>>>>> >>>>>> >>>>>> On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax < >> matth...@confluent.io >>>> >>>>>> wrote: >>>>>> >>>>>>> One more question: >>>>>>> >>>>>>> Should we add any of >>>>>>> - InitizialierWithKey >>>>>>> - ReducerWithKey >>>>>>> - ValueTransformerWithKey >>>>>>> >>>>>>> To get consistent/complete API, it might be a good idea. Any >> thoughts? >>>>>>> >>>>>>> >>>>>>> -Matthias >>>>>>> >>>>>>> >>>>>>> On 5/24/17 3:47 PM, Matthias J. Sax wrote: >>>>>>>> Jeyhun, >>>>>>>> >>>>>>>> I was just wondering if you did look into the key-deep-copy idea we >>>>>>>> discussed. I am curious to see what the impact might be. >>>>>>>> >>>>>>>> >>>>>>>> -Matthias >>>>>>>> >>>>>>>> On 5/20/17 2:03 AM, Jeyhun Karimov wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Thanks for your comments. I rethink about including rich functions >>>>> into >>>>>>>>> this KIP. >>>>>>>>> I think once we include rich functions in this KIP and then fix >>>>>>>>> ProcessorContext in another KIP and incorporate with existing rich >>>>>>>>> functions, the code will not be backwards compatible. >>>>>>>>> >>>>>>>>> I see Damian's and your point more clearly now. >>>>>>>>> >>>>>>>>> Conclusion: we include only withKey interfaces in this KIP (I >>> updated >>>>>>> the >>>>>>>>> KIP), I will try also initiate another KIP for rich functions. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Jeyhun >>>>>>>>> >>>>>>>>> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax < >>>>> matth...@confluent.io >>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> With the current KIP, using ValueMapper and ValueMapperWithKey >>>>>>>>>> interfaces, RichFunction seems to be an independent add-on. To >> fix >>>>> the >>>>>>>>>> original issue to allow key access, RichFunctions are not >> required >>>>>>> IMHO. >>>>>>>>>> >>>>>>>>>> I initially put the RichFunction idea on the table, because I was >>>>>>> hoping >>>>>>>>>> to get a uniform API. And I think, is was good to consider them >> -- >>>>>>>>>> however, the discussion showed that they are not necessary for >> key >>>>>>>>>> access. Thus, it seems to be better to handle RichFunctions in an >>> own >>>>>>>>>> KIP. The ProcessorContext/RecordContext issues seems to be a main >>>>>>>>>> challenge for this. And introducing RichFunctions with >>> parameter-less >>>>>>>>>> init() method, seem not to help too much. We would get an >>>>>>> "intermediate" >>>>>>>>>> API that we want to change anyway later on... >>>>>>>>>> >>>>>>>>>> As you put already much effort into RichFunction, feel free do >> push >>>>>>> this >>>>>>>>>> further and start a new KIP (we can do this even in parallel) -- >> we >>>>>>>>>> don't want to slow you down :) But it make the discussion and >> code >>>>>>>>>> review easier, if we separate both IMHO. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -Matthias >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote: >>>>>>>>>>> Hi Damian, >>>>>>>>>>> >>>>>>>>>>> Thanks for your comments. I think providing to users *interface* >>>>>>> rather >>>>>>>>>>> than *abstract class* should be preferred (Matthias also raised >>> this >>>>>>>>>> issue >>>>>>>>>>> ), anyway I changed the corresponding parts of KIP. >>>>>>>>>>> >>>>>>>>>>> Regarding with passing additional contextual information, I >> think >>> it >>>>>>> is a >>>>>>>>>>> tradeoff, >>>>>>>>>>> 1) first, we fix the context parameter for *init() *method in >>>>> another >>>>>>> PR >>>>>>>>>>> and solve Rich functions afterwards >>>>>>>>>>> 2) first, we fix the requested issues on jira ([1-3]) with >>> providing >>>>>>>>>>> (not-complete) Rich functions and integrate the context >> parameters >>>>> to >>>>>>>>>> this >>>>>>>>>>> afterwards (like improvement) >>>>>>>>>>> >>>>>>>>>>> To me, the second approach seems more incremental. However you >> are >>>>>>> right, >>>>>>>>>>> the names might confuse the users. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] https://issues.apache.org/jira/browse/KAFKA-4218 >>>>>>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-4726 >>>>>>>>>>> [3] https://issues.apache.org/jira/browse/KAFKA-3745 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Jeyhun >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, May 19, 2017 at 10:42 AM Damian Guy < >> damian....@gmail.com >>>> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I see you've removed the `ProcessorContext` from the >> RichFunction >>>>>>> which >>>>>>>>>> is >>>>>>>>>>>> good, but why is it a `RichFunction`? I'd have expected it to >>> pass >>>>>>> some >>>>>>>>>>>> additional contextual information, i.e., the `RecordContext` >> that >>>>>>>>>> contains >>>>>>>>>>>> just the topic, partition, timestamp, offset. I'm ok with it >> not >>>>>>>>>> passing >>>>>>>>>>>> this contextual information, but is the naming incorrect? I'm >> not >>>>>>> sure, >>>>>>>>>>>> tbh. I'm wondering if we should drop `RichFunctions` until we >> can >>>>> do >>>>>>> it >>>>>>>>>>>> properly with the correct context? >>>>>>>>>>>> >>>>>>>>>>>> Also, i don't like the abstract classes: RichValueMapper, >>>>>>>>>> RichValueJoiner, >>>>>>>>>>>> RichInitializer etc. Why can't they not just be interfaces? >>>>>>> Generally we >>>>>>>>>>>> should provide users with Intefaces to code against, not >> classes. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Damian >>>>>>>>>>>> >>>>>>>>>>>> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov < >>> je.kari...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. I initiated the PR as well, to get a better overview. >>>>>>>>>>>>> >>>>>>>>>>>>> The only reason that I used abstract class instead of >> interface >>>>> for >>>>>>>>>> Rich >>>>>>>>>>>>> functions is that in future if we will have some >>>>>>> AbstractRichFunction >>>>>>>>>>>>> abstract classes, >>>>>>>>>>>>> we can easily extend: >>>>>>>>>>>>> >>>>>>>>>>>>> public abstract class RichValueMapper<K, V, VR> implements >>>>>>>>>>>>> ValueMapperWithKey<K, V, VR>, RichFunction *extends >>>>>>>>>>>> AbstractRichFunction*{ >>>>>>>>>>>>> } >>>>>>>>>>>>> With interfaces we are only limited to interfaces for >>>>> inheritance. >>>>>>>>>>>>> >>>>>>>>>>>>> However, I think we can easily change it (from abstract class >> -> >>>>>>>>>>>> interface) >>>>>>>>>>>>> if you think interface is a better fit. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax < >>>>>>> matth...@confluent.io >>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for the update and explanations. The KIP is quite >>> improved >>>>>>> now >>>>>>>>>>>> -- >>>>>>>>>>>>>> great job! >>>>>>>>>>>>>> >>>>>>>>>>>>>> One more question: Why are RichValueMapper etc abstract >> classes >>>>> and >>>>>>>>>> not >>>>>>>>>>>>>> interfaces? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Overall, I like the KIP a lot! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for your comments. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think supporting Lambdas for `withKey` and >>>>>>> `AbstractRichFunction` >>>>>>>>>>>>>>>> don't go together, as Lambdas are only supported for >>> interfaces >>>>>>>>>>>> AFAIK. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Maybe I misunderstood your comment. >>>>>>>>>>>>>>> *withKey* and and *withOnlyValue* are interfaces. So they >>> don't >>>>>>> have >>>>>>>>>>>>>> direct >>>>>>>>>>>>>>> relation with *AbstractRichFunction*. >>>>>>>>>>>>>>> *withKey* and and *withOnlyValue* interfaces have only one >>>>>>> method , >>>>>>>>>>>> so >>>>>>>>>>>>>> we >>>>>>>>>>>>>>> can use lambdas. >>>>>>>>>>>>>>> Where does the *AbstractRichFunction* comes to play? Inside >>> Rich >>>>>>>>>>>>>> functions. >>>>>>>>>>>>>>> And we use Rich functions in 2 places: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. User doesn't use rich functions. Just regular *withKey* >> and >>>>> and >>>>>>>>>>>>>>> *withOnlyValue* interfaces(both support lambdas) . We get >>> those >>>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>> and wrap into Rich function while building the topology, and >>>>> send >>>>>>> to >>>>>>>>>>>>>>> Processor. >>>>>>>>>>>>>>> 2. User does use rich functions (Rich functions implement >>>>>>> *withKey* >>>>>>>>>>>>>>> interface). As a result no lamdas here by definition. In >> this >>>>>>> case, >>>>>>>>>>>>> while >>>>>>>>>>>>>>> building the topology we do a type check if the object type >> is >>>>>>>>>>>>>>> *withKey* or *RichFunction*. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So *AbstractRichFunction* is just syntactic sugar for Rich >>>>>>> functions >>>>>>>>>>>>> and >>>>>>>>>>>>>>> does not affect using lambdas. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need >> to >>>>>>> have a >>>>>>>>>>>>>>>> interface approach like this >>>>>>>>>>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>>>>>>>>>> - ValueMapper >>>>>>>>>>>>>>>> - ValueMapperWithKey >>>>>>>>>>>>>>>> - RichValueMapper extends ValueMapperWithKey, >> RichFunction >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As I said above, currently we support lambdas for *withKey* >>>>>>>>>>>> interfaces >>>>>>>>>>>>> as >>>>>>>>>>>>>>> well. However, I agree with your idea and I will remove the >>>>>>>>>>>>>>> AbstractRichFunction from the design. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As an alternative, we could argue, that it is sufficient to >>>>>>> support >>>>>>>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended >>>>> API". >>>>>>>>>>>> For >>>>>>>>>>>>>>>> this, RichFunction could add key+init+close and >>>>>>> AbstractRichFunction >>>>>>>>>>>>>>>> would allow to only care about getting the key. >>>>>>>>>>>>>>>> Not sure, which one is better. I don't like the idea of >> more >>>>>>>>>>>>> overloaded >>>>>>>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much >>>>> because >>>>>>> we >>>>>>>>>>>>> have >>>>>>>>>>>>>>>> already so many overlaods. On the other hand, I do see >> value >>> in >>>>>>>>>>>>>>>> supporting Lambdas for `withKey`. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Just to clarify, with current design we have only one extra >>>>>>>>>>>> overloaded >>>>>>>>>>>>>>> method per *withOnlyValue* interface: which is *withKey* >>>>> version >>>>>>> of >>>>>>>>>>>>>>> particular interface. >>>>>>>>>>>>>>> We don't need extra overload for Rich function as Rich >>> function >>>>>>>>>>>>>> implements >>>>>>>>>>>>>>> *withKey* interface as a result they have same type. We >>>>>>> differentiate >>>>>>>>>>>>>> them >>>>>>>>>>>>>>> while building the topology. >>>>>>>>>>>>>>> We supported lambdas for *withKey* APIs because of the >>> comment: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> @Jeyhun: I did not put any thought into this, but can we >> have >>> a >>>>>>>>>>>> design >>>>>>>>>>>>>>>> that allows for both? Also, with regard to lambdas, it >> might >>>>> make >>>>>>>>>>>>> sense >>>>>>>>>>>>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However, I don't think that this complicates the overall >>> design >>>>>>>>>>>>>>> significantly. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Depending on what we want to support, it might make sense to >>>>>>>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, >> this >>>>>>> also >>>>>>>>>>>>>>>> determines if we should have a "ProcessorContext KIP" >> before >>>>>>> driving >>>>>>>>>>>>>>>> this KIP further. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Based on our discussion I think we should keep Rich >> functions >>>>> as I >>>>>>>>>>>>> don't >>>>>>>>>>>>>>> think that they bring extra layer of overhead to library. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any comments are appreciated. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax < >>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks for the update. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I think supporting Lambdas for `withKey` and >>>>>>> `AbstractRichFunction` >>>>>>>>>>>>>>>> don't go together, as Lambdas are only supported for >>> interfaces >>>>>>>>>>>> AFAIK. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thus, if we want to support Lambdas for `withKey`, we need >> to >>>>>>> have a >>>>>>>>>>>>>>>> interface approach like this >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - RichFunction -> only adding init() and close() >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - ValueMapper >>>>>>>>>>>>>>>> - ValueMapperWithKey >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - RichValueMapper extends ValueMapperWithKey, >> RichFunction >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For this approach, AbstractRichFunction does not make sense >>>>>>> anymore, >>>>>>>>>>>>> as >>>>>>>>>>>>>>>> the only purpose of `RichFunction` is to allow the >>>>>>> implementation of >>>>>>>>>>>>>>>> init() and close() -- if you don't want those, you would >>>>>>> implement a >>>>>>>>>>>>>>>> different interface (ie, ValueMapperWithKey) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As an alternative, we could argue, that it is sufficient to >>>>>>> support >>>>>>>>>>>>>>>> Lambdas for the "plain" API only, but not for any "extended >>>>> API". >>>>>>>>>>>> For >>>>>>>>>>>>>>>> this, RichFunction could add key+init+close and >>>>>>> AbstractRichFunction >>>>>>>>>>>>>>>> would allow to only care about getting the key. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Not sure, which one is better. I don't like the idea of >> more >>>>>>>>>>>>> overloaded >>>>>>>>>>>>>>>> methods to get Lambdas for `withKey` interfaces too much >>>>> because >>>>>>> we >>>>>>>>>>>>> have >>>>>>>>>>>>>>>> already so many overlaods. On the other hand, I do see >> value >>> in >>>>>>>>>>>>>>>> supporting Lambdas for `withKey`. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Depending on what we want to support, it might make sense >> to >>>>>>>>>>>>>>>> include/exclude RichFunctions from this KIP -- and thus, >> this >>>>>>> also >>>>>>>>>>>>>>>> determines if we should have a "ProcessorContext KIP" >> before >>>>>>> driving >>>>>>>>>>>>>>>> this KIP further. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thoughts? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sorry for super late response. Thanks for your comments. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little >>>>> bit? I >>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>>>> follow the explanation in the KIP to see what the problem >>> is. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - From [1] says "A functional interface is an interface >> that >>>>> has >>>>>>>>>>>> just >>>>>>>>>>>>>> one >>>>>>>>>>>>>>>>> abstract method, and thus represents a single function >>>>>>> contract". >>>>>>>>>>>>>>>>> So basically once we extend some interface from another >> (in >>>>> our >>>>>>>>>>>> case, >>>>>>>>>>>>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas >>> in >>>>>>> the >>>>>>>>>>>>>>>> extended >>>>>>>>>>>>>>>>> interface. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Further comments: >>>>>>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe >>>>> reformat >>>>>>> the >>>>>>>>>>>>>> wiki >>>>>>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would help. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - I will work on the KIP. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have >> overlaods >>>>>>> added >>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to >> add >>>>> any >>>>>>> new >>>>>>>>>>>>>>>>>> overloads) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Actually there are more than one Processor and public >> APIs >>>>> to >>>>>>> be >>>>>>>>>>>>>>>>> changed (KStream-KTable >>>>>>>>>>>>>>>>> joins is one case). However all of them has similar >>> structure: >>>>>>> we >>>>>>>>>>>>>>>> overload >>>>>>>>>>>>>>>>> the *method* with *methodWithKey*, >>>>>>>>>>>>>>>>> wrap it into the Rich function, send to processor and >> inside >>>>> the >>>>>>>>>>>>>>>> processor >>>>>>>>>>>>>>>>> call *init* and *close* methods of the Rich function. >>>>>>>>>>>>>>>>> As I wrote in KIP, I wanted to demonstrate the overall >> idea >>>>> with >>>>>>>>>>>> only >>>>>>>>>>>>>>>>> *ValueMapper* as the same can be applied to all changes. >>>>>>>>>>>>>>>>> Anyway I will update the KIP. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Instead of overriding the *init(ProcessorContext p)* and* >>>>>>> close()* >>>>>>>>>>>>>>>> methods >>>>>>>>>>>>>>>>> in every Rich function with empty body like: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> @Override >>>>>>>>>>>>>>>>> void init(ProcessorContext context) {} >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> @Override >>>>>>>>>>>>>>>>> void close () {} >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I thought that we can override them once in >>>>>>> *AbstractRichFunction* >>>>>>>>>>>>> and >>>>>>>>>>>>>>>>> extent new Rich functions from *AbstractRichFunction*. >>>>>>>>>>>>>>>>> Basically this can eliminate code copy-paste and ease the >>>>>>>>>>>>> maintenance. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, >> Merger, >>>>>>>>>>>>> Predicate, >>>>>>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to >> all, >>>>> but >>>>>>> we >>>>>>>>>>>>>> should >>>>>>>>>>>>>>>>>> discuss all of them and add where it does make sense >> (e.g., >>>>>>>>>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Definitely agree. As I said, the same technique applies to >>> all >>>>>>> this >>>>>>>>>>>>>>>>> interfaces and I didn't want to explode the KIP, just >> wanted >>>>> to >>>>>>>>>>>> give >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> overall intuition. >>>>>>>>>>>>>>>>> However, I will update the KIP as I said. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- >>>>>>>>>>>>>> `RichValueXX` >>>>>>>>>>>>>>>>>> in general -- but why can't we do all this with >> interfaces >>>>>>> only? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sure we can. However the main intuition is we should not >>> force >>>>>>>>>>>> users >>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> implement *init(ProcessorContext)* and *close()* functions >>>>> every >>>>>>>>>>>> time >>>>>>>>>>>>>>>> they >>>>>>>>>>>>>>>>> use Rich functions. >>>>>>>>>>>>>>>>> If one needs, she can override the respective methods. >>>>> However, >>>>>>> I >>>>>>>>>>>> am >>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>> for discussion. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread >> any >>>>>>>>>>>> further >>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>> it currently is. So maybe we need another KIP that is >> done >>>>>>> before >>>>>>>>>>>>>> this? >>>>>>>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too >>>>> large. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> That is good point. I wanted to make >>> *init(ProcessorContext)* >>>>>>>>>>>> method >>>>>>>>>>>>>>>>> persistent among the library (which use ProcessorContext >> as >>> an >>>>>>>>>>>>> input), >>>>>>>>>>>>>>>>> therefore I put *ProcessorContext* as an input. >>>>>>>>>>>>>>>>> So the important question is that (as @dguy and @mjsax >>>>>>> mentioned) >>>>>>>>>>>>>> whether >>>>>>>>>>>>>>>>> continue this KIP without providing users an access to >>>>>>>>>>>>>> *ProcessorContext* >>>>>>>>>>>>>>>>> (change *init (ProcessorContext)* to * init()* ) or >>>>>>>>>>>>>>>>> initiate another KIP before this. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> >>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java- >>>>>>>>>>>>>>>> se-8-jls-pfd-diffs.pdf >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy < >>>>>>> damian....@gmail.com> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'd rather not see the use of `ProcessorContext` spread >>> any >>>>>>>>>>>> further >>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>> it currently is. So maybe we need another KIP that is >> done >>>>>>> before >>>>>>>>>>>>>> this? >>>>>>>>>>>>>>>>>> Otherwise i think the scope of this KIP is becoming too >>>>> large. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax < >>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I agree that that `ProcessorContext` interface is too >>> broad >>>>> in >>>>>>>>>>>>>> general >>>>>>>>>>>>>>>>>>> -- this is even true for transform/process, and it's >> also >>>>>>>>>>>> reflected >>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>> the API improvement list we want to do. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/ >>>>>>>>>>>>>>>>>> Kafka+Streams+Discussions >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So I am wondering, if you question the `RichFunction` >>>>>>> approach in >>>>>>>>>>>>>>>>>>> general? Or if you suggest to either extend the scope of >>>>> this >>>>>>> KIP >>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> include this---or maybe better, do another KIP for it >> and >>>>>>> delay >>>>>>>>>>>>> this >>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>> until the other one is done? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote: >>>>>>>>>>>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we >>>>>>> really >>>>>>>>>>>>> want >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> give >>>>>>>>>>>>>>>>>>>> every DSL method access to the `ProcessorContext` ? It >>> has >>>>> a >>>>>>>>>>>> bunch >>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>> methods on it that seem in-appropriate for some of the >>> DSL >>>>>>>>>>>>> methods, >>>>>>>>>>>>>>>>>> i.e, >>>>>>>>>>>>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. >>> It >>>>> is >>>>>>>>>>>> far >>>>>>>>>>>>>> too >>>>>>>>>>>>>>>>>>>> broad. I think it would be better to have a narrower >>>>>>> interface >>>>>>>>>>>>> like >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> `RecordContext` - remembering it is easier to add >>>>>>>>>>>>>> methods/interfaces >>>>>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>>>>>> than to remove them >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax < >>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Jeyhun, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a >>> little >>>>>>> bit? >>>>>>>>>>>> I >>>>>>>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>>>>>>> follow the explanation in the KIP to see what the >>> problem >>>>>>> is. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> For updating the KIP title I don't know -- guess it's >> up >>>>> to >>>>>>>>>>>> you. >>>>>>>>>>>>>>>>>> Maybe a >>>>>>>>>>>>>>>>>>>>> committer can comment on this? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Further comments: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - The KIP get a little hard to read -- can you maybe >>>>>>> reformat >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> wiki >>>>>>>>>>>>>>>>>>>>> page a little bit? I think using `CodeBlock` would >> help. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - What about KStream-KTable joins? You don't have >>>>> overlaods >>>>>>>>>>>>> added >>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>> them. Why? (Even if I still hope that we don't need to >>> add >>>>>>> any >>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>> overloads) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - Why do we need `AbstractRichFunction`? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - What about interfaces Initializer, ForeachAction, >>>>> Merger, >>>>>>>>>>>>>>>>>> Predicate, >>>>>>>>>>>>>>>>>>>>> Reducer? I don't want to say we should/need to add to >>> all, >>>>>>> but >>>>>>>>>>>> we >>>>>>>>>>>>>>>>>> should >>>>>>>>>>>>>>>>>>>>> discuss all of them and add where it does make sense >>>>> (e.g., >>>>>>>>>>>>>>>>>>>>> RichForachAction does make sense IMHO) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- >> `ValueXXWithKey` >>> -- >>>>>>>>>>>>>>>>>> `RichValueXX` >>>>>>>>>>>>>>>>>>>>> in general -- but why can't we do all this with >>> interfaces >>>>>>>>>>>> only? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks for your comments. I think we cannot extend >> the >>>>> two >>>>>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I >>>>> should >>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> title, because now we are not limiting the KIP to >> only >>>>>>>>>>>>>> ValueMapper, >>>>>>>>>>>>>>>>>>>>>> ValueTransformer and ValueJoiner. >>>>>>>>>>>>>>>>>>>>>> Please feel free to comment. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax < >>>>>>>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we >> don't >>>>>>> need >>>>>>>>>>>> the >>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>> overlaod. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> And yes, we need to do one check -- but this happens >>>>> when >>>>>>>>>>>>>> building >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> topology. At runtime (I mean after >> KafkaStream#start() >>>>> we >>>>>>>>>>>> don't >>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>> any >>>>>>>>>>>>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. >>>>>>>>>>>>>>>>>>>>>>>> Then we need to overload method >>>>>>>>>>>>>>>>>>>>>>>> <VR> KStream<K, VR> mapValues(ValueMapper<? super >>> V, >>>>> ? >>>>>>>>>>>>> extends >>>>>>>>>>>>>>>>>> VR> >>>>>>>>>>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>> <VR> KStream<K, VR> >> mapValues(ValueMapperWithKey<? >>>>>>> super >>>>>>>>>>>> V, >>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>>>> extends >>>>>>>>>>>>>>>>>>>>>>> VR> >>>>>>>>>>>>>>>>>>>>>>>> mapper); >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> and in runtime (inside processor) we still have to >>>>> check >>>>>>> it >>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>> ValueMapper >>>>>>>>>>>>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the >>> rich >>>>>>>>>>>>> function. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki < >>>>>>>>>>>>>>>>>>>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> +1 :) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, >>> if >>>>> we >>>>>>>>>>>>> should >>>>>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> design a little different. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction >>> and >>>>>>>>>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the >>>>> time. >>>>>>>>>>>>> Thus, >>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> DSL >>>>>>>>>>>>>>>>>>>>>>>>>> entry level, if a user provides a >> non-RichFunction, >>>>> we >>>>>>>>>>>> wrap >>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>> by a >>>>>>>>>>>>>>>>>>>>>>>>>> RichFunction that is fully implemented by >> Streams. >>>>> This >>>>>>>>>>>>>>>>>>> RichFunction >>>>>>>>>>>>>>>>>>>>>>>>>> would just forward the call omitting the key: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code >> snippet): >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements >>>>>>>>>>>> RichValueMapper() >>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>>>>>>>>> private ValueMapper userProvidedMapper; // >> set >>> by >>>>>>>>>>>>>>>> constructor >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> public VR apply(final K key, final V1 value1, >>>>>>> final V2 >>>>>>>>>>>>>>>>>> value2) >>>>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>>>>>>>>> return userProvidedMapper(value1, >> value2); >>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> From a performance point of view, I am not sure >> if >>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would >> have >>>>>>> more >>>>>>>>>>>>>>>> overhead >>>>>>>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>>>>>>> this approach (we would do more nested method >> call >>>>> for >>>>>>>>>>>>>>>>>>>>> non-RichFunction >>>>>>>>>>>>>>>>>>>>>>>>>> which should be more common than RichFunctions). >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I >>> miss >>>>>>>>>>>>>>>> something?) >>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>> might be cleaner, as we could have one more top >>> level >>>>>>>>>>>>>> interface >>>>>>>>>>>>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and >> `close()` >>>>> and >>>>>>>>>>>> also >>>>>>>>>>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract >>> classes >>>>>>>>>>>>>> required). >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Any thoughts? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to >>>>> include >>>>>>>>>>>> rich >>>>>>>>>>>>>>>>>>>>> functions. >>>>>>>>>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep >>> copying >>>>>>> of >>>>>>>>>>>>> keys. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak < >>>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hey Matthias, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> My opinion would be that documenting the >>>>>>> immutability of >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>> best approach available. Better than requiring >>> the >>>>>>> key >>>>>>>>>>>> to >>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>> serializable >>>>>>>>>>>>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no >>>>>>> performance >>>>>>>>>>>>> risk. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type >>>>>>> constraints >>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>> kind. >>>>>>>>>>>>>>>>>>>>>>>>> :-) >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. >> Sax >>> < >>>>>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io> >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this >>> path, >>>>>>> it >>>>>>>>>>>>>> should >>>>>>>>>>>>>>>>>>> cover >>>>>>>>>>>>>>>>>>>>>>>>>>>>> all(?) DSL interfaces. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure >>> what >>>>>>> to >>>>>>>>>>>> do >>>>>>>>>>>>>>>> about >>>>>>>>>>>>>>>>>>>>> it... >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and >>> name >>>>>>> the >>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). >>> Ultimately, >>>>> I >>>>>>>>>>>> would >>>>>>>>>>>>>>>>>> prefer >>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> guard against any modification, but I have no >>> good >>>>>>> idea >>>>>>>>>>>>> how >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>>>>>>>>> this. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of >>>>>>> corrupted >>>>>>>>>>>>>>>>>>>>> partitioning >>>>>>>>>>>>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, >>>>> well, >>>>>>> bad >>>>>>>>>>>>>> luck, >>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>> got >>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs >> deep >>>>>>> copy >>>>>>>>>>>>> with >>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>> potential >>>>>>>>>>>>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm >>>>> without >>>>>>> any >>>>>>>>>>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>>>>>>>>> test). >>>>>>>>>>>>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe >> it's >>>>>>> worth >>>>>>>>>>>>> for >>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>> apply >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's >>> very >>>>>>>>>>>> basic >>>>>>>>>>>>>>>>>>>>> performance >>>>>>>>>>>>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you >>>>> want >>>>>>> to >>>>>>>>>>>> do >>>>>>>>>>>>>>>>>> this, >>>>>>>>>>>>>>>>>>>>> look >>>>>>>>>>>>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, >> and >>>>> into >>>>>>>>>>>>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find >>> find >>>>>>> the >>>>>>>>>>>>> perf >>>>>>>>>>>>>>>>>>> test. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Matthias, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think extending KIP to include >> RichFunctions >>>>>>> totally >>>>>>>>>>>>>>>> makes >>>>>>>>>>>>>>>>>>>>>>> sense. >>>>>>>>>>>>>>>>>>>>>>>>>>>> So, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we don't want to guard the keys because it >> is >>>>>>>>>>>> costly. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it >> should >>>>>>> not be >>>>>>>>>>>>>>>>>> limited >>>>>>>>>>>>>>>>>>>>> only >>>>>>>>>>>>>>>>>>>>>>>>>>>> the 3 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range >> of >>>>>>>>>>>>>> interfaces. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. >>> Sax < >>>>>>>>>>>>>>>>>>>>>>>>> matth...@confluent.io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the >>> user >>>>>>> list: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+ >>>>>>>>>>>>>>>>>> key+ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, >>>>> Adder, >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> Substractor >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> inferface, too. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And we should double check if there are >> other >>>>>>>>>>>> interface >>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>> might >>>>>>>>>>>>>>>>>>>>>>>>> miss >>>>>>>>>>>>>>>>>>>>>>>>>>>>> atm. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, >> but >>> I >>>>> am >>>>>>>>>>>>>> actually >>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>> little >>>>>>>>>>>>>>>>>>>>>>>>>>>> bit >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We >> might >>>>>>> want to >>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>> test >>>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> quantify this impact. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of >>>>>>> `RichFunction` >>>>>>>>>>>>>>>>>>> interface >>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata >>>>> (like >>>>>>>>>>>>>>>>>> timestamp, >>>>>>>>>>>>>>>>>>>>>>>>> offset, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a >>>>>>> similar >>>>>>>>>>>>>>>> concept. >>>>>>>>>>>>>>>>>>>>>>> Thus, I >>>>>>>>>>>>>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to >> enlarge >>>>> the >>>>>>>>>>>> scope >>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that? WDYT? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar >>>>> approach >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>> updated >>>>>>>>>>>>>>>>>>>>> PR >>>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in >>>>>>> Processors >>>>>>>>>>>>>>>>>> sending >>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>> copy >>>>>>>>>>>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actual key. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an >> object, I >>>>>>> think >>>>>>>>>>>>>> memory >>>>>>>>>>>>>>>>>>> can >>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> bottleneck >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in some use-cases. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu >>>>> Fenniak < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mathieu.fenn...@replicon.com> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper >>>>>>> (...etc) to >>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>> classes, >>>>>>>>>>>>>>>>>>>>>>>>>>>> rather >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards >>>>>>> incompatible >>>>>>>>>>>>>>>>>> change. >>>>>>>>>>>>>>>>>>>>> An >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alternative >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards >> compatible >>>>>>> would >>>>>>>>>>>> be >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>> define >>>>>>>>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where >>> those >>>>>>>>>>>>>> interfaces >>>>>>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>>>>>>> used. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter >> as >>>>>>> "final" >>>>>>>>>>>>>>>>>> doesn't >>>>>>>>>>>>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>>>>>>>>>>>>> much >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change. It >> only >>>>>>>>>>>> prevents >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>> parameter >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> variable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from being reassigned. If the key type >> is >>> a >>>>>>>>>>>> mutable >>>>>>>>>>>>>>>>>> object >>>>>>>>>>>>>>>>>>>>>>> (eg. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> byte[]), >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = >> 0). >>>>> But >>>>>>>>>>>> I'm >>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>> really >>>>>>>>>>>>>>>>>>>>>>>>> sure >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there's >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun >>>>> Karimov >>>>>>> < >>>>>>>>>>>>>>>>>>>>>>>>>>>> je.kari...@gmail.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we >> can >>>>>>> guard >>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>> immutable >>>>>>>>>>>>>>>>>>>>>>>>> keys >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few >>> changes), I >>>>>>>>>>>> didn't >>>>>>>>>>>>>>>>>>> consider >>>>>>>>>>>>>>>>>>>>>>>>>>>>> backward >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatibility. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my >> mind. >>> In >>>>>>> both >>>>>>>>>>>>>>>> cases, >>>>>>>>>>>>>>>>>>>>> user >>>>>>>>>>>>>>>>>>>>>>>>>>>>> accesses >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra >> type >>>>>>>>>>>> parameter >>>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>>>>>> break >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility. So user has >> to >>>>> cast >>>>>>> to >>>>>>>>>>>>>> actual >>>>>>>>>>>>>>>>>>> key >>>>>>>>>>>>>>>>>>>>>>>>> type. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >> >> -- >> -Cheers >> >> Jeyhun >> > > >
signature.asc
Description: OpenPGP digital signature