Michal, thanks a lot for this comment. I did not consider lambdas when proposing RichFunctions. I thinks it very important to preserve the ability to use lambdas!
@Jeyhun: I did not put any thought into this, but can we have a design that allows for both? Also, with regard to lambdas, it might make sense to allow for both `V -> newV` and `(K, V) -> newV` ? -Matthias On 5/7/17 5:34 AM, Michal Borowiecki wrote: > Hi Jeyhun, > > Thanks for your quick reply. > > Indeed, I understand the existing ValueMapper/Joiner etc. have to remain > as-is for backwards compatibility. > > I was just expressing my surprise that their proposed richer equivalents > weren't functional interfaces too. > > Thanks, > > MichaĆ > > > On 07/05/17 12:32, Jeyhun Karimov wrote: >> Hi Michal, >> >> Thanks for your comments. We proposed the similar solution to yours in >> KIP (please look at rejected alternatives). However, after the >> discussion in mailing list I extended it to rich functions. Maybe we >> should keep them both: simple and rich versions. >> >> Cheers, >> Jeyhun >> >> On Sun, May 7, 2017 at 11:46 AM Michal Borowiecki >> <michal.borowie...@openbet.com <mailto:michal.borowie...@openbet.com>> >> wrote: >> >> Do I understanding correctly, that with the proposed pattern one >> could not pass a lambda expression and access the context from >> within it? >> >> TBH, I was hoping that for simple things this would be possible: >> >> myStream.map( (key, value, ctx) -> new KeyValue<>(ctx.partition(), >> value) ) >> >> or (more to the original point of this KIP): >> >> myStream.mapValues( (key, value, ctx) -> new MyValueWrapper(value, >> ctx.partition(), ctx.offset()) ) >> >> but it looks like that would require subclassing RichFunction? >> That's a bit of an inconvenience IMO. >> >> Cheers, >> >> Michal >> >> >> On 07/05/17 01:29, Jeyhun Karimov wrote: >>> Hi, >>> >>> Thanks for comments. I extended PR and KIP to include rich functions. I >>> will still have to evaluate the cost of deep copying of keys. >>> >>> Cheers, >>> Jeyhun >>> >>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak >>> <mathieu.fenn...@replicon.com> <mailto:mathieu.fenn...@replicon.com> >>> wrote: >>> >>>> Hey Matthias, >>>> >>>> My opinion would be that documenting the immutability of the key is the >>>> best approach available. Better than requiring the key to be >>>> serializable >>>> (as with Jeyhun's last pass at the PR), no performance risk. >>>> >>>> It'd be different if Java had immutable type constraints of some kind. >>>> :-) >>>> >>>> Mathieu >>>> >>>> >>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax >>>> <matth...@confluent.io> <mailto:matth...@confluent.io> >>>> wrote: >>>> >>>>> Agreed about RichFunction. If we follow this path, it should cover >>>>> all(?) DSL interfaces. >>>>> >>>>> About guarding the key -- I am still not sure what to do about it... >>>>> Maybe it might be enough to document it (and name the key parameter >>>>> like >>>>> `readOnlyKey` to make is very clear). Ultimately, I would prefer to >>>>> guard against any modification, but I have no good idea how to do >>>>> this. >>>>> Not sure what others think about the risk of corrupted partitioning >>>>> (what would be a user error and we could say, well, bad luck, you got >>>>> a >>>>> bug in your code, that's not our fault), vs deep copy with a potential >>>>> performance hit (that we can't quantity atm without any performance >>>> test). >>>>> We do have a performance system test. Maybe it's worth for you to >>>>> apply >>>>> the deep copy strategy and run the test. It's very basic performance >>>>> test only, but might give some insight. If you want to do this, look >>>>> into folder "tests" for general test setup, and into >>>>> "tests/kafaktests/benchmarks/streams" to find find the perf test. >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote: >>>>>> Hi Matthias, >>>>>> >>>>>> I think extending KIP to include RichFunctions totally makes sense. >>>> So, >>>>>> we don't want to guard the keys because it is costly. >>>>>> If we introduce RichFunctions I think it should not be limited only >>>> the 3 >>>>>> interfaces proposed in KIP but to wide range of interfaces. >>>>>> Please correct me if I am wrong. >>>>>> >>>>>> Cheers, >>>>>> Jeyhun >>>>>> >>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax >>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> >>>>>> wrote: >>>>>> >>>>>>> One follow up. There was this email on the user list: >>>>>>> >>>>>>> >>>>>>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj= >>>>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+key+ >>>>>>> It might make sense so include Initializer, Adder, and Substractor >>>>>>> inferface, too. >>>>>>> >>>>>>> And we should double check if there are other interface we might >>>>>>> miss >>>>> atm. >>>>>>> -Matthias >>>>>>> >>>>>>> >>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote: >>>>>>>> Thanks for updating the KIP. >>>>>>>> >>>>>>>> Deep copying the key will work for sure, but I am actually a little >>>> bit >>>>>>>> worried about performance impact... We might want to do some test >>>>>>>> to >>>>>>>> quantify this impact. >>>>>>>> >>>>>>>> >>>>>>>> Btw: this remind me about the idea of `RichFunction` interface that >>>>>>>> would allow users to access record metadata (like timestamp, >>>>>>>> offset, >>>>>>>> partition etc) within DSL. This would be a similar concept. Thus, I >>>> am >>>>>>>> wondering, if it would make sense to enlarge the scope of this KIP >>>>>>>> by >>>>>>>> that? WDYT? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Matthias >>>>>>>> >>>>>>>> >>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote: >>>>>>>>> Hi Mathieu, >>>>>>>>> >>>>>>>>> Thanks for feedback. I followed similar approach and updated PR >>>>>>>>> and >>>>> KIP >>>>>>>>> accordingly. I tried to guard the key in Processors sending a copy >>>> of >>>>> an >>>>>>>>> actual key. >>>>>>>>> Because I am doing deep copy of an object, I think memory can be >>>>>>> bottleneck >>>>>>>>> in some use-cases. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Jeyhun >>>>>>>>> >>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak < >>>>>>> mathieu.fenn...@replicon.com >>>>>>> <mailto:mathieu.fenn...@replicon.com>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Jeyhun, >>>>>>>>>> >>>>>>>>>> This approach would change ValueMapper (...etc) to be classes, >>>> rather >>>>>>> than >>>>>>>>>> interfaces, which is also a backwards incompatible change. An >>>>>>> alternative >>>>>>>>>> approach that would be backwards compatible would be to define >>>>>>>>>> new >>>>>>>>>> interfaces, and provide overrides where those interfaces are >>>>>>>>>> used. >>>>>>>>>> >>>>>>>>>> Unfortunately, making the key parameter as "final" doesn't change >>>>> much >>>>>>>>>> about guarding against key change. It only prevents the >>>>>>>>>> parameter >>>>>>> variable >>>>>>>>>> from being reassigned. If the key type is a mutable object (eg. >>>>>>> byte[]), >>>>>>>>>> it can still be mutated. (eg. key[0] = 0). But I'm not really >>>>>>>>>> sure >>>>>>> there's >>>>>>>>>> much that can be done about that. >>>>>>>>>> >>>>>>>>>> Mathieu >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov < >>>> je.kari...@gmail.com <mailto:je.kari...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks for comments. >>>>>>>>>>> >>>>>>>>>>> The concerns makes sense. Although we can guard for immutable >>>>>>>>>>> keys >>>>> in >>>>>>>>>>> current implementation (with few changes), I didn't consider >>>>> backward >>>>>>>>>>> compatibility. >>>>>>>>>>> >>>>>>>>>>> In this case 2 solutions come to my mind. In both cases, user >>>>> accesses >>>>>>>>>> the >>>>>>>>>>> key in Object type, as passing extra type parameter will break >>>>>>>>>>> backwards-compatibility. So user has to cast to actual key >>>>>>>>>>> type. >>>>>>>>>>> >>>>>>>>>>> 1. Firstly, We can overload apply method with 2 argument (key >>>>>>>>>>> and >>>>>>> value) >>>>>>>>>>> and force key to be *final*. By doing this, I think we can >>>> address >>>>>>> both >>>>>>>>>>> backward-compatibility and guarding against key change. >>>>>>>>>>> >>>>>>>>>>> 2. Secondly, we can create class KeyAccess like: >>>>>>>>>>> >>>>>>>>>>> public class KeyAccess { >>>>>>>>>>> Object key; >>>>>>>>>>> public void beforeApply(final Object key) { >>>>>>>>>>> this.key = key; >>>>>>>>>>> } >>>>>>>>>>> public Object getKey() { >>>>>>>>>>> return key; >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and *ValueTransformer* >>>> from >>>>>>>>>>> *KeyAccess*. Inside processor (for example >>>>> *KTableMapValuesProcessor*) >>>>>>>>>>> before calling *mapper.apply(value)* we can set the *key* by >>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user can use *getKey()* >>>>>>>>>>> to >>>>>>> access >>>>>>>>>>> the key inside *apply(value)* method. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Jeyhun >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax < >>>>> matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Jeyhun, >>>>>>>>>>>> >>>>>>>>>>>> thanks a lot for the KIP! >>>>>>>>>>>> >>>>>>>>>>>> I think there are two issues we need to address: >>>>>>>>>>>> >>>>>>>>>>>> (1) The KIP does not consider backward compatibility. Users did >>>>>>>>>> complain >>>>>>>>>>>> about this in past releases already, and as the user base >>>>>>>>>>>> grows, >>>> we >>>>>>>>>>>> should not break backward compatibility in future releases >>>> anymore. >>>>>>>>>>>> Thus, we should think of a better way to allow key access. >>>>>>>>>>>> >>>>>>>>>>>> Mathieu's comment goes into the same direction >>>>>>>>>>>> >>>>>>>>>>>>>> On the other hand, the number of compile failures that would >>>> need >>>>>>> to >>>>>>>>>>> be >>>>>>>>>>>>>> fixed from this change is unfortunate. :-) >>>>>>>>>>>> (2) Another concern is, that there is no guard to prevent user >>>> code >>>>>>> to >>>>>>>>>>>> modify the key. This might corrupt partitioning if users do >>>>>>>>>>>> alter >>>>> the >>>>>>>>>>>> key (accidentally -- or users are just not aware that they are >>>> not >>>>>>>>>>>> allowed to modify the provided key object) and thus break the >>>>>>>>>>>> application. (This was the original motivation to not provide >>>>>>>>>>>> the >>>>> key >>>>>>>>>> in >>>>>>>>>>>> the first place -- it's guards against modification.) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -Matthias >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: >>>>>>>>>>>>> Hi Jeyhun, >>>>>>>>>>>>> >>>>>>>>>>>>> I just want to add my voice that, I too, have wished for >>>>>>>>>>>>> access >>>> to >>>>>>>>>> the >>>>>>>>>>>>> record key during a mapValues or similar operation. >>>>>>>>>>>>> >>>>>>>>>>>>> On the other hand, the number of compile failures that would >>>> need >>>>> to >>>>>>>>>> be >>>>>>>>>>>>> fixed from this change is unfortunate. :-) But at least it >>>> would >>>>>>> all >>>>>>>>>>> be >>>>>>>>>>>> a >>>>>>>>>>>>> pretty clear and easy change. >>>>>>>>>>>>> >>>>>>>>>>>>> Mathieu >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov < >>>>>>> je.kari...@gmail.com <mailto:je.kari...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Dear community, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues KAFKA-4218 [2], >>>>>>>>>> KAFKA-4726 >>>>>>>>>>>> [3], >>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be found at [5]. >>>>>>>>>>>>>> I would like to get your comments. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>>>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-4218 >>>>>>>>>>>>>> [3] https://issues.apache.org/jira/browse/KAFKA-4726 >>>>>>>>>>>>>> [4] https://issues.apache.org/jira/browse/KAFKA-3745 >>>>>>>>>>>>>> [5] https://github.com/apache/kafka/pull/2946 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> Jeyhun >>>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>> -Cheers >>>>>>>>>>> >>>>>>>>>>> Jeyhun >>>>>>>>>>> >>>>>>> -- >>>>>> -Cheers >>>>>> >>>>>> Jeyhun >>>>>> >> >> -- >> <http://www.openbet.com/> Michal Borowiecki >> Senior Software Engineer L4 >> T: +44 208 742 1600 <tel:+44%2020%208742%201600> >> >> >> +44 203 249 8448 <tel:+44%2020%203249%208448> >> >> >> >> E: michal.borowie...@openbet.com >> <mailto:michal.borowie...@openbet.com> >> W: www.openbet.com <http://www.openbet.com/> >> >> >> OpenBet Ltd >> >> Chiswick Park Building 9 >> >> 566 Chiswick High Rd >> >> London >> >> W4 5XT >> >> UK >> >> >> <https://www.openbet.com/email_promo> >> >> This message is confidential and intended only for the addressee. >> If you have received this message in error, please immediately >> notify the postmas...@openbet.com <mailto:postmas...@openbet.com> >> and delete it from your system as well as any copies. The content >> of e-mails as well as traffic data may be monitored by OpenBet for >> employment and security purposes. To protect the environment >> please do not print this e-mail unless necessary. OpenBet Ltd. >> Registered Office: Chiswick Park Building 9, 566 Chiswick High >> Road, London, W4 5XT, United Kingdom. A company registered in >> England and Wales. Registered no. 3134634. VAT no. GB927523612 >> >> -- >> -Cheers >> >> Jeyhun > > -- > Signature > <http://www.openbet.com/> Michal Borowiecki > Senior Software Engineer L4 > T: +44 208 742 1600 > > > +44 203 249 8448 > > > > E: michal.borowie...@openbet.com > W: www.openbet.com <http://www.openbet.com/> > > > OpenBet Ltd > > Chiswick Park Building 9 > > 566 Chiswick High Rd > > London > > W4 5XT > > UK > > > <https://www.openbet.com/email_promo> > > This message is confidential and intended only for the addressee. If you > have received this message in error, please immediately notify the > postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it > from your system as well as any copies. The content of e-mails as well > as traffic data may be monitored by OpenBet for employment and security > purposes. To protect the environment please do not print this e-mail > unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building > 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company > registered in England and Wales. Registered no. 3134634. VAT no. > GB927523612 >
signature.asc
Description: OpenPGP digital signature