Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

Jeyhun Karimov Sat, 20 May 2017 02:03:54 -0700

Hi,

Thanks for your comments. I rethink about including rich functions into
this KIP.
I think once we include rich functions in this KIP and then fix
ProcessorContext in another KIP and incorporate with existing rich
functions, the code will not be backwards compatible.


I see Damian's and your point more clearly now.

Conclusion: we include only withKey interfaces in this KIP (I updated the
KIP), I will try also initiate another KIP for rich functions.

Cheers,
Jeyhun

On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax <[email protected]>
wrote:

> With the current KIP, using ValueMapper and ValueMapperWithKey
> interfaces, RichFunction seems to be an independent add-on. To fix the
> original issue to allow key access, RichFunctions are not required IMHO.
>
> I initially put the RichFunction idea on the table, because I was hoping
> to get a uniform API. And I think, is was good to consider them --
> however, the discussion showed that they are not necessary for key
> access. Thus, it seems to be better to handle RichFunctions in an own
> KIP. The ProcessorContext/RecordContext issues seems to be a main
> challenge for this. And introducing RichFunctions with parameter-less
> init() method, seem not to help too much. We would get an "intermediate"
> API that we want to change anyway later on...
>
> As you put already much effort into RichFunction, feel free do push this
> further and start a new KIP (we can do this even in parallel) -- we
> don't want to slow you down :) But it make the discussion and code
> review easier, if we separate both IMHO.
>
>
> -Matthias
>
>
> On 5/19/17 2:25 AM, Jeyhun Karimov wrote:
> > Hi Damian,
> >
> > Thanks for your comments. I think providing to users *interface* rather
> > than *abstract class* should be preferred (Matthias also raised this
> issue
> > ), anyway I changed the corresponding parts of KIP.
> >
> > Regarding with passing additional contextual information, I think it is a
> > tradeoff,
> > 1) first, we fix the context parameter for *init() *method in another PR
> > and solve Rich functions afterwards
> > 2) first, we fix the requested issues on jira ([1-3]) with providing
> > (not-complete) Rich functions and integrate the context parameters to
> this
> > afterwards (like improvement)
> >
> > To me, the second approach seems more incremental. However you are right,
> > the names might confuse the users.
> >
> >
> >
> > [1] https://issues.apache.org/jira/browse/KAFKA-4218
> > [2] https://issues.apache.org/jira/browse/KAFKA-4726
> > [3] https://issues.apache.org/jira/browse/KAFKA-3745
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> > On Fri, May 19, 2017 at 10:42 AM Damian Guy <[email protected]>
> wrote:
> >
> >> Hi,
> >>
> >> I see you've removed the `ProcessorContext` from the RichFunction which
> is
> >> good, but why is it a `RichFunction`? I'd have expected it to pass some
> >> additional contextual information, i.e., the `RecordContext` that
> contains
> >> just the topic, partition, timestamp, offset.  I'm ok with it not
> passing
> >> this contextual information, but is the naming incorrect? I'm not sure,
> >> tbh. I'm wondering if we should drop `RichFunctions` until we can do it
> >> properly with the correct context?
> >>
> >> Also, i don't like the abstract classes: RichValueMapper,
> RichValueJoiner,
> >> RichInitializer etc. Why can't they not just be interfaces? Generally we
> >> should provide users with Intefaces to code against, not classes.
> >>
> >> Thanks,
> >> Damian
> >>
> >> On Fri, 19 May 2017 at 00:50 Jeyhun Karimov <[email protected]>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> Thanks. I initiated the PR as well, to get a better overview.
> >>>
> >>> The only reason that I used abstract class instead of interface for
> Rich
> >>> functions is that in future if we will have some AbstractRichFunction
> >>> abstract classes,
> >>> we can easily extend:
> >>>
> >>> public abstract class RichValueMapper<K, V, VR>  implements
> >>> ValueMapperWithKey<K, V, VR>, RichFunction *extends
> >> AbstractRichFunction*{
> >>> }
> >>>  With interfaces we are only limited to interfaces for inheritance.
> >>>
> >>> However, I think we can easily change it (from abstract class ->
> >> interface)
> >>> if you think interface is a better fit.
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>>
> >>> On Fri, May 19, 2017 at 1:00 AM Matthias J. Sax <[email protected]
> >
> >>> wrote:
> >>>
> >>>> Thanks for the update and explanations. The KIP is quite improved now
> >> --
> >>>> great job!
> >>>>
> >>>> One more question: Why are RichValueMapper etc abstract classes and
> not
> >>>> interfaces?
> >>>>
> >>>>
> >>>> Overall, I like the KIP a lot!
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>> On 5/16/17 7:03 AM, Jeyhun Karimov wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Thanks for your comments.
> >>>>>
> >>>>> I think supporting Lambdas for `withKey` and `AbstractRichFunction`
> >>>>>> don't go together, as Lambdas are only supported for interfaces
> >> AFAIK.
> >>>>>
> >>>>>
> >>>>> Maybe I misunderstood your comment.
> >>>>> *withKey* and and *withOnlyValue* are interfaces. So they don't have
> >>>> direct
> >>>>> relation with *AbstractRichFunction*.
> >>>>> *withKey* and and *withOnlyValue* interfaces have only one  method ,
> >> so
> >>>> we
> >>>>> can use lambdas.
> >>>>> Where does the *AbstractRichFunction* comes to play? Inside Rich
> >>>> functions.
> >>>>> And we use Rich functions in 2 places:
> >>>>>
> >>>>> 1. User doesn't use rich functions. Just regular *withKey* and and
> >>>>> *withOnlyValue* interfaces(both support lambdas) . We get those
> >>>> interfaces
> >>>>> and wrap into Rich function while building the topology, and send to
> >>>>> Processor.
> >>>>> 2. User does use rich functions (Rich functions implement *withKey*
> >>>>> interface). As a result no lamdas here by definition. In this case,
> >>> while
> >>>>> building the topology we do a type check if the object type is
> >>>>> *withKey* or *RichFunction*.
> >>>>>
> >>>>> So *AbstractRichFunction* is just syntactic sugar for Rich functions
> >>> and
> >>>>> does not affect using lambdas.
> >>>>>
> >>>>> Thus, if we want to support Lambdas for `withKey`, we need to have a
> >>>>>> interface approach like this
> >>>>>>   - RichFunction -> only adding init() and close()
> >>>>>>   - ValueMapper
> >>>>>>   - ValueMapperWithKey
> >>>>>>   - RichValueMapper extends ValueMapperWithKey, RichFunction
> >>>>>
> >>>>>
> >>>>> As I said above, currently we support lambdas for *withKey*
> >> interfaces
> >>> as
> >>>>> well.  However, I agree with your idea and I will remove the
> >>>>> AbstractRichFunction from the design.
> >>>>>
> >>>>> As an alternative, we could argue, that it is sufficient to support
> >>>>>> Lambdas for the "plain" API only, but not for any "extended API".
> >> For
> >>>>>> this, RichFunction could add key+init+close and AbstractRichFunction
> >>>>>> would allow to only care about getting the key.
> >>>>>> Not sure, which one is better. I don't like the idea of more
> >>> overloaded
> >>>>>> methods to get Lambdas for `withKey` interfaces too much because we
> >>> have
> >>>>>> already so many overlaods. On the other hand, I do see value in
> >>>>>> supporting Lambdas for `withKey`.
> >>>>>
> >>>>>
> >>>>> Just to clarify, with current design we have only one extra
> >> overloaded
> >>>>> method per *withOnlyValue* interface:  which is *withKey* version of
> >>>>> particular interface.
> >>>>> We don't need extra overload for Rich function as Rich function
> >>>> implements
> >>>>> *withKey* interface as a result they have same type. We differentiate
> >>>> them
> >>>>> while building the topology.
> >>>>> We supported lambdas for *withKey* APIs because of the comment:
> >>>>>
> >>>>> @Jeyhun: I did not put any thought into this, but can we have a
> >> design
> >>>>>> that allows for both? Also, with regard to lambdas, it might make
> >>> sense
> >>>>>> to allow for both `V -> newV` and `(K, V) -> newV` ?
> >>>>>
> >>>>>
> >>>>> However, I don't think that this complicates the overall design
> >>>>> significantly.
> >>>>>
> >>>>>
> >>>>> Depending on what we want to support, it might make sense to
> >>>>>> include/exclude RichFunctions from this KIP -- and thus, this also
> >>>>>> determines if we should have a "ProcessorContext KIP" before driving
> >>>>>> this KIP further.
> >>>>>
> >>>>>
> >>>>> Based on our discussion I think we should keep Rich functions as I
> >>> don't
> >>>>> think that they bring extra layer of overhead to library.
> >>>>>
> >>>>> Any comments are appreciated.
> >>>>>
> >>>>> Cheers,
> >>>>> Jeyhun
> >>>>>
> >>>>>
> >>>>> On Tue, May 16, 2017 at 12:10 AM Matthias J. Sax <
> >>> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> Jeyhun,
> >>>>>>
> >>>>>> thanks for the update.
> >>>>>>
> >>>>>> I think supporting Lambdas for `withKey` and `AbstractRichFunction`
> >>>>>> don't go together, as Lambdas are only supported for interfaces
> >> AFAIK.
> >>>>>>
> >>>>>> Thus, if we want to support Lambdas for `withKey`, we need to have a
> >>>>>> interface approach like this
> >>>>>>
> >>>>>>   - RichFunction -> only adding init() and close()
> >>>>>>
> >>>>>>   - ValueMapper
> >>>>>>   - ValueMapperWithKey
> >>>>>>
> >>>>>>   - RichValueMapper extends ValueMapperWithKey, RichFunction
> >>>>>>
> >>>>>> For this approach, AbstractRichFunction does not make sense anymore,
> >>> as
> >>>>>> the only purpose of `RichFunction` is to allow the implementation of
> >>>>>> init() and close() -- if you don't want those, you would implement a
> >>>>>> different interface (ie, ValueMapperWithKey)
> >>>>>>
> >>>>>> As an alternative, we could argue, that it is sufficient to support
> >>>>>> Lambdas for the "plain" API only, but not for any "extended API".
> >> For
> >>>>>> this, RichFunction could add key+init+close and AbstractRichFunction
> >>>>>> would allow to only care about getting the key.
> >>>>>>
> >>>>>> Not sure, which one is better. I don't like the idea of more
> >>> overloaded
> >>>>>> methods to get Lambdas for `withKey` interfaces too much because we
> >>> have
> >>>>>> already so many overlaods. On the other hand, I do see value in
> >>>>>> supporting Lambdas for `withKey`.
> >>>>>>
> >>>>>> Depending on what we want to support, it might make sense to
> >>>>>> include/exclude RichFunctions from this KIP -- and thus, this also
> >>>>>> determines if we should have a "ProcessorContext KIP" before driving
> >>>>>> this KIP further.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>>
> >>>>>> On 5/15/17 11:01 AM, Jeyhun Karimov wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Sorry for super late response. Thanks for your comments.
> >>>>>>>
> >>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit? I
> >>> cannot
> >>>>>>>> follow the explanation in the KIP to see what the problem is.
> >>>>>>>
> >>>>>>>
> >>>>>>> - From [1] says "A functional interface is an interface that has
> >> just
> >>>> one
> >>>>>>> abstract method, and thus represents a single function contract".
> >>>>>>> So basically once we extend some interface from another (in our
> >> case,
> >>>>>>> ValueMapperWithKey from ValueMapper) we cannot use lambdas in the
> >>>>>> extended
> >>>>>>> interface.
> >>>>>>>
> >>>>>>>
> >>>>>>> Further comments:
> >>>>>>>>  - The KIP get a little hard to read -- can you maybe reformat the
> >>>> wiki
> >>>>>>>> page a little bit? I think using `CodeBlock` would help.
> >>>>>>>
> >>>>>>>
> >>>>>>> - I will work on the KIP.
> >>>>>>>
> >>>>>>>  - What about KStream-KTable joins? You don't have overlaods added
> >>> for
> >>>>>>>> them. Why? (Even if I still hope that we don't need to add any new
> >>>>>>>> overloads)
> >>>>>>>
> >>>>>>>
> >>>>>>> - Actually there are more than one Processor and public APIs to be
> >>>>>>> changed (KStream-KTable
> >>>>>>> joins is one case). However all of them has similar structure: we
> >>>>>> overload
> >>>>>>> the *method* with  *methodWithKey*,
> >>>>>>> wrap it into the Rich function, send to processor and inside the
> >>>>>> processor
> >>>>>>> call *init* and *close* methods of the Rich function.
> >>>>>>> As I wrote in KIP, I wanted to demonstrate the overall idea with
> >> only
> >>>>>>> *ValueMapper* as the same can be applied to all changes.
> >>>>>>> Anyway I will update the KIP.
> >>>>>>>
> >>>>>>>  - Why do we need `AbstractRichFunction`?
> >>>>>>>
> >>>>>>>
> >>>>>>> Instead of overriding the *init(ProcessorContext p)* and* close()*
> >>>>>> methods
> >>>>>>> in every Rich function with empty body like:
> >>>>>>>
> >>>>>>> @Override
> >>>>>>> void init(ProcessorContext context) {}
> >>>>>>>
> >>>>>>> @Override
> >>>>>>> void close () {}
> >>>>>>>
> >>>>>>> I thought that we can override them once in *AbstractRichFunction*
> >>> and
> >>>>>>> extent new Rich functions from *AbstractRichFunction*.
> >>>>>>> Basically this can eliminate code copy-paste and ease the
> >>> maintenance.
> >>>>>>>
> >>>>>>>  - What about interfaces Initializer, ForeachAction, Merger,
> >>> Predicate,
> >>>>>>>> Reducer? I don't want to say we should/need to add to all, but we
> >>>> should
> >>>>>>>> discuss all of them and add where it does make sense (e.g.,
> >>>>>>>> RichForachAction does make sense IMHO)
> >>>>>>>
> >>>>>>>
> >>>>>>> Definitely agree. As I said, the same technique applies to all this
> >>>>>>> interfaces and I didn't want to explode the KIP, just wanted to
> >> give
> >>>> the
> >>>>>>> overall intuition.
> >>>>>>> However, I will update the KIP as I said.
> >>>>>>>
> >>>>>>>
> >>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` --
> >>>> `RichValueXX`
> >>>>>>>> in general -- but why can't we do all this with interfaces only?
> >>>>>>>
> >>>>>>>
> >>>>>>> Sure we can. However the main intuition is we should not force
> >> users
> >>> to
> >>>>>>> implement *init(ProcessorContext)* and *close()* functions every
> >> time
> >>>>>> they
> >>>>>>> use Rich functions.
> >>>>>>> If one needs, she can override the respective methods. However, I
> >> am
> >>>> open
> >>>>>>> for discussion.
> >>>>>>>
> >>>>>>>
> >>>>>>> I'd rather not see the use of  `ProcessorContext` spread any
> >> further
> >>>> than
> >>>>>>>> it currently is. So maybe we need another KIP that is done before
> >>>> this?
> >>>>>>>> Otherwise i think the scope of this KIP is becoming too large.
> >>>>>>>
> >>>>>>>
> >>>>>>> That is good point. I wanted to make *init(ProcessorContext)*
> >> method
> >>>>>>> persistent among the library (which use ProcessorContext as an
> >>> input),
> >>>>>>> therefore I put *ProcessorContext* as an input.
> >>>>>>> So the important question is that (as @dguy and @mjsax mentioned)
> >>>> whether
> >>>>>>> continue this KIP without providing users an access to
> >>>> *ProcessorContext*
> >>>>>>> (change *init (ProcessorContext)* to * init()* ) or
> >>>>>>> initiate another KIP before this.
> >>>>>>>
> >>>>>>> [1]
> >>>>>>> http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java-
> >>>>>> se-8-jls-pfd-diffs.pdf
> >>>>>>>
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Jeyhun
> >>>>>>>
> >>>>>>> On Mon, May 15, 2017 at 7:15 PM, Damian Guy <[email protected]>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> I'd rather not see the use of  `ProcessorContext` spread any
> >> further
> >>>>>> than
> >>>>>>>> it currently is. So maybe we need another KIP that is done before
> >>>> this?
> >>>>>>>> Otherwise i think the scope of this KIP is becoming too large.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, 15 May 2017 at 18:06 Matthias J. Sax <
> >> [email protected]
> >>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I agree that that `ProcessorContext` interface is too broad in
> >>>> general
> >>>>>>>>> -- this is even true for transform/process, and it's also
> >> reflected
> >>>> in
> >>>>>>>>> the API improvement list we want to do.
> >>>>>>>>>
> >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/
> >>>>>>>> Kafka+Streams+Discussions
> >>>>>>>>>
> >>>>>>>>> So I am wondering, if you question the `RichFunction` approach in
> >>>>>>>>> general? Or if you suggest to either extend the scope of this KIP
> >>> to
> >>>>>>>>> include this---or maybe better, do another KIP for it and delay
> >>> this
> >>>>>> KIP
> >>>>>>>>> until the other one is done?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -Matthias
> >>>>>>>>>
> >>>>>>>>> On 5/15/17 2:35 AM, Damian Guy wrote:
> >>>>>>>>>> Thanks for the KIP.
> >>>>>>>>>>
> >>>>>>>>>> I'm not convinced on the `RichFunction` approach. Do we really
> >>> want
> >>>> to
> >>>>>>>>> give
> >>>>>>>>>> every DSL method access to the `ProcessorContext` ? It has a
> >> bunch
> >>>> of
> >>>>>>>>>> methods on it that seem in-appropriate for some of the DSL
> >>> methods,
> >>>>>>>> i.e,
> >>>>>>>>>> `register`, `getStateStore`, `forward`, `schedule` etc. It is
> >> far
> >>>> too
> >>>>>>>>>> broad. I think it would be better to have a narrower interface
> >>> like
> >>>>>> the
> >>>>>>>>>> `RecordContext`  - remembering it is easier to add
> >>>> methods/interfaces
> >>>>>>>>> later
> >>>>>>>>>> than to remove them
> >>>>>>>>>>
> >>>>>>>>>> On Sat, 13 May 2017 at 22:26 Matthias J. Sax <
> >>> [email protected]
> >>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Jeyhun,
> >>>>>>>>>>>
> >>>>>>>>>>> I am not an expert on Lambdas. Can you elaborate a little bit?
> >> I
> >>>>>>>> cannot
> >>>>>>>>>>> follow the explanation in the KIP to see what the problem is.
> >>>>>>>>>>>
> >>>>>>>>>>> For updating the KIP title I don't know -- guess it's up to
> >> you.
> >>>>>>>> Maybe a
> >>>>>>>>>>> committer can comment on this?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Further comments:
> >>>>>>>>>>>
> >>>>>>>>>>>  - The KIP get a little hard to read -- can you maybe reformat
> >>> the
> >>>>>>>> wiki
> >>>>>>>>>>> page a little bit? I think using `CodeBlock` would help.
> >>>>>>>>>>>
> >>>>>>>>>>>  - What about KStream-KTable joins? You don't have overlaods
> >>> added
> >>>>>> for
> >>>>>>>>>>> them. Why? (Even if I still hope that we don't need to add any
> >>> new
> >>>>>>>>>>> overloads)
> >>>>>>>>>>>
> >>>>>>>>>>>  - Why do we need `AbstractRichFunction`?
> >>>>>>>>>>>
> >>>>>>>>>>>  - What about interfaces Initializer, ForeachAction, Merger,
> >>>>>>>> Predicate,
> >>>>>>>>>>> Reducer? I don't want to say we should/need to add to all, but
> >> we
> >>>>>>>> should
> >>>>>>>>>>> discuss all of them and add where it does make sense (e.g.,
> >>>>>>>>>>> RichForachAction does make sense IMHO)
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` --
> >>>>>>>> `RichValueXX`
> >>>>>>>>>>> in general -- but why can't we do all this with interfaces
> >> only?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks for your comments. I think we cannot extend the two
> >>>>>> interfaces
> >>>>>>>>> if
> >>>>>>>>>>> we
> >>>>>>>>>>>> want to keep lambdas. I updated the KIP [1]. Maybe I should
> >>> change
> >>>>>>>> the
> >>>>>>>>>>>> title, because now we are not limiting the KIP to only
> >>>> ValueMapper,
> >>>>>>>>>>>> ValueTransformer and ValueJoiner.
> >>>>>>>>>>>> Please feel free to comment.
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+
> >>>>>>>> ValueMapper%2C+and+ValueJoiner
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax <
> >>>>>>>> [email protected]>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> If `ValueMapperWithKey` extends `ValueMapper` we don't need
> >> the
> >>>> new
> >>>>>>>>>>>>> overlaod.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> And yes, we need to do one check -- but this happens when
> >>>> building
> >>>>>>>> the
> >>>>>>>>>>>>> topology. At runtime (I mean after KafkaStream#start() we
> >> don't
> >>>>>> need
> >>>>>>>>> any
> >>>>>>>>>>>>> check as we will always use `ValueMapperWithKey`)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 5/9/17 2:55 AM, Jeyhun Karimov wrote:
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for feedback.
> >>>>>>>>>>>>>> Then we need to overload method
> >>>>>>>>>>>>>>   <VR> KStream<K, VR> mapValues(ValueMapper<? super V, ?
> >>> extends
> >>>>>>>> VR>
> >>>>>>>>>>>>>> mapper);
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>   <VR> KStream<K, VR> mapValues(ValueMapperWithKey<? super
> >> V,
> >>> ?
> >>>>>>>>> extends
> >>>>>>>>>>>>> VR>
> >>>>>>>>>>>>>> mapper);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> and in runtime (inside processor) we still have to check it
> >> is
> >>>>>>>>>>>>> ValueMapper
> >>>>>>>>>>>>>> or ValueMapperWithKey before wrapping it into the rich
> >>> function.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Please correct me if I am wrong.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki <
> >>>>>>>>>>>>>> [email protected]> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +1 :)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 08/05/17 23:52, Matthias J. Sax wrote:
> >>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I was reading the updated KIP and I am wondering, if we
> >>> should
> >>>>>> do
> >>>>>>>>> the
> >>>>>>>>>>>>>>>> design a little different.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Instead of distinguishing between a RichFunction and
> >>>>>>>>> non-RichFunction
> >>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>> runtime level, we would use RichFunctions all the time.
> >>> Thus,
> >>>> on
> >>>>>>>>> the
> >>>>>>>>>>>>> DSL
> >>>>>>>>>>>>>>>> entry level, if a user provides a non-RichFunction, we
> >> wrap
> >>> it
> >>>>>>>> by a
> >>>>>>>>>>>>>>>> RichFunction that is fully implemented by Streams. This
> >>>>>>>>> RichFunction
> >>>>>>>>>>>>>>>> would just forward the call omitting the key:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Just to sketch the idea (incomplete code snippet):
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> public StreamsRichValueMapper implements
> >> RichValueMapper()
> >>> {
> >>>>>>>>>>>>>>>>>    private ValueMapper userProvidedMapper; // set by
> >>>>>> constructor
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>    public VR apply(final K key, final V1 value1, final V2
> >>>>>>>> value2)
> >>>>>>>>> {
> >>>>>>>>>>>>>>>>>        return userProvidedMapper(value1, value2);
> >>>>>>>>>>>>>>>>>    }
> >>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>  From a performance point of view, I am not sure if the
> >>>>>>>>>>>>>>>> "if(isRichFunction)" including casts etc would have more
> >>>>>> overhead
> >>>>>>>>>>> than
> >>>>>>>>>>>>>>>> this approach (we would do more nested method call for
> >>>>>>>>>>> non-RichFunction
> >>>>>>>>>>>>>>>> which should be more common than RichFunctions).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> This approach should not effect lambdas (or do I miss
> >>>>>> something?)
> >>>>>>>>> and
> >>>>>>>>>>>>>>>> might be cleaner, as we could have one more top level
> >>>> interface
> >>>>>>>>>>>>>>>> `RichFunction` with methods `init()` and `close()` and
> >> also
> >>>>>>>>>>> interfaces
> >>>>>>>>>>>>>>>> for `RichValueMapper` etc. (thus, no abstract classes
> >>>> required).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Any thoughts?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 5/6/17 5:29 PM, Jeyhun Karimov wrote:
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks for comments. I extended PR and KIP to include
> >> rich
> >>>>>>>>>>> functions.
> >>>>>>>>>>>>> I
> >>>>>>>>>>>>>>>>> will still have to evaluate the cost of deep copying of
> >>> keys.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 8:02 PM Mathieu Fenniak <
> >>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hey Matthias,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> My opinion would be that documenting the immutability of
> >>> the
> >>>>>>>> key
> >>>>>>>>> is
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> best approach available.  Better than requiring the key
> >> to
> >>>> be
> >>>>>>>>>>>>>>> serializable
> >>>>>>>>>>>>>>>>>> (as with Jeyhun's last pass at the PR), no performance
> >>> risk.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> It'd be different if Java had immutable type constraints
> >>> of
> >>>>>>>> some
> >>>>>>>>>>>>> kind.
> >>>>>>>>>>>>>>> :-)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Mathieu
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 11:31 AM, Matthias J. Sax <
> >>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Agreed about RichFunction. If we follow this path, it
> >>>> should
> >>>>>>>>> cover
> >>>>>>>>>>>>>>>>>>> all(?) DSL interfaces.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> About guarding the key -- I am still not sure what to
> >> do
> >>>>>> about
> >>>>>>>>>>> it...
> >>>>>>>>>>>>>>>>>>> Maybe it might be enough to document it (and name the
> >> key
> >>>>>>>>>>> parameter
> >>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>> `readOnlyKey` to make is very clear). Ultimately, I
> >> would
> >>>>>>>> prefer
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> guard against any modification, but I have no good idea
> >>> how
> >>>>>> to
> >>>>>>>>> do
> >>>>>>>>>>>>>>> this.
> >>>>>>>>>>>>>>>>>>> Not sure what others think about the risk of corrupted
> >>>>>>>>>>> partitioning
> >>>>>>>>>>>>>>>>>>> (what would be a user error and we could say, well, bad
> >>>> luck,
> >>>>>>>>> you
> >>>>>>>>>>>>> got
> >>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>> bug in your code, that's not our fault), vs deep copy
> >>> with
> >>>> a
> >>>>>>>>>>>>> potential
> >>>>>>>>>>>>>>>>>>> performance hit (that we can't quantity atm without any
> >>>>>>>>>>> performance
> >>>>>>>>>>>>>>>>>> test).
> >>>>>>>>>>>>>>>>>>> We do have a performance system test. Maybe it's worth
> >>> for
> >>>>>> you
> >>>>>>>>> to
> >>>>>>>>>>>>>>> apply
> >>>>>>>>>>>>>>>>>>> the deep copy strategy and run the test. It's very
> >> basic
> >>>>>>>>>>> performance
> >>>>>>>>>>>>>>>>>>> test only, but might give some insight. If you want to
> >> do
> >>>>>>>> this,
> >>>>>>>>>>> look
> >>>>>>>>>>>>>>>>>>> into folder "tests" for general test setup, and into
> >>>>>>>>>>>>>>>>>>> "tests/kafaktests/benchmarks/streams" to find find the
> >>> perf
> >>>>>>>>> test.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On 5/5/17 8:55 AM, Jeyhun Karimov wrote:
> >>>>>>>>>>>>>>>>>>>> Hi Matthias,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I think extending KIP to include RichFunctions totally
> >>>>>> makes
> >>>>>>>>>>>>> sense.
> >>>>>>>>>>>>>>>>>> So,
> >>>>>>>>>>>>>>>>>>>>   we don't want to guard the keys because it is
> >> costly.
> >>>>>>>>>>>>>>>>>>>> If we introduce RichFunctions I think it should not be
> >>>>>>>> limited
> >>>>>>>>>>> only
> >>>>>>>>>>>>>>>>>> the 3
> >>>>>>>>>>>>>>>>>>>> interfaces proposed in KIP but to wide range of
> >>>> interfaces.
> >>>>>>>>>>>>>>>>>>>> Please correct me if I am wrong.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Fri, May 5, 2017 at 12:04 AM Matthias J. Sax <
> >>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> One follow up. There was this email on the user list:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>> http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj=
> >>>>>>>>>>>>>>>>>>>
> >>> Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+
> >>>>>>>> key+
> >>>>>>>>>>>>>>>>>>>>> It might make sense so include Initializer, Adder,
> >> and
> >>>>>>>>>>> Substractor
> >>>>>>>>>>>>>>>>>>>>> inferface, too.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> And we should double check if there are other
> >> interface
> >>>> we
> >>>>>>>>> might
> >>>>>>>>>>>>>>> miss
> >>>>>>>>>>>>>>>>>>> atm.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On 5/4/17 1:31 PM, Matthias J. Sax wrote:
> >>>>>>>>>>>>>>>>>>>>>> Thanks for updating the KIP.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Deep copying the key will work for sure, but I am
> >>>> actually
> >>>>>>>> a
> >>>>>>>>>>>>> little
> >>>>>>>>>>>>>>>>>> bit
> >>>>>>>>>>>>>>>>>>>>>> worried about performance impact... We might want to
> >>> do
> >>>>>>>> some
> >>>>>>>>>>> test
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> quantify this impact.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Btw: this remind me about the idea of `RichFunction`
> >>>>>>>>> interface
> >>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>> would allow users to access record metadata (like
> >>>>>>>> timestamp,
> >>>>>>>>>>>>>>> offset,
> >>>>>>>>>>>>>>>>>>>>>> partition etc) within DSL. This would be a similar
> >>>>>> concept.
> >>>>>>>>>>>>> Thus, I
> >>>>>>>>>>>>>>>>>> am
> >>>>>>>>>>>>>>>>>>>>>> wondering, if it would make sense to enlarge the
> >> scope
> >>>> of
> >>>>>>>>> this
> >>>>>>>>>>>>> KIP
> >>>>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>>>>>>> that? WDYT?
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On 5/3/17 2:08 AM, Jeyhun Karimov wrote:
> >>>>>>>>>>>>>>>>>>>>>>> Hi Mathieu,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Thanks for feedback. I followed similar approach
> >> and
> >>>>>>>> updated
> >>>>>>>>>>> PR
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> KIP
> >>>>>>>>>>>>>>>>>>>>>>> accordingly. I tried to guard the key in Processors
> >>>>>>>> sending
> >>>>>>>>> a
> >>>>>>>>>>>>> copy
> >>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>>>>> actual key.
> >>>>>>>>>>>>>>>>>>>>>>> Because I am doing deep copy of an object, I think
> >>>> memory
> >>>>>>>>> can
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>> bottleneck
> >>>>>>>>>>>>>>>>>>>>>>> in some use-cases.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak <
> >>>>>>>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun,
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> This approach would change ValueMapper (...etc) to
> >>> be
> >>>>>>>>>>> classes,
> >>>>>>>>>>>>>>>>>> rather
> >>>>>>>>>>>>>>>>>>>>> than
> >>>>>>>>>>>>>>>>>>>>>>>> interfaces, which is also a backwards incompatible
> >>>>>>>> change.
> >>>>>>>>>>> An
> >>>>>>>>>>>>>>>>>>>>> alternative
> >>>>>>>>>>>>>>>>>>>>>>>> approach that would be backwards compatible would
> >> be
> >>>> to
> >>>>>>>>>>> define
> >>>>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>>>>>>>>> interfaces, and provide overrides where those
> >>>> interfaces
> >>>>>>>>> are
> >>>>>>>>>>>>>>> used.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Unfortunately, making the key parameter as "final"
> >>>>>>>> doesn't
> >>>>>>>>>>>>> change
> >>>>>>>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>>> about guarding against key change.  It only
> >> prevents
> >>>> the
> >>>>>>>>>>>>>>> parameter
> >>>>>>>>>>>>>>>>>>>>> variable
> >>>>>>>>>>>>>>>>>>>>>>>> from being reassigned.  If the key type is a
> >> mutable
> >>>>>>>> object
> >>>>>>>>>>>>> (eg.
> >>>>>>>>>>>>>>>>>>>>> byte[]),
> >>>>>>>>>>>>>>>>>>>>>>>> it can still be mutated. (eg. key[0] = 0).  But
> >> I'm
> >>>> not
> >>>>>>>>>>> really
> >>>>>>>>>>>>>>> sure
> >>>>>>>>>>>>>>>>>>>>> there's
> >>>>>>>>>>>>>>>>>>>>>>>> much that can be done about that.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Mathieu
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov <
> >>>>>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for comments.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> The concerns makes sense. Although we can guard
> >> for
> >>>>>>>>>>> immutable
> >>>>>>>>>>>>>>> keys
> >>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>>>> current implementation (with few changes), I
> >> didn't
> >>>>>>>>> consider
> >>>>>>>>>>>>>>>>>>> backward
> >>>>>>>>>>>>>>>>>>>>>>>>> compatibility.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> In this case 2 solutions come to my mind. In both
> >>>>>> cases,
> >>>>>>>>>>> user
> >>>>>>>>>>>>>>>>>>> accesses
> >>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> key in Object type, as passing extra type
> >> parameter
> >>>>>> will
> >>>>>>>>>>> break
> >>>>>>>>>>>>>>>>>>>>>>>>> backwards-compatibility.  So user has to cast to
> >>>> actual
> >>>>>>>>> key
> >>>>>>>>>>>>>>> type.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> 1. Firstly, We can overload apply method with 2
> >>>>>> argument
> >>>>>>>>>>> (key
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> value)
> >>>>>>>>>>>>>>>>>>>>>>>>> and force key to be *final*. By doing this,  I
> >>> think
> >>>> we
> >>>>>>>>> can
> >>>>>>>>>>>>>>>>>> address
> >>>>>>>>>>>>>>>>>>>>> both
> >>>>>>>>>>>>>>>>>>>>>>>>> backward-compatibility and guarding against key
> >>>> change.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> 2. Secondly, we can create class KeyAccess like:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> public class KeyAccess {
> >>>>>>>>>>>>>>>>>>>>>>>>>      Object key;
> >>>>>>>>>>>>>>>>>>>>>>>>>      public void beforeApply(final Object key) {
> >>>>>>>>>>>>>>>>>>>>>>>>>          this.key = key;
> >>>>>>>>>>>>>>>>>>>>>>>>>      }
> >>>>>>>>>>>>>>>>>>>>>>>>>      public Object getKey() {
> >>>>>>>>>>>>>>>>>>>>>>>>>          return key;
> >>>>>>>>>>>>>>>>>>>>>>>>>      }
> >>>>>>>>>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> We can extend *ValueMapper, ValueJoiner* and
> >>>>>>>>>>>>> *ValueTransformer*
> >>>>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>>>>>>>>>> *KeyAccess*. Inside processor (for example
> >>>>>>>>>>>>>>>>>>> *KTableMapValuesProcessor*)
> >>>>>>>>>>>>>>>>>>>>>>>>> before calling *mapper.apply(value)* we can set
> >> the
> >>>>>>>> *key*
> >>>>>>>>> by
> >>>>>>>>>>>>>>>>>>>>>>>>> *mapper.beforeApply(key)*. As a result, user can
> >>> use
> >>>>>>>>>>>>> *getKey()*
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> access
> >>>>>>>>>>>>>>>>>>>>>>>>> the key inside *apply(value)* method.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax <
> >>>>>>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun,
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> thanks a lot for the KIP!
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I think there are two issues we need to address:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> (1) The KIP does not consider backward
> >>>> compatibility.
> >>>>>>>>> Users
> >>>>>>>>>>>>> did
> >>>>>>>>>>>>>>>>>>>>>>>> complain
> >>>>>>>>>>>>>>>>>>>>>>>>>> about this in past releases already, and as the
> >>> user
> >>>>>>>> base
> >>>>>>>>>>>>>>> grows,
> >>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>>>> should not break backward compatibility in
> >> future
> >>>>>>>>> releases
> >>>>>>>>>>>>>>>>>> anymore.
> >>>>>>>>>>>>>>>>>>>>>>>>>> Thus, we should think of a better way to allow
> >> key
> >>>>>>>>> access.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu's comment goes into the same direction
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile
> >>> failures
> >>>>>>>> that
> >>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-)
> >>>>>>>>>>>>>>>>>>>>>>>>>> (2) Another concern is, that there is no guard
> >> to
> >>>>>>>> prevent
> >>>>>>>>>>>>> user
> >>>>>>>>>>>>>>>>>> code
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>> modify the key. This might corrupt partitioning
> >> if
> >>>>>>>> users
> >>>>>>>>> do
> >>>>>>>>>>>>>>> alter
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> key (accidentally -- or users are just not aware
> >>>> that
> >>>>>>>>> they
> >>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>> allowed to modify the provided key object) and
> >>> thus
> >>>>>>>> break
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> application. (This was the original motivation
> >> to
> >>>> not
> >>>>>>>>>>> provide
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> key
> >>>>>>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>>>>> the first place -- it's guards against
> >>>> modification.)
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jeyhun,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I just want to add my voice that, I too, have
> >>>> wished
> >>>>>>>> for
> >>>>>>>>>>>>>>> access
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> record key during a mapValues or similar
> >>> operation.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> On the other hand, the number of compile
> >> failures
> >>>>>> that
> >>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>>>>> fixed from this change is unfortunate. :-)  But
> >>> at
> >>>>>>>> least
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> pretty clear and easy change.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Mathieu
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov
> >> <
> >>>>>>>>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear community,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to share KIP-149 [1] based on issues
> >>>>>>>> KAFKA-4218
> >>>>>>>>>>> [2],
> >>>>>>>>>>>>>>>>>>>>>>>> KAFKA-4726
> >>>>>>>>>>>>>>>>>>>>>>>>>> [3],
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> KAFKA-3745 [4]. The related PR can be found at
> >>>> [5].
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get your comments.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/
> >>>>>>>> confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> ValueMapper%2C+and+ValueJoiner
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira
> >>>>>> /browse/KAFKA-4218
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] https://issues.apache.org/jira
> >>>>>> /browse/KAFKA-4726
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> [4] https://issues.apache.org/jira
> >>>>>> /browse/KAFKA-3745
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> [5] https://github.com/apache/kafka/pull/2946
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -Cheers
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>>>> -Cheers
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>> -Cheers
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> -Cheers
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>> -Cheers
> >>>>>>>>>>>>
> >>>>>>>>>>>> Jeyhun
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>> --
> >>> -Cheers
> >>>
> >>> Jeyhun
> >>>
> >>
>
> --
-Cheers

Jeyhun

Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

Reply via email to