Thanks for you comments Jeyhun, I agree about the disadvantages. Only the punctuation part is something I don't buy. IMHO, RichFunctions should not allow to register and use punctuation. If you need punctuation, you should use #transform() or similar. Note, that we plan to provide `RecordContext` and not `ProcessorContext` and thus, it's not even possible to register punctuations.
One more thought: if you go with `init()` and `close()` we basically allow users to have an in-memory state for a function. Thus, we cannot share a single instance of RichValueMapper (etc) over multiple tasks and we would need a supplier pattern similar to #transform(). And this would "break the flow" of the API, as (Rich)ValueMapperSupplier would not inherit from ValueMapper and thus we would need many new overload for KStream/KTable classes. The overall goal of RichFunction (from my understanding) was to provide record metadata information (like offset, timestamp, etc) to the user. And we still have #transform() that provided the init and close functionality. So if we introduce those with RichFunction we are quite close to what #transform provides, and thus it feels as if we duplicate functionality. For this reason, it seems to be better to got with the `#valueMapper(ValueMapper mapper, RecordContext context)` approach. WDYT? -Matthias On 5/27/17 11:00 AM, Jeyhun Karimov wrote: > Hi, > > Thanks for your comments. I will refer the overall approach as rich > functions until we find a better name. > > I think there are some pros and cons of the approach you described. > > Pros is that it is simple, has clear boundaries, avoids misunderstanding of > term "function". > So you propose sth like: > KStream.valueMapper (ValueMapper vm, RecordContext rc) > or > having rich functions with only a single init(RecordContext rc) method. > > Cons is that: > - This will bring another set of overloads (if we use RecordContext as a > separate parameter). We should consider that the rich functions will be for > all main interfaces. > - I don't think that we need lambdas in rich functions. It is by > definition "rich" so, no single method in interface -> as a result no > lambdas. > - I disagree that rich functions should only contain init() method. This > depends on each interface. For example, for specific interfaces we can add > methods (like punctuate()) to their rich functions. > > > Cheers, > Jeyhun > > > > On Thu, May 25, 2017 at 1:02 AM Matthias J. Sax <matth...@confluent.io> > wrote: > >> I confess, the term is borrowed from Flink :) >> >> Personally, I never thought about it, but I tend to agree with Michal. I >> also want to clarify, that the main purpose is the ability to access >> record metadata. Thus, it might even be sufficient to only have "init". >> >> An alternative would of course be, to pass in the RecordContext as >> method parameter. This would allow us to drop "init()". This might even >> allow to use Lambdas and we could keep the name RichFunction as we >> preserve the nature of being a function. >> >> >> -Matthias >> >> On 5/24/17 12:13 PM, Jeyhun Karimov wrote: >>> Hi Michal, >>> >>> Thanks for your comments. I see your point and I agree with it. However, >>> I don't have a better idea for naming. I checked MR source code. There >>> it is used JobConfigurable and Closable, two different interfaces. Maybe >>> we can rename RichFunction as Configurable? >>> >>> >>> Cheers, >>> Jeyhun >>> >>> On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki >>> <michal.borowie...@openbet.com <mailto:michal.borowie...@openbet.com>> >>> wrote: >>> >>> Hi Jeyhun, >>> >>> I understand your argument about "Rich" in RichFunctions. Perhaps >>> I'm just being too puritan here, but let me ask this anyway: >>> >>> What is it that makes something a function? To me a function is >>> something that takes zero or more arguments and possibly returns a >>> value and while it may have side-effects (as opposed to "pure >>> functions" which can't), it doesn't have any life-cycle of its own. >>> This is what, in my mind, distinguishes the concept of a "function" >>> from that of more vaguely defined concepts. >>> >>> So if we add a life-cycle to a function, in that understanding, it >>> doesn't become a rich function but instead stops being a function >>> altogether. >>> >>> You could say it's "just semantics" but to me precise use of >>> language in the given context is an important foundation for good >>> engineering. And in the context of programming "function" has a >>> precise meaning. Of course we can say that in the context of Kafka >>> Streams "function" has a different, looser meaning but I'd argue >>> that won't do anyone any good. >>> >>> On the other hand other frameworks such as Flink use this >>> terminology, so it could be that consistency is the reason. I'm >>> guessing that's why the name was proposed in the first place. My >>> point is simply that it's a poor choice of wording and Kafka Streams >>> don't have to follow that to the letter. >>> >>> Cheers, >>> >>> Michal >>> >>> >>> On 23/05/17 13:26, Jeyhun Karimov wrote: >>>> Hi Michal, >>>> >>>> Thanks for your comments. >>>> >>>> >>>> To me at least it feels strange that something is called a >>>> function yet doesn't follow the functional interface >>>> definition of having just one abstract method. I suppose init >>>> and close could be made default methods with empty bodies once >>>> Java 7 support is dropped to mitigate that concern. Still, I >>>> feel some resistance to consider something that requires >>>> initialisation and closing (which implies holding state) as >>>> being a function. Sounds more like the Processor/Transformer >>>> kind of thing semantically, rather than a function. >>>> >>>> >>>> - If we called the interface name only Function your assumptions >>>> will hold. However, the keyword Rich by definition implies that we >>>> have a function (as you described, with one abstract method and >>>> etc) but it is rich. So, there are multiple methods in it. >>>> Ideally it should be: >>>> >>>> public interface RichFunction extends Function { // this >>>> is the Function that you described >>>> void close(); >>>> void init(Some params); >>>> ... >>>> } >>>> >>>> >>>> The KIP says there are multiple use-cases for this but doesn't >>>> enumerate any - I think some examples would be useful, >>>> otherwise that section sounds a little bit vague. >>>> >>>> >>>> I thought it is obvious by definition but I will update it. Thanks. >>>> >>>> >>>> IMHO, it's the access to the RecordContext is where the added >>>> value lies but maybe I'm just lacking in imagination, so I'm >>>> asking all this to better understand the rationale for init() >>>> and close(). >>>> >>>> >>>> Maybe I should add some examples. Thanks. >>>> >>>> >>>> Cheers, >>>> Jeyhun >>>> >>>> On Mon, May 22, 2017 at 11:02 AM, Michal Borowiecki >>>> <michal.borowie...@openbet.com >>>> <mailto:michal.borowie...@openbet.com>> wrote: >>>> >>>> Hi Jeyhun, >>>> >>>> I'd like to understand better the premise of RichFunctions and >>>> why |init(Some params)|,| close() |are said to be needed. >>>> >>>> To me at least it feels strange that something is called a >>>> function yet doesn't follow the functional interface >>>> definition of having just one abstract method. I suppose init >>>> and close could be made default methods with empty bodies once >>>> Java 7 support is dropped to mitigate that concern. Still, I >>>> feel some resistance to consider something that requires >>>> initialisation and closing (which implies holding state) as >>>> being a function. Sounds more like the Processor/Transformer >>>> kind of thing semantically, rather than a function. >>>> >>>> The KIP says there are multiple use-cases for this but doesn't >>>> enumerate any - I think some examples would be useful, >>>> otherwise that section sounds a little bit vague. >>>> >>>> IMHO, it's the access to the RecordContext is where the added >>>> value lies but maybe I'm just lacking in imagination, so I'm >>>> asking all this to better understand the rationale for init() >>>> and close(). >>>> >>>> Thanks, >>>> MichaĆ >>>> >>>> On 20/05/17 17:05, Jeyhun Karimov wrote: >>>>> Dear community, >>>>> >>>>> As we discussed in KIP-149 [DISCUSS] thread [1], I would like >> to initiate >>>>> KIP for rich functions (interfaces) [2]. >>>>> I would like to get your comments. >>>>> >>>>> >>>>> [1] >>>>> >> http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+and+ValueJoiner >>>>> [2] >>>>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams >>>>> >>>>> >>>>> Cheers, >>>>> Jeyhun >>>> -- >>>> <http://www.openbet.com/> Michal Borowiecki >>>> Senior Software Engineer L4 >>>> T: +44 208 742 1600 <+44%2020%208742%201600> >> <tel:+44%2020%208742%201600> >>>> +44 203 249 8448 <+44%2020%203249%208448> >> <tel:+44%2020%203249%208448> >>>> >>>> E: michal.borowie...@openbet.com >>>> <mailto:michal.borowie...@openbet.com> >>>> W: www.openbet.com <http://www.openbet.com/> >>>> >>>> >>>> OpenBet Ltd >>>> Chiswick Park Building 9 >>>> 566 Chiswick High Rd >>>> London >>>> W4 5XT >>>> UK >>>> >>>> >>>> <https://www.openbet.com/email_promo> >>>> >>>> This message is confidential and intended only for the >>>> addressee. If you have received this message in error, please >>>> immediately notify the postmas...@openbet.com >>>> <mailto:postmas...@openbet.com> and delete it from your system >>>> as well as any copies. The content of e-mails as well as >>>> traffic data may be monitored by OpenBet for employment and >>>> security purposes. To protect the environment please do not >>>> print this e-mail unless necessary. OpenBet Ltd. Registered >>>> Office: Chiswick Park Building 9, 566 Chiswick High Road, >>>> London, W4 5XT, United Kingdom. A company registered in >>>> England and Wales. Registered no. 3134634. VAT no. GB927523612 >>>> >>> -- >>> <http://www.openbet.com/> Michal Borowiecki >>> Senior Software Engineer L4 >>> T: +44 208 742 1600 <+44%2020%208742%201600> >> <tel:+44%2020%208742%201600> >>> +44 203 249 8448 <+44%2020%203249%208448> >> <tel:+44%2020%203249%208448> >>> >>> E: michal.borowie...@openbet.com >>> <mailto:michal.borowie...@openbet.com> >>> W: www.openbet.com <http://www.openbet.com/> >>> >>> >>> OpenBet Ltd >>> Chiswick Park Building 9 >>> 566 Chiswick High Rd >>> London >>> W4 5XT >>> UK >>> >>> >>> <https://www.openbet.com/email_promo> >>> >>> This message is confidential and intended only for the addressee. If >>> you have received this message in error, please immediately notify >>> the postmas...@openbet.com <mailto:postmas...@openbet.com> and >>> delete it from your system as well as any copies. The content of >>> e-mails as well as traffic data may be monitored by OpenBet for >>> employment and security purposes. To protect the environment please >>> do not print this e-mail unless necessary. OpenBet Ltd. Registered >>> Office: Chiswick Park Building 9, 566 Chiswick High Road, London, W4 >>> 5XT, United Kingdom. A company registered in England and Wales. >>> Registered no. 3134634. VAT no. GB927523612 >>> >>> -- >>> -Cheers >>> >>> Jeyhun >> >> -- > -Cheers > > Jeyhun >
signature.asc
Description: OpenPGP digital signature