Hi, I found some problems with `FunctionWindows` when I implemented this pip, and I added it to PIP: Implementation[4].
After I submit the first PR, you can refer to it. Thanks, Baodi Shi > On Jun 2, 2022, at 18:4232, 石宝迪 <wudixiaolong...@icloud.com.INVALID> wrote: > >> Ok. I would add in the Compatability change another section with bold or >> capital letters to highlight you're creating a breaking change. It should >> be reflected in the release notes somehow - don't know the process for that. > > Ok, I added to `Incompatible case`. PTAL. > > > Thanks, > Baodi Shi > >> 2022年6月2日 18:0404,Asaf Mesika <asaf.mes...@gmail.com> 写道: >> >>> >>> I tend to fail. Although this breaks the current logic. but the current >>> implementation can be considered is a bug. >> >> Ok. I would add in the Compatability change another section with bold or >> capital letters to highlight you're creating a breaking change. It should >> be reflected in the release notes somehow - don't know the process for that. >> >> On Tue, May 31, 2022 at 7:16 PM 石宝迪 <wudixiaolong...@icloud.com.invalid> >> wrote: >> >>>>> If you fail to start the function, you immediately break people's >>>> functions when they upgrade to this version. How about notifying them >>> once >>>> via logger (WARN)? >>> >>> >>> I tend to fail. Although this breaks the current logic. but the current >>> implementation can be considered is a bug. >>> >>>> It will flood their logs if they used it wrong. Maybe write to log once? >>> >>> >>> Agree, I changed PIP. >>> >>> Thanks, >>> Baodi Shi >>> >>>> 2022年5月31日 23:5720,Asaf Mesika <asaf.mes...@gmail.com> 写道: >>>> >>>> Hi Baodi, >>>> >>>> Regarding >>>> >>>>> >>>>> 1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must >>>>> be true. If the validation fails, let the function fail to start (This >>>>> temporarily resolves the configuration ambiguity). When autoAck is >>>>> subsequently removed, the message will be acked immediately after >>> receiving >>>>> the message. >>>>> >>>>> >>>>> If you fail to start the function, you immediately break people's >>>> functions when they upgrade to this version. How about notifying them >>> once >>>> via logger (WARN)? >>>> >>>> Regarding >>>> >>>>> >>>>> 1. >>>>> >>>>> >>>>> When user call record.ack() in function, just ProcessingGuarantees == >>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user >>>>> call record.ack() is invalid(print warn log). >>>>> >>>>> It will flood their logs if they used it wrong. Maybe write to log once? >>>> >>>> On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolong...@icloud.com >>> .invalid> >>>> wrote: >>>> >>>>> Hi, Asaf. >>>>> >>>>> Thanks review. >>>>> >>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I >>>>>> understand it is achieved using transactions, thus the consumption of >>>>> that >>>>>> message and production of any messages, as a result, are considered one >>>>>> atomic unit - either message acknowledged and messages produced or >>>>> neither. >>>>> >>>>> >>>>> Not using transactions now, I understand: EFFECTIVELY_ONCE = >>> ATLEAST_ONCE >>>>> + Message Deduplication. >>>>> >>>>> @Neng Lu @Rui Fu Can help make sure? >>>>> >>>>>>> I would issue a WARN when reading configuring the function (thus >>> emitted >>>>>> once) when the user actively configured autoAck=false and warn them >>> that >>>>>> this configuration is deprecated and they should switch to the MANUAL >>>>>> ProcessingGuarantee configuration option. >>>>> >>>>> >>>>> Added to API Change(2) >>>>> >>>>>>> suggest you clarify what existing behavior remains for backward >>>>>> compatibility with the appropriate comment that this is deprecated and >>>>> will >>>>>> be removed. >>>>> >>>>> Yes, I have rewritten it, please see Implementation(1) >>>>> >>>>>> 5. Regarding Test Plan >>>>>> * I would add: Validate the test of autoAck=false still works (you >>>>> haven't >>>>>> broken anything) >>>>>> * I would add: Validate existing ProcessingGuarantee test for >>> AtMostOnce, >>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true) >>>>> >>>>> >>>>> Nice, I added to PIP. >>>>> >>>>> >>>>> Thanks, >>>>> Baodi Shi >>>>> >>>>>> 2022年5月30日 22:0011,Asaf Mesika <asaf.mes...@gmail.com> 写道: >>>>>> >>>>>> Thanks for applying the fixes. >>>>>> >>>>>> 1. Regarding >>>>>> >>>>>>> >>>>>>> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function >>>>>>> finished execution. Depends on pulsar deduplication, and provides >>>>>>> end-to-end effectively once processing. >>>>>>> >>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I >>>>>> understand it is achieved using transactions, thus the consumption of >>>>> that >>>>>> message and production of any messages, as a result, are considered one >>>>>> atomic unit - either message acknowledged and messages produced or >>>>> neither. >>>>>> >>>>>> 2. Regarding >>>>>> >>>>>>> >>>>>>> 1. Indication of autoAck is deprecated, and not use it in the code. >>>>>>> (and also Function.proto) >>>>>>> >>>>>>> * I would issue a WARN when reading configuring the function (thus >>>>> emitted >>>>>> once) when the user actively configured autoAck=false and warn them >>> that >>>>>> this configuration is deprecated and they should switch to the MANUAL >>>>>> ProcessingGuarantee configuration option. >>>>>> >>>>>> 3. Regarding >>>>>> >>>>>>> >>>>>>> 1. When the delivery semantic is ATMOST_ONCE, the message will be >>>>>>> acked immediately after receiving the message, no longer affected by >>>>> the >>>>>>> autoAck configuration. >>>>>>> >>>>>>> I suggest you clarify what existing behavior remains for backward >>>>>> compatibility with the appropriate comment that this is deprecated and >>>>> will >>>>>> be removed. >>>>>> >>>>>> 4. Regarding >>>>>> >>>>>>> >>>>>>> 1. >>>>>>> >>>>>>> When user call record.ack() in function, just ProcessingGuarantees == >>>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, >>> user >>>>>>> call record.ack() is invalid(print warn log). >>>>>>> >>>>>>> That might blast WARN messages to the user. Perhaps save the fact that >>>>> you >>>>>> have printed a WARN message once and only print when the variable is >>> not >>>>>> set? >>>>>> >>>>>> 5. Regarding Test Plan >>>>>> * I would add: Validate the test of autoAck=false still works (you >>>>> haven't >>>>>> broken anything) >>>>>> * I would add: Validate existing ProcessingGuarantee test for >>> AtMostOnce, >>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true) >>>>>> >>>>>> >>>>>> >>>>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com >>>>> .invalid> >>>>>> wrote: >>>>>> >>>>>>> Hi, Mesika. >>>>>>> >>>>>>> Thanks review. >>>>>>> >>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of >>>>> NONE. >>>>>>> As >>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact >>>>> that >>>>>>>>> the function executor calls acknowledge, in specific timing: >>>>>>> >>>>>>> >>>>>>> Added, Refer to the latest pip. >>>>>>> https://github.com/apache/pulsar/issues/15560 >>>>>>> >>>>>>>>> 3. Removing autoAck from Function Config breaks backward >>>>> compatibility, >>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does >>> Pulsar >>>>>>>>> release handle breaking change? >>>>>>> >>>>>>> As suggested by @neng, They will be marked as deprecated first and >>>>> clearly >>>>>>> stated in the documentation. Remove it after 2~3 release. >>>>>>> >>>>>>>>> 4. Regarding Implementation (1), isn't the class itself >>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I >>>>>>>>> understand how you use that enum value *inside* the class/ >>>>>>> >>>>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest >>>>> PIP >>>>>>> API Changes(3) >>>>>>> >>>>>>> Thanks, >>>>>>> Baodi Shi >>>>>>> >>>>>>>> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道: >>>>>>>> >>>>>>>> Hi Baodi, >>>>>>>> >>>>>>>> Nice work. Put some suggestions below, ptal. >>>>>>>> >>>>>>>> 1. API changes should also contain the changes of `Function.proto`, >>>>>>> including new `ProcessingGuarantees` option and `autoAck`. >>>>>>>> 2. Please be sure the other language runtimes (like Python, Golang) >>> do >>>>>>> support similar `record.ack()` function from the context, if no, it >>>>> might >>>>>>> have some API changes for different runtime we well. >>>>>>>> >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Rui Fu >>>>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道: >>>>>>>>> 1. "Added NONE delivery semantics and delete autoAck config." >>>>>>>>> - Added --> add >>>>>>>>> >>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of >>>>> NONE. >>>>>>> As >>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact >>>>> that >>>>>>>>> the function executor calls acknowledge, in specific timing: >>>>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is >>>>>>>>> immediately acknowledged and only then the function is executed, >>> thus >>>>>>>>> guaranteeing it will not run more than once >>>>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function >>>>>>> finished >>>>>>>>> execution, thus it will be run at least once. >>>>>>>>> - `MANUAL` - Signals to the user that it is up to them to >>> acknowledge >>>>>>> the >>>>>>>>> message, inside the function. >>>>>>>>> >>>>>>>>> I think if you couple that change with adding the explanation I >>> wrote >>>>>>>>> above to the documentation it will become crystal clear (hopefully) >>>>>>> what is >>>>>>>>> a Processing Guarantee exactly and what each value signifies. >>>>>>>>> >>>>>>>>> 3. Removing autoAck from Function Config breaks backward >>>>> compatibility, >>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does >>> Pulsar >>>>>>>>> release handle breaking change? >>>>>>>>> >>>>>>>>> 4. Regarding Implementation (1), isn't the class itself >>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I >>>>>>>>> understand how you use that enum value *inside* the class/ >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Some suggestions: >>>>>>>>>> >>>>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the >>>>>>> code. >>>>>>>>>> And documented clearly it's deprecated for the following 2~3 >>> release. >>>>>>> And >>>>>>>>>> then delete it. >>>>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and >>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it >>>>>>> defaults >>>>>>>>>> to ATLEAST_ONCE. >>>>>>>>>> 3. Need to let users know the behavior when they call >>> `record.ack()` >>>>>>> inside >>>>>>>>>> the function implementation. >>>>>>>>>> >>>>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com >>>>>>> .invalid> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Pulsar community, >>>>>>>>>>> >>>>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for >>> Function >>>>>>> add >>>>>>>>>>> NONE delivery semantics >>>>>>>>>>> >>>>>>>>>>> Let me know what you think. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Baodi Shi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ## Motivation >>>>>>>>>>> >>>>>>>>>>> Currently Function supports three delivery semantics, and also >>>>>>> provides >>>>>>>>>>> autoAck to control whether to automatically ack. >>>>>>>>>>> Because autoAck affects the delivery semantics of Function, it can >>>>> be >>>>>>>>>>> confusing for users to understand the relationship between these >>> two >>>>>>>>>>> parameters. >>>>>>>>>>> >>>>>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` >>>>> and >>>>>>>>>>> `autoAck == false`, then the framework will not help the user to >>> ack >>>>>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`. >>>>>>>>>>> >>>>>>>>>>> The delivery semantics provided by Function should be clear. When >>>>> the >>>>>>>>>> user >>>>>>>>>>> sets the guarantees, the framework should ensure point-to-point >>>>>>> semantic >>>>>>>>>>> processing and cannot be affected by other parameters. >>>>>>>>>>> >>>>>>>>>>> ## Goal >>>>>>>>>>> >>>>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config. >>>>>>>>>>> >>>>>>>>>>> The original intention of `autoAck` semantics is that users want >>> to >>>>>>>>>>> control the timing of ack by themselves. When autoAck == false, >>> the >>>>>>>>>>> processing semantics provided by the framework should be invalid. >>>>>>> Then we >>>>>>>>>>> can add `NONE` processing semantics to replace the autoAck == >>> false >>>>>>>>>>> scenario. >>>>>>>>>>> >>>>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the >>>>>>> framework >>>>>>>>>>> does not help the user to do any ack operations, and the ack is >>> left >>>>>>> to >>>>>>>>>> the >>>>>>>>>>> user to handle. In other cases, the framework guarantees >>> processing >>>>>>>>>>> semantics. >>>>>>>>>>> >>>>>>>>>>> ## API Changes >>>>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees >>>>>>>>>>> ``` java >>>>>>>>>>> public enum ProcessingGuarantees { >>>>>>>>>>> ATLEAST_ONCE, >>>>>>>>>>> ATMOST_ONCE, >>>>>>>>>>> EFFECTIVELY_ONCE, >>>>>>>>>>> NONE >>>>>>>>>>> } >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> 2. Delete autoAck config in FunctionConfig >>>>>>>>>>> ``` java >>>>>>>>>>> public class FunctionConfig { >>>>>>>>>>> - private Boolean autoAck; >>>>>>>>>>> } >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> ## Implementation >>>>>>>>>>> >>>>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and >>>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees >>> != >>>>>>> NONE` >>>>>>>>>>> can be ack. >>>>>>>>>>> >>>>>>>>>>> < >>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>> >>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276 >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will >>> be >>>>>>> acked >>>>>>>>>>> immediately after receiving the message, no longer affected by the >>>>>>>>>> autoAck >>>>>>>>>>> configuration. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>> >>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276 >>>>>>>>>>> >>>>>>>>>>> 3. When user call `record.ack()` in function, just >>>>>>> `ProcessingGuarantees >>>>>>>>>>> == NONE` can be work. >>>>>>>>>>> >>>>>>>>>>> ## Plan test >>>>>>>>>>> The main test and assert is that when ProcessingGuarantees == >>> NONE, >>>>>>> the >>>>>>>>>>> function framework will not do any ack operations for the user. >>>>>>>>>>> >>>>>>>>>>> ## Compatibility >>>>>>>>>>> 1. This change will invalidate the user's setting of autoAck, >>> which >>>>>>>>>> should >>>>>>>>>>> be explained in the documentation and provide parameter >>> verification >>>>>>> to >>>>>>>>>>> remind the user. >>>>>>>>>>> 2. Runtimes of other languages need to maintain consistent >>>>> processing >>>>>>>>>>> logic (python, go). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best Regards, >>>>>>>>>> Neng >>>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> >