>> If you fail to start the function, you immediately break people's > functions when they upgrade to this version. How about notifying them once > via logger (WARN)?
I tend to fail. Although this breaks the current logic. but the current implementation can be considered is a bug. > It will flood their logs if they used it wrong. Maybe write to log once? Agree, I changed PIP. Thanks, Baodi Shi > 2022年5月31日 23:5720,Asaf Mesika <asaf.mes...@gmail.com> 写道: > > Hi Baodi, > > Regarding > >> >> 1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must >> be true. If the validation fails, let the function fail to start (This >> temporarily resolves the configuration ambiguity). When autoAck is >> subsequently removed, the message will be acked immediately after receiving >> the message. >> >> >> If you fail to start the function, you immediately break people's > functions when they upgrade to this version. How about notifying them once > via logger (WARN)? > > Regarding > >> >> 1. >> >> >> When user call record.ack() in function, just ProcessingGuarantees == >> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user >> call record.ack() is invalid(print warn log). >> >> It will flood their logs if they used it wrong. Maybe write to log once? > > On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolong...@icloud.com.invalid> > wrote: > >> Hi, Asaf. >> >> Thanks review. >> >>>> I'm not entirely sure that is accurate. The Effectively-Once as I >>> understand it is achieved using transactions, thus the consumption of >> that >>> message and production of any messages, as a result, are considered one >>> atomic unit - either message acknowledged and messages produced or >> neither. >> >> >> Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE >> + Message Deduplication. >> >> @Neng Lu @Rui Fu Can help make sure? >> >>>> I would issue a WARN when reading configuring the function (thus emitted >>> once) when the user actively configured autoAck=false and warn them that >>> this configuration is deprecated and they should switch to the MANUAL >>> ProcessingGuarantee configuration option. >> >> >> Added to API Change(2) >> >>>> suggest you clarify what existing behavior remains for backward >>> compatibility with the appropriate comment that this is deprecated and >> will >>> be removed. >> >> Yes, I have rewritten it, please see Implementation(1) >> >>> 5. Regarding Test Plan >>> * I would add: Validate the test of autoAck=false still works (you >> haven't >>> broken anything) >>> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce, >>> AtLeastOnce, ExactlyOnce still works (when autoAck=true) >> >> >> Nice, I added to PIP. >> >> >> Thanks, >> Baodi Shi >> >>> 2022年5月30日 22:0011,Asaf Mesika <asaf.mes...@gmail.com> 写道: >>> >>> Thanks for applying the fixes. >>> >>> 1. Regarding >>> >>>> >>>> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function >>>> finished execution. Depends on pulsar deduplication, and provides >>>> end-to-end effectively once processing. >>>> >>>> I'm not entirely sure that is accurate. The Effectively-Once as I >>> understand it is achieved using transactions, thus the consumption of >> that >>> message and production of any messages, as a result, are considered one >>> atomic unit - either message acknowledged and messages produced or >> neither. >>> >>> 2. Regarding >>> >>>> >>>> 1. Indication of autoAck is deprecated, and not use it in the code. >>>> (and also Function.proto) >>>> >>>> * I would issue a WARN when reading configuring the function (thus >> emitted >>> once) when the user actively configured autoAck=false and warn them that >>> this configuration is deprecated and they should switch to the MANUAL >>> ProcessingGuarantee configuration option. >>> >>> 3. Regarding >>> >>>> >>>> 1. When the delivery semantic is ATMOST_ONCE, the message will be >>>> acked immediately after receiving the message, no longer affected by >> the >>>> autoAck configuration. >>>> >>>> I suggest you clarify what existing behavior remains for backward >>> compatibility with the appropriate comment that this is deprecated and >> will >>> be removed. >>> >>> 4. Regarding >>> >>>> >>>> 1. >>>> >>>> When user call record.ack() in function, just ProcessingGuarantees == >>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user >>>> call record.ack() is invalid(print warn log). >>>> >>>> That might blast WARN messages to the user. Perhaps save the fact that >> you >>> have printed a WARN message once and only print when the variable is not >>> set? >>> >>> 5. Regarding Test Plan >>> * I would add: Validate the test of autoAck=false still works (you >> haven't >>> broken anything) >>> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce, >>> AtLeastOnce, ExactlyOnce still works (when autoAck=true) >>> >>> >>> >>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com >> .invalid> >>> wrote: >>> >>>> Hi, Mesika. >>>> >>>> Thanks review. >>>> >>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of >> NONE. >>>> As >>>>>> you carefully explained, ProcessingGuarantee comes does to the fact >> that >>>>>> the function executor calls acknowledge, in specific timing: >>>> >>>> >>>> Added, Refer to the latest pip. >>>> https://github.com/apache/pulsar/issues/15560 >>>> >>>>>> 3. Removing autoAck from Function Config breaks backward >> compatibility, >>>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar >>>>>> release handle breaking change? >>>> >>>> As suggested by @neng, They will be marked as deprecated first and >> clearly >>>> stated in the documentation. Remove it after 2~3 release. >>>> >>>>>> 4. Regarding Implementation (1), isn't the class itself >>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I >>>>>> understand how you use that enum value *inside* the class/ >>>> >>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest >> PIP >>>> API Changes(3) >>>> >>>> Thanks, >>>> Baodi Shi >>>> >>>>> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道: >>>>> >>>>> Hi Baodi, >>>>> >>>>> Nice work. Put some suggestions below, ptal. >>>>> >>>>> 1. API changes should also contain the changes of `Function.proto`, >>>> including new `ProcessingGuarantees` option and `autoAck`. >>>>> 2. Please be sure the other language runtimes (like Python, Golang) do >>>> support similar `record.ack()` function from the context, if no, it >> might >>>> have some API changes for different runtime we well. >>>>> >>>>> >>>>> Best, >>>>> >>>>> Rui Fu >>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道: >>>>>> 1. "Added NONE delivery semantics and delete autoAck config." >>>>>> - Added --> add >>>>>> >>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of >> NONE. >>>> As >>>>>> you carefully explained, ProcessingGuarantee comes does to the fact >> that >>>>>> the function executor calls acknowledge, in specific timing: >>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is >>>>>> immediately acknowledged and only then the function is executed, thus >>>>>> guaranteeing it will not run more than once >>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function >>>> finished >>>>>> execution, thus it will be run at least once. >>>>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge >>>> the >>>>>> message, inside the function. >>>>>> >>>>>> I think if you couple that change with adding the explanation I wrote >>>>>> above to the documentation it will become crystal clear (hopefully) >>>> what is >>>>>> a Processing Guarantee exactly and what each value signifies. >>>>>> >>>>>> 3. Removing autoAck from Function Config breaks backward >> compatibility, >>>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar >>>>>> release handle breaking change? >>>>>> >>>>>> 4. Regarding Implementation (1), isn't the class itself >>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I >>>>>> understand how you use that enum value *inside* the class/ >>>>>> >>>>>> >>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote: >>>>>> >>>>>>> Some suggestions: >>>>>>> >>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the >>>> code. >>>>>>> And documented clearly it's deprecated for the following 2~3 release. >>>> And >>>>>>> then delete it. >>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and >>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it >>>> defaults >>>>>>> to ATLEAST_ONCE. >>>>>>> 3. Need to let users know the behavior when they call `record.ack()` >>>> inside >>>>>>> the function implementation. >>>>>>> >>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com >>>> .invalid> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Pulsar community, >>>>>>>> >>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function >>>> add >>>>>>>> NONE delivery semantics >>>>>>>> >>>>>>>> Let me know what you think. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Baodi Shi >>>>>>>> >>>>>>>> >>>>>>>> ## Motivation >>>>>>>> >>>>>>>> Currently Function supports three delivery semantics, and also >>>> provides >>>>>>>> autoAck to control whether to automatically ack. >>>>>>>> Because autoAck affects the delivery semantics of Function, it can >> be >>>>>>>> confusing for users to understand the relationship between these two >>>>>>>> parameters. >>>>>>>> >>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` >> and >>>>>>>> `autoAck == false`, then the framework will not help the user to ack >>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`. >>>>>>>> >>>>>>>> The delivery semantics provided by Function should be clear. When >> the >>>>>>> user >>>>>>>> sets the guarantees, the framework should ensure point-to-point >>>> semantic >>>>>>>> processing and cannot be affected by other parameters. >>>>>>>> >>>>>>>> ## Goal >>>>>>>> >>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config. >>>>>>>> >>>>>>>> The original intention of `autoAck` semantics is that users want to >>>>>>>> control the timing of ack by themselves. When autoAck == false, the >>>>>>>> processing semantics provided by the framework should be invalid. >>>> Then we >>>>>>>> can add `NONE` processing semantics to replace the autoAck == false >>>>>>>> scenario. >>>>>>>> >>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the >>>> framework >>>>>>>> does not help the user to do any ack operations, and the ack is left >>>> to >>>>>>> the >>>>>>>> user to handle. In other cases, the framework guarantees processing >>>>>>>> semantics. >>>>>>>> >>>>>>>> ## API Changes >>>>>>>> 1. Add `NONE` type to ProcessingGuarantees >>>>>>>> ``` java >>>>>>>> public enum ProcessingGuarantees { >>>>>>>> ATLEAST_ONCE, >>>>>>>> ATMOST_ONCE, >>>>>>>> EFFECTIVELY_ONCE, >>>>>>>> NONE >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> 2. Delete autoAck config in FunctionConfig >>>>>>>> ``` java >>>>>>>> public class FunctionConfig { >>>>>>>> - private Boolean autoAck; >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> ## Implementation >>>>>>>> >>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and >>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != >>>> NONE` >>>>>>>> can be ack. >>>>>>>> >>>>>>>> < >>>>>>>> >>>>>>> >>>> >> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276 >>>>>>>>> >>>>>>>> >>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be >>>> acked >>>>>>>> immediately after receiving the message, no longer affected by the >>>>>>> autoAck >>>>>>>> configuration. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>> >> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276 >>>>>>>> >>>>>>>> 3. When user call `record.ack()` in function, just >>>> `ProcessingGuarantees >>>>>>>> == NONE` can be work. >>>>>>>> >>>>>>>> ## Plan test >>>>>>>> The main test and assert is that when ProcessingGuarantees == NONE, >>>> the >>>>>>>> function framework will not do any ack operations for the user. >>>>>>>> >>>>>>>> ## Compatibility >>>>>>>> 1. This change will invalidate the user's setting of autoAck, which >>>>>>> should >>>>>>>> be explained in the documentation and provide parameter verification >>>> to >>>>>>>> remind the user. >>>>>>>> 2. Runtimes of other languages need to maintain consistent >> processing >>>>>>>> logic (python, go). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> Neng >>>>>>> >>>> >>>> >> >>