Hi, Asaf. Thanks review.
>> I'm not entirely sure that is accurate. The Effectively-Once as I > understand it is achieved using transactions, thus the consumption of that > message and production of any messages, as a result, are considered one > atomic unit - either message acknowledged and messages produced or neither. Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE + Message Deduplication. @Neng Lu @Rui Fu Can help make sure? >> I would issue a WARN when reading configuring the function (thus emitted > once) when the user actively configured autoAck=false and warn them that > this configuration is deprecated and they should switch to the MANUAL > ProcessingGuarantee configuration option. Added to API Change(2) >> suggest you clarify what existing behavior remains for backward > compatibility with the appropriate comment that this is deprecated and will > be removed. Yes, I have rewritten it, please see Implementation(1) > 5. Regarding Test Plan > * I would add: Validate the test of autoAck=false still works (you haven't > broken anything) > * I would add: Validate existing ProcessingGuarantee test for AtMostOnce, > AtLeastOnce, ExactlyOnce still works (when autoAck=true) Nice, I added to PIP. Thanks, Baodi Shi > 2022年5月30日 22:0011,Asaf Mesika <asaf.mes...@gmail.com> 写道: > > Thanks for applying the fixes. > > 1. Regarding > >> >> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function >> finished execution. Depends on pulsar deduplication, and provides >> end-to-end effectively once processing. >> >> I'm not entirely sure that is accurate. The Effectively-Once as I > understand it is achieved using transactions, thus the consumption of that > message and production of any messages, as a result, are considered one > atomic unit - either message acknowledged and messages produced or neither. > > 2. Regarding > >> >> 1. Indication of autoAck is deprecated, and not use it in the code. >> (and also Function.proto) >> >> * I would issue a WARN when reading configuring the function (thus emitted > once) when the user actively configured autoAck=false and warn them that > this configuration is deprecated and they should switch to the MANUAL > ProcessingGuarantee configuration option. > > 3. Regarding > >> >> 1. When the delivery semantic is ATMOST_ONCE, the message will be >> acked immediately after receiving the message, no longer affected by the >> autoAck configuration. >> >> I suggest you clarify what existing behavior remains for backward > compatibility with the appropriate comment that this is deprecated and will > be removed. > > 4. Regarding > >> >> 1. >> >> When user call record.ack() in function, just ProcessingGuarantees == >> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user >> call record.ack() is invalid(print warn log). >> >> That might blast WARN messages to the user. Perhaps save the fact that you > have printed a WARN message once and only print when the variable is not > set? > > 5. Regarding Test Plan > * I would add: Validate the test of autoAck=false still works (you haven't > broken anything) > * I would add: Validate existing ProcessingGuarantee test for AtMostOnce, > AtLeastOnce, ExactlyOnce still works (when autoAck=true) > > > > On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com.invalid> > wrote: > >> Hi, Mesika. >> >> Thanks review. >> >>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. >> As >>>> you carefully explained, ProcessingGuarantee comes does to the fact that >>>> the function executor calls acknowledge, in specific timing: >> >> >> Added, Refer to the latest pip. >> https://github.com/apache/pulsar/issues/15560 >> >>>> 3. Removing autoAck from Function Config breaks backward compatibility, >>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar >>>> release handle breaking change? >> >> As suggested by @neng, They will be marked as deprecated first and clearly >> stated in the documentation. Remove it after 2~3 release. >> >>>> 4. Regarding Implementation (1), isn't the class itself >>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I >>>> understand how you use that enum value *inside* the class/ >> >> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP >> API Changes(3) >> >> Thanks, >> Baodi Shi >> >>> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道: >>> >>> Hi Baodi, >>> >>> Nice work. Put some suggestions below, ptal. >>> >>> 1. API changes should also contain the changes of `Function.proto`, >> including new `ProcessingGuarantees` option and `autoAck`. >>> 2. Please be sure the other language runtimes (like Python, Golang) do >> support similar `record.ack()` function from the context, if no, it might >> have some API changes for different runtime we well. >>> >>> >>> Best, >>> >>> Rui Fu >>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道: >>>> 1. "Added NONE delivery semantics and delete autoAck config." >>>> - Added --> add >>>> >>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. >> As >>>> you carefully explained, ProcessingGuarantee comes does to the fact that >>>> the function executor calls acknowledge, in specific timing: >>>> - `AT_MOST_ONCE` - When the message is read by the client, it is >>>> immediately acknowledged and only then the function is executed, thus >>>> guaranteeing it will not run more than once >>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function >> finished >>>> execution, thus it will be run at least once. >>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge >> the >>>> message, inside the function. >>>> >>>> I think if you couple that change with adding the explanation I wrote >>>> above to the documentation it will become crystal clear (hopefully) >> what is >>>> a Processing Guarantee exactly and what each value signifies. >>>> >>>> 3. Removing autoAck from Function Config breaks backward compatibility, >>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar >>>> release handle breaking change? >>>> >>>> 4. Regarding Implementation (1), isn't the class itself >>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I >>>> understand how you use that enum value *inside* the class/ >>>> >>>> >>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote: >>>> >>>>> Some suggestions: >>>>> >>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the >> code. >>>>> And documented clearly it's deprecated for the following 2~3 release. >> And >>>>> then delete it. >>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and >>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it >> defaults >>>>> to ATLEAST_ONCE. >>>>> 3. Need to let users know the behavior when they call `record.ack()` >> inside >>>>> the function implementation. >>>>> >>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com >> .invalid> >>>>> wrote: >>>>> >>>>>> Hi Pulsar community, >>>>>> >>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function >> add >>>>>> NONE delivery semantics >>>>>> >>>>>> Let me know what you think. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Baodi Shi >>>>>> >>>>>> >>>>>> ## Motivation >>>>>> >>>>>> Currently Function supports three delivery semantics, and also >> provides >>>>>> autoAck to control whether to automatically ack. >>>>>> Because autoAck affects the delivery semantics of Function, it can be >>>>>> confusing for users to understand the relationship between these two >>>>>> parameters. >>>>>> >>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and >>>>>> `autoAck == false`, then the framework will not help the user to ack >>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`. >>>>>> >>>>>> The delivery semantics provided by Function should be clear. When the >>>>> user >>>>>> sets the guarantees, the framework should ensure point-to-point >> semantic >>>>>> processing and cannot be affected by other parameters. >>>>>> >>>>>> ## Goal >>>>>> >>>>>> Added `NONE` delivery semantics and delete `autoAck` config. >>>>>> >>>>>> The original intention of `autoAck` semantics is that users want to >>>>>> control the timing of ack by themselves. When autoAck == false, the >>>>>> processing semantics provided by the framework should be invalid. >> Then we >>>>>> can add `NONE` processing semantics to replace the autoAck == false >>>>>> scenario. >>>>>> >>>>>> When the user configuration `ProcessingGuarantees == NONE`, the >> framework >>>>>> does not help the user to do any ack operations, and the ack is left >> to >>>>> the >>>>>> user to handle. In other cases, the framework guarantees processing >>>>>> semantics. >>>>>> >>>>>> ## API Changes >>>>>> 1. Add `NONE` type to ProcessingGuarantees >>>>>> ``` java >>>>>> public enum ProcessingGuarantees { >>>>>> ATLEAST_ONCE, >>>>>> ATMOST_ONCE, >>>>>> EFFECTIVELY_ONCE, >>>>>> NONE >>>>>> } >>>>>> ``` >>>>>> >>>>>> 2. Delete autoAck config in FunctionConfig >>>>>> ``` java >>>>>> public class FunctionConfig { >>>>>> - private Boolean autoAck; >>>>>> } >>>>>> ``` >>>>>> >>>>>> ## Implementation >>>>>> >>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and >>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != >> NONE` >>>>>> can be ack. >>>>>> >>>>>> < >>>>>> >>>>> >> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276 >>>>>>> >>>>>> >>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be >> acked >>>>>> immediately after receiving the message, no longer affected by the >>>>> autoAck >>>>>> configuration. >>>>>> >>>>>> >>>>>> >>>>> >> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276 >>>>>> >>>>>> 3. When user call `record.ack()` in function, just >> `ProcessingGuarantees >>>>>> == NONE` can be work. >>>>>> >>>>>> ## Plan test >>>>>> The main test and assert is that when ProcessingGuarantees == NONE, >> the >>>>>> function framework will not do any ack operations for the user. >>>>>> >>>>>> ## Compatibility >>>>>> 1. This change will invalidate the user's setting of autoAck, which >>>>> should >>>>>> be explained in the documentation and provide parameter verification >> to >>>>>> remind the user. >>>>>> 2. Runtimes of other languages need to maintain consistent processing >>>>>> logic (python, go). >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> Neng >>>>> >> >>