Hi, Asaf.

Thanks review. 

>> I'm not entirely sure that is accurate. The Effectively-Once as I
> understand it is achieved using transactions, thus the consumption of that
> message and production of any messages, as a result, are considered one
> atomic unit - either message acknowledged and messages produced or neither.


Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE + 
Message Deduplication.

@Neng Lu @Rui Fu Can help make sure?

>> I would issue a WARN when reading configuring the function (thus emitted
> once) when the user actively configured autoAck=false and warn them that
> this configuration is deprecated and they should switch to the MANUAL
> ProcessingGuarantee configuration option.


Added to API Change(2)

>> suggest you clarify what existing behavior remains for backward
> compatibility with the appropriate comment that this is deprecated and will
> be removed.

Yes, I have rewritten it, please see Implementation(1)

> 5. Regarding Test Plan
> * I would add: Validate the test of autoAck=false still works (you haven't
> broken anything)
> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> AtLeastOnce, ExactlyOnce still works (when autoAck=true)


Nice, I added to PIP.


Thanks,
Baodi Shi

> 2022年5月30日 22:0011,Asaf Mesika <asaf.mes...@gmail.com> 写道:
> 
> Thanks for applying the fixes.
> 
> 1. Regarding
> 
>> 
>>   - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>   finished execution. Depends on pulsar deduplication, and provides
>>   end-to-end effectively once processing.
>> 
>> I'm not entirely sure that is accurate. The Effectively-Once as I
> understand it is achieved using transactions, thus the consumption of that
> message and production of any messages, as a result, are considered one
> atomic unit - either message acknowledged and messages produced or neither.
> 
> 2. Regarding
> 
>> 
>>   1. Indication of autoAck is deprecated, and not use it in the code.
>>   (and also Function.proto)
>> 
>> * I would issue a WARN when reading configuring the function (thus emitted
> once) when the user actively configured autoAck=false and warn them that
> this configuration is deprecated and they should switch to the MANUAL
> ProcessingGuarantee configuration option.
> 
> 3. Regarding
> 
>> 
>>   1. When the delivery semantic is ATMOST_ONCE, the message will be
>>   acked immediately after receiving the message, no longer affected by the
>>   autoAck configuration.
>> 
>> I suggest you clarify what existing behavior remains for backward
> compatibility with the appropriate comment that this is deprecated and will
> be removed.
> 
> 4. Regarding
> 
>> 
>>   1.
>> 
>>   When user call record.ack() in function, just ProcessingGuarantees ==
>>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>   call record.ack() is invalid(print warn log).
>> 
>> That might blast WARN messages to the user. Perhaps save the fact that you
> have printed a WARN message once and only print when the variable is not
> set?
> 
> 5. Regarding Test Plan
> * I would add: Validate the test of autoAck=false still works (you haven't
> broken anything)
> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> 
> 
> 
> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com.invalid>
> wrote:
> 
>> Hi, Mesika.
>> 
>> Thanks review.
>> 
>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
>> As
>>>> you carefully explained, ProcessingGuarantee comes does to the fact that
>>>> the function executor calls acknowledge, in specific timing:
>> 
>> 
>> Added, Refer to the latest pip.
>> https://github.com/apache/pulsar/issues/15560
>> 
>>>> 3. Removing autoAck from Function Config breaks backward compatibility,
>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>> release handle breaking change?
>> 
>> As suggested by @neng, They will be marked as deprecated first and clearly
>> stated in the documentation. Remove it after 2~3 release.
>> 
>>>> 4. Regarding Implementation (1), isn't the class itself
>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>> understand how you use that enum value *inside* the class/
>> 
>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP
>> API Changes(3)
>> 
>> Thanks,
>> Baodi Shi
>> 
>>> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道:
>>> 
>>> Hi Baodi,
>>> 
>>> Nice work. Put some suggestions below, ptal.
>>> 
>>> 1. API changes should also contain the changes of `Function.proto`,
>> including new `ProcessingGuarantees` option and `autoAck`.
>>> 2. Please be sure the other language runtimes (like Python, Golang) do
>> support similar `record.ack()` function from the context, if no, it might
>> have some API changes for different runtime we well.
>>> 
>>> 
>>> Best,
>>> 
>>> Rui Fu
>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道:
>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>> - Added --> add
>>>> 
>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
>> As
>>>> you carefully explained, ProcessingGuarantee comes does to the fact that
>>>> the function executor calls acknowledge, in specific timing:
>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>> immediately acknowledged and only then the function is executed, thus
>>>> guaranteeing it will not run more than once
>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>> finished
>>>> execution, thus it will be run at least once.
>>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
>> the
>>>> message, inside the function.
>>>> 
>>>> I think if you couple that change with adding the explanation I wrote
>>>> above to the documentation it will become crystal clear (hopefully)
>> what is
>>>> a Processing Guarantee exactly and what each value signifies.
>>>> 
>>>> 3. Removing autoAck from Function Config breaks backward compatibility,
>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>> release handle breaking change?
>>>> 
>>>> 4. Regarding Implementation (1), isn't the class itself
>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>> understand how you use that enum value *inside* the class/
>>>> 
>>>> 
>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote:
>>>> 
>>>>> Some suggestions:
>>>>> 
>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>> code.
>>>>> And documented clearly it's deprecated for the following 2~3 release.
>> And
>>>>> then delete it.
>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>> defaults
>>>>> to ATLEAST_ONCE.
>>>>> 3. Need to let users know the behavior when they call `record.ack()`
>> inside
>>>>> the function implementation.
>>>>> 
>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com
>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Hi Pulsar community,
>>>>>> 
>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
>> add
>>>>>> NONE delivery semantics
>>>>>> 
>>>>>> Let me know what you think.
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Baodi Shi
>>>>>> 
>>>>>> 
>>>>>> ## Motivation
>>>>>> 
>>>>>> Currently Function supports three delivery semantics, and also
>> provides
>>>>>> autoAck to control whether to automatically ack.
>>>>>> Because autoAck affects the delivery semantics of Function, it can be
>>>>>> confusing for users to understand the relationship between these two
>>>>>> parameters.
>>>>>> 
>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>>>>>> `autoAck == false`, then the framework will not help the user to ack
>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>> 
>>>>>> The delivery semantics provided by Function should be clear. When the
>>>>> user
>>>>>> sets the guarantees, the framework should ensure point-to-point
>> semantic
>>>>>> processing and cannot be affected by other parameters.
>>>>>> 
>>>>>> ## Goal
>>>>>> 
>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>> 
>>>>>> The original intention of `autoAck` semantics is that users want to
>>>>>> control the timing of ack by themselves. When autoAck == false, the
>>>>>> processing semantics provided by the framework should be invalid.
>> Then we
>>>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>>>> scenario.
>>>>>> 
>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>> framework
>>>>>> does not help the user to do any ack operations, and the ack is left
>> to
>>>>> the
>>>>>> user to handle. In other cases, the framework guarantees processing
>>>>>> semantics.
>>>>>> 
>>>>>> ## API Changes
>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>> ``` java
>>>>>> public enum ProcessingGuarantees {
>>>>>> ATLEAST_ONCE,
>>>>>> ATMOST_ONCE,
>>>>>> EFFECTIVELY_ONCE,
>>>>>> NONE
>>>>>> }
>>>>>> ```
>>>>>> 
>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>> ``` java
>>>>>> public class FunctionConfig {
>>>>>> - private Boolean autoAck;
>>>>>> }
>>>>>> ```
>>>>>> 
>>>>>> ## Implementation
>>>>>> 
>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
>> NONE`
>>>>>> can be ack.
>>>>>> 
>>>>>> <
>>>>>> 
>>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>> 
>>>>>> 
>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
>> acked
>>>>>> immediately after receiving the message, no longer affected by the
>>>>> autoAck
>>>>>> configuration.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>> 
>>>>>> 3. When user call `record.ack()` in function, just
>> `ProcessingGuarantees
>>>>>> == NONE` can be work.
>>>>>> 
>>>>>> ## Plan test
>>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
>> the
>>>>>> function framework will not do any ack operations for the user.
>>>>>> 
>>>>>> ## Compatibility
>>>>>> 1. This change will invalidate the user's setting of autoAck, which
>>>>> should
>>>>>> be explained in the documentation and provide parameter verification
>> to
>>>>>> remind the user.
>>>>>> 2. Runtimes of other languages need to maintain consistent processing
>>>>>> logic (python, go).
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> Best Regards,
>>>>> Neng
>>>>> 
>> 
>> 

Reply via email to