Hi, Mesika.

Thanks review.

>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
>> you carefully explained, ProcessingGuarantee comes does to the fact that
>> the function executor calls acknowledge, in specific timing:


Added, Refer to the latest pip. https://github.com/apache/pulsar/issues/15560

>> 3. Removing autoAck from Function Config breaks backward compatibility,
>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>> release handle breaking change?

As suggested by @neng, They will be marked as deprecated first and clearly 
stated in the documentation. Remove it after 2~3 release.

>> 4. Regarding Implementation (1), isn't the class itself
>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>> understand how you use that enum value *inside* the class/

I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP API 
Changes(3)

Thanks,
Baodi Shi

> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道:
> 
> Hi Baodi,
> 
> Nice work. Put some suggestions below, ptal.
> 
> 1. API changes should also contain the changes of `Function.proto`, including 
> new `ProcessingGuarantees` option and `autoAck`.
> 2. Please be sure the other language runtimes (like Python, Golang) do 
> support similar `record.ack()` function from the context, if no, it might 
> have some API changes for different runtime we well.
> 
> 
> Best,
> 
> Rui Fu
> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道:
>> 1. "Added NONE delivery semantics and delete autoAck config."
>> - Added --> add
>> 
>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
>> you carefully explained, ProcessingGuarantee comes does to the fact that
>> the function executor calls acknowledge, in specific timing:
>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>> immediately acknowledged and only then the function is executed, thus
>> guaranteeing it will not run more than once
>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished
>> execution, thus it will be run at least once.
>> - `MANUAL` - Signals to the user that it is up to them to acknowledge the
>> message, inside the function.
>> 
>> I think if you couple that change with adding the explanation I wrote
>> above to the documentation it will become crystal clear (hopefully) what is
>> a Processing Guarantee exactly and what each value signifies.
>> 
>> 3. Removing autoAck from Function Config breaks backward compatibility,
>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>> release handle breaking change?
>> 
>> 4. Regarding Implementation (1), isn't the class itself
>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>> understand how you use that enum value *inside* the class/
>> 
>> 
>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote:
>> 
>>> Some suggestions:
>>> 
>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
>>> And documented clearly it's deprecated for the following 2~3 release. And
>>> then delete it.
>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
>>> to ATLEAST_ONCE.
>>> 3. Need to let users know the behavior when they call `record.ack()` inside
>>> the function implementation.
>>> 
>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com.invalid>
>>> wrote:
>>> 
>>>> Hi Pulsar community,
>>>> 
>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function add
>>>> NONE delivery semantics
>>>> 
>>>> Let me know what you think.
>>>> 
>>>> 
>>>> Thanks,
>>>> Baodi Shi
>>>> 
>>>> 
>>>> ## Motivation
>>>> 
>>>> Currently Function supports three delivery semantics, and also provides
>>>> autoAck to control whether to automatically ack.
>>>> Because autoAck affects the delivery semantics of Function, it can be
>>>> confusing for users to understand the relationship between these two
>>>> parameters.
>>>> 
>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>>>> `autoAck == false`, then the framework will not help the user to ack
>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>> 
>>>> The delivery semantics provided by Function should be clear. When the
>>> user
>>>> sets the guarantees, the framework should ensure point-to-point semantic
>>>> processing and cannot be affected by other parameters.
>>>> 
>>>> ## Goal
>>>> 
>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>> 
>>>> The original intention of `autoAck` semantics is that users want to
>>>> control the timing of ack by themselves. When autoAck == false, the
>>>> processing semantics provided by the framework should be invalid. Then we
>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>> scenario.
>>>> 
>>>> When the user configuration `ProcessingGuarantees == NONE`, the framework
>>>> does not help the user to do any ack operations, and the ack is left to
>>> the
>>>> user to handle. In other cases, the framework guarantees processing
>>>> semantics.
>>>> 
>>>> ## API Changes
>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>> ``` java
>>>> public enum ProcessingGuarantees {
>>>> ATLEAST_ONCE,
>>>> ATMOST_ONCE,
>>>> EFFECTIVELY_ONCE,
>>>> NONE
>>>> }
>>>> ```
>>>> 
>>>> 2. Delete autoAck config in FunctionConfig
>>>> ``` java
>>>> public class FunctionConfig {
>>>> - private Boolean autoAck;
>>>> }
>>>> ```
>>>> 
>>>> ## Implementation
>>>> 
>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
>>>> can be ack.
>>>> 
>>>> <
>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>> 
>>>> 
>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
>>>> immediately after receiving the message, no longer affected by the
>>> autoAck
>>>> configuration.
>>>> 
>>>> 
>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>> 
>>>> 3. When user call `record.ack()` in function, just `ProcessingGuarantees
>>>> == NONE` can be work.
>>>> 
>>>> ## Plan test
>>>> The main test and assert is that when ProcessingGuarantees == NONE, the
>>>> function framework will not do any ack operations for the user.
>>>> 
>>>> ## Compatibility
>>>> 1. This change will invalidate the user's setting of autoAck, which
>>> should
>>>> be explained in the documentation and provide parameter verification to
>>>> remind the user.
>>>> 2. Runtimes of other languages need to maintain consistent processing
>>>> logic (python, go).
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Best Regards,
>>> Neng
>>> 

Reply via email to