Thanks for applying the fixes.

1. Regarding

>
>    - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>    finished execution. Depends on pulsar deduplication, and provides
>    end-to-end effectively once processing.
>
> I'm not entirely sure that is accurate. The Effectively-Once as I
understand it is achieved using transactions, thus the consumption of that
message and production of any messages, as a result, are considered one
atomic unit - either message acknowledged and messages produced or neither.

2. Regarding

>
>    1. Indication of autoAck is deprecated, and not use it in the code.
>    (and also Function.proto)
>
> * I would issue a WARN when reading configuring the function (thus emitted
once) when the user actively configured autoAck=false and warn them that
this configuration is deprecated and they should switch to the MANUAL
ProcessingGuarantee configuration option.

3. Regarding

>
>    1. When the delivery semantic is ATMOST_ONCE, the message will be
>    acked immediately after receiving the message, no longer affected by the
>    autoAck configuration.
>
> I suggest you clarify what existing behavior remains for backward
compatibility with the appropriate comment that this is deprecated and will
be removed.

4. Regarding

>
>    1.
>
>    When user call record.ack() in function, just ProcessingGuarantees ==
>    MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>    call record.ack() is invalid(print warn log).
>
> That might blast WARN messages to the user. Perhaps save the fact that you
have printed a WARN message once and only print when the variable is not
set?

5. Regarding Test Plan
* I would add: Validate the test of autoAck=false still works (you haven't
broken anything)
* I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
AtLeastOnce, ExactlyOnce still works (when autoAck=true)



On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com.invalid>
wrote:

> Hi, Mesika.
>
> Thanks review.
>
> >> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
> As
> >> you carefully explained, ProcessingGuarantee comes does to the fact that
> >> the function executor calls acknowledge, in specific timing:
>
>
> Added, Refer to the latest pip.
> https://github.com/apache/pulsar/issues/15560
>
> >> 3. Removing autoAck from Function Config breaks backward compatibility,
> >> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >> release handle breaking change?
>
> As suggested by @neng, They will be marked as deprecated first and clearly
> stated in the documentation. Remove it after 2~3 release.
>
> >> 4. Regarding Implementation (1), isn't the class itself
> >> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >> understand how you use that enum value *inside* the class/
>
> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP
> API Changes(3)
>
> Thanks,
> Baodi Shi
>
> > 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道:
> >
> > Hi Baodi,
> >
> > Nice work. Put some suggestions below, ptal.
> >
> > 1. API changes should also contain the changes of `Function.proto`,
> including new `ProcessingGuarantees` option and `autoAck`.
> > 2. Please be sure the other language runtimes (like Python, Golang) do
> support similar `record.ack()` function from the context, if no, it might
> have some API changes for different runtime we well.
> >
> >
> > Best,
> >
> > Rui Fu
> > 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道:
> >> 1. "Added NONE delivery semantics and delete autoAck config."
> >> - Added --> add
> >>
> >> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
> As
> >> you carefully explained, ProcessingGuarantee comes does to the fact that
> >> the function executor calls acknowledge, in specific timing:
> >> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >> immediately acknowledged and only then the function is executed, thus
> >> guaranteeing it will not run more than once
> >> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> finished
> >> execution, thus it will be run at least once.
> >> - `MANUAL` - Signals to the user that it is up to them to acknowledge
> the
> >> message, inside the function.
> >>
> >> I think if you couple that change with adding the explanation I wrote
> >> above to the documentation it will become crystal clear (hopefully)
> what is
> >> a Processing Guarantee exactly and what each value signifies.
> >>
> >> 3. Removing autoAck from Function Config breaks backward compatibility,
> >> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >> release handle breaking change?
> >>
> >> 4. Regarding Implementation (1), isn't the class itself
> >> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >> understand how you use that enum value *inside* the class/
> >>
> >>
> >> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote:
> >>
> >>> Some suggestions:
> >>>
> >>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> code.
> >>> And documented clearly it's deprecated for the following 2~3 release.
> And
> >>> then delete it.
> >>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> defaults
> >>> to ATLEAST_ONCE.
> >>> 3. Need to let users know the behavior when they call `record.ack()`
> inside
> >>> the function implementation.
> >>>
> >>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com
> .invalid>
> >>> wrote:
> >>>
> >>>> Hi Pulsar community,
> >>>>
> >>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
> add
> >>>> NONE delivery semantics
> >>>>
> >>>> Let me know what you think.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Baodi Shi
> >>>>
> >>>>
> >>>> ## Motivation
> >>>>
> >>>> Currently Function supports three delivery semantics, and also
> provides
> >>>> autoAck to control whether to automatically ack.
> >>>> Because autoAck affects the delivery semantics of Function, it can be
> >>>> confusing for users to understand the relationship between these two
> >>>> parameters.
> >>>>
> >>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
> >>>> `autoAck == false`, then the framework will not help the user to ack
> >>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>
> >>>> The delivery semantics provided by Function should be clear. When the
> >>> user
> >>>> sets the guarantees, the framework should ensure point-to-point
> semantic
> >>>> processing and cannot be affected by other parameters.
> >>>>
> >>>> ## Goal
> >>>>
> >>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>
> >>>> The original intention of `autoAck` semantics is that users want to
> >>>> control the timing of ack by themselves. When autoAck == false, the
> >>>> processing semantics provided by the framework should be invalid.
> Then we
> >>>> can add `NONE` processing semantics to replace the autoAck == false
> >>>> scenario.
> >>>>
> >>>> When the user configuration `ProcessingGuarantees == NONE`, the
> framework
> >>>> does not help the user to do any ack operations, and the ack is left
> to
> >>> the
> >>>> user to handle. In other cases, the framework guarantees processing
> >>>> semantics.
> >>>>
> >>>> ## API Changes
> >>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>> ``` java
> >>>> public enum ProcessingGuarantees {
> >>>> ATLEAST_ONCE,
> >>>> ATMOST_ONCE,
> >>>> EFFECTIVELY_ONCE,
> >>>> NONE
> >>>> }
> >>>> ```
> >>>>
> >>>> 2. Delete autoAck config in FunctionConfig
> >>>> ``` java
> >>>> public class FunctionConfig {
> >>>> - private Boolean autoAck;
> >>>> }
> >>>> ```
> >>>>
> >>>> ## Implementation
> >>>>
> >>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
> NONE`
> >>>> can be ack.
> >>>>
> >>>> <
> >>>>
> >>>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>
> >>>>
> >>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
> acked
> >>>> immediately after receiving the message, no longer affected by the
> >>> autoAck
> >>>> configuration.
> >>>>
> >>>>
> >>>>
> >>>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>
> >>>> 3. When user call `record.ack()` in function, just
> `ProcessingGuarantees
> >>>> == NONE` can be work.
> >>>>
> >>>> ## Plan test
> >>>> The main test and assert is that when ProcessingGuarantees == NONE,
> the
> >>>> function framework will not do any ack operations for the user.
> >>>>
> >>>> ## Compatibility
> >>>> 1. This change will invalidate the user's setting of autoAck, which
> >>> should
> >>>> be explained in the documentation and provide parameter verification
> to
> >>>> remind the user.
> >>>> 2. Runtimes of other languages need to maintain consistent processing
> >>>> logic (python, go).
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> Best Regards,
> >>> Neng
> >>>
>
>

Reply via email to