Hi Baodi,

Nice work. Put some suggestions below, ptal.

1. API changes should also contain the changes of `Function.proto`, including 
new `ProcessingGuarantees` option and `autoAck`.
2. Please be sure the other language runtimes (like Python, Golang) do support 
similar `record.ack()` function from the context, if no, it might have some API 
changes for different runtime we well.


Best,

Rui Fu
在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道:
> 1. "Added NONE delivery semantics and delete autoAck config."
> - Added --> add
>
> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
> you carefully explained, ProcessingGuarantee comes does to the fact that
> the function executor calls acknowledge, in specific timing:
> - `AT_MOST_ONCE` - When the message is read by the client, it is
> immediately acknowledged and only then the function is executed, thus
> guaranteeing it will not run more than once
> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished
> execution, thus it will be run at least once.
> - `MANUAL` - Signals to the user that it is up to them to acknowledge the
> message, inside the function.
>
> I think if you couple that change with adding the explanation I wrote
> above to the documentation it will become crystal clear (hopefully) what is
> a Processing Guarantee exactly and what each value signifies.
>
> 3. Removing autoAck from Function Config breaks backward compatibility,
> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> release handle breaking change?
>
> 4. Regarding Implementation (1), isn't the class itself
> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> understand how you use that enum value *inside* the class/
>
>
> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote:
>
> > Some suggestions:
> >
> > 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
> > And documented clearly it's deprecated for the following 2~3 release. And
> > then delete it.
> > 2. For `PulsarSinkAtLeastOnceProcessor` and
> > `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
> > to ATLEAST_ONCE.
> > 3. Need to let users know the behavior when they call `record.ack()` inside
> > the function implementation.
> >
> > On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com.invalid>
> > wrote:
> >
> > > Hi Pulsar community,
> > >
> > > I open a https://github.com/apache/pulsar/issues/15560 for Function add
> > > NONE delivery semantics
> > >
> > > Let me know what you think.
> > >
> > >
> > > Thanks,
> > > Baodi Shi
> > >
> > >
> > > ## Motivation
> > >
> > > Currently Function supports three delivery semantics, and also provides
> > > autoAck to control whether to automatically ack.
> > > Because autoAck affects the delivery semantics of Function, it can be
> > > confusing for users to understand the relationship between these two
> > > parameters.
> > >
> > > For example, when the user configures `Guarantees == ATMOST_ONCE` and
> > > `autoAck == false`, then the framework will not help the user to ack
> > > messages, and the processing semantics may become `ATLEAST_ONCE`.
> > >
> > > The delivery semantics provided by Function should be clear. When the
> > user
> > > sets the guarantees, the framework should ensure point-to-point semantic
> > > processing and cannot be affected by other parameters.
> > >
> > > ## Goal
> > >
> > > Added `NONE` delivery semantics and delete `autoAck` config.
> > >
> > > The original intention of `autoAck` semantics is that users want to
> > > control the timing of ack by themselves. When autoAck == false, the
> > > processing semantics provided by the framework should be invalid. Then we
> > > can add `NONE` processing semantics to replace the autoAck == false
> > > scenario.
> > >
> > > When the user configuration `ProcessingGuarantees == NONE`, the framework
> > > does not help the user to do any ack operations, and the ack is left to
> > the
> > > user to handle. In other cases, the framework guarantees processing
> > > semantics.
> > >
> > > ## API Changes
> > > 1. Add `NONE` type to ProcessingGuarantees
> > > ``` java
> > > public enum ProcessingGuarantees {
> > > ATLEAST_ONCE,
> > > ATMOST_ONCE,
> > > EFFECTIVELY_ONCE,
> > > NONE
> > > }
> > > ```
> > >
> > > 2. Delete autoAck config in FunctionConfig
> > > ``` java
> > > public class FunctionConfig {
> > > - private Boolean autoAck;
> > > }
> > > ```
> > >
> > > ## Implementation
> > >
> > > 1. In `PulsarSinkAtLeastOnceProcessor` and
> > > `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
> > > can be ack.
> > >
> > > <
> > >
> > https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> > > >
> > >
> > > 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
> > > immediately after receiving the message, no longer affected by the
> > autoAck
> > > configuration.
> > >
> > >
> > >
> > https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> > >
> > > 3. When user call `record.ack()` in function, just `ProcessingGuarantees
> > > == NONE` can be work.
> > >
> > > ## Plan test
> > > The main test and assert is that when ProcessingGuarantees == NONE, the
> > > function framework will not do any ack operations for the user.
> > >
> > > ## Compatibility
> > > 1. This change will invalidate the user's setting of autoAck, which
> > should
> > > be explained in the documentation and provide parameter verification to
> > > remind the user.
> > > 2. Runtimes of other languages need to maintain consistent processing
> > > logic (python, go).
> > >
> > >
> > >
> >
> > --
> > Best Regards,
> > Neng
> >

Reply via email to