Hi Baodi, Nice work. Put some suggestions below, ptal.
1. API changes should also contain the changes of `Function.proto`, including new `ProcessingGuarantees` option and `autoAck`. 2. Please be sure the other language runtimes (like Python, Golang) do support similar `record.ack()` function from the context, if no, it might have some API changes for different runtime we well. Best, Rui Fu 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道: > 1. "Added NONE delivery semantics and delete autoAck config." > - Added --> add > > 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As > you carefully explained, ProcessingGuarantee comes does to the fact that > the function executor calls acknowledge, in specific timing: > - `AT_MOST_ONCE` - When the message is read by the client, it is > immediately acknowledged and only then the function is executed, thus > guaranteeing it will not run more than once > - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished > execution, thus it will be run at least once. > - `MANUAL` - Signals to the user that it is up to them to acknowledge the > message, inside the function. > > I think if you couple that change with adding the explanation I wrote > above to the documentation it will become crystal clear (hopefully) what is > a Processing Guarantee exactly and what each value signifies. > > 3. Removing autoAck from Function Config breaks backward compatibility, > thus shouldn't this be strongly reflected in the PIP - how does Pulsar > release handle breaking change? > > 4. Regarding Implementation (1), isn't the class itself > PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I > understand how you use that enum value *inside* the class/ > > > On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote: > > > Some suggestions: > > > > 1. Instead of deleting the `autoAck`, keep it but not use it in the code. > > And documented clearly it's deprecated for the following 2~3 release. And > > then delete it. > > 2. For `PulsarSinkAtLeastOnceProcessor` and > > `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults > > to ATLEAST_ONCE. > > 3. Need to let users know the behavior when they call `record.ack()` inside > > the function implementation. > > > > On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com.invalid> > > wrote: > > > > > Hi Pulsar community, > > > > > > I open a https://github.com/apache/pulsar/issues/15560 for Function add > > > NONE delivery semantics > > > > > > Let me know what you think. > > > > > > > > > Thanks, > > > Baodi Shi > > > > > > > > > ## Motivation > > > > > > Currently Function supports three delivery semantics, and also provides > > > autoAck to control whether to automatically ack. > > > Because autoAck affects the delivery semantics of Function, it can be > > > confusing for users to understand the relationship between these two > > > parameters. > > > > > > For example, when the user configures `Guarantees == ATMOST_ONCE` and > > > `autoAck == false`, then the framework will not help the user to ack > > > messages, and the processing semantics may become `ATLEAST_ONCE`. > > > > > > The delivery semantics provided by Function should be clear. When the > > user > > > sets the guarantees, the framework should ensure point-to-point semantic > > > processing and cannot be affected by other parameters. > > > > > > ## Goal > > > > > > Added `NONE` delivery semantics and delete `autoAck` config. > > > > > > The original intention of `autoAck` semantics is that users want to > > > control the timing of ack by themselves. When autoAck == false, the > > > processing semantics provided by the framework should be invalid. Then we > > > can add `NONE` processing semantics to replace the autoAck == false > > > scenario. > > > > > > When the user configuration `ProcessingGuarantees == NONE`, the framework > > > does not help the user to do any ack operations, and the ack is left to > > the > > > user to handle. In other cases, the framework guarantees processing > > > semantics. > > > > > > ## API Changes > > > 1. Add `NONE` type to ProcessingGuarantees > > > ``` java > > > public enum ProcessingGuarantees { > > > ATLEAST_ONCE, > > > ATMOST_ONCE, > > > EFFECTIVELY_ONCE, > > > NONE > > > } > > > ``` > > > > > > 2. Delete autoAck config in FunctionConfig > > > ``` java > > > public class FunctionConfig { > > > - private Boolean autoAck; > > > } > > > ``` > > > > > > ## Implementation > > > > > > 1. In `PulsarSinkAtLeastOnceProcessor` and > > > `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE` > > > can be ack. > > > > > > < > > > > > https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276 > > > > > > > > > > 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked > > > immediately after receiving the message, no longer affected by the > > autoAck > > > configuration. > > > > > > > > > > > https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276 > > > > > > 3. When user call `record.ack()` in function, just `ProcessingGuarantees > > > == NONE` can be work. > > > > > > ## Plan test > > > The main test and assert is that when ProcessingGuarantees == NONE, the > > > function framework will not do any ack operations for the user. > > > > > > ## Compatibility > > > 1. This change will invalidate the user's setting of autoAck, which > > should > > > be explained in the documentation and provide parameter verification to > > > remind the user. > > > 2. Runtimes of other languages need to maintain consistent processing > > > logic (python, go). > > > > > > > > > > > > > -- > > Best Regards, > > Neng > >