Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread David Anderson
I'm delighted to see the progress on this. This is going to be a major enabler for some important use cases. The proposed simplifications (global config and ordered mode) for V1 make a lot of sense to me. +1 David On Wed, Dec 20, 2023 at 12:31 PM Alan Sheinberg wrote: > Thanks for that feedbac

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Alan Sheinberg
Thanks for that feedback Lincoln, Only one question with the async `timeout` parameter[1](since I > haven't seen the POC code), current description is: 'The time which can > pass before a restart strategy is triggered', > but in the previous flip-232[2] and flip-234[3], in retry scenario, this > t

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Lincoln Lee
+1 for this useful feature! Hope this reply isn't too late. Agree that we start with global async-scalar configuration and ordered mode first. @Alan Only one question with the async `timeout` parameter[1](since I haven't seen the POC code), current description is: 'The time which can pass before a

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-19 Thread Alan Sheinberg
Thanks for the comments Timo. > Can you remove the necessary parts? Esp.: @Override > public Set getRequirements() { > return Collections.singleton(FunctionRequirement.ORDERED); > } I removed this section from the FLIP since presumably, there's no use in adding to the p

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-19 Thread Timo Walther
> I would be totally fine with the first version only having ORDERED > mode. For a v2, we could attempt to do the next most conservative > thing Sounds good to me. I also cheked AsyncWaitOperator and could not find n access of StreamRecord's timestamp but only watermarks. But as we said, let's

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-18 Thread Alan Sheinberg
Thanks for the helpful comments, Xuyang and Timo. @Timo, @Alan: IIUC, there seems to be something wrong here. Take kafka as > source and mysql as sink as an example. > Although kafka is an append-only source, one of its fields is used as pk > when writing to mysql. If async udx is executed > in a

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-18 Thread Timo Walther
Hi Xuyang and Alan, thanks for this productive discussion. > Would it make a difference if it were exposed by the explain @Alan: I think this is great idea. +1 on exposing the sync/async behavior thought EXPLAIN. > Is there an easy way to determine if the output of an async function > would

Re:Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-17 Thread Xuyang
Hi, Alan and Timo. Thanks for your reply. >Would it make a difference if it were exposed by the explain >method (the operator having "syncMode" vs not)? @Alan: I think this is a good way to tell the user what mode these async udx are currently in. >A regular SQL user doesn't care whether the funct

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-15 Thread Alan Sheinberg
Thanks for the replies everyone. My responses are inline: About the configs, what do you think using hints as mentioned in [1]. @Aitozi: I think hints could be a good way to do this, similar to lookup joins or the proposal in FLIP-313. One benefit of hints is that they allow for the highest gra

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-15 Thread Timo Walther
1. Override the function `getRequirements` in `AsyncScalarFunction` > If the user overrides `requirements()` to omit the `ORDERED` > requirement, do we allow the operator to return out-of-order results > or should it fall back on `AsyncOutputMode.ALLOW_UNORDERED` type > behavior (where we allow o

Re:Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-14 Thread Xuyang
Hi, Alan. Thanks for driving this. Using async to improve throughput has been done on look join, and the improvement effect is obvious, so I think it makes sense to support async scalar function. Big +1 for this flip. I have some questions below. 1. Override the function `getRequirements`

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-14 Thread Aitozi
Hi Alan, Nice FLIP, I also explore leveraging the async table function[1] to improve the throughput before. About the configs, what do you think using hints as mentioned in [1]. [1]: https://cwiki.apache.org/confluence/display/FLINK/FLIP-313%3A+Add+support+of+User+Defined+AsyncTableFunction

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-14 Thread Alan Sheinberg
Thanks Piotr and Timo for your responses. To address your comments Timo: 1) Configuration Configuration keys like `table.exec.async-scalar.catalog.db.func-name.buffer-capacity` are currently not supported in the configuration stack. The key space > should remain constant. Only a constant key s

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-14 Thread Timo Walther
Hi Alan, thanks for proposing this FLIP. It's a great addition to Flink and has been requested multiple times. It will be in particular interesting for accessing REST endpoints and other remote services. Great that we can generalize and reuse parts of the Python planner rules and code for th

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-11 Thread Piotr Nowojski
+1 to the idea, I don't have any comments. Best, Piotrek czw., 7 gru 2023 o 07:15 Alan Sheinberg napisał(a): > > > > Nicely written and makes sense. The only feedback I have is around the > > naming of the generalization, e.g. "Specifically, PythonCalcSplitRuleBase > > will be generalized into

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-06 Thread Alan Sheinberg
> > Nicely written and makes sense. The only feedback I have is around the > naming of the generalization, e.g. "Specifically, PythonCalcSplitRuleBase > will be generalized into RemoteCalcSplitRuleBase." This naming seems to > imply/suggest that all Async functions are remote. I wonder if we can

Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-06 Thread Jim Hughes
Hi Alan, Nicely written and makes sense. The only feedback I have is around the naming of the generalization, e.g. "Specifically, PythonCalcSplitRuleBase will be generalized into RemoteCalcSplitRuleBase." This naming seems to imply/suggest that all Async functions are remote. I wonder if we can

[DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-06 Thread Alan Sheinberg
I'd like to start a discussion of FLIP-400: AsyncScalarFunction for asynchronous scalar function support [1] This feature proposes adding a new UDF type AsyncScalarFunction which is invoked just like a normal ScalarFunction, but is implemented with an asynchronous eval method. I had brought this