Hi Alan, Nicely written and makes sense. The only feedback I have is around the naming of the generalization, e.g. "Specifically, PythonCalcSplitRuleBase will be generalized into RemoteCalcSplitRuleBase." This naming seems to imply/suggest that all Async functions are remote. I wonder if we can find another name which doesn't carry that connotation; maybe AsyncCalcSplitRuleBase. (An AsyncCalcSplitRuleBase which handles Python and Async functions seems reasonable.)
Cheers, Jim On Wed, Dec 6, 2023 at 5:45 PM Alan Sheinberg <asheinb...@confluent.io.invalid> wrote: > I'd like to start a discussion of FLIP-400: AsyncScalarFunction for > asynchronous scalar function support [1] > > This feature proposes adding a new UDF type AsyncScalarFunction which is > invoked just like a normal ScalarFunction, but is implemented with an > asynchronous eval method. I had brought this up including the motivation > in a previous discussion thread [2]. > > The purpose is to achieve high throughput scalar function UDFs while > allowing that an individual call may have high latency. It allows scaling > up the parallelism of just these calls without having to increase the > parallelism of the whole query (which could be rather resource > inefficient). > > In practice, it should enable SQL integration with external services and > systems, which Flink has limited support for at the moment. It should also > allow easier integration with existing libraries which use asynchronous > APIs. > > Looking forward to your feedback and suggestions. > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support > > > > [2] https://lists.apache.org/thread/bn153gmcobr41x2nwgodvmltlk810hzs > <https://lists.apache.org/thread/bn153gmcobr41x2nwgodvmltlk810hzs> > > Thanks, > Alan >