Hey Shengkai, Thank you for your observations. This proposal is mostly driven by Swapna, but I could also share my thoughts here, please find them inline.
Cheers, Matyas On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]> wrote: > > Hi, Matyas. > > Thanks for the proposal. I have some suggestions about the proposal. > > 1. I'm wondering whether we could extend the SQL API to change how Python > models are loaded. For example, we could allow users to write: > > ``` > CREATE MODEL my_pytorch_model > WITH ( > 'type' = 'pytorch' > ) LANGUAGE PYTHON; > ``` > In this case, we wouldn't rely on Java SPI to load the Python model > provider. However, I'm not sure whether Python has a similar mechanism to > SPI that avoids hardcoding class paths. This is an interesting idea, however we are proposing using the provider model because it aligns with Flink's existing Java-based architecture for discovering plugins. A Java entry point is required to launch the Python code, and this is the standard way to do it. > 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can we > reuse Beam's utilities to build Flink prediction functions[1]? We can certainly learn from Beam's design, but directly reusing it would add a very heavy dependency and be difficult to integrate cleanly into Flink's native processing model. > 3. It would be better if we introduced a PredictRuntimeContext to help > users download required weight files. This is actually a great idea and essential for usability. Just to double check on your suggestion, your proposal is to have an explicit PredictRuntimeContext for dynamic model file downloading? > > 4. In ML, users typically perform inference on batches of data. Therefore, > per-record evaluation may not be necessary. How about we just introduce API > like[2]? I agree completely. The row-by-row API is just a starting point, and we should aim to prioritize support for efficient batch inference to ensure good performance for real-world models. > > Best, > Shengkai > > [1] https://beam.apache.org/documentation/ml/about-ml/ > [2] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation > > > > > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道: > > > Thanks Matyas. > > > > Hao, > > > > The proposal is to provide a generic framework . > > Interfaces -> PythonPredictRuntimeProvider / PythonPredictFunction / > > PredictFunction(in Python) are defined to provide a base for that > > framework. > > > > generic-python is one of the implementations, registered similar to openai > > in original FLIP. > > This is though not a concrete implementation end to end. It can be used as, > > 1. As a reference implementation for other complete end to end concrete > > model provider implementations > > 2. For simple python model implementations, this can be used out of box to > > avoid boilerplate java provider implementation. > > > > I will also open a PR with current implementation changes , so it's more > > clear for further discussion. > > > > -Thanks, > > M.Swapna > > > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <[email protected]> > > wrote: > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers > > > > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás <[email protected]> > > > wrote: > > > > > > > > Swapna, I can help you to create a FLIP page. > > > > > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li <[email protected]> > > wrote: > > > > > > > > > > Hi Swapna, > > > > > > > > > > Thanks for the proposal. Can you put it in a FLIP and start a > > > discussion > > > > > thread for it? > > > > > > > > > > From an initial look, I'm a bit confused if this is a concrete > > > > > implementation for "generic-python" or it's generic framework to > > handle > > > > > python predict function. Because everything seems concrete like > > > > > `GenericPythonModelProviderFactory`, `GenericPythonModelProvider` > > > exception > > > > > the final Python predict function. > > > > > > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do you > > > predefine > > > > > the required and optional options for it? Will it be inflexible if > > > > > predefined? > > > > > > > > > > Thanks, > > > > > Hao > > > > > > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru < > > > [email protected]> > > > > > wrote: > > > > > > > > > > > > Hi ShengKai, > > > > > > > > > > > > Documented the initial proposal here , > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing > > > > > > > > > > > > Please review and let me know your thoughts. > > > > > > > > > > > > -Thanks, > > > > > > Swapna > > > > > > > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang <[email protected]> > > > wrote: > > > > > > > > > > > > > I see your point, and I agree that your proposal is feasible. > > > However, > > > > > > > there is one limitation to consider: the current loading > > mechanism > > > first > > > > > > > discovers all available factories on the classpath and then > > > filters them > > > > > > > based on the user-specified identifiers. > > > > > > > > > > > > > > In most practical scenarios, we would likely have only one > > generic > > > > > factory > > > > > > > (e.g., a GenericPythonModelFactory) present in the classpath. > > This > > > means > > > > > > > the framework would be able to load either PyTorch or TensorFlow > > > > > > > models—whichever is defined within that single generic > > > > > implementation—but > > > > > > > not both simultaneously unless additional mechanisms are > > > introduced. > > > > > > > > > > > > > > This doesn't block the proposal, but it’s something worth noting > > > as we > > > > > > > design the extensibility model. We may want to explore ways to > > > support > > > > > > > multiple user-defined providers more seamlessly in the future. > > > > > > > > > > > > > > Best, > > > > > > > Shengkai > > > > > > > > > > > >
