Hey Shengkai,

Thank you for your observations. This proposal is mostly driven by
Swapna, but I could also share my thoughts here, please find them
inline.

Cheers,
Matyas

On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]> wrote:
>
> Hi, Matyas.
>
> Thanks for the proposal.  I have some suggestions about the proposal.
>
> 1. I'm wondering whether we could extend the SQL API to change how Python
> models are loaded. For example, we could allow users to write:
>
> ```
> CREATE MODEL my_pytorch_model
> WITH (
>    'type' = 'pytorch'
> ) LANGUAGE PYTHON;
> ```
> In this case, we wouldn't rely on Java SPI to load the Python model
> provider. However, I'm not sure whether Python has a similar mechanism to
> SPI that avoids hardcoding class paths.

This is an interesting idea, however we are proposing using the
provider model because it aligns with Flink's existing Java-based
architecture for discovering plugins. A Java entry point is required
to launch the Python code, and this is the standard way to do it.

> 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can we
> reuse Beam's utilities to build Flink prediction functions[1]?

We can certainly learn from Beam's design, but directly reusing it
would add a very heavy dependency and be difficult to integrate
cleanly into Flink's native processing model.

> 3. It would be better if we introduced a PredictRuntimeContext to help
> users download required weight files.

This is actually a great idea and essential for usability. Just to
double check on your suggestion, your proposal is to have an explicit
PredictRuntimeContext for dynamic model file downloading?

>
> 4. In ML, users typically perform inference on batches of data. Therefore,
> per-record evaluation may not be necessary. How about we just introduce API
> like[2]?

I agree completely. The row-by-row API is just a starting point, and
we should aim to prioritize support for efficient batch inference to
ensure good performance for real-world models.
>
> Best,
> Shengkai
>
> [1] https://beam.apache.org/documentation/ml/about-ml/
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation
>
>
>
>
> Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道:
>
> > Thanks Matyas.
> >
> > Hao,
> >
> > The proposal is to provide a generic framework .
> > Interfaces ->  PythonPredictRuntimeProvider / PythonPredictFunction /
> > PredictFunction(in Python) are defined to provide a base for that
> > framework.
> >
> > generic-python is one of the implementations, registered similar to openai
> > in original FLIP.
> > This is though not a concrete implementation end to end. It can be used as,
> > 1. As a reference implementation for other complete end to end concrete
> > model provider implementations
> > 2. For simple python model implementations, this can be used out of box to
> > avoid boilerplate java provider implementation.
> >
> > I will also open a PR with current implementation changes , so it's more
> > clear for further discussion.
> >
> > -Thanks,
> > M.Swapna
> >
> > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <[email protected]>
> > wrote:
> >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers
> > >
> > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás <[email protected]>
> > > wrote:
> > > >
> > > > Swapna, I can help you to create a FLIP page.
> > > >
> > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li <[email protected]>
> > wrote:
> > > > >
> > > > > Hi Swapna,
> > > > >
> > > > > Thanks for the proposal. Can you put it in a FLIP and start a
> > > discussion
> > > > > thread for it?
> > > > >
> > > > > From an initial look, I'm a bit confused if this is a concrete
> > > > > implementation for "generic-python" or it's generic framework to
> > handle
> > > > > python predict function. Because everything seems concrete like
> > > > > `GenericPythonModelProviderFactory`, `GenericPythonModelProvider`
> > > exception
> > > > > the final Python predict function.
> > > > >
> > > > > Also if `GenericPythonModelProviderFactory` is predefined, do you
> > > predefine
> > > > > the required and optional options for it? Will it be inflexible if
> > > > > predefined?
> > > > >
> > > > > Thanks,
> > > > > Hao
> > > > >
> > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru <
> > > [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > Hi ShengKai,
> > > > > >
> > > > > > Documented the initial proposal here ,
> > > > > >
> > > > > >
> > > > >
> > >
> > https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing
> > > > > >
> > > > > > Please review and let me know your thoughts.
> > > > > >
> > > > > > -Thanks,
> > > > > > Swapna
> > > > > >
> > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang <[email protected]>
> > > wrote:
> > > > > >
> > > > > > > I see your point, and I agree that your proposal is feasible.
> > > However,
> > > > > > > there is one limitation to consider: the current loading
> > mechanism
> > > first
> > > > > > > discovers all available factories on the classpath and then
> > > filters them
> > > > > > > based on the user-specified identifiers.
> > > > > > >
> > > > > > > In most practical scenarios, we would likely have only one
> > generic
> > > > > factory
> > > > > > > (e.g., a GenericPythonModelFactory) present in the classpath.
> > This
> > > means
> > > > > > > the framework would be able to load either PyTorch or TensorFlow
> > > > > > > models—whichever is defined within that single generic
> > > > > implementation—but
> > > > > > > not both simultaneously unless additional mechanisms are
> > > introduced.
> > > > > > >
> > > > > > > This doesn't block the proposal, but it’s something worth noting
> > > as we
> > > > > > > design the extensibility model. We may want to explore ways to
> > > support
> > > > > > > multiple user-defined providers more seamlessly in the future.
> > > > > > >
> > > > > > > Best,
> > > > > > > Shengkai
> > > > > > >
> > >
> >

Reply via email to