Hi Christophe,

it is true that FlinkML only targets batch workloads. Also, there has not
been any development since a long time.

In March last year, a discussion was started on the dev mailing list about
different machine learning features for stream processing [1].
One result of this discussion was FLIP-23 [2] which will add a library for
model serving to Flink, i.e., it can load (and update) machine learning
models and evaluate them on a stream.
If you dig through the mailing list thread, you'll find a link to a Google
doc that discusses other possible directions.

Best, Fabian

[1]
https://lists.apache.org/thread.html/eeb80481f3723c160bc923d689416a352d6df4aad98fe7424bf33132@%3Cdev.flink.apache.org%3E
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-23+-+Model+Serving

2018-02-05 16:43 GMT+01:00 Christophe Jolif <cjo...@gmail.com>:

> Hi all,
>
> Sorry, this is me again with another question.
>
> Maybe I did not search deep enough, but it seems the FlinkML API is still
> pure batch.
>
> If I read https://cwiki.apache.org/confluence/display/FLINK/
> FlinkML%3A+Vision+and+Roadmap it seems there was the intend to "exploit
> the streaming nature of Flink, and provide functionality designed
> specifically for data streams" but from my external point of view, I don't
> see much happening here. Is there work in progress towards that?
>
> I would personally see two use-cases around streaming, first one around
> updating an existing model that was build in batch, second one would be
> triggering prediction not through a batch job but in a stream job.
>
> Are these things that are in the works? or maybe already feasible despite
> the API looking like purely batch branded?
>
> Thanks,
> --
> Christophe
>

Reply via email to