Hello everyone, It's been quite a while since I wrote to the Flink ML, because in my current job never actually arose the need for a stateful stream processing system, until now.
Since the last version I actually tried was Flink 1.9, well before Stateful Functions, I had a few questions about some of the latest features. 1. What are the use cases for which Flink Statefuns were thought of? As far as I understand from the documentation, they are basically processors that can be separated from a "main" Flink streaming job (and can be integrated with), but I fail to grasp how they should differ from a rest endpoint implemented using any other framework. 2. How is the storage for these functions configured? I see that the storage for the state is accessed via a Context object, so I think it is configured by a Flink cluster configuration? I would like, then, to elaborate on my use case: we have some 20 CDC topics (1 topic per table) on Kafka. Upon the data streamed on these topics, we need to compute many features to be used by a ML model. Many of these features need to be computed by joining multiple topics and/or need the whole history of the field. So, I was wondering if Stateful Functions could be a good approach to this problem, where a feature could be "packaged" in a single stateful function to be "triggered" by the arrival of any new message on the topic configured as its ingress. So, basically, I'm wondering if they could fit the use case, or we're better off with a custom flink job. Thank you for your time, -- Federico D'Ambrosio