Hi Igal, thank you so much for your response.
As for [2], I was mainly interested in how the state is stored physically. Looking at the deployment files, I see the following file https://github.com/apache/flink-statefun-playground/blob/main/deployments/k8s/04-statefun/01-statefun-runtime.yaml where the state seems to be defined by the keys in the flink-conf.yaml: state.backend: rocksdb state.backend.rocksdb.timer-service.factory: ROCKSDB As far as I can tell from the docs, the built-in backends are FS, HashMap and RocksDB, but I can technically implement my own backend by implementing this abstract class <https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/AbstractStateBackend.java> (and a related factory), is that correct? Thank you again, Federico Il giorno gio 24 feb 2022 alle ore 15:11 Igal Shilman <i...@apache.org> ha scritto: > Hello, > > For (1) I welcome you to visit our documentions, and many talks online to > understand more about the motivation and the value of StateFun. I can say > in a nutshell that StateFun provides few building blocks that makes > building distributed stateful applications easier. > > For (2) checkout our playground repository to see how storage is > configured. It is completely defined by the SDK and is not configured by > Flink cluster configuration. > > I think that the use case you are describing is a good fit for StateFun. > If you check out the latest Flink Forward's videos there were few that > described how to use > StateFun for exactly that[3]. > > Good luck! > Igal > > [1] https://nightlies.apache.org/flink/flink-statefun-docs-stable/ > [2] https://github.com/apache/flink-statefun-playground > [3] https://www.youtube.com/channel/UCY8_lgiZLZErZPF47a2hXMA/videos > > On Sun, Feb 20, 2022 at 1:54 PM Federico D'Ambrosio <fedex...@gmail.com> > wrote: > >> Hello everyone, >> >> It's been quite a while since I wrote to the Flink ML, because in my >> current job never actually arose the need for a stateful stream processing >> system, until now. >> >> Since the last version I actually tried was Flink 1.9, well before >> Stateful Functions, I had a few questions about some of the latest features. >> >> 1. What are the use cases for which Flink Statefuns were thought of? As >> far as I understand from the documentation, they are basically processors >> that can be separated from a "main" Flink streaming job (and can be >> integrated with), but I fail to grasp how they should differ from a rest >> endpoint implemented using any other framework. >> 2. How is the storage for these functions configured? I see that the >> storage for the state is accessed via a Context object, so I think it is >> configured by a Flink cluster configuration? >> >> I would like, then, to elaborate on my use case: we have some 20 CDC >> topics (1 topic per table) on Kafka. Upon the data streamed on these >> topics, we need to compute many features to be used by a ML model. Many of >> these features need to be computed by joining multiple topics and/or need >> the whole history of the field. So, I was wondering if Stateful Functions >> could be a good approach to this problem, where a feature could be >> "packaged" in a single stateful function to be "triggered" by the arrival >> of any new message on the topic configured as its ingress. >> >> So, basically, I'm wondering if they could fit the use case, or we're >> better off with a custom flink job. >> >> Thank you for your time, >> -- >> Federico D'Ambrosio >> > -- Federico D'Ambrosio