Hi Igal, all, In the meantime we found a way to serve Flink stateful functions in a frontend. We decided to add another (set of) Flask application(s) which link to Kafka topics. These Kafka topics then serve as ingress and egress for the statefun cluster. However, we're wondering how we can scale this cluster. On the documentation page some nice figures are provided for different setups but no implementation details are given. In our case we are using a remote cluster so we have a Docker instance containing the `python-stateful-function` and of course the Flink cluster containing a `master` and `worker`. If I understood correctly, in a remote setting, we can scale both the Flink cluster and the `python-stateful-function`. Scaling the Flink cluster is trivial because I can add just more workers/task-managers (providing more taskslots) just by scaling the worker instance. However, how can I scale the stateful function also ensuring that it ends op in the correct Flink job (because we need shared state there). I tried scaling the Docker instance as well but that didn't seem to work.
Hope you can give me some leads there. Thanks in advance! Kind regards, Wouter Op do 7 mei 2020 om 17:17 schreef Wouter Zorgdrager <zorgdrag...@gmail.com>: > Hi Igal, > > Thanks for your quick reply. Getting back to point 2, I was wondering if > you could trigger indeed a stateful function directly from Flask and also > get the reply there instead of using Kafka in between. We want to > experiment running stateful functions behind a front-end (which should be > able to trigger a function), but we're a bit afraid that using Kafka > doesn't scale well if on the frontend side a user has to consume all Kafka > messages to find the correct reply/output for a certain request/input. Any > thoughts? > > Thanks in advance, > Wouter > > Op do 7 mei 2020 om 10:51 schreef Igal Shilman <i...@ververica.com>: > >> Hi Wouter! >> >> Glad to read that you are using Flink for quite some time, and also >> exploring with StateFun! >> >> 1) yes it is correct and you can follow the Dockerhub contribution PR at >> [1] >> >> 2) I’m not sure I understand what do you mean by trigger from the browser. >> If you mean, for testing / illustration purposes triggering the function >> independently of StateFun, you would need to write some JavaScript and >> preform the POST (assuming CORS are enabled) >> Let me know if you’d like getting further information of how to do it. >> Broadly speaking, GET is traditionally used to get data from a resource >> and POST to send data (the data is the invocation batch in our case). >> >> One easier walk around for you would be to expose another endpoint in >> your Flask application, and call your stateful function directly from there >> (possibly populating the function argument with values taken from the query >> params) >> >> 3) I would expect a performance loss when going from the embedded SDK to >> the remote one, simply because the remote function is at a different >> process, and a round trip is required. There are different ways of >> deployment even for remote functions. >> For example they can be co-located with the Task managers and communicate >> via the loop back device /Unix domain socket, or they can be deployed >> behind a load balancer with an auto-scaler, and thus reacting to higher >> request rate/latency increases by spinning new instances (something that is >> not yet supported with the embedded API) >> >> Good luck, >> Igal. >> >> >> >> >> >> [1] https://github.com/docker-library/official-images/pull/7749 >> >> >> On Wednesday, May 6, 2020, Wouter Zorgdrager <zorgdrag...@gmail.com> >> wrote: >> >>> Hi all, >>> >>> I've been using Flink for quite some time now and for a university >>> project I'm planning to experiment with statefun. During the walkthrough >>> I've run into some issues, I hope you can help me with. >>> >>> 1) Is it correct that the Docker image of statefun is not yet published? >>> I couldn't find it anywhere, but was able to run it by building the image >>> myself. >>> 2) In the example project using the Python SDK, it uses Flask to expose >>> a function using POST. Is there also a way to serve GET request so that you >>> can trigger a stateful function by for instance using your browser? >>> 3) Do you expect a lot of performance loss when using the Python SDK >>> over Java? >>> >>> Thanks in advance! >>> >>> Regards, >>> Wouter >>> >>