Hi Wouter,

Triggering a stateful function from a frontend indeed requires an ingress
between them, so the way you've approached this is also the way we were
thinking of.
As Gordon mentioned a potential improvement might be an HTTP ingress, that
would allow triggering stateful functions directly from the front end
servers.
But this kind of ingress is not implemented yet.

Regarding scaling: Your understanding is correct, you can scale both the
Flink cluster and the remote "python-stateful-function" cluster
independently.
Scaling the Flink cluster, tho, requires taking a savepoint, bumping the
job parallelism, and starting the cluster with more workers from the
savepoint taken previously.

Scaling "python-stateful-function" workers can be done transparently to the
Flink cluster, but the exact details are deployment specific.
- For example the python workers are a k8s service.
- Or the python workers are deployed behind a load balancer
- Or you add new entries to the DNS record of your python worker.

I didn't understand "ensuring that it ends op in the correct Flink job" can
you please clarify?
Flink would be the one contacting the remote workers and not the other way
around. So as long as the new instances
are visible to Flink they would be reached with the same shared state.

I'd recommend watching [1] and the demo at the end, and [2] for a demo
using stateful functions on AWS lambda.

[1] https://youtu.be/NF0hXZfUyqE
[2] https://www.youtube.com/watch?v=tuSylBadNSo

It seems like you are on the correct path!
Good luck!
Igal.


On Tue, May 12, 2020 at 11:18 PM Wouter Zorgdrager <zorgdrag...@gmail.com>
wrote:

> Hi Igal, all,
>
> In the meantime we found a way to serve Flink stateful functions in a
> frontend. We decided to add another (set of) Flask application(s) which
> link to Kafka topics. These Kafka topics then serve as ingress and egress
> for the statefun cluster. However, we're wondering how we can scale this
> cluster. On the documentation page some nice figures are provided for
> different setups but no implementation details are given. In our case we
> are using a remote cluster so we have a Docker instance containing the
> `python-stateful-function` and of course the Flink cluster containing a
> `master` and `worker`. If I understood correctly, in a remote setting, we
> can scale both the Flink cluster and the `python-stateful-function`.
> Scaling the Flink cluster is trivial because I can add just more
> workers/task-managers (providing more taskslots) just by scaling the worker
> instance. However, how can I scale the stateful function also ensuring that
> it ends op in the correct Flink job (because we need shared state there). I
> tried scaling the Docker instance as well but that didn't seem to work.
>
> Hope you can give me some leads there.
> Thanks in advance!
>
> Kind regards,
> Wouter
>
> Op do 7 mei 2020 om 17:17 schreef Wouter Zorgdrager <zorgdrag...@gmail.com
> >:
>
>> Hi Igal,
>>
>> Thanks for your quick reply. Getting back to point 2, I was wondering if
>> you could trigger indeed a stateful function directly from Flask and also
>> get the reply there instead of using Kafka in between. We want to
>> experiment running stateful functions behind a front-end (which should be
>> able to trigger a function), but we're a bit afraid that using Kafka
>> doesn't scale well if on the frontend side a user has to consume all Kafka
>> messages to find the correct reply/output for a certain request/input. Any
>> thoughts?
>>
>> Thanks in advance,
>> Wouter
>>
>> Op do 7 mei 2020 om 10:51 schreef Igal Shilman <i...@ververica.com>:
>>
>>> Hi Wouter!
>>>
>>> Glad to read that you are using Flink for quite some time, and also
>>> exploring with StateFun!
>>>
>>> 1) yes it is correct and you can follow the Dockerhub contribution PR at
>>> [1]
>>>
>>> 2) I’m not sure I understand what do you mean by trigger from the
>>> browser.
>>> If you mean, for testing / illustration purposes triggering the function
>>> independently of StateFun, you would need to write some JavaScript and
>>> preform the POST (assuming CORS are enabled)
>>> Let me know if you’d like getting further information of how to do it.
>>> Broadly speaking, GET is traditionally used to get data from a resource
>>> and POST to send data (the data is the invocation batch in our case).
>>>
>>> One easier walk around for you would be to expose another endpoint in
>>> your Flask application, and call your stateful function directly from there
>>> (possibly populating the function argument with values taken from the query
>>> params)
>>>
>>> 3) I would expect a performance loss when going from the embedded SDK to
>>> the remote one, simply because the remote function is at a different
>>> process, and a round trip is required. There are different ways of
>>> deployment even for remote functions.
>>> For example they can be co-located with the Task managers and
>>> communicate via the loop back device /Unix domain socket, or they can be
>>> deployed behind a load balancer with an auto-scaler, and thus reacting to
>>> higher request rate/latency increases by spinning new instances (something
>>> that is not yet supported with the embedded API)
>>>
>>> Good luck,
>>> Igal.
>>>
>>>
>>>
>>>
>>>
>>> [1] https://github.com/docker-library/official-images/pull/7749
>>>
>>>
>>> On Wednesday, May 6, 2020, Wouter Zorgdrager <zorgdrag...@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I've been using Flink for quite some time now and for a university
>>>> project I'm planning to experiment with statefun. During the walkthrough
>>>> I've run into some issues, I hope you can help me with.
>>>>
>>>> 1) Is it correct that the Docker image of statefun is not yet
>>>> published? I couldn't find it anywhere, but was able to run it by building
>>>> the image myself.
>>>> 2) In the example project using the Python SDK, it uses Flask to expose
>>>> a function using POST. Is there also a way to serve GET request so that you
>>>> can trigger a stateful function by for instance using your browser?
>>>> 3) Do you expect a lot of performance loss when using the Python SDK
>>>> over Java?
>>>>
>>>> Thanks in advance!
>>>>
>>>> Regards,
>>>> Wouter
>>>>
>>>

Reply via email to