Re: FLIP-6 and running many "small" jobs

Maciek Próchniak Fri, 04 Nov 2016 12:36:35 -0700

Hi Max,

thanks for answer.

I still have to wrap my head around it, but I hope we'll manage to workit out - maybe when 1.3.x arrives I'll have access to some nice mesoscluster... or not... we'll see :)


thanks,

maciek


On 25/10/2016 17:49, Maximilian Michels wrote:

Hi Maciek,

Your use case will be covered by the FLIP-6 "Sessions". Sessions are
similar to how the on-premise Flink or the Yarn session operates
today. We will have a long-running dispatcher, resource manager, and
task managers. We will bring up a job manager for each job but the
overhead for this one node (non HA) is relatively little if you have a
cluster with many nodes. After all, the resource intensive computation
is performed by the task managers. The job manager is only responsible
for coordinating the job execution.

Note that the dispatcher hosts the web UI and is responsible for
taking care of the job submission. The role of the resource manager
changes slightly to span across jobs. Task managers have always been
able to serve multiple jobs. Dispatcher, resource manager and task
managers live across jobs within a session.

In my opinion, you won't have to change you use pattern once FLIP-6 is
ready, which is targeted for Flink 1.3.0.

-Max


On Thu, Oct 20, 2016 at 10:07 AM, Maciek Próchniak <m...@touk.pl> wrote:

Hi,

we're looking at FLIP-6 and while it looks really great we started to wonder
how it fits in our use case.

We currently have around 20 processes but the idea is to have many more of
them. Many of them are pretty "small" - them don't large sources, are
stateless, mainly filtering data.

As I understand, FLIP-6 makes job even more heavyweight thing than today -
e.g. each job will have it's own jobmanager process etc.

Our concern is that each job will now require more resources - e.g. the
number of threads, memory and so on. We are thinking about a way to make
some jobs share these resources - of course that mean they won't be really
isolated from each other.

So far the only idea we see is deploying these small jobs together, as one
job - but this leads to some problems, like how to track which version is
really deployed (we talk about stateless processes so the only problem is
maintaining source kafka offsets)

Unfortunatelly our jobs can have many different sources and outcomes, so we
don't think doing sth similar to King&RBEA would work for us...

Do you have any views/ideas about such use case? Or is common view that we
should deploy our stuff to mesos and let it handle resource allocation? But
still - for some jobs we'd need sth like "1/4" slot :)

thanks,

maciek

Re: FLIP-6 and running many "small" jobs

Reply via email to