Hi Emmanuel!

Flink does not yet include JobManager failover, but we have this on the
list for the mid term future (middle to second half of the year).

At this point, when the JobManager dies, the job is cancelled.

Greetings,
Stephan


On Mon, Mar 16, 2015 at 4:43 PM, Emmanuel <ele...@msn.com> wrote:

>  I see...
> Because of the start-cluster script, I was under the impression that the
> jobmanager had to connect to each node upon start-up, which would make
> scaling an issue without restarting the job manager, but it makes sense
> now. Thanks for the clarification.
>
>  Side question:what happens if the job manager fails? Taskmanagers keep
> running the jobs? Should a typical production setup include multiple
> jobmanagers for replication, and if so how is that configured?
>
>  Emmanuel
>
>
>
> -------- Original message --------
> From: Stephan Ewen <se...@apache.org>
> Date:03/16/2015 1:09 AM (GMT-08:00)
> To: user@flink.apache.org
> Subject: Re: Scaling a Flink cluster
>
>  Hi Emmanuel!
>
> The slaves file is not needed on every node. It is only used by the
> "start-cluster.sh" Script, which makes an ssh call to every host in that
> file to start a taskmanager.
>
> You can add a taskmanager to an existing flink cluster by simply calling
> "taskmanager.sh start" on that machine (which should have a flink-conf.yaml
> file). The flink-conf.yaml may actually be different for every taskmanager
> as well, but that is a more uncommon setup...
>
> Greetings,
> Stephan
> Am 16.03.2015 08:27 schrieb "Emmanuel" <ele...@msn.com>:
>
>  Hello,
>
>  In my understanding, the flink-conf.yaml is the one config file to
> configure a cluster.
> The slave file lists the slave nodes.
> they must both be on every node.
>
>  I'm trying to understand what is the best strategy to scale a Flink
> cluster since:
> - adding a node means adding an entry to the slave list and replicating on
> every node.
>
>  Does the cluster need to be restarted to take the new nodes into
> account? It seems like it.
> Having to replicate the file on all nodes is not super convenient.
> Restarting is even more trouble.
> Is there a way to scale a live cluster? If so how?
> Any link to the info would be helpful.
>
>  Thanks
>
>

Reply via email to