Re: YARN High Availability

Till Rohrmann Wed, 18 Nov 2015 09:02:01 -0800

Hi Gwenhaël,

do you have access to the yarn logs?


Cheers,
Till

On Wed, Nov 18, 2015 at 5:55 PM, Gwenhael Pasquiers <
gwenhael.pasqui...@ericsson.com> wrote:

> Hello,
>
>
>
> We’re trying to set up high availability using an existing zookeeper
> quorum already running in our Cloudera cluster.
>
>
>
> So, as per the doc we’ve changed the max attempt in yarn’s config as well
> as the flink.yaml.
>
>
>
> recovery.mode: zookeeper
>
> recovery.zookeeper.quorum: host1:3181,host2:3181,host3:3181
>
> state.backend: filesystem
>
> state.backend.fs.checkpointdir: hdfs:///flink/checkpoints
>
> recovery.zookeeper.storageDir: hdfs:///flink/recovery/
>
> yarn.application-attempts: 1000
>
>
>
> Everything is ok as long as recovery.mode is commented.
>
> As soon as I uncomment recovery.mode the deployment on yarn is stuck on :
>
>
>
> “Deploying cluster, current state ACCEPTED”.
>
> “Deployment took more than 60 seconds….”
>
> Every second.
>
>
>
> And I have more than enough resources available on my yarn cluster.
>
>
>
> Do you have any idea of what could cause this, and/or what logs I should
> look for in order to understand ?
>
>
>
> B.R.
>
>
>
> Gwenhaël PASQUIERS
>

Re: YARN High Availability

Reply via email to