Hi Maximilian, thank you for the reply. I've checked out the documentation before running my tests (I'm not expert enough to not read the docs ;)) but it doesn't mention some specific requirement regarding the execution retries, I'll check it out, thank!
On Mon, Feb 15, 2016 at 12:51 PM, Maximilian Michels <m...@apache.org> wrote: > Hi Stefano, > > The Job should stop temporarily but then be resumed by the new > JobManager. Have you increased the number of execution retries? AFAIK, > it is set to 0 by default. This will not re-run the job, even in HA > mode. You can enable it on the StreamExecutionEnvironment. > > Otherwise, you have probably already found the documentation: > > https://ci.apache.org/projects/flink/flink-docs-master/setup/jobmanager_high_availability.html#configuration > > Cheers, > Max > > On Mon, Feb 15, 2016 at 12:35 PM, Stefano Baghino > <stefano.bagh...@radicalbit.io> wrote: > > Hello everyone, > > > > last week I've ran some tests with Apache ZooKeeper to get a grip on > Flink > > HA features. My tests went bad so far and I can't sort out the reason. > > > > My latest tests involved Flink 0.10.2, ran as a standalone cluster with 3 > > masters and 4 slaves. The 3 masters are also the ZooKeeper (3.4.6) > ensemble. > > I've started ZooKeeper on each machine, tested it's availability and then > > started the Flink cluster. Since there's no reliable distributed > filesystem > > on the cluster, I had to use the local file system as the state backend. > > > > I then submitted a very simple streaming job that writes the timestamp > on a > > text file on the local file system each second and then went on to kill > the > > process running the job manager to verify that another job manager takes > > over. However, the job just stopped. I still have to perform some checks > on > > the handover to the new job manager, but before digging deeper I wanted > to > > ask if my expectation of having the job going despite the job manager > > failure is unreasonable. > > > > Thanks in advance. > > > > -- > > BR, > > Stefano Baghino > > > > Software Engineer @ Radicalbit > -- BR, Stefano Baghino Software Engineer @ Radicalbit