Hi Hao Sun, When you use the Job Cluster mode, you should be sure to isolate the Zookeeper path for different jobs. Ufuk is correct. We fixed the JobID for the purpose of finding JobGraph in failover. In fact, FLINK-10291 should be combined with FLINK-10292[1].
To till, I hope FLINK-10292 can be reviewed as soon as possible. Thanks, vino. [1]: https://issues.apache.org/jira/projects/FLINK/issues/FLINK-10292 Hao Sun <ha...@zendesk.com> 于2018年11月5日周一 上午5:34写道: > Thanks that also works. To avoid same issue with zookeeper, I assume I > have to do the same trick? > > On Sun, Nov 4, 2018, 03:34 Ufuk Celebi <u...@apache.org> wrote: > >> Hey Hao Sun, >> >> this has been changed recently [1] in order to properly support >> failover in job cluster mode. >> >> A workaround for you would be to add an application identifier to the >> checkpoint path of each application, resulting in S3 paths like >> application-XXXX/00...00/chk-64. >> >> Is that a feasible solution? >> >> As a side note: It was considered to keep the job ID fixed, but make >> it configurable (e.g. by providing a --job-id argument) which would >> also help to avoid this situation, but I'm not aware of any concrete >> plans to move forward with that approach. >> >> Best, >> >> Ufuk >> >> [1] https://issues.apache.org/jira/projects/FLINK/issues/FLINK-10291 >> >> On Sun, Nov 4, 2018 at 3:39 AM Hao Sun <ha...@zendesk.com> wrote: >> > >> > I am wondering if I can customize job_id for job cluster mode. >> Currently it is always 00000000000000000000000000000000. I am running >> multiple job clusters and sharing s3, it means checkpoints will be shared >> by different jobs as well e.g. 00000000000000000000000000000000/chk-64, how >> can I avoid this? >> > >> > Thanks >> >