Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-26 Thread Mikhail Pryakhin
Hi Till, thanks for your reply! here is the issue ticket: https://issues.apache.org/jira/browse/FLINK-10694 Kind Regards, Mike Pryakhin > On 26 Oct 2018, at 18:29, Till Rohrmann wrote: > > Hi Mike, > > thanks for reporting this issue. I thi

Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-26 Thread Till Rohrmann
Hi Mike, thanks for reporting this issue. I think you're right that Flink leaves some empty nodes in ZooKeeper. It seems that we don't delete the node with all its children in ZooKeeperHaServices#closeAndCleanupAllData. Could you please open a JIRA issue to in order to fix it? Thanks a lot! Che

Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-26 Thread Mikhail Pryakhin
Hi Andrey, Thanks a lot for your reply! > What was the full job life cycle? 1. The job is deployed as a YARN cluster with the following properties set high-availability: zookeeper high-availability.zookeeper.quorum: high-availability.zookeeper.storageDir: hdfs:///

Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-26 Thread Andrey Zagrebin
Hi Mike, What was the full job life cycle? Did you start it with Flink 1.6.1 or canceled job running with 1.6.0? Was there a failover of Job Master while running before the cancelation? What version of Zookeeper do you use? Flink creates child nodes to create a lock for the job in Zookeeper. Lo

Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-25 Thread Mikhail Pryakhin
Hi Flink experts! When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related Zookeeper nodes are not removed. Is there a reason behind that? I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral node that can have nested nodes) and fall back to