[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762965#comment-17762965 ]
Zili Chen commented on FLINK-33053: ----------------------------------- The log seems trimed. I saw: 2023-09-08 11:09:03,738 DEBUG org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.WatcherRemovalManager [] - Removing watcher for path: /flink/flink-native-test-117/leader/7db5c7316828f598234677e2169e7b0f/connection_info So the TM has issued watcher removal request. > Watcher leak in Zookeeper HA mode > --------------------------------- > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 1.17.0, 1.17.1 > Reporter: Yangze Guo > Priority: Critical > Attachments: 26.dump.zip, 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)