Hi All. I am facing a weird issue while upgrading Solr8.11 to Solr9. I have everyhting up and running passing all kind of tests unit and integration on my current CD process.
I have a cluster of 3 machines on SolrCloud and it's all good and working. Problem happens when machines are restarted. Either 1 or 2 servers of the cluster can't connect to zookeeper even when zookeeper reports as healthy and stable. If I restart solr then the server can connect back to the cluster and gets healthy. I check the logs and everything seems normal except the servers who tries to connect ot the cluster and fails on start. I get this error. I tried to delay the start of solr a bit just in case but no luck. Any help much appreciated. Sergio 2024-06-20 12:56:42.944 INFO (main) [ ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (0) -> (2) 2024-06-20 12:56:43.003 INFO (main) [ ] o.a.s.c.DistributedClusterStateUpdater Creating DistributedClusterStateUpdater with useDistributedStateUpdate=false. Solr will be using Overseer based cluster state updates. 2024-06-20 12:56:43.056 INFO (main) [ ] o.a.s.c.ZkController Publish node=server03:8983_solr as DOWN 2024-06-20 12:56:43.088 INFO (main) [ ] o.a.s.c.ZkController Register node as live in ZooKeeper:/live_nodes/server03:8983_solr 2024-06-20 12:56:43.111 ERROR (main) [ ] o.a.s.c.ZkController => org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) ~[?:?] at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1778) ~[?:?] at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1650) ~[?:?] at org.apache.solr.common.cloud.SolrZkClient.lambda$multi$12(SolrZkClient.java:781) ~[?:?] at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:70) ~[?:?] at org.apache.solr.common.cloud.SolrZkClient.multi(SolrZkClient.java:781) ~[?:?] at org.apache.solr.cloud.ZkController.createEphemeralLiveNode(ZkController.java:1211) ~[?:?]