[ 
https://issues.apache.org/jira/browse/SOLR-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh updated SOLR-18138:
-----------------------------
    Description: 
I ran bin/solr start -e cloud and was playing around with adding/removing 
nodes.   It was all working until I stopped the very first solr node I 
started....   I finally figured out that was because it was the node that was 
hosting the ZooKeeper!   So, with no ZooKeeper, then lots of things do not work.

In the admin UI I 
[http://localhost:9004/solr/admin/collections?_=1772130089267&action=LISTALIASES&wt=json]
 for example fails.   Though interestingly 
[http://localhost:9004/solr/admin/collections?_=1772130089267&action=LIST&wt=json]
 works, because I guess it doesn't consult ZooKeeper?  

 

Then I tried 

bin/solr zk ls -z 127.0.0.1:9983 /

 

and got 

ERROR - 2026-02-26 13:25:45.030; org.apache.solr.cli.ZkLsTool; Could not 
complete ls operation for reason:  =>org.apache.solr.common.SolrException: 
java.util.concurrent.TimeoutException: Timeout while waiting for Zookeeper 
Client to connect: 15000 ms

at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:213)

 

Using the solr end point is't better:

bin/solr zk ls -s [http://localhost:9004|http://localhost:9004/] /

 

ERROR: Error from server at 
[http://localhost:9004/solr/admin/collections?action=CLUSTERSTATUS&wt=javabin:] 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /live_nodes

 

I think we could provide a better error message to the user in the CLI.  And, 
maybe on the LISTALIAS we ought to return a better error than "KeeperErrorCode 
= ConnectionLoss for /aliases.json"???

 

 

Restarting the first node brings things back to life!

"bin/solr" start -p 8983 --solr-home "example/cloud/node1/solr" --server-dir 
"/Users/epugh/Documents/projects/solr/solr/packaging/build/dev/server"

 

  was:
I ran bin/solr start -e cloud and was playing around with adding/removing 
nodes.   It was all working until I stopped the very first solr node I 
started....   I finally figured out that was because it was the node that was 
hosting the ZooKeeper!   So, with no ZooKeeper, then lots of things do not work.

In the admin UI I 
[http://localhost:9004/solr/admin/collections?_=1772130089267&action=LISTALIASES&wt=json]
 for example fails.   Though interestingly 
[http://localhost:9004/solr/admin/collections?_=1772130089267&action=LIST&wt=json]
 works, because I guess it doesn't consult ZooKeeper?  

 

Then I tried 

bin/solr zk ls -z 127.0.0.1:9983 /

 

and got 

ERROR - 2026-02-26 13:25:45.030; org.apache.solr.cli.ZkLsTool; Could not 
complete ls operation for reason:  =>org.apache.solr.common.SolrException: 
java.util.concurrent.TimeoutException: Timeout while waiting for Zookeeper 
Client to connect: 15000 ms

at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:213)

 

Using the solr end point is't better:

bin/solr zk ls -s http://localhost:9004 /

 

ERROR: Error from server at 
http://localhost:9004/solr/admin/collections?action=CLUSTERSTATUS&wt=javabin: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /live_nodes

 

I think we could provide a better error message to the user in the CLI.  And, 
maybe on the LISTALIAS we ought to return a better error than "KeeperErrorCode 
= ConnectionLoss for /aliases.json"???

 


> Handle in CLI when Solr hosting ZK is not reachable
> ---------------------------------------------------
>
>                 Key: SOLR-18138
>                 URL: https://issues.apache.org/jira/browse/SOLR-18138
>             Project: Solr
>          Issue Type: Sub-task
>          Components: cli
>            Reporter: Eric Pugh
>            Priority: Minor
>
> I ran bin/solr start -e cloud and was playing around with adding/removing 
> nodes.   It was all working until I stopped the very first solr node I 
> started....   I finally figured out that was because it was the node that was 
> hosting the ZooKeeper!   So, with no ZooKeeper, then lots of things do not 
> work.
> In the admin UI I 
> [http://localhost:9004/solr/admin/collections?_=1772130089267&action=LISTALIASES&wt=json]
>  for example fails.   Though interestingly 
> [http://localhost:9004/solr/admin/collections?_=1772130089267&action=LIST&wt=json]
>  works, because I guess it doesn't consult ZooKeeper?  
>  
> Then I tried 
> bin/solr zk ls -z 127.0.0.1:9983 /
>  
> and got 
> ERROR - 2026-02-26 13:25:45.030; org.apache.solr.cli.ZkLsTool; Could not 
> complete ls operation for reason:  =>org.apache.solr.common.SolrException: 
> java.util.concurrent.TimeoutException: Timeout while waiting for Zookeeper 
> Client to connect: 15000 ms
> at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:213)
>  
> Using the solr end point is't better:
> bin/solr zk ls -s [http://localhost:9004|http://localhost:9004/] /
>  
> ERROR: Error from server at 
> [http://localhost:9004/solr/admin/collections?action=CLUSTERSTATUS&wt=javabin:]
>  org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for /live_nodes
>  
> I think we could provide a better error message to the user in the CLI.  And, 
> maybe on the LISTALIAS we ought to return a better error than 
> "KeeperErrorCode = ConnectionLoss for /aliases.json"???
>  
>  
> Restarting the first node brings things back to life!
> "bin/solr" start -p 8983 --solr-home "example/cloud/node1/solr" --server-dir 
> "/Users/epugh/Documents/projects/solr/solr/packaging/build/dev/server"
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to