dsmiley commented on code in PR #2935: URL: https://github.com/apache/solr/pull/2935#discussion_r1925844451
########## solr/CHANGES.txt: ########## @@ -176,6 +176,10 @@ Bug Fixes * SOLR-17629: If SQLHandler failed to open the underlying stream (e.g. Solr returns an error; could be user/syntax problem), it needs to close the stream to cleanup resources but wasn't. (David Smiley) +* SOLR-17519: SolrJ fix to CloudSolrClient and HTTP ClusterState where it can fail to request cluster state + if its current live nodes list is stale. HTTP ClusterState now retains the initial configured list of passed URLs as backup Review Comment: I don't think Solr users know about "HTTP ClusterState"; it's not likely in their vocabulary. On the other hand, wording like "SolrJ CloudSolrClient configured with Solr URLs" is likely to be understood. ########## solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseHttpClusterStateProvider.java: ########## @@ -216,38 +236,42 @@ private SimpleOrderedMap<?> submitClusterStateRequest( @Override public Set<String> getLiveNodes() { - if (liveNodes == null) { - throw new RuntimeException( - "We don't know of any live_nodes to fetch the" - + " latest live_nodes information from. " - + "If you think your Solr cluster is up and is accessible," - + " you could try re-creating a new CloudSolrClient using working" - + " solrUrl(s) or zkHost(s)."); - } if (TimeUnit.SECONDS.convert((System.nanoTime() - liveNodesTimestamp), TimeUnit.NANOSECONDS) > getCacheTimeout()) { - for (String nodeName : liveNodes) { - String baseUrl = Utils.getBaseUrlForNodeName(nodeName, urlScheme); - try (SolrClient client = getSolrClient(baseUrl)) { - Set<String> liveNodes = fetchLiveNodes(client); - this.liveNodes = (liveNodes); - liveNodesTimestamp = System.nanoTime(); - return liveNodes; - } catch (Exception e) { - log.warn("Attempt to fetch cluster state from {} failed.", baseUrl, e); - } - } + + if (liveNodes.stream() + .anyMatch((node) -> updateLiveNodes(URLUtil.getBaseUrlForNodeName(node, urlScheme)))) + return this.liveNodes; + + log.warn( + "Attempt to fetch cluster state from all known live nodes {} failed. Trying backup nodes", + liveNodes); + + if (configuredNodes.stream().anyMatch((node) -> updateLiveNodes(node.toString()))) Review Comment: Shouldn't we check configuredNodes *first* so as to give the user the possibility of some control, basically addressing the conversation points in JIRA? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org