Hi all,

I've been working on a feature for Apache StormCrawler (Incubating) (https://github.com/apache/incubator-stormcrawler/pull/1488), where we would like to be able to

1. Use CloudHttp2SolrClient
   
<https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/CloudHttp2SolrClient.html>
   to communicate with a Solr Cloud cluster
2. Send asynchronous query requests as one can do with
   Http2SolrClient#requestAsync
   
<https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/Http2SolrClient.html#requestAsync(org.apache.solr.client.solrj.SolrRequest,java.lang.String)>
3. Send batched updates like one can do with ConcurrentUpdateSolrClient
   
<https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.html>

From what I found, neither (2) nor (3) can be done out of the box, so I tried the following alternative instead:

For asynchronous query requests:

 * Get the wrapped LBHttp2SolrClient of the CloudHttp2SolrClient.
 * Get the active Solr endpoints from the cluster state.
 * Shuffle the endpoints for basic load balancing.
      o From LBHttp2SolrClient#requestAsync:
       Execute an asynchronous request against one or more hosts for a
       given collection. The passed-in Req object includes a List of
       Endpoints. This method always begins with the first Endpoint in
       the list and if unsuccessful tries each in turn until the
       request is successful. Consequently, this method does not
       actually Load Balance. It is up to the caller to shuffle the
       List of Endpoints if Load Balancing is desired.
 * Make the LBHttp2SolrClient#requestAsync call

   Here
   
<https://github.com/apache/incubator-stormcrawler/blob/main/external/solr/src/main/java/org/apache/stormcrawler/solr/SolrConnection.java#L66-L96>
   is how I have implemented the above in code.

For batching updates, however, the only alternative I can think of is implementing the batching manually, but this seems convoluted and probably against the architecture of the CloudSolrClient.

Is there any plan to include asynchronous requests and/or batched updates in CloudHttp2SolrClient in future Solr releases?

Do you have any suggestions on the alternatives I described above?

Thanks a lot in advance,

Markos Volikas

Reply via email to