Hi all,
I've been working on a feature for Apache StormCrawler
(Incubating) (https://github.com/apache/incubator-stormcrawler/pull/1488),
where we would like to be able to
1. Use CloudHttp2SolrClient
<https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/CloudHttp2SolrClient.html>
to communicate with a Solr Cloud cluster
2. Send asynchronous query requests as one can do with
Http2SolrClient#requestAsync
<https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/Http2SolrClient.html#requestAsync(org.apache.solr.client.solrj.SolrRequest,java.lang.String)>
3. Send batched updates like one can do with ConcurrentUpdateSolrClient
<https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.html>
From what I found, neither (2) nor (3) can be done out of the box, so I
tried the following alternative instead:
For asynchronous query requests:
* Get the wrapped LBHttp2SolrClient of the CloudHttp2SolrClient.
* Get the active Solr endpoints from the cluster state.
* Shuffle the endpoints for basic load balancing.
o From LBHttp2SolrClient#requestAsync:
Execute an asynchronous request against one or more hosts for a
given collection. The passed-in Req object includes a List of
Endpoints. This method always begins with the first Endpoint in
the list and if unsuccessful tries each in turn until the
request is successful. Consequently, this method does not
actually Load Balance. It is up to the caller to shuffle the
List of Endpoints if Load Balancing is desired.
* Make the LBHttp2SolrClient#requestAsync call
Here
<https://github.com/apache/incubator-stormcrawler/blob/main/external/solr/src/main/java/org/apache/stormcrawler/solr/SolrConnection.java#L66-L96>
is how I have implemented the above in code.
For batching updates, however, the only alternative I can think of is
implementing the batching manually, but this seems convoluted and
probably against the architecture of the CloudSolrClient.
Is there any plan to include asynchronous requests and/or batched
updates in CloudHttp2SolrClient in future Solr releases?
Do you have any suggestions on the alternatives I described above?
Thanks a lot in advance,
Markos Volikas