Hi Markos, I'll answer the easiest question first. The "requestAsync" method is relatively new to our SolrJ API. I don't know of any concrete plans, but I would expect it to be added to more client implementations over time (and ultimately end up on the SolrClient interface).
Update batching is a different story though. CloudSolrClient and ConcurrentUpdateSolrClient offer two fundamentally different approaches to speeding up update requests. As you know, the "Concurrent" client adds documents to a queue internally and streams them to a single endpoint using batching where possible. The Cloud client on the other hand figures out which documents belong to each shard, and routes documents directly to that shard's leader. It may be possible to reconcile those approaches in the future, and have the "Cloud" client use both optimizations, but I haven't seen much discussion of that so IMO it's unlikely that you'll see this in an upcoming release. It'd be a great improvement to have though, so if you have any interest in contributing I'm more than happy to review and do what I can to help move this forward. Let me know! In terms of the workarounds you suggested above, I can't suggest any improvements to your asynchronous-query flow. But on the "update batching" side, you might see most of the benefits that the ConcurrentUpdate client if you can find a way to prevent users from calling the single-document update API that SolrClient's offer, i.e. `SolrClient.update(SolrInputDocument)`. The simplest way to do that might be to create a trivial CloudHttp2SolrClient subclass that overrides `update(SolrInputDocument)` to throw an UnsupportedOperationException or some other relevant error. That'd nudge other "stormcrawler" devs towards using the batch-update method that's more similar to what the Concurrent client does under the hood. Good luck, Jason On Sun, Apr 27, 2025 at 9:38 AM Markos Volikas <mvoli...@apache.org> wrote: > > Hi all, > > I've been working on a feature for Apache StormCrawler > (Incubating) (https://github.com/apache/incubator-stormcrawler/pull/1488), > where we would like to be able to > > 1. Use CloudHttp2SolrClient > > <https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/CloudHttp2SolrClient.html> > to communicate with a Solr Cloud cluster > 2. Send asynchronous query requests as one can do with > Http2SolrClient#requestAsync > > <https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/Http2SolrClient.html#requestAsync(org.apache.solr.client.solrj.SolrRequest,java.lang.String)> > 3. Send batched updates like one can do with ConcurrentUpdateSolrClient > > <https://solr.apache.org/docs/9_8_0/solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.html> > > From what I found, neither (2) nor (3) can be done out of the box, so I > tried the following alternative instead: > > For asynchronous query requests: > > * Get the wrapped LBHttp2SolrClient of the CloudHttp2SolrClient. > * Get the active Solr endpoints from the cluster state. > * Shuffle the endpoints for basic load balancing. > o From LBHttp2SolrClient#requestAsync: > Execute an asynchronous request against one or more hosts for a > given collection. The passed-in Req object includes a List of > Endpoints. This method always begins with the first Endpoint in > the list and if unsuccessful tries each in turn until the > request is successful. Consequently, this method does not > actually Load Balance. It is up to the caller to shuffle the > List of Endpoints if Load Balancing is desired. > * Make the LBHttp2SolrClient#requestAsync call > > Here > > <https://github.com/apache/incubator-stormcrawler/blob/main/external/solr/src/main/java/org/apache/stormcrawler/solr/SolrConnection.java#L66-L96> > is how I have implemented the above in code. > > For batching updates, however, the only alternative I can think of is > implementing the batching manually, but this seems convoluted and > probably against the architecture of the CloudSolrClient. > > Is there any plan to include asynchronous requests and/or batched > updates in CloudHttp2SolrClient in future Solr releases? > > Do you have any suggestions on the alternatives I described above? > > Thanks a lot in advance, > > Markos Volikas