I'm not so sure this is as bad as it sounds. When your collection is sharded, no single node knows about the documents in other shards/nodes, so to find the total number, a query will need to go to every node.
Trying to work out something to do a single request to every node, combine their collection statistics and aggregate them into a single result sounds very complicated, and likely overkill. Are you needing to collect this information often? Do you have a lot of collections? Upayavira On Fri, Jun 5, 2015, at 06:29 AM, Zheng Lin Edwin Yeo wrote: > I'm trying to write a SolrJ program in Java to read and consolidate all > the > information into a JSON file, The client will just need to call this > SolrJ > program and read this JSON file to get the details. But the problem is we > are still querying the Solr once for each collection, just that this time > it is done in the SolrJ program in a for-loop, while previously it's done > on the client side. Not sure will this lead to performance improvement? > > For your suggestion on spawning a bunch of threads, does it mean the same > thing as I did? > > Regards, > Edwin > > > On 5 June 2015 at 12:03, Erick Erickson <[email protected]> wrote: > > > Have you considered spawning a bunch of threads, one per collection > > and having them all run in parallel? > > > > Best, > > Erick > > > > On Thu, Jun 4, 2015 at 4:52 PM, Zheng Lin Edwin Yeo > > <[email protected]> wrote: > > > The reason we wanted to do a single call is to improve on the > > performance, > > > as our application requires to list the total number of records in each > > of > > > the collections, and the number of records that matches the query each of > > > the collections. > > > > > > Currently we are querying each collection one by one to retrieve the > > > numFound value and display them, but this can slow down the system > > > significantly when the number of collection grows. So we are thinking of > > > ways to improve the speed in this area. > > > > > > Any other methods which you can suggest that we can do to overcome this > > > speed problem? > > > > > > Regards, > > > Edwin > > > On 5 Jun 2015 00:16, "Erick Erickson" <[email protected]> wrote: > > > > > >> Not in a single call that I know of. These are really orthogonal > > >> concepts. Getting the cluster status merely involves reading the > > >> Zookeeper clusterstate whereas getting the total number of docs for > > >> each would involve querying each collection, i.e. going to the Solr > > >> nodes themselves. I'd guess it's unlikely to be combined. > > >> > > >> Best, > > >> Erick > > >> > > >> On Thu, Jun 4, 2015 at 7:47 AM, Zheng Lin Edwin Yeo > > >> <[email protected]> wrote: > > >> > Hi, > > >> > > > >> > Would like to check, are we able to use the Collection API or any > > other > > >> > method to list all the collections in the cluster together with the > > >> number > > >> > of records in each of the collections in one output? > > >> > > > >> > Currently, I only know of the List Collections > > >> > /admin/collections?action=LIST. However, this only list the names of > > the > > >> > collections that are in the cluster, but not the number of records. > > >> > > > >> > Is there a way to show the number of records in each of the > > collections > > >> as > > >> > well? > > >> > > > >> > Regards, > > >> > Edwin > > >> > >
