Thanks Walter. We have 1-to-1 mapping, each physical server hosts a single solr core so that CPU resource per node is sufficient. Because of the constant index size growth and each shard can only hold X million documents, and we don't want shards to reach their maximum capacity, therefore the number of shards can grow into several hundreds. My question is from solr cloud/Zookeeper perspective, is there a hard limit or performance impact when number of shards(and correspondingly number of nodes) reachs a certain threshold? Another option I see is to split the data into multiple separate small solr clouds, but then we have to handle the aggregation of results from different clouds outside of solr, which has challenges like how to compare respective solr scores and handle sorting etc.
Thanks, Wei On Mon, Jan 9, 2023 at 7:39 PM Walter Underwood <wun...@wunderwood.org> wrote: > > On Jan 9, 2023, at 1:29 PM, Wei <weiwan...@gmail.com> wrote: > > > > Is there a practical limit on the number of shards and nodes in a solr > > cloud? We need to scale up the solr cloud and wonder if there is concern > > when increasing to a couple of hundred shards and several thousand nodes > in > > a single cloud. Any suggestions? > > It was challenging to manage with 8 shards and a replication factor of 8. > At that point, we scaled vertically to bigger AWS instances. It scaled > smoothly up to 72 CPU instances. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > >