My two cents: 1) Try to have shards small enough to fit in memory so entire index can be cached, less disk access=more speed. So #of shards depend on the memory available on your nodes. If you still need more read throughput add more replicas
2) At some point network chatter would be too much due to the #of shards 3) 2 billion documents per shard (if this didn’t change in recent versions) -ufuk — > On 13 Sep 2023, at 11:32, Saksham Gupta <saksham.gu...@indiamart.com.invalid> > wrote: > > Hi All, > > I have been trying to reduce the response time of solr cloud(v8.10, 8 > nodes). To achieve this, I have tried increasing the number of shards of > solr cloud which can help reduce data size on each shard thereby reducing > response time. > > > I have encountered a few questions regarding sharding strategy: > > 1. How to decide the ideal number of shards? Is there a minimum or maximum > number of shards which should be used? > > 2. What is the minimum size of a shard after which reducing the size > further won't have any effect on the response time (as time taken by other > factors like data aggregation will compensate for that) ? > > 3. Is there some maximum limit to the size of data that should be kept in a > shard? > > > As of now we have 8 shards each on a separate node with ~25 gb of > data(15-16 million docs) present on each shard. Please advise me of the > standard approaches to define the number of shards and shard size. Thanks > in advance.