Hi All, We are using a solr cloud cluster of 59 shards [1 replica for each shard] spread across 8 nodes. We have used implicit routing for indexing and searching data across these shards.
Upon analyzing the timeouts on solr, we have found that more than 85% [3097/3693 timeouts on 9th July] of the solr timeouts were happening due to just 1 replica where the the size of the replica is more compared to other replica [other replica contain < 5gb of data, whereas this replica contains 10 gb]. 1. Anyone who faced a similar issue, how to mitigate this? Is there a way to increase timeout for a particular replica/ node? 2. Also, has someone tried to further divide a shards' data into multiple shards? How can we plan this, as there is already a logical separation [implicit routing] b/w the 59 shards, and we will be adding another logic to subdivide data for 1 of the shards.