Hi All,

We are using a solr cloud cluster of 59 shards [1 replica for each shard]
spread across 8 nodes. We have used implicit routing for indexing and
searching data across these shards.

Upon analyzing the timeouts on solr, we have found that more than 85%
[3097/3693 timeouts on 9th July] of the solr timeouts were happening due to
just 1 replica where the the size of the replica is more compared to other
replica [other replica contain < 5gb of data, whereas this replica contains
10 gb].

1. Anyone who faced a similar issue, how to mitigate this? Is there a way
to increase timeout for a particular replica/ node?

2. Also, has someone tried to further divide a shards' data into multiple
shards? How can we plan this, as there is already a logical separation
[implicit routing] b/w the 59 shards, and we will be adding another logic
to subdivide data for 1 of the shards.

Reply via email to