Re: Unusually High Number of timeouts on 1 Solr Shard

Saksham Gupta Fri, 19 Jul 2024 03:03:57 -0700

Hi Aman,
Yes, I mean the shard is having a size of 10 gb. The index was created from
scratch, so no recovery issue should exist.


Have you tried subdividing a shard further? I was thinking of breaking the
data of this shard using a numeric field [for instance, id mod 2 and
assigning a subshard with a certain value]. Is there a better way to
achieve this?

On Wed, Jul 17, 2024 at 9:25 PM Aman Tandon <amantandon...@gmail.com> wrote:

> Hi Saksham,
>
> When you are saying one replica node is having more size, do you mean that
> shard, it's shard is also of same size of 10gb. If not please check if
> there is any old recovery issue due to which the old logs or index still
> exist.
>
> If this shard is having 10gb of space, you please try to divide data. I
> hope you can try in development environment before applying it on
> production clusters.
>
> Regards,
> Aman
>
> On Wed, 17 Jul 2024, 18:12 Saksham Gupta,
> <saksham.gu...@indiamart.com.invalid> wrote:
>
> > Hi All,
> > Pinging again for assistance. This is a very unusual case, which is
> ruining
> > user experience for a particular type of search [searches mapped in the
> > replica facing timeouts] as these requests are taking more than 3
> seconds.
> >
> > On Wed, Jul 17, 2024 at 11:37 AM Saksham Gupta <
> > saksham.gu...@indiamart.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > We are using a solr cloud cluster of 59 shards [1 replica for each
> shard]
> > > spread across 8 nodes. We have used implicit routing for indexing and
> > > searching data across these shards.
> > >
> > > Upon analyzing the timeouts on solr, we have found that more than 85%
> > > [3097/3693 timeouts on 9th July] of the solr timeouts were happening
> due
> > to
> > > just 1 replica where the the size of the replica is more compared to
> > other
> > > replica [other replica contain < 5gb of data, whereas this replica
> > contains
> > > 10 gb].
> > >
> > > 1. Anyone who faced a similar issue, how to mitigate this? Is there a
> way
> > > to increase timeout for a particular replica/ node?
> > >
> > > 2. Also, has someone tried to further divide a shards' data into
> multiple
> > > shards? How can we plan this, as there is already a logical separation
> > > [implicit routing] b/w the 59 shards, and we will be adding another
> logic
> > > to subdivide data for 1 of the shards.
> > >
> >
>

Re: Unusually High Number of timeouts on 1 Solr Shard

Reply via email to