Hi Houston,

I forgot to mention that we are using NRT in all replicas. My major concern is 
not during normal operation, but rather in case of problems.

Imagine a situation where there are so many indexing/reading operations that 
the Solr cluster starts to lag. In that scenario, we noticed that one NRT 
replica (follower) had issues while updating its index: we found many network 
requests trying to reach the leader, but the network was so overloaded that it 
couldn't. It started adding more temporary indexes for a shard only to fail and 
then it created another, and so on until the disk was full.

This happened in two instances - luckily, the leader did not crash and we were 
able to recover. Importantly, this happened when we had all 5 shards using the 
same leader, and we assumed that since the followers were all trying to 
communicate with the leader, it generated too much traffic/load for it (the 
leader) to respond in a timely manner.

The question was basically if we had the shard leadership spread amongst the 
instances, could this alleviate the problem of followers falling behind and 
generating too much load to the single leader?


________________________________
From: Houston Putman <houstonput...@gmail.com>
Sent: 01 November 2021 18:13
To: users@solr.apache.org <users@solr.apache.org>
Subject: Re: Shard leadership best practice in Solr 8

*** External email: use caution ***



If you are using NRT replicas, then I don't imagine there is going to be a
huge difference in resource usage between leader replicas and follower
replicas. They are all receiving all documents, and
indexing/committing locally.

If you are using TLOG/PULL replicas, then I would recommend splitting your
TLOG replicas to not live on the same nodes as your PULL replicas. That way
you are able to separate query/ingest traffic and scale up accordingly.

- Houston

On Wed, Oct 13, 2021 at 9:43 AM Saur, Alexandre (ELS-AMS) <
a.s...@elsevier.com> wrote:

> Hello,
>
> We have a Solr 8 cluster with 5 nodes and one (big) collection that is
> split into 5 shards.
>
> Given this scenario, what's the best way to optimize heavy indexing -
> splitting shard leadership amongst the nodes or have just one node being
> the leader of all shards?
>
> Thanks in advance!
>
>
> ________________________________
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33158992, Registered in The Netherlands.
>

________________________________

Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The 
Netherlands, Registration No. 33158992, Registered in The Netherlands.

Reply via email to