Thanks Walter. We have 1-to-1 mapping,  each physical server hosts a single
solr core so that CPU resource per node is sufficient. Because of the
constant index size growth and each shard can only hold X million
documents, and we don't want shards to reach their maximum capacity,
therefore the number of shards can grow into several hundreds.  My question
is from solr cloud/Zookeeper perspective,  is there a hard limit or
performance impact when number of shards(and correspondingly number of
nodes) reachs a certain threshold?  Another option I see is to split the
data into multiple separate small solr clouds,  but then we have to handle
the aggregation of results from different clouds outside of solr, which has
challenges like how to compare respective solr scores and handle sorting
etc.

Thanks,
Wei

On Mon, Jan 9, 2023 at 7:39 PM Walter Underwood <wun...@wunderwood.org>
wrote:

> > On Jan 9, 2023, at 1:29 PM, Wei <weiwan...@gmail.com> wrote:
> >
> > Is there a practical limit on the number of shards and nodes in a solr
> > cloud? We need to scale up the solr cloud and wonder if there is concern
> > when increasing to a couple of hundred shards and several thousand nodes
> in
> > a single cloud.  Any suggestions?
>
> It was challenging to manage with 8 shards and a replication factor of 8.
> At that point, we scaled vertically to bigger AWS instances. It scaled
> smoothly up to 72 CPU instances.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>

Reply via email to