Well to start you should just have one shard. 1 million documents is barely 
anything justifying sharding it out. So it’s really quite easy to balance one 
shard and one server

> On Jul 17, 2022, at 1:26 PM, Kaminski, Adi <adi.kamin...@verint.com.invalid> 
> wrote:
> 
> So what would be the recommendation then to have balanced shards 
> automatically in specific collection (if collection is used as separate 
> abstraction/storage for each customer/ tenant  to comply with multi 
> tenancy/security isolation) ?
> 
> For example, if we have 10 shards each 100k (1M total) documents size for 
> best and optimized ingestion/query performance...adding more documents will 
> make sense to have 11th shard, and reaching 1.1M total will make sense to add 
> 12th one eventually.
> 
> Is it reasonable to use some automation of collections API, splitting shards 
> accordingly to some strategy (largest, oldest, etc.) ?
> 
> Aren't some out of the box capabilities in Solr Cloud search engine ? Or 
> maybe some libraries/operators on top to simplify k8s deployments, but not 
> only for queries and automatic PODs scaling but also automating data storage 
> optimization (per volume, date, any other custom logic..).
> 
> Thanks in advance,
> Adi
> 
> 
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> ________________________________
> From: Shawn Heisey <apa...@elyograg.org>
> Sent: Sunday, July 17, 2022 5:44:24 PM
> To: users@solr.apache.org <users@solr.apache.org>
> Subject: Re: Autoscaling
> 
>> On 7/17/22 07:40, Ronen Nussbaum wrote:
>> We are planning to migrate our Solr Cloud clusters to the cloud.
>> Currently it is installed on-prem for each customer.
>> It is already deployed as Docker containers.
>> Instead of estimating in advance what is the number of shards needed, or
>> the number of pods, we'd like to rely on EKS cluster autoscaler, K8S HPA
>> and Solr autoscaling.
>> Our main concern is the deprecation of the autoscaling feature since
>> version 9.0.
> 
> One thing I am not sure you're aware of:  You can't add shards to a
> collection unless it is using the implicit router, which is poorly named
> because what it means is that sharding is 100% user-managed (manual).
> 
> There is the shard splitting capability in the Collections API, but that
> only works on a single shard, not the whole collection. If you wanted to
> adjust from say 6 to 8 shards and still have the shards be approximately
> equal in size, you would need to build a new collection and completely
> reindex.
> 
> There have been a number of issues filed for a rebalance feature, but it
> has not been implemented because implementing it would involve a very
> large amount of work, and making it stable would take even more work.
> And I am not even sure the Lucene API has the capability to do it currently.
> 
> https://issues.apache.org/jira/browse/SOLR-9241
> 
>> What is your recommendation? Should we start with 8.11? Will it be a
>> substitute soon?
> 
> I have no idea whether a substitute will be available.  You can manually
> do everything that the autoscaler would do with the Collections API.
> 
> Thanks,
> Shawn
> 
> 
> 
> This electronic message may contain proprietary and confidential information 
> of Verint Systems Inc., its affiliates and/or subsidiaries. The information 
> is intended to be for the use of the individual(s) or entity(ies) named 
> above. If you are not the intended recipient (or authorized to receive this 
> e-mail for the intended recipient), you may not use, copy, disclose or 
> distribute to anyone this message or any information contained in this 
> message. If you have received this electronic message in error, please notify 
> us by replying to this e-mail.

Reply via email to