Re: Autoscaling

Shawn Heisey Sun, 17 Jul 2022 14:42:10 -0700

On 7/17/22 11:25, Kaminski, Adi wrote:

For example, if we have 10 shards each 100k (1M total) documents size for best 
and optimized ingestion/query performance...adding more documents will make 
sense to have 11th shard, and reaching 1.1M total will make sense to add 12th 
one eventually.

One million total documents is actually a pretty small index, and as youwere told in another reply, is not big enough in most situations torequire sharding, unless your hardware has very little cpu/memory/storage.

Is it reasonable to use some automation of collections API, splitting shards 
accordingly to some strategy (largest, oldest, etc.) ?

In a typical scenario, every shard will be approximately equal in size,and will contain documents of any age. If you have a 10 shard index andyou split one of the shards, then you will have 9 shards of relativelyequal size and two shards that are each half the size of the other 9. To correctly redistribute the load, you would need to split ALL theshards, so you would end up with 20 shards, or some other multiple of10, the starting point.

In my last reply, I mentioned the implicit router. This is the routeryou would need to use if you want to organize your shards by somethinglike date. But then every single document you index must indicate whatshard it will end up on -- there is no automatic routing.

Aren't some out of the box capabilities in Solr Cloud search engine ? Or maybe 
some libraries/operators on top to simplify k8s deployments, but not only for 
queries and automatic PODs scaling but also automating data storage 
optimization (per volume, date, any other custom logic..).


I have no idea what you are asking here.

Thanks,
Shawn

Re: Autoscaling

Reply via email to