Hi Eric,

Thank you to pickup my question.
Correct me if I'm wrong please regarding sharding and indexes.
The flow when the user put an object to the cluster, it will create 1 object in 
the index pool that will hold the let's say location of the file in the data 
pool.
1 index entry is for 1 bucket so if the bucket objects number is growing the 
index object will grow too.
Here is where sharding come into picture, with sharding we can make smaller 
chunks of this 1 big index object. Document says we can calculate the shard 
numbers with 100.000, so 1 shard is for 100.000 objects which means if the 
bucket has 100 shards, it can hold let's say 10 millions objects.

Now I have the situation, there is 100 shards and 100.000 objects/shard set. 
Have a bucket which crossed the 10 millions of objects and to be honest I don't 
know what is happening at the moment, they are at 11.5 millions objects, no 
issue, I just don't understand what is happening.

So if we don't know at the beginning of the bucket creation what is the planned 
number of objects in the future, it's better to set the sharding to a high 
number. And as the documentation says, 64k is the max shards bucket, so why not 
set this number to avoid any limitation.

And now we have a new cluster with multisite enabled, here dynamic bucket 
sharding is not even possible, so I don't know at the moment, what I should set 
as a basic before put it into production.

Thank you in advance your clarification.


-----Original Message-----
From: Eric Ivancich <ivanc...@redhat.com>
Sent: Wednesday, November 25, 2020 5:37 AM
To: Szabo, Istvan (Agoda) <istvan.sz...@agoda.com>
Cc: ceph-users <ceph-users@ceph.io>
Subject: Re: [Suspicious newsletter] [ceph-users] Re: Unable to reshard bucket

Email received from outside the company. If in doubt don't click links nor open 
attachments!
________________________________

Can you clarify, Istvan, what you plan on setting to 64K? If it’s the number of 
shards for a bucket, that would be a mistake.

> On Nov 21, 2020, at 2:09 AM, Szabo, Istvan (Agoda) <istvan.sz...@agoda.com> 
> wrote:
>
> Seems like this sharding we need to be plan carefully since the beginning. 
> I'm thinking to set the shard number by default to the maximum which is 64k 
> and leave it as is so we will never reach the limit only if we reach the 
> maximum number of objects.
>
> Would be interesting to know what is the side effect if I set the shards to 
> 64k by default.
>
> Istvan Szabo
> Senior Infrastructure Engineer

--
J. Eric Ivancich
he / him / his
Red Hat Storage
Ann Arbor, Michigan, USA


________________________________
This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to