Hello Loïc,

On Thu, May 4, 2017 at 8:30 AM Loic Dachary <l...@dachary.org> wrote:

> Is there a way to calculate the optimum nearfull ratio for a given
> crushmap ?
>

This is a question that I was planning to cover in those calculations I was
working on for python-crush. I've currently shelved the work for a few
weeks but intend to look at it again as time frees up.

Basically, I see this as a five-fold uncertainty problem:
1. CRUSH mappings are pseudo-random and therefore (usually) uneven
2. Object distribution between placement groups has the exact same issue
3. Object size within a given pool can also vary greatly (from bytes to
megabytes)
4. Failures and the following re-balancing are also random.
5. Finally, pools can occupy different and overlapping sets of OSDs, and
hold independent sets of objects.

Thanks to your new CRUSH tools, I think #1 and #4 are solved respectively
by the ability to:
- generate a CRUSH map for a precise (and even) distribution of PGs;
- test mappings for every scenario of N failures and find the worst-case
scenario (very expensive calculation, but possible).

Issues #2 and #3 are more tricky. The big picture is that a given amount of
data is placed more evenly the more objects there are, and there should be
a way to use statistics to quantify that. Variance in object size then
brings in more uncertainty, but I think that metric is difficult to
quantify outside of very specific use cases where object size are known.

Finally, this might all be made redundant by the new auto-rebalancing
feature that Sage is planning for Luminous. If we can assume even data
placement at all times the #4 is the only thing we need to worry about. For
performance-based placement that would be very different however. And if
pools have overlapping OSD sets, that could be fairly tricky too.

Maybe some other users here already have some rule of thumb or actual
calculations for that. I was planning to get into the statistical
calculations of data placement assuming unique object size as the next step
for the paper I am working on. Would there be a need for such tools?

Regards,
-- 
Xavier Villaneau
Storage Software Eng. at Concurrent Computer Corp.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to