> Yes, I think the Ceph docs should get some minor updates to make the > difference between PGs and PG replicas (PG * replicationFactor) even more > explicit.
Please open a tracker and list places you find where this isn’t already made clear. > Can we please have 1 command, that can dump all config (including from config > files, monitor central configuration database, all currently running > daemons), and nicely point out what's set and overridden where and which > value is in effect? Sounds like an opportunity to enter a tracker issue or a PR. > 0 ssd 0.16370 1.00000 168 GiB 3.4 GiB 59 MiB 1.6 GiB 1.7 > GiB 164 GiB 2.01 0.03 2 up > 1 ssd 0.16370 1.00000 168 GiB 3.9 GiB 72 MiB 1.6 GiB 2.3 > GiB 164 GiB 2.34 0.03 2 up > >> STDDEV [..] if your SSD OSDs are significantly smaller than the HDDs that >> can confound the reporting > > Yes, indeed the SSD OSDs are 100x smaller than the HDD OSDs. What model are they that they’re that small? Are they enterprise-quality? OSDs that small can present difficulties. > Indeed, but isn't it factor 4x too low already? > > > Potentially useful to know: > > * The rep-cluster is in HEALTH_OK for a long time. > * The ec-cluster suffers from `37 OSD(s) experiencing BlueFS spillover` for a > long time How large are those DB+WAL slices? Please share BlueFS stats: https://www.ibm.com/docs/en/storage-ceph/7.1.0?topic=bluefs-viewing-ceph-statistics-ceph-osds > (I have not solved that yet; I suspect that Ceph would simply like larger > DB/WAL devices on my SSDs for the size / object count I have on the HDDs, but > if so that is unfixable for me because I use Hetzner SX134 servers). I do not > know if that HEALTH_WARN caused by that spillover will permanently inhibit > the balancer. I wouldn’t think so, but it may be possible to address them. > That said, I occasionally use https://github.com/TheJJ/ceph-balancer, which > takes into account the actual sizes of objects when balancing. > > Another question: > > Why do you inquire about the balancer? Does it affect the autoscaler? It can contribute to suboptimal PG ratios on OSDs. > Oddly, not listed in > https://docs.ceph.com/en/squid/rados/configuration/ceph-conf/#commands > But I think > https://docs.ceph.com/en/squid/rados/configuration/ceph-conf/#commands should > list it so that from there one can easily see it's legacy. I look forward to your PR. > >> My understanding is that the autoscaler won’t jump a pg_num value until the >> new value is (by default) a factor of 3 high or low > > Indeed, but isn't it factor 4x too low already? One would think. > rep-cluster: 35 PGs/OSD (= 1024*3/86) 35 > 100/3 > ec-cluster: 26 PGs/OSD (= 256*6/58) > [ another such machine, new (added to the cluster 2 days > ago), that is currently being rebalanced to ] I suspect that once backfill completes you’ll see a ratio > 33 >> ceph config set global target_size_ratio 250 > > I don't fully understand this suggestion. Apologies, I meant mon_target_pg_per_osd = 250 > Also, should I be setting `pg_autoscale_bias` to increase the number of PGs > that the autoscaler comes up with, by a fixed factor, to adjust for my small > objects? In most cases that should mostly be set for metadata / index pools. Mostly. > > This is suggested by > https://docs.redhat.com/en/documentation/red_hat_ceph_storage/4/html/storage_strategies_guide/placement_groups_pgs > >> This property is particularly used for metadata pools which might be small >> in size but have large number of objects, so scaling them faster is >> important for better performance. That’s a Nautilus page, so be careful using docs that old. But yes, see above. > > Separate: > I read https://docs.ceph.com/en/squid/rados/operations/balancer/#throttling > I think these docs need improvement: > >> There is a separate setting for how uniform the distribution of PGs must be >> for the module to consider the cluster adequately balanced. At the time of >> writing (June 2025), this value defaults to `5` > > So "there is a setting, and its default value is 5" ... but what's the name > of the setting? > Is it `upmap_max_deviation` from 4 paragraphs further down? Yes. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io