> On Jan 2, 2025, at 11:18 AM, Nicola Mori <m...@fi.infn.it> wrote:
> 
> Hi Anthony, thanks for your insights. I actually used df -h from the bash 
> shell of a machine mounting the CephFS with the kernel module, and here's the 
> current result:
> 
> wizardfs_rootsquash@b1029256-7bb3-11ec-a8ce-ac1f6b627b45.wizardfs=/ 217T   
> 78T  139T  36% /wizard/ceph
> 
> So it seems the fs size is 217 TiB, which is about 66% of the total amount of 
> raw disk space (320 TiB) as I wrote before.
> 
> Then I tried the command you suggested:
> 
> # ceph df
> --- RAW STORAGE ---
> CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
> hdd    320 TiB  216 TiB  104 TiB   104 TiB      32.56
> TOTAL  320 TiB  216 TiB  104 TiB   104 TiB      32.56
> 
> --- POOLS ---
> POOL             ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
> .mgr              1    1  242 MiB       62  726 MiB      0     62 TiB
> wizard_metadata   2   16  1.2 GiB   85.75k  3.5 GiB      0     62 TiB
> wizard_data       3  512   78 TiB   27.03M  104 TiB  36.06    138 TiB
> 
> In order to find the total size of the data pool I don't understand how to 
> interpret the "MAX AVAIL" column: should it be summed to "STORED" or to 
> "USED”?

Do you have a lot of small files?

> In the first case I'd get 216 TiB which corresponds to what df -h says and 
> thus to 66%, in the second case I'd get 242 TiB which is very close to 75%... 
> But I guess the first option is the right one.
> 
> Then I looked at the weights of my failure domain (host):
> 
> #    ceph osd tree | grep host
> 
> -7          25.51636      host aka
> -3          25.51636      host balin
> -13          29.10950      host bifur
> -17          29.10950      host bofur
> -21          29.10371      host dwalin
> -23          21.83276      host fili
> -25          29.10950      host kili
> -9          25.51636      host ogion
> -19          25.51636      host prestno
> -15          29.10522      host remolo
> -5          25.51636      host rokanan
> -11          27.29063      host romolo
> 
> They seem quite even and quite reflecting the actual total size of each host:
> 
> # ceph orch host ls --detail
> HOST     . . .  HDD
> aka              9/28.3TB
> balin            9/28.3TB
> bifur            9/32.5TB
> bofur            8/32.0TB
> dwalin          16/32.0TB
> fili            12/24.0TB
> kili             8/32.0TB
> ogion            8/28.0TB
> prestno          9/28.3TB
> remolo          16/32.0TB
> rokanan          9/28.5TB
> romolo          16/30.0TB
> 
> so I see no problem here (in fact, making these even is the idea behind the 
> disk upgrade strategy I am pursuing).
> 
> About the OSD outlier: there seems to be not such an OSD, the maximum OSD 
> occupancy is 38% and it smoothly decreases down to a minimum of 27% with no 
> jumps.

That’s a very high variance.  If the balancer is working it should be like +/- 
1-2%.  Available space in the cluster will be reported as though all OSDs are 
38%.

> 
> About PGs: I have 512 PGs in the data pool and 124 OSDs in total, maybe the 
> count is too low but I'm hesitant to increase it since my cluster is very low 
> specs and I fear to run out of memory on the oldest machines.
> 
> About CRUSH rules: I don't know exactly what to search for, so if you believe 
> it's important then I'd need some advice.
> 
> Thank you again for your precious help,
> 
> Nicola
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to