Hi, I have a rook-provisioned cluster to be used for RBDs only. I have 2 pools named replicated-metadata-pool and ec-data-pool. EC parameters are 6+3. I've been writing some data to this cluster for some time and noticed that the reported usage is not what I was expecting.
# ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 5.4 PiB 4.3 PiB 1.2 PiB 1.2 PiB 21.77 TOTAL 5.4 PiB 4.3 PiB 1.2 PiB 1.2 PiB 21.77 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL replicated-metadata-pool 1 90 KiB 408 38 MiB 0 1.2 PiB ec-data-pool 2 722 TiB 191.64M 1.2 PiB 25.04 2.4 PiB Since these numbers are rounded a bit too much, I generally use prometheus metrics on mgr, which are as follows: ceph_pool_stored : 793,746 G for ec-data-pool and 92323 for replicated-metadata-pool ceph_pool_stored_raw: 1,190,865 G for ec-data-pool and 99213 for replicated-metadata-pool ceph_cluster_total_used_bytes: 1,329,374 G ceph_cluster_total_used_raw_bytes: 1,333,013 G sum(ceph_bluefs_db_used_bytes) : 3,638 G So ceph_pool_stored for the EC pool is a bit higher than the total used space of the formatted RBDs. I think that's because of the sparse nature and deleted blocks not being fstrimmed yet. That's OK. ceph_pool_stored_raw is almost exactly 1.5x ceph_pool_stored which is what I'd expect considering EC parameters of 6+3. What I can't find is the 138,509 G difference between the ceph_cluster_total_used_bytes and ceph_pool_stored_raw. This is not static BTW, checking the same data historically shows we have about 1.12x of what we expect. This seems to make our 1.5x EC overhead a 1.68x overhead in reality. Anyone have any ideas for why this is the case? We also have a ceph_cluster_total_used_raw_bytes metric, I believe to be close to data+metadata. Which is why I tried to show with sum(ceph_bluefs_db_used_bytes). Is that correct? Best, -- erdem agaoglu
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io