Hi,
I've configured an erasure coded pool (3+2) in our Ceph lab environment (ceph 
version 14.2.4), and I'm trying to verify the behaviour of 
bluestore_min_alloc_size.
Our OSDs are HDDs, so by default the min_alloc_size is set to 64KB. 
ceph daemon osd.X config show | grep bluestore_min_alloc_size_hdd
    "bluestore_min_alloc_size_hdd": "65536",
According to the documentation, the unwritten area in each chunk is filled with 
zeroes when it is written to the raw partition, which can lead to space 
amplification when writing small objects.
In other words, a 4KB object stored in my cluster should theoretically use 64KB 
* 5(k+m) = 320KB. Or, quite simply, 64KB per chunk.
To test this, I uploaded a 4KB object, and used the ceph-objectstore-tool to 
output the size of the object on one of the OSDs:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-20 --pgid 15.93s2 
b458a7bf-0643-4c04-bccc-f7f8feb0bd20.4889853.3_ceph.txt dump | jq '.stat'
{
  "size": 4096,
  "blksize": 4096,
  "blocks": 1,
  "nlink": 1
}
I was expecting size to be 64KB, but perhaps it doesn't take into account the 
area filled with zeroes? Note, in this case, size = 4K because that is the 
stripe unit size specified in my erasure coding profile.
Is there any other way of querying the object to verify that each chunk is 
using 64K, or that the object size in total is using 320KB? 
Obviously, if I only have one object in the pool, then I can use "rados df", 
but as soon as I add more objects of different sizes, I lose this ability.
rados df

POOL_NAME                      USED OBJECTS CLONES  COPIES MISSING_ON_PRIMARY 
UNFOUND DEGRADED  RD_OPS      RD  WR_OPS      WR USED COMPR UNDER COMPR
ec32                        320 KiB       1      0       5                  0   
    0        1       0     0 B       1   4 KiB        0 B         0 B
Thanks and regards,
James.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to