Hi All...

I am still fighting with this issue. It may be something which is not properly implemented, and if that is the case, that is fine.

I am still trying to understand what is the real space occupied by files in a /cephfs filesystem, reported for example by a df.

Maybe I did not explain myself clearly. I am not saying that block size has
something to do with rbytes, I was just making a comparison with what I
expect in a regular POSIX filesystem. Let me put the question in the
following / different way:

1) I know that, if I only have one char file in a ext4 filesystem, where my
filesystem was set with a 4KB blocksize, a df would show 4KB as used space.

2) now imagine that I only have one char file in my Cephfs filesystem, and
the layout of my file is object_size=512K, stripe_count=2, and
stripe_unit=256K. Also assume that I have set my cluster to have 3
replicates. What would be the used space reported by a df command in this
case?

My naive assumption would be that a df should show as used space 512KB x 3.
Is this correct?
No. Used space reported by df is the sum of used space of OSDs' local
store. A 512k file require 3x512k space for data, OSD and local
filesystem also need extra space for tracking these data.

Please bare with me on my simple minded tests:

   0) # umount /cephfs;mount -t ceph X.X.X.X:6789:/ /cephfs -o
   name=admin,secretfile=/etc/ceph/admin.secret

   1) # getfattr -d -m ceph.*
   /cephfs/objectsize4M_stripeunit512K_stripecount8/
   (...)
   ceph.dir.layout="stripe_unit=524288 stripe_count=8
   object_size=4194304 pool=cephfs_dt"
   ceph.dir.rbytes="*549755813888*"
   (...)


   2) # df -B 1 /cephfs/
   Filesystem                1B-blocks           Used      Available
   Use% Mounted on
   X.X.X.X:6789:/ 95618814967808 11738728628224 *83880086339584* 13%
   /cephfs


   3) # dd if=/dev/zero
   of=/cephfs/objectsize4M_stripeunit512K_stripecount8/4096bytes.txt
   bs=1 count=4096
   4096+0 records in
   4096+0 records out
   4096 bytes (4.1 kB) copied, 0.0139456 s, 294 kB/s


   4) # ls -lb
   /cephfs/objectsize4M_stripeunit512K_stripecount8/4096bytes.txt
   -rw-r--r-- 1 root root 4096 Aug  7 07:16
   /cephfs/objectsize4M_stripeunit512K_stripecount8/4096bytes.txt


   5) # umount /cephfs;mount -t ceph X.X.X.X:6789:/  /cephfs -o
   name=admin,secretfile=/etc/ceph/admin.secret


   6) # getfattr -d -m ceph.*
   /cephfs/objectsize4M_stripeunit512K_stripecount8/
   (...)
   ceph.dir.layout="stripe_unit=524288 stripe_count=8
   object_size=4194304 pool=cephfs_dt"
   ceph.dir.rbytes="*549755817984*"


   7) # df -B 1 /cephfs/
   Filesystem                1B-blocks           Used      Available
   Use% Mounted on
   192.231.127.8:6789:/ 95618814967808 11738728628224 *83880086339584*
   13% /cephfs


Please note that in this simple minded tests:

a./ rbytes properly reports the change in size (after a unmount/mount)
*549755817984 **- 549755813888 = 4096

*    b./ A df does not show any change.

I could use 'ceph df details' but it does not give me the granularity want. Moreover, I also do not understand well its input:

   # ceph df
   GLOBAL:
        SIZE       AVAIL      RAW USED     %RAW USED
        89051G     78119G       10932G         12.28
   POOLS:
        NAME          ID     USED      %USED     MAX AVAIL     OBJECTS
        cephfs_dt     5      3633G      4.08        25128G     1554050
        cephfs_mt    6      3455k         0        25128G          39

- What imposes the MAX AVAILABLE? I am assuming it is ~ GLOBAL AVAIL / Number of replicas... - The %USED is computed in reference to what? I am asking because it seems it is computed in references to GLOBAL SIZE... But this is misleading since the POOL MAX AVAIL is much less.

Thanks for the clarifications
Goncalo


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to