[ceph-users] Re: space size issue

Tim Holloway Fri, 28 Mar 2025 13:39:42 -0700

We're glad to have been of help.

There is no One Size Fits All solution. For you, it seems that speed ismore important than high availability. For me, it's HA+redundancy.


Ceph has 3 ways to deliver data to remote clients:

1. As a direct ceph mount on the client. From experience, this is a painwhen the clients hibernate.


2. As an internal ganesha NFS server running under ceph

3. As an independent ganesha NFS server using ceph as a backend.

option 1 would in theory provide the fastest access, since the clientdirectly talks to OSDs, but I've heard reports that NFS handlesbuffering better. It definitely handles client disconnects better. Also,option 1 means that you don't have to supply a fixed server IP, but NFScan employ keepalive to failover.

Those are just the options I know of within ceph, and I don't have firmstats even on those, much less non-ceph solutions.

At least you've learned a few things about ceph, even it it doesn'tpresently align with your needs, so there's that.


  Best Regards,

    Tim

On 3/28/25 14:20, Mihai Ciubancan wrote:

Hello,

Thank you all for your useful advices!
I choose ceph because I understood that is faster than NFS. But on theother hand as I don't think that I will extend the storage with newnodes soon probably I will go back to software RAID solution with NFS.So I will copy temporarily the data on another storage andreconfigured the machine
I wish you all a nice week-end,
Mihai


On 2025-03-28 17:14, Peter Linder wrote:
To get everything up to a working state, you will need to set your
failure domain to "osd" instead of "host" in the default rule, and as
it has been said before, pool size should be 3 and min_size 2.
With that said, you will eventually need more hosts to get the mostout of ceph.
Den 2025-03-28 kl. 16:05, skrev mihai.ciubancan:
The problem is that I have only one host, and the host doesn't havea RAID controller...I have set as Anthone suggested, but nothingchanged.Mihai-------- Original message --------From: Eugen Block <ebl...@nde.ag>Date: 3/28/25 3:47 PM (GMT+02:00) To: Mihai Ciubancan<mihai.ciuban...@eli-np.ro> Cc: ceph-users@ceph.io Subject:[ceph-users] Re: space size issue You have the autoscaler enabled,but it stuck changing the pg_num. The default replicated_rule(which you are using) requires as many hosts as your pool size is,so in your case 2. (If you value your data, don't use replicatedpools with size 2.)You could make it work with only one host (asAnthony suggested with osd_crush_chooseleaf_type 0), but you don'tany real resiliency. I recommend to reconsider your setup.Zitat vonMihai Ciubancan <mihai.ciuban...@eli-np.ro>:> Hi Eugen,>> Thanks foryour answer. Please find below the output of the command:>> ceph osdpool ls detail> pool 1 '.mgr' replicated size 2 min_size 1crush_rule 0 object_hash > rjenkins pg_num 1 pgp_num 1autoscale_mode on last_change 323 flags > hashpspool,nearfullstripe_width 0 pg_num_max 32 pg_num_min 1 > application mgrread_balance_score 12.50> pool 2 'cephfs.cephfs.meta' replicatedsize 2 min_size 1 crush_rule > 0 object_hash rjenkins pg_num 16pgp_num 1 pgp_num_target 16 > autoscale_mode on last_change 295lfor 0/0/54 flags > hashpspool,nearfull stripe_width 0pg_autoscale_bias 4 pg_num_min 16 > recovery_priority 5 applicationcephfs read_balance_score 12.50> pool 3 'cephfs.cephfs.data'replicated size 2 min_size 1 crush_rule > 0 object_hash rjenkinspg_num 129 pgp_num 1 pg_num_target 512 > pgp_num_target 512autoscale_mode on last_change 326 lfor 0/0/54 > flagshashpspool,nearfull,bulk stripe_width 0 application cephfs >read_balance_score 12.50>>> How can I decrese the number ofpg_num?>> Best,> Mihai>>> On 2025-03-28 13:19, Eugen Block wrote:>>Do you use size 1 for your data? You know that's bad, right?Please>> share 'ceph osd pool ls detail'.>> Also, it's recommendedto use a power of 2 for pg_num, so you should>> decrease the pg_numfor the pool cephfs.cephfs.data.>>>> Zitat von Mihai Ciubancan<mihai.ciuban...@eli-np.ro>:>>>>> Hi Anthony,>>> Thanks for theanswer:>>>>>> The output of 'ceph osd df' is:>>>>>> ceph osd dftree>>> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATAOMAP >>> META AVAIL %USE
VAR PGS STATUS TYPE NAME>>> -1 167.64825 - 168 TiB 12TiB 12 TiB 100 MiB >>> 27 GiB 156 TiB 7.13 1.00 - root default>>> -3 167.64825 - 168 TiB 12 TiB 12 TiB 100 MiB >>> 27 GiB 156 TiB 7.13 1.00 - host >>> sto-core-hpc01>>> 0 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14TiB 0 0 0 up osd.0>>> 1 ssd 13.97069 1.00000 14 TiB 12 TiB 12 TiB 6 KiB >>> 26 GiB 2.0 TiB 85.53 12.00 129 up osd.1>>> 2 ssd 13.97069 1.00000 14TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 00 up osd.2>>> 3 ssd 13.97069 1.00000 14TiB 1.7 GiB 258 MiB 100 MiB >>> 1.3 GiB 14 TiB 0.01 0.00 16 up osd.3>>> 4 ssd 13.97069 1.00000 14 TiB 32MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0 uposd.4>>> 5 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12KiB >>> 28 MiB 14 TiB 0 0 0 up osd.5>>>6 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0 up osd.6>>>7 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0 up osd.7>>> 8 ssd 13.97069 1.00000 14 TiB 68 MiB 4.8 MiB 12 KiB >>> 63 MiB 14 TiB 0 01 up osd.8>>> 9 ssd 13.97069 1.00000 14 TiB 32MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0 up osd.9>>> 10 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0up osd.10>>> 11 ssd 13.97069 1.00000 14 TiB 68MiB 4.8 MiB 12 KiB >>> 63 MiB 14 TiB 0 0 1 up osd.11>>> TOTAL 168 TiB 12 TiB 12 TiB 100 MiB >>> 27 GiB 156 TiB 7.13>>>>>> So all the date is onosd.1>>>>>> But I have checked the balancer and seems active:>>> cephbalancer status>>> {>>> "active": true,>>>"last_optimize_duration": "0:00:00.000368",>>>"last_optimize_started": "Fri Mar 28 10:55:06 2025",>>>"mode": "upmap",>>> "no_optimization_needed": false,>>> "optimize_result": "Some objects (0.500000) are degraded; try >>>again later",>>> "plans": []>>> }>>>>>> But the output of the commnad'ceph config dump|grep balancer' >>> gives me nothing.>>>>>>Best,>>> Mihai>>>>>> On 2025-03-27 23:06, Anthony D'Atri wrote:>>>>Look at `ceph osd df`. Is the balancer enabled?>>>>>>>>> On Mar 27,2025, at 8:50 AM, Mihai Ciubancan >>>>> <mihai.ciuban...@eli-np.ro>wrote:>>>>>>>>>> Hello,>>>>>>>>>> My name is Mihai, and I have startedusing CEPH this mount for a >>>>> HPC cluster.>>>>> When was lunch inthe production the available space shown was >>>>> 80TB now is 16TBand I didn't do anything, while I'm having 12 >>>>> OSD (SSD of14TB):>>>>>>>>>> sudo ceph osd tree>>>>> ID CLASS WEIGHT TYPENAME STATUS REWEIGHT PRI-AFF>>>>> -1 167.64825 root default>>>>> -3167.64825 host sto-core-hpc01>>>>> 0 ssd 13.97069 osd.0 up 1.00000 1.00000>>>>> 1 ssd 13.97069osd.1 up 1.00000 1.00000>>>>> 2 ssd 13.97069 osd.2 up 1.00000 1.00000>>>>> 3 ssd 13.97069 osd.3 up 1.00000 1.00000>>>>> 4 ssd 13.97069 osd.4 up 1.00000 1.00000>>>>>5 ssd 13.97069 osd.5 up 1.00000 1.00000>>>>>6 ssd 13.97069 osd.6 up 1.000001.00000>>>>> 7 ssd 13.97069 osd.7 up 1.00000 1.00000>>>>> 8 ssd 13.97069 osd.8up 1.00000 1.00000>>>>> 9 ssd 13.97069 osd.9 up 1.00000 1.00000>>>>> 10 ssd 13.97069osd.10 up 1.00000 1.00000>>>>> 11 ssd 13.97069 osd.11 up 1.00000 1.00000>>>>>>>>>>sudo ceph df detail>>>>> --- RAW STORAGE --->>>>> CLASS SIZE AVAIL USED RAW USED %RAW USED>>>>> ssd 168 TiB 156 TiB 12TiB 12 TiB 7.12>>>>> TOTAL 168 TiB 156 TiB 12 TiB 12TiB 7.12>>>>>>>>>> --- POOLS --->>>>> POOL ID PGS STORED (DATA) (OMAP) OBJECTS >>>>> USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA >>>>> BYTES DIRTY USEDCOMPR UNDER COMPR>>>>> .mgr 1 1 705 KiB 705KiB 0 B 2 >>>>> 1.4 MiB 1.4MiB 0 B 0 8.1 TiB N/A >>>>> N/A N/A 0 B 0 B>>>>> cephfs.cephfs.meta 2 16 270MiB 270 MiB 0 B 85.96k >>>>> 270 MiB 270 MiB 0 B 0 16 TiB N/A >>>>> N/A N/A 0 B 0B>>>>> cephfs.cephfs.data 3 129 12 TiB 12 TiB 0 B 3.73M >>>>> 12 TiB 12 TiB 0 B 42.49 16 TiB N/A>>>>> N/A N/A 0 B 0 B>>>>>>>>>> While on theclient side I have this:>>>>>>>>>> $ df -h>>>>>10.18.31.1:6789:/ 21T 13T 8.1T 61% /data>>>>>>>>>>I don't know where it's gone all the space that was at thebeginning.>>>>> Someone has any hint?>>>>>>>>>> Best regards,>>>>>Mihai>>>>>_______________________________________________>>>>> ceph-usersmailing list -- ceph-users@ceph.io>>>>> To unsubscribe send an emailto ceph-users-le...@ceph.io>>>_______________________________________________>>> ceph-users mailinglist -- ceph-users@ceph.io>>> To unsubscribe send an email toceph-users-le...@ceph.io>>>>>>_______________________________________________>> ceph-users mailinglist -- ceph-users@ceph.io>> To unsubscribe send an email toceph-users-leave@ceph.io_______________________________________________ceph-usersmailing list -- ceph-us...@ceph.ioto unsubscribe send an email toceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: space size issue

Reply via email to