The problem is that I have only one host, and the host doesn't have a RAID
controller...I have set as Anthone suggested, but nothing changed.Mihai
-------- Original message --------From: Eugen Block <[email protected]> Date:
3/28/25 3:47 PM (GMT+02:00) To: Mihai Ciubancan <[email protected]>
Cc: [email protected] Subject: [ceph-users] Re: space size issue You have the
autoscaler enabled, but it stuck changing the pg_num. The default
replicated_rule (which you are using) requires as many hosts as your pool size
is, so in your case 2. (If you value your data, don't use replicated pools
with size 2.)You could make it work with only one host (as Anthony suggested
with osd_crush_chooseleaf_type 0), but you don't any real resiliency. I
recommend to reconsider your setup.Zitat von Mihai Ciubancan
<[email protected]>:> Hi Eugen,>> Thanks for your answer. Please find
below the output of the command:>> ceph osd pool ls detail> pool 1 '.mgr'
replicated size 2 min_size 1 crush_rule 0 object_hash > rjenkins pg_num 1
pgp_num 1 autoscale_mode on last_change 323 flags > hashpspool,nearfull
stripe_width 0 pg_num_max 32 pg_num_min 1 > application mgr read_balance_score
12.50> pool 2 'cephfs.cephfs.meta' replicated size 2 min_size 1 crush_rule > 0
object_hash rjenkins pg_num 16 pgp_num 1 pgp_num_target 16 > autoscale_mode on
last_change 295 lfor 0/0/54 flags > hashpspool,nearfull stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 > recovery_priority 5 application cephfs
read_balance_score 12.50> pool 3 'cephfs.cephfs.data' replicated size 2
min_size 1 crush_rule > 0 object_hash rjenkins pg_num 129 pgp_num 1
pg_num_target 512 > pgp_num_target 512 autoscale_mode on last_change 326 lfor
0/0/54 > flags hashpspool,nearfull,bulk stripe_width 0 application cephfs >
read_balance_score 12.50>>> How can I decrese the number of pg_num?>> Best,>
Mihai>>> On 2025-03-28 13:19, Eugen Block wrote:>> Do you use size 1 for your
data? You know that's bad, right? Please>> share 'ceph osd pool ls detail'.>>
Also, it's recommended to use a power of 2 for pg_num, so you should>> decrease
the pg_num for the pool cephfs.cephfs.data.>>>> Zitat von Mihai Ciubancan
<[email protected]>:>>>>> Hi Anthony,>>> Thanks for the answer:>>>>>>
The output of 'ceph osd df' is:>>>>>> ceph osd df tree>>> ID CLASS WEIGHT
REWEIGHT SIZE RAW USE DATA OMAP >>> META AVAIL %USE
VAR PGS STATUS TYPE NAME>>> -1 167.64825 - 168 TiB 12
TiB 12 TiB 100 MiB >>> 27 GiB 156 TiB 7.13 1.00 - root
default>>> -3 167.64825 - 168 TiB 12 TiB 12 TiB 100 MiB
>>> 27 GiB 156 TiB 7.13 1.00 - host >>>
sto-core-hpc01>>> 0 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12
KiB >>> 28 MiB 14 TiB 0 0 0 up osd.0>>> 1
ssd 13.97069 1.00000 14 TiB 12 TiB 12 TiB 6 KiB >>> 26 GiB
2.0 TiB 85.53 12.00 129 up osd.1>>> 2 ssd 13.97069
1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0
0 0 up osd.2>>> 3 ssd 13.97069 1.00000 14 TiB 1.7
GiB 258 MiB 100 MiB >>> 1.3 GiB 14 TiB 0.01 0.00 16 up
osd.3>>> 4 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB
>>> 28 MiB 14 TiB 0 0 0 up osd.4>>> 5 ssd
13.97069 1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB
0 0 0 up osd.5>>> 6 ssd 13.97069 1.00000 14
TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0
up osd.6>>> 7 ssd 13.97069 1.00000 14 TiB 32 MiB 4.1 MiB
12 KiB >>> 28 MiB 14 TiB 0 0 0 up osd.7>>> 8
ssd 13.97069 1.00000 14 TiB 68 MiB 4.8 MiB 12 KiB >>> 63 MiB
14 TiB 0 0 1 up osd.8>>> 9 ssd 13.97069
1.00000 14 TiB 32 MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0
0 0 up osd.9>>> 10 ssd 13.97069 1.00000 14 TiB 32
MiB 4.1 MiB 12 KiB >>> 28 MiB 14 TiB 0 0 0 up
osd.10>>> 11 ssd 13.97069 1.00000 14 TiB 68 MiB 4.8 MiB 12 KiB
>>> 63 MiB 14 TiB 0 0 1 up osd.11>>>
TOTAL 168 TiB 12 TiB 12 TiB 100 MiB >>> 27 GiB 156 TiB
7.13>>>>>> So all the date is on osd.1>>>>>> But I have checked the balancer
and seems active:>>> ceph balancer status>>> {>>> "active": true,>>>
"last_optimize_duration": "0:00:00.000368",>>> "last_optimize_started": "Fri
Mar 28 10:55:06 2025",>>> "mode": "upmap",>>> "no_optimization_needed":
false,>>> "optimize_result": "Some objects (0.500000) are degraded; try
>>> again later",>>> "plans": []>>> }>>>>>> But the output of the commnad
'ceph config dump|grep balancer' >>> gives me nothing.>>>>>> Best,>>>
Mihai>>>>>> On 2025-03-27 23:06, Anthony D'Atri wrote:>>>> Look at `ceph osd
df`. Is the balancer enabled?>>>>>>>>> On Mar 27, 2025, at 8:50 AM, Mihai
Ciubancan >>>>> <[email protected]> wrote:>>>>>>>>>> Hello,>>>>>>>>>>
My name is Mihai, and I have started using CEPH this mount for a >>>>> HPC
cluster.>>>>> When was lunch in the production the available space shown was
>>>>> 80TB now is 16TB and I didn't do anything, while I'm having 12 >>>>> OSD
(SSD of 14TB):>>>>>>>>>> sudo ceph osd tree>>>>> ID CLASS WEIGHT TYPE
NAME STATUS REWEIGHT PRI-AFF>>>>> -1 167.64825 root
default>>>>> -3 167.64825 host sto-core-hpc01>>>>> 0 ssd
13.97069 osd.0 up 1.00000 1.00000>>>>> 1 ssd
13.97069 osd.1 up 1.00000 1.00000>>>>> 2 ssd
13.97069 osd.2 up 1.00000 1.00000>>>>> 3 ssd
13.97069 osd.3 up 1.00000 1.00000>>>>> 4 ssd
13.97069 osd.4 up 1.00000 1.00000>>>>> 5 ssd
13.97069 osd.5 up 1.00000 1.00000>>>>> 6 ssd
13.97069 osd.6 up 1.00000 1.00000>>>>> 7 ssd
13.97069 osd.7 up 1.00000 1.00000>>>>> 8 ssd
13.97069 osd.8 up 1.00000 1.00000>>>>> 9 ssd
13.97069 osd.9 up 1.00000 1.00000>>>>> 10 ssd
13.97069 osd.10 up 1.00000 1.00000>>>>> 11 ssd
13.97069 osd.11 up 1.00000 1.00000>>>>>>>>>> sudo
ceph df detail>>>>> --- RAW STORAGE --->>>>> CLASS SIZE AVAIL USED
RAW USED %RAW USED>>>>> ssd 168 TiB 156 TiB 12 TiB 12 TiB
7.12>>>>> TOTAL 168 TiB 156 TiB 12 TiB 12 TiB 7.12>>>>>>>>>> ---
POOLS --->>>>> POOL ID PGS STORED (DATA) (OMAP) OBJECTS
>>>>> USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA >>>>>
BYTES DIRTY USED COMPR UNDER COMPR>>>>> .mgr 1 1 705 KiB
705 KiB 0 B 2 >>>>> 1.4 MiB 1.4 MiB 0 B 0 8.1 TiB
N/A >>>>> N/A N/A 0 B 0 B>>>>>
cephfs.cephfs.meta 2 16 270 MiB 270 MiB 0 B 85.96k >>>>> 270 MiB
270 MiB 0 B 0 16 TiB N/A >>>>> N/A N/A
0 B 0 B>>>>> cephfs.cephfs.data 3 129 12 TiB 12 TiB 0
B 3.73M >>>>> 12 TiB 12 TiB 0 B 42.49 16 TiB N/A
>>>>> N/A N/A 0 B 0 B>>>>>>>>>> While on the
client side I have this:>>>>>>>>>> $ df -h>>>>> 10.18.31.1:6789:/
21T 13T 8.1T 61% /data>>>>>>>>>> I don't know where it's gone all the
space that was at the beginning.>>>>> Someone has any hint?>>>>>>>>>> Best
regards,>>>>> Mihai>>>>> _______________________________________________>>>>>
ceph-users mailing list -- [email protected]>>>>> To unsubscribe send an email
to [email protected]>>>
_______________________________________________>>> ceph-users mailing list --
[email protected]>>> To unsubscribe send an email to
[email protected]>>>>>>
_______________________________________________>> ceph-users mailing list --
[email protected]>> To unsubscribe send an email to
[email protected]_______________________________________________ceph-users
mailing list -- [email protected] unsubscribe send an email to
[email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]