Hi, Totally same issue here also. Latest octopus and newly added osds with less pgs are more full than old ones.
What I normally do, let the cluster do the rebalance until in the new osds some of them hitting 75-80%, then let the cluster "rest" for 1-2 days. This time some background cleaning happens which can free up some spaces. Once have space, start the rebalance again and stop at 75%, then let it rest. It's terrible and slow solution but couldn't find it to make it differently. If you are already in the reweight loop, use crush reweight to move away pg-s from that node and if at some stage you are able to finish the rebalance, slowly set back the crush weights. ________________________________ From: Joshua Baergen <jbaer...@digitalocean.com> Sent: Tuesday, July 22, 2025 5:13 AM To: mhnx <morphinwith...@gmail.com> Cc: Ceph Users <ceph-users@ceph.io> Subject: [ceph-users] Re: HELP! Cluster usage increased after adding new nodes/osd's Email received from the internet. If in doubt, don't click any link nor open any attachment ! ________________________________ Hello, Any chance that these OSDs were deployed with different bluestore_min_alloc_size settings? Josh On Mon, Jul 7, 2025 at 2:39 PM mhnx <morphinwith...@gmail.com> wrote: > > Hello Stefan! > > All of my nodes and clients = Octopus 15.2.14 > > I have 1x RBD pool and 2000x rbd volumes with 100Gb / each > > > This is upmap balanced state, without manual reweight: > > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP > META AVAIL %USE VAR PGS STATUS TYPE NAME > -1 669.87897 - 671 TiB 381 TiB 376 TiB 170 GiB > 5.2 TiB 289 TiB 56.87 1.00 - root default > -53 335.36298 - 335 TiB 192 TiB 189 TiB 85 GiB > 2.6 TiB 144 TiB 57.15 1.00 - datacenter > E-datacenter > > **** OLD-NODE: > -43 20.95900 - 21 TiB 11 TiB 10 TiB 5.4 GiB > 180 GiB 10 TiB 50.66 0.89 - host E10 > 240 ssd 1.74699 1.00000 1.7 TiB 728 GiB 714 GiB 425 MiB > 14 GiB 1.0 TiB 40.70 0.72 125 up osd.240 > 241 ssd 1.74699 1.00000 1.7 TiB 924 GiB 909 GiB 507 MiB > 14 GiB 864 GiB 51.66 0.91 126 up osd.241 > 242 ssd 1.74699 1.00000 1.7 TiB 913 GiB 898 GiB 513 MiB > 15 GiB 876 GiB 51.04 0.90 131 up osd.242 > 243 ssd 1.74699 1.00000 1.7 TiB 896 GiB 880 GiB 474 MiB > 16 GiB 892 GiB 50.12 0.88 132 up osd.243 > 244 ssd 1.74699 1.00000 1.7 TiB 842 GiB 826 GiB 411 MiB > 16 GiB 947 GiB 47.06 0.83 133 up osd.244 > 245 ssd 1.74699 1.00000 1.7 TiB 912 GiB 896 GiB 416 MiB > 15 GiB 876 GiB 51.00 0.90 143 up osd.245 > 246 ssd 1.74699 1.00000 1.7 TiB 940 GiB 925 GiB 535 MiB > 15 GiB 848 GiB 52.58 0.92 143 up osd.246 > 247 ssd 1.74699 1.00000 1.7 TiB 1008 GiB 993 GiB 436 MiB > 15 GiB 781 GiB 56.35 0.99 135 up osd.247 > 248 ssd 1.74699 1.00000 1.7 TiB 1.0 TiB 1.0 TiB 452 MiB > 15 GiB 728 GiB 59.28 1.04 141 up osd.248 > 249 ssd 1.74699 1.00000 1.7 TiB 826 GiB 812 GiB 375 MiB > 14 GiB 962 GiB 46.21 0.81 128 up osd.249 > 250 ssd 1.74699 1.00000 1.7 TiB 923 GiB 907 GiB 435 MiB > 15 GiB 866 GiB 51.60 0.91 136 up osd.250 > 251 ssd 1.74699 1.00000 1.7 TiB 900 GiB 884 GiB 567 MiB > 15 GiB 889 GiB 50.30 0.88 142 up osd.251 > > **** NEW-NODE: > -65 20.96375 - 21 TiB 16 TiB 16 TiB 5.4 GiB > 125 GiB 5.1 TiB 75.47 1.33 - host E14 > 324 ssd 1.74698 1.00000 1.7 TiB 1.4 TiB 1.3 TiB 431 MiB > 10 GiB 399 GiB 77.72 1.37 124 up osd.324 > 325 ssd 1.74698 1.00000 1.7 TiB 1.2 TiB 1.2 TiB 436 MiB > 9.6 GiB 579 GiB 67.62 1.19 107 up osd.325 > 326 ssd 1.74698 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 446 MiB > 10 GiB 495 GiB 72.35 1.27 107 up osd.326 > 327 ssd 1.74698 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 506 MiB > 11 GiB 355 GiB 80.14 1.41 126 up osd.327 > 328 ssd 1.74698 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 432 MiB > 10 GiB 477 GiB 73.33 1.29 114 up osd.328 > 329 ssd 1.74698 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 530 MiB > 11 GiB 343 GiB 80.81 1.42 124 up osd.329 > 330 ssd 1.74698 1.00000 1.7 TiB 1.2 TiB 1.2 TiB 432 MiB > 10 GiB 537 GiB 69.99 1.23 113 up osd.330 > 331 ssd 1.74698 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 473 MiB > 11 GiB 353 GiB 80.25 1.41 123 up osd.331 > 332 ssd 1.74698 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 459 MiB > 11 GiB 370 GiB 79.30 1.39 124 up osd.332 > 333 ssd 1.74698 1.00000 1.7 TiB 1.3 TiB 1.2 TiB 438 MiB > 10 GiB 500 GiB 72.05 1.27 111 up osd.333 > 334 ssd 1.74698 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 433 MiB > 11 GiB 393 GiB 78.00 1.37 123 up osd.334 > 335 ssd 1.74698 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 488 MiB > 10 GiB 464 GiB 74.08 1.30 119 up osd.335 > > --------------------- > I can't upgrade to newer versions because I have a personal project > and it is designed for current linux and ceph version. Upgrade means a > lot of work for me. > > Maybe the JJ balancer will do better job as you recommended but I > don't want better balance at this moment. > > First of all I want to understand why this happened and what is > changed between "nautilus <-> octopus" and same OSD deploy method > generates near-full new OSD's with similar amount PG count. > > -Best > > Stefan Kooman <ste...@bit.nl>, 7 Tem 2025 Pzt, 22:22 tarihinde şunu yazdı: > > > > On 7/7/25 18:34, mhnx wrote: > > > Hello! > > > > > > Few years ago I build a "dc-a:12 + dc-b:12 = 24" node ceph cluster > > > with Nautilus v14.2.16 > > > A year ago the cluster upgraded to Octopus and it was running fine. > > > Recently I added 4+4=8 new nodes with identical hardware and SSD drives. > > > When I created OSD's with Octopus, The cluster usage increased from %50 > > > to %78!! > > > > What does a "ceph osd df tree" gives you? > > > > > > > > The weird problem is, the new OSD's become nearfull and hold more size > > > even if they have the same or less amount of PG's. > > > > > > I had to reweight new OSD's to 0.9 to make them equal size usage.. > > > I increased the PG count 8192 to 16384 and ran balancer, it became > > > worse and I have %84 usage now! > > > > Remember that Ceph is limited by the fullest OSD in the cluster. > > Do you have old clients? If not, try to get rid of reweight and start > > using upmap. It is way more efficient in getting a cluster well > > balanced. I would recommend using this balance script: > > https://github.com/TheJJ/ceph-balancer > > > > Maybe first reset all the reweigths (first do: ceph osd set nobackfill). > > Then run this script: > > https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py > > > > And after that run the ceph-balancer script. That should help > > tremendously if the cluster is imbalanced. > > > > > > > > > > I guess OSD or PG code changed between nautilus <-> octopus and it > > > generates this problem. > > > > What version of Octopus are you running? > > > > > > > > Can anyone help me with experience or knowledge about this? > > > What should I do? > > > > > > My solution idea: > > > I'm thinking of destroy and re-create old OSD's as a solution but I > > > need to re-create 144x3.8TB Sas SSD OSD's and it means 4-5 days of > > > maintenance. > > > > > > Also I have 2 osd per drive because it was recommended at Nautilus > > > times. How about this? Should I keep the config or should I use 1 osd > > > per 3.8TB SAS SSD ? What is the recommendation for Octopus and Quincy? > > > > I would recommend upgrading to newer, supported versions, maybe go to > > Pacific and then Reef. Modern versions of Ceph do not gain from > > deploying multiple OSDs per drive. What Ceph services are you running > > (MDS, RGW, RBD)? > > > > Gr. Stefan > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io