First time I recall anyone trying this.  Thoughts:

* Manually edit the crush map and bump retries from 50 to 100
* Better yet, give those OSDs a custom device class and change the CRUSH rule 
to use that and the default root.  

Do you also constrain mons to those systems ? 

> On Apr 16, 2025, at 6:41 AM, Michel Jouvin <michel.jou...@ijclab.in2p3.fr> 
> wrote:
> 
> Hi,
> 
> We have use case where we had like to restrict some pools to a subset of the 
> OSDs located in a particular section of the crush map hierarchy (OSDs backed 
> up by UPS/Diesel). We tried to define for these (replica 3) pools a specific 
> crush rule with the root paramater defined to a specific row (which contains 
> 3 OSD servers with #10 OSD each). At the beginning it worked but after some 
> time (probably after doing a reweight on the OSDs in this row to reduce the 
> number of PGs from other pools), a few PGs are active+clean+remapped and 1 is 
> undersized.
> 
> 'ceph osd pg dump|grep remapped' gives an output similar to the following one 
> for each remapped PG:
> 
> 20.1ae       648                   0         0        648 0   374416846       
>      0           0    438      1404 438       active+clean+remapped  
> 2025-04-16T07:19:40.507778+0000 43117'1018433    48443:1131738       [70,58]  
>         70 [70,58,45]              70   43117'1018433 
> 2025-04-16T07:19:40.507443+0000    43117'1018433 
> 2025-04-16T07:19:40.507443+0000              0 15  periodic scrub scheduled @ 
> 2025-04-17T19:18:23.470846+0000 648                0
> 
> We can see that we currently have 3 replica but that Ceph would like to move 
> to 2... (the undersized PG has currently only 2 replica for an unknown 
> reason, probably the same).
> 
> Is it wrong trying to do what we did, i.e. using a row for the crush rule 
> root parameter? If not, where could we find more information about the cause?
> 
> Thanks in advance for any help. Best regards,
> 
> Michel
> 
> --------------------- Crush rule used -----------------
> 
> {
>     "rule_id": 2,
>     "rule_name": "ha-replicated_ruleset",
>     "type": 1,
>     "steps": [
>         {
>             "op": "take",
>             "item": -22,
>             "item_name": "row-01~hdd"
>         },
>         {
>             "op": "chooseleaf_firstn",
>             "num": 0,
>             "type": "host"
>         },
>         {
>             "op": "emit"
>         }
>     ]
> }
> 
> 
> ------------------- Beginning of the CRUSH tree -------------------
> 
> ID   CLASS  WEIGHT     TYPE NAME                         STATUS REWEIGHT  
> PRI-AFF
>  -1         843.57141  root default
> -19         843.57141      datacenter bat.206
> -21         283.81818          row row-01
> -15          87.32867              host cephdevel-76079
>   1    hdd    7.27739                  osd.1                 up 0.50000  
> 1.00000
>   2    hdd    7.27739                  osd.2                 up 0.50000  
> 1.00000
>  14    hdd    7.27739                  osd.14                up 0.50000  
> 1.00000
>  39    hdd    7.27739                  osd.39                up 0.50000  
> 1.00000
>  40    hdd    7.27739                  osd.40                up 0.50000  
> 1.00000
>  41    hdd    7.27739                  osd.41                up 0.50000  
> 1.00000
>  42    hdd    7.27739                  osd.42                up 0.50000  
> 1.00000
>  43    hdd    7.27739                  osd.43                up 0.50000  
> 1.00000
>  44    hdd    7.27739                  osd.44                up 0.50000  
> 1.00000
>  45    hdd    7.27739                  osd.45                up 0.50000  
> 1.00000
>  46    hdd    7.27739                  osd.46                up 0.50000  
> 1.00000
>  47    hdd    7.27739                  osd.47                up 0.50000  
> 1.00000
>  -3          94.60606              host cephdevel-76154
>  49    hdd    7.27739                  osd.49                up 0.50000  
> 1.00000
>  50    hdd    7.27739                  osd.50                up 0.50000  
> 1.00000
>  51    hdd    7.27739                  osd.51                up 0.50000  
> 1.00000
>  66    hdd    7.27739                  osd.66                up 0.50000  
> 1.00000
>  67    hdd    7.27739                  osd.67                up 0.50000  
> 1.00000
>  68    hdd    7.27739                  osd.68                up 0.50000  
> 1.00000
>  69    hdd    7.27739                  osd.69                up 0.50000  
> 1.00000
>  70    hdd    7.27739                  osd.70                up 0.50000  
> 1.00000
>  71    hdd    7.27739                  osd.71                up 0.50000  
> 1.00000
>  72    hdd    7.27739                  osd.72                up 0.50000  
> 1.00000
>  73    hdd    7.27739                  osd.73                up 0.50000  
> 1.00000
>  74    hdd    7.27739                  osd.74                up 0.50000  
> 1.00000
>  75    hdd    7.27739                  osd.75                up 0.50000  
> 1.00000
>  -4         101.88345              host cephdevel-76204
>  48    hdd    7.27739                  osd.48                up 0.50000  
> 1.00000
>  52    hdd    7.27739                  osd.52                up 0.50000  
> 1.00000
>  53    hdd    7.27739                  osd.53                up 0.50000  
> 1.00000
>  54    hdd    7.27739                  osd.54                up 0.50000  
> 1.00000
>  56    hdd    7.27739                  osd.56                up 0.50000  
> 1.00000
>  57    hdd    7.27739                  osd.57                up 0.50000  
> 1.00000
>  58    hdd    7.27739                  osd.58                up 0.50000  
> 1.00000
>  59    hdd    7.27739                  osd.59                up 0.50000  
> 1.00000
>  60    hdd    7.27739                  osd.60                up 0.50000  
> 1.00000
>  61    hdd    7.27739                  osd.61                up 0.50000  
> 1.00000
>  62    hdd    7.27739                  osd.62                up 0.50000  
> 1.00000
>  63    hdd    7.27739                  osd.63                up 0.50000  
> 1.00000
>  64    hdd    7.27739                  osd.64                up 0.50000  
> 1.00000
>  65    hdd    7.27739                  osd.65                up 0.50000  
> 1.00000
> -23         203.16110          row row-02
> -13          87.32867              host cephdevel-76213
>  27    hdd    7.27739                  osd.27                up 1.00000  
> 1.00000
>  28    hdd    7.27739                  osd.28                up 1.00000  
> 1.00000
>  29    hdd    7.27739                  osd.29                up 1.00000  
> 1.00000
>  30    hdd    7.27739                  osd.30                up 1.00000  
> 1.00000
>  31    hdd    7.27739                  osd.31                up 1.00000  
> 1.00000
>  32    hdd    7.27739                  osd.32                up 1.00000  
> 1.00000
>  33    hdd    7.27739                  osd.33                up 1.00000  
> 1.00000
>  34    hdd    7.27739                  osd.34                up 1.00000  
> 1.00000
>  35    hdd    7.27739                  osd.35                up 1.00000  
> 1.00000
>  36    hdd    7.27739                  osd.36                up 1.00000  
> 1.00000
>  37    hdd    7.27739                  osd.37                up 1.00000  
> 1.00000
>  38    hdd    7.27739                  osd.38                up 1.00000  
> 1.00000
> ......
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to