The OMAPs seem to be a red herring. It is just the `rbd_directory` which is
just a list of rbd names referencing to an ID and back.

I've now offline compacted the OSDs. The deep-scrub still takes around
100minutes, compared to the other PGs.
I hope this solves the snaptrim problem, because the during the long
running snap trim the latency of the cluster slowly degrades.

Am Mo., 5. Mai 2025 um 13:50 Uhr schrieb Boris <b...@kervyn.de>:

> So I went through some other clusters we have and it always shows
> basically the same pattern.
>
> One PG with disproportional large OMAP data. These are all flash only rbd
> clusters of different size, which where set up in different initial
> versions.
> All of them are Reef 18.2.6.
>
> $ (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph
> pg ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11,
> $13, $14 }')| column -t
> PG     OMAP_BYTES*  OMAP_KEYS*  STATE         VERSION           REPORTED
> 1.1b1  1356         36          active+clean  432394'543016527
>  432394:691327003
> 1.ea   1951         57          active+clean  432394'607403423
>  432394:783494415
> 1.1c   59094        2044        active+clean  432394'528017978
>  432394:726223821
>
> $ (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph
> pg ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11,
> $13, $14 }')| column -t
> PG     OMAP_BYTES*  OMAP_KEYS*  STATE         VERSION           REPORTED
> 2.357  1374         29          active+clean  122007'222637136
>  122007:294907884
> 2.291  1517         38          active+clean  122007'196789540
>  122007:271705492
> 2.1c   32396        1118        active+clean  122007'201282577
>  122007:290533282
>
> $ (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph
> pg ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11,
> $13, $14 }')| column -t
> PG     OMAP_BYTES*  OMAP_KEYS*  STATE         VERSION          REPORTED
> 1.12e  1423         37          active+clean  86947'526495372
>  86947:597361344
> 1.304  1490         38          active+clean  86947'356338340
>  86947:427439006
> 1.1c   38385        1329        active+clean  86947'449102542
>  86947:525465853
>
> (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph pg
> ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11, $13,
> $14 }')| column -t
> PG     OMAP_BYTES*  OMAP_KEYS*  STATE         VERSION        REPORTED
> 2.13f  425          27          active+clean  46780'456499   46828:588064
> 2.1e2  427          27          active+clean  46588'351645   46828:492723
> 2.1c   11718        410         active+clean  46828'1366432  46828:4124268
>
>
> Am Mo., 5. Mai 2025 um 10:00 Uhr schrieb Boris <b...@kervyn.de>:
>
>> The mentioned PG seems also to take ages to deep scrub. The other PGs
>> seem to take roughly 10min to deep scrub, this PG is scrubbing since over
>> an hour.
>>
>> It also got a lot more omap_bytes and omap_keys
>> root@ceph-rbd-mon1:~# (ceph pg ls | head -n1 | awk '{print $1, $6, $7,
>> $8, $11, $13, $14 }'; ceph pg ls | grep ^3\. | sort -nk8 | tail | awk
>> '{print $1, $6, $7, $8, $11, $13, $14 }')| column -t
>> PG      BYTES        OMAP_BYTES*  OMAP_KEYS*  STATE
>>  VERSION            REPORTED
>> 3.1323  45850562560  2042         40          active+clean
>>   8935596'655909002  8935596:696646107
>> 3.1f8e  46328389632  2408         40          active+clean
>>   8935596'625068094  8935596:699142583
>> 3.b57   43968541474  1143         40          active+clean
>>   8935596'395996789  8935596:548258860
>> 3.532   45439124004  1213         42          active+clean
>>   8935596'535018014  8935596:621720921
>> 3.1837  45275007300  1928         43          active+clean
>>   8935596'369343024  8935596:485257282
>> 3.97d   45998831398  2272         43          active+clean
>>   8935596'341398404  8935596:463788682
>> 3.8f4   45190209204  1405         44          active+clean
>>   8935596'379178010  8935596:509024378
>> 3.1237  45146136576  2097         45          active+clean
>>   8935596'368790387  8935596:876062378
>> 3.101d  45518981120  1296         47          active+clean
>>   8935596'310619462  8935596:489401818
>> 3.c1c   45024581632  263621       9078        active+clean+scrubbing+deep
>>  8935596'371406402  8935596:580117577
>> root@ceph-rbd-mon1:~# (ceph pg ls | head -n1 | awk '{print $1, $6, $7,
>> $8, $11, $13, $14 }'; ceph pg ls | grep ^3\. | sort -nk7 | tail | awk
>> '{print $1, $6, $7, $8, $11, $13, $14 }')| column -t
>> PG      BYTES        OMAP_BYTES*  OMAP_KEYS*  STATE
>>  VERSION            REPORTED
>> 3.1837  45275007300  1928         43          active+clean
>>   8935596'369343048  8935596:485257403
>> 3.28c   46611274402  1930         29          active+clean
>>   8935596'343110593  8935596:438952996
>> 3.aa    44759281456  1950         34          active+clean
>>   8935596'415139727  8935596:516645874
>> 3.1323  45850562560  2042         40          active+clean
>>   8935596'655909072  8935596:696646714
>> 3.1237  45146136576  2097         45          active+clean
>>   8935596'368790391  8935596:876062474
>> 3.a56   45746631280  2144         30          active+clean
>>   8935596'305003417  8935596:431604881
>> 3.97d   45998610214  2272         43          active+clean
>>   8935596'341398488  8935596:463788934
>> 3.9a4   44538237970  2314         39          active+clean
>>   8935596'385476487  8935596:545205675
>> 3.1f8e  46328389632  2408         40          active+clean
>>   8935596'625068112  8935596:699142847
>> 3.c1c   45024581632  263621       9078        active+clean+scrubbing+deep
>>  8935596'371406432  8935596:580119804
>>
>> Am Mo., 5. Mai 2025 um 09:27 Uhr schrieb Boris <b...@kervyn.de>:
>>
>>> Hi,
>>>
>>> we have one cluster where one PG that seems to be  snaptrimming forever.
>>> It sometimes seem to hinder other PGs to on the same main OSD to
>>> snaptrim as well.
>>>
>>> We've synced out the OSD but it still keeps happening.
>>> It was gone sometime in between but now it is back again with the same
>>> PG.
>>>
>>> And the constant factor is always PG 3.c1c.
>>>
>>> Anyone got an idea what I can do? Is it possible to "sync out and
>>> recreate" a PG, like I can do with an OSD?
>>>
>>> Here is some output from the pg list
>>>
>>> Cheers
>>>  Boris
>>>
>>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' |
>>> column -t
>>> Tue Apr 29 02:12:21 PM UTC 2025
>>> 3.1b    active+clean+snaptrim_wait  3m  [156,115,48]p156
>>> 3.2f6   active+clean+snaptrim_wait  3m  [156,117,32]p156
>>> 3.65d   active+clean+snaptrim_wait  3m  [156,263,9]p156
>>> 3.721   active+clean+snaptrim_wait  3m  [156,23,190]p156
>>> 3.849   active+clean+snaptrim_wait  3m  [156,135,31]p156
>>> 3.abb   active+clean+snaptrim_wait  3m  [156,70,262]p156
>>> 3.c1c   active+clean+snaptrim_wait  3m  [156,111,11]p156
>>> 3.f83   active+clean+snaptrim_wait  3m  [156,119,97]p156
>>> 3.f8a   active+clean+snaptrim_wait  3m  [156,266,281]p156
>>> 3.103f  active+clean+snaptrim_wait  3m  [156,262,215]p156
>>> 3.11e9  active+clean+snaptrim_wait  3m  [156,135,280]p156
>>> 3.14ea  active+clean+snaptrim_wait  3m  [156,118,103]p156
>>> 3.1688  active+clean+snaptrim_wait  3m  [156,221,98]p156
>>> 3.1848  active+clean+snaptrim_wait  3m  [156,268,68]p156
>>> 3.18ca  active+clean+snaptrim_wait  3m  [156,39,94]p156
>>> 3.1b6e  active+clean+snaptrim_wait  3m  [156,274,31]p156
>>> 3.1e27  active+clean+snaptrim_wait  3m  [156,212,134]p156
>>> 3.1e74  active+clean+snaptrim       3m  [156,96,12]p156
>>> 3.1f11  active+clean+snaptrim_wait  3m  [156,253,120]p156
>>> 3.1f7b  active+clean+snaptrim_wait  3m  [156,190,283]p156
>>>
>>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' |
>>> column -t
>>> Tue Apr 29 02:47:38 PM UTC 2025
>>> 3.1b    active+clean+snaptrim_wait  2m  [156,115,48]p156
>>> 3.2f6   active+clean+snaptrim_wait  2m  [156,117,32]p156
>>> 3.65d   active+clean+snaptrim_wait  2m  [156,263,9]p156
>>> 3.721   active+clean+snaptrim_wait  2m  [156,23,190]p156
>>> 3.849   active+clean+snaptrim_wait  2m  [156,135,31]p156
>>> 3.abb   active+clean+snaptrim_wait  2m  [156,70,262]p156
>>> 3.c1c   active+clean+snaptrim_wait  2m  [156,111,11]p156
>>> 3.f83   active+clean+snaptrim_wait  2m  [156,119,97]p156
>>> 3.f8a   active+clean+snaptrim_wait  2m  [156,266,281]p156
>>> 3.103f  active+clean+snaptrim_wait  2m  [156,262,215]p156
>>> 3.11e9  active+clean+snaptrim_wait  2m  [156,135,280]p156
>>> 3.14ea  active+clean+snaptrim_wait  2m  [156,118,103]p156
>>> 3.1688  active+clean+snaptrim_wait  2m  [156,221,98]p156
>>> 3.1848  active+clean+snaptrim_wait  2m  [156,268,68]p156
>>> 3.18ca  active+clean+snaptrim_wait  2m  [156,39,94]p156
>>> 3.1b6e  active+clean+snaptrim_wait  2m  [156,274,31]p156
>>> 3.1e27  active+clean+snaptrim_wait  2m  [156,212,134]p156
>>> 3.1e74  active+clean+snaptrim       2m  [156,96,12]p156
>>> 3.1f11  active+clean+snaptrim_wait  2m  [156,253,120]p156
>>> 3.1f7b  active+clean+snaptrim_wait  2m  [156,190,283]p156
>>>
>>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' |
>>> column -t
>>> Tue Apr 29 04:24:55 PM UTC 2025
>>> 3.1b    active+clean+snaptrim_wait  55s  [156,115,48]p156
>>> 3.2f6   active+clean+snaptrim_wait  55s  [156,117,32]p156
>>> 3.65d   active+clean+snaptrim_wait  55s  [156,263,9]p156
>>> 3.721   active+clean+snaptrim_wait  55s  [156,23,190]p156
>>> 3.849   active+clean+snaptrim_wait  55s  [156,135,31]p156
>>> 3.abb   active+clean+snaptrim_wait  55s  [156,70,262]p156
>>> 3.c1c   active+clean+snaptrim       34s  [156,111,11]p156
>>> 3.f83   active+clean+snaptrim_wait  55s  [156,119,97]p156
>>> 3.f8a   active+clean+snaptrim_wait  55s  [156,266,281]p156
>>> 3.103f  active+clean+snaptrim_wait  55s  [156,262,215]p156
>>> 3.11e9  active+clean+snaptrim_wait  55s  [156,135,280]p156
>>> 3.14ea  active+clean+snaptrim_wait  55s  [156,118,103]p156
>>> 3.1688  active+clean+snaptrim_wait  55s  [156,221,98]p156
>>> 3.1848  active+clean+snaptrim_wait  55s  [156,268,68]p156
>>> 3.18ca  active+clean+snaptrim_wait  55s  [156,39,94]p156
>>> 3.1b6e  active+clean+snaptrim_wait  55s  [156,274,31]p156
>>> 3.1e27  active+clean+snaptrim_wait  55s  [156,212,134]p156
>>> 3.1e74  active+clean+snaptrim_wait  34s  [156,96,12]p156
>>> 3.1f11  active+clean+snaptrim_wait  55s  [156,253,120]p156
>>> 3.1f7b  active+clean+snaptrim_wait  55s  [156,190,283]p156
>>>
>>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' |
>>> column -t
>>> Mon May  5 06:32:53 AM UTC 2025
>>> 3.c1c  active+clean+snaptrim  41h  [165,111,25]p165
>>> --
>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>>> groüen Saal.
>>>
>>
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to