The OMAPs seem to be a red herring. It is just the `rbd_directory` which is just a list of rbd names referencing to an ID and back.
I've now offline compacted the OSDs. The deep-scrub still takes around 100minutes, compared to the other PGs. I hope this solves the snaptrim problem, because the during the long running snap trim the latency of the cluster slowly degrades. Am Mo., 5. Mai 2025 um 13:50 Uhr schrieb Boris <b...@kervyn.de>: > So I went through some other clusters we have and it always shows > basically the same pattern. > > One PG with disproportional large OMAP data. These are all flash only rbd > clusters of different size, which where set up in different initial > versions. > All of them are Reef 18.2.6. > > $ (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph > pg ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11, > $13, $14 }')| column -t > PG OMAP_BYTES* OMAP_KEYS* STATE VERSION REPORTED > 1.1b1 1356 36 active+clean 432394'543016527 > 432394:691327003 > 1.ea 1951 57 active+clean 432394'607403423 > 432394:783494415 > 1.1c 59094 2044 active+clean 432394'528017978 > 432394:726223821 > > $ (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph > pg ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11, > $13, $14 }')| column -t > PG OMAP_BYTES* OMAP_KEYS* STATE VERSION REPORTED > 2.357 1374 29 active+clean 122007'222637136 > 122007:294907884 > 2.291 1517 38 active+clean 122007'196789540 > 122007:271705492 > 2.1c 32396 1118 active+clean 122007'201282577 > 122007:290533282 > > $ (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph > pg ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11, > $13, $14 }')| column -t > PG OMAP_BYTES* OMAP_KEYS* STATE VERSION REPORTED > 1.12e 1423 37 active+clean 86947'526495372 > 86947:597361344 > 1.304 1490 38 active+clean 86947'356338340 > 86947:427439006 > 1.1c 38385 1329 active+clean 86947'449102542 > 86947:525465853 > > (ceph pg ls | head -n1 | awk '{print $1, $7, $8, $11, $13, $14 }'; ceph pg > ls-by-pool rbd | sort -nk7 | tail -n3 | awk '{print $1, $7, $8, $11, $13, > $14 }')| column -t > PG OMAP_BYTES* OMAP_KEYS* STATE VERSION REPORTED > 2.13f 425 27 active+clean 46780'456499 46828:588064 > 2.1e2 427 27 active+clean 46588'351645 46828:492723 > 2.1c 11718 410 active+clean 46828'1366432 46828:4124268 > > > Am Mo., 5. Mai 2025 um 10:00 Uhr schrieb Boris <b...@kervyn.de>: > >> The mentioned PG seems also to take ages to deep scrub. The other PGs >> seem to take roughly 10min to deep scrub, this PG is scrubbing since over >> an hour. >> >> It also got a lot more omap_bytes and omap_keys >> root@ceph-rbd-mon1:~# (ceph pg ls | head -n1 | awk '{print $1, $6, $7, >> $8, $11, $13, $14 }'; ceph pg ls | grep ^3\. | sort -nk8 | tail | awk >> '{print $1, $6, $7, $8, $11, $13, $14 }')| column -t >> PG BYTES OMAP_BYTES* OMAP_KEYS* STATE >> VERSION REPORTED >> 3.1323 45850562560 2042 40 active+clean >> 8935596'655909002 8935596:696646107 >> 3.1f8e 46328389632 2408 40 active+clean >> 8935596'625068094 8935596:699142583 >> 3.b57 43968541474 1143 40 active+clean >> 8935596'395996789 8935596:548258860 >> 3.532 45439124004 1213 42 active+clean >> 8935596'535018014 8935596:621720921 >> 3.1837 45275007300 1928 43 active+clean >> 8935596'369343024 8935596:485257282 >> 3.97d 45998831398 2272 43 active+clean >> 8935596'341398404 8935596:463788682 >> 3.8f4 45190209204 1405 44 active+clean >> 8935596'379178010 8935596:509024378 >> 3.1237 45146136576 2097 45 active+clean >> 8935596'368790387 8935596:876062378 >> 3.101d 45518981120 1296 47 active+clean >> 8935596'310619462 8935596:489401818 >> 3.c1c 45024581632 263621 9078 active+clean+scrubbing+deep >> 8935596'371406402 8935596:580117577 >> root@ceph-rbd-mon1:~# (ceph pg ls | head -n1 | awk '{print $1, $6, $7, >> $8, $11, $13, $14 }'; ceph pg ls | grep ^3\. | sort -nk7 | tail | awk >> '{print $1, $6, $7, $8, $11, $13, $14 }')| column -t >> PG BYTES OMAP_BYTES* OMAP_KEYS* STATE >> VERSION REPORTED >> 3.1837 45275007300 1928 43 active+clean >> 8935596'369343048 8935596:485257403 >> 3.28c 46611274402 1930 29 active+clean >> 8935596'343110593 8935596:438952996 >> 3.aa 44759281456 1950 34 active+clean >> 8935596'415139727 8935596:516645874 >> 3.1323 45850562560 2042 40 active+clean >> 8935596'655909072 8935596:696646714 >> 3.1237 45146136576 2097 45 active+clean >> 8935596'368790391 8935596:876062474 >> 3.a56 45746631280 2144 30 active+clean >> 8935596'305003417 8935596:431604881 >> 3.97d 45998610214 2272 43 active+clean >> 8935596'341398488 8935596:463788934 >> 3.9a4 44538237970 2314 39 active+clean >> 8935596'385476487 8935596:545205675 >> 3.1f8e 46328389632 2408 40 active+clean >> 8935596'625068112 8935596:699142847 >> 3.c1c 45024581632 263621 9078 active+clean+scrubbing+deep >> 8935596'371406432 8935596:580119804 >> >> Am Mo., 5. Mai 2025 um 09:27 Uhr schrieb Boris <b...@kervyn.de>: >> >>> Hi, >>> >>> we have one cluster where one PG that seems to be snaptrimming forever. >>> It sometimes seem to hinder other PGs to on the same main OSD to >>> snaptrim as well. >>> >>> We've synced out the OSD but it still keeps happening. >>> It was gone sometime in between but now it is back again with the same >>> PG. >>> >>> And the constant factor is always PG 3.c1c. >>> >>> Anyone got an idea what I can do? Is it possible to "sync out and >>> recreate" a PG, like I can do with an OSD? >>> >>> Here is some output from the pg list >>> >>> Cheers >>> Boris >>> >>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' | >>> column -t >>> Tue Apr 29 02:12:21 PM UTC 2025 >>> 3.1b active+clean+snaptrim_wait 3m [156,115,48]p156 >>> 3.2f6 active+clean+snaptrim_wait 3m [156,117,32]p156 >>> 3.65d active+clean+snaptrim_wait 3m [156,263,9]p156 >>> 3.721 active+clean+snaptrim_wait 3m [156,23,190]p156 >>> 3.849 active+clean+snaptrim_wait 3m [156,135,31]p156 >>> 3.abb active+clean+snaptrim_wait 3m [156,70,262]p156 >>> 3.c1c active+clean+snaptrim_wait 3m [156,111,11]p156 >>> 3.f83 active+clean+snaptrim_wait 3m [156,119,97]p156 >>> 3.f8a active+clean+snaptrim_wait 3m [156,266,281]p156 >>> 3.103f active+clean+snaptrim_wait 3m [156,262,215]p156 >>> 3.11e9 active+clean+snaptrim_wait 3m [156,135,280]p156 >>> 3.14ea active+clean+snaptrim_wait 3m [156,118,103]p156 >>> 3.1688 active+clean+snaptrim_wait 3m [156,221,98]p156 >>> 3.1848 active+clean+snaptrim_wait 3m [156,268,68]p156 >>> 3.18ca active+clean+snaptrim_wait 3m [156,39,94]p156 >>> 3.1b6e active+clean+snaptrim_wait 3m [156,274,31]p156 >>> 3.1e27 active+clean+snaptrim_wait 3m [156,212,134]p156 >>> 3.1e74 active+clean+snaptrim 3m [156,96,12]p156 >>> 3.1f11 active+clean+snaptrim_wait 3m [156,253,120]p156 >>> 3.1f7b active+clean+snaptrim_wait 3m [156,190,283]p156 >>> >>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' | >>> column -t >>> Tue Apr 29 02:47:38 PM UTC 2025 >>> 3.1b active+clean+snaptrim_wait 2m [156,115,48]p156 >>> 3.2f6 active+clean+snaptrim_wait 2m [156,117,32]p156 >>> 3.65d active+clean+snaptrim_wait 2m [156,263,9]p156 >>> 3.721 active+clean+snaptrim_wait 2m [156,23,190]p156 >>> 3.849 active+clean+snaptrim_wait 2m [156,135,31]p156 >>> 3.abb active+clean+snaptrim_wait 2m [156,70,262]p156 >>> 3.c1c active+clean+snaptrim_wait 2m [156,111,11]p156 >>> 3.f83 active+clean+snaptrim_wait 2m [156,119,97]p156 >>> 3.f8a active+clean+snaptrim_wait 2m [156,266,281]p156 >>> 3.103f active+clean+snaptrim_wait 2m [156,262,215]p156 >>> 3.11e9 active+clean+snaptrim_wait 2m [156,135,280]p156 >>> 3.14ea active+clean+snaptrim_wait 2m [156,118,103]p156 >>> 3.1688 active+clean+snaptrim_wait 2m [156,221,98]p156 >>> 3.1848 active+clean+snaptrim_wait 2m [156,268,68]p156 >>> 3.18ca active+clean+snaptrim_wait 2m [156,39,94]p156 >>> 3.1b6e active+clean+snaptrim_wait 2m [156,274,31]p156 >>> 3.1e27 active+clean+snaptrim_wait 2m [156,212,134]p156 >>> 3.1e74 active+clean+snaptrim 2m [156,96,12]p156 >>> 3.1f11 active+clean+snaptrim_wait 2m [156,253,120]p156 >>> 3.1f7b active+clean+snaptrim_wait 2m [156,190,283]p156 >>> >>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' | >>> column -t >>> Tue Apr 29 04:24:55 PM UTC 2025 >>> 3.1b active+clean+snaptrim_wait 55s [156,115,48]p156 >>> 3.2f6 active+clean+snaptrim_wait 55s [156,117,32]p156 >>> 3.65d active+clean+snaptrim_wait 55s [156,263,9]p156 >>> 3.721 active+clean+snaptrim_wait 55s [156,23,190]p156 >>> 3.849 active+clean+snaptrim_wait 55s [156,135,31]p156 >>> 3.abb active+clean+snaptrim_wait 55s [156,70,262]p156 >>> 3.c1c active+clean+snaptrim 34s [156,111,11]p156 >>> 3.f83 active+clean+snaptrim_wait 55s [156,119,97]p156 >>> 3.f8a active+clean+snaptrim_wait 55s [156,266,281]p156 >>> 3.103f active+clean+snaptrim_wait 55s [156,262,215]p156 >>> 3.11e9 active+clean+snaptrim_wait 55s [156,135,280]p156 >>> 3.14ea active+clean+snaptrim_wait 55s [156,118,103]p156 >>> 3.1688 active+clean+snaptrim_wait 55s [156,221,98]p156 >>> 3.1848 active+clean+snaptrim_wait 55s [156,268,68]p156 >>> 3.18ca active+clean+snaptrim_wait 55s [156,39,94]p156 >>> 3.1b6e active+clean+snaptrim_wait 55s [156,274,31]p156 >>> 3.1e27 active+clean+snaptrim_wait 55s [156,212,134]p156 >>> 3.1e74 active+clean+snaptrim_wait 34s [156,96,12]p156 >>> 3.1f11 active+clean+snaptrim_wait 55s [156,253,120]p156 >>> 3.1f7b active+clean+snaptrim_wait 55s [156,190,283]p156 >>> >>> # date; ceph pg ls| grep snaptrim | awk '{print $1, $11, $12, $15}' | >>> column -t >>> Mon May 5 06:32:53 AM UTC 2025 >>> 3.c1c active+clean+snaptrim 41h [165,111,25]p165 >>> -- >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >>> groüen Saal. >>> >> >> >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> groüen Saal. >> > > > -- > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groüen Saal. > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io