Well, sadly, that setting doesn’t seem to resolve the issue. I set the value in ceph.conf for the OSDs with small WAL/DB devices that keep running into the issue,
> $ ceph tell osd.12 config show | grep bluestore_volume_selection_policy > "bluestore_volume_selection_policy": "rocksdb_original", > $ ceph crash info > 2024-01-10T16:39:05.925534Z_f0c57ca3-b7e6-4511-b7ae-5834541d6c67 | egrep > "(assert_condition|entity_name)" > "assert_condition": "cur >= p.length", > "entity_name": "osd.12", So, I guess that configuration item doesn’t in fact prevent the crash as was purported. Looks like I may need to fast track moving to quincy… Reed > On Jan 8, 2024, at 9:47 AM, Reed Dier <reed.d...@focusvq.com> wrote: > > I ended up setting it in ceph.conf which appears to have worked (as far as I > can tell). > >> [osd] >> bluestore_volume_selection_policy = rocksdb_original > >> $ ceph config show osd.0 | grep bluestore_volume_selection_policy >> bluestore_volume_selection_policy rocksdb_original file >> (mon[rocksdb_original]) > > So far so good… > > Reed > >> On Jan 8, 2024, at 2:04 AM, Eugen Block <ebl...@nde.ag >> <mailto:ebl...@nde.ag>> wrote: >> >> Hi, >> >> I just did the same in my lab environment and the config got applied to the >> daemon after a restart: >> >> pacific:~ # ceph tell osd.0 config show | grep >> bluestore_volume_selection_policy >> "bluestore_volume_selection_policy": "rocksdb_original", >> >> This is also a (tiny single-node) cluster running 16.2.14. Maybe you have >> some typo or something while doing the loop? Have you tried to set it for >> one OSD only and see if it starts with the config set? >> >> >> Zitat von Reed Dier <reed.d...@focusvq.com <mailto:reed.d...@focusvq.com>>: >> >>> After ~3 uneventful weeks after upgrading from 15.2.17 to 16.2.14 I’ve >>> started seeing OSD crashes with "cur >= fnode.size” and "cur >= p.length”, >>> which seems to be resolved in the next point release for pacific later this >>> month, but until then, I’d love to keep the OSDs from flapping. >>> >>>> $ for crash in $(ceph crash ls | grep osd | awk '{print $1}') ; do ceph >>>> crash info $crash | egrep "(assert_condition|crash_id)" ; done >>>> "assert_condition": "cur >= fnode.size", >>>> "crash_id": >>>> "2024-01-03T09:07:55.698213Z_348af2d3-d4a7-4c27-9f71-70e6dc7c1af7", >>>> "assert_condition": "cur >= p.length", >>>> "crash_id": >>>> "2024-01-03T14:21:55.794692Z_4557c416-ffca-4165-aa91-d63698d41454", >>>> "assert_condition": "cur >= fnode.size", >>>> "crash_id": >>>> "2024-01-03T22:53:43.010010Z_15dc2b2a-30fb-4355-84b9-2f9560f08ea7", >>>> "assert_condition": "cur >= p.length", >>>> "crash_id": >>>> "2024-01-04T02:34:34.408976Z_2954a2c2-25d2-478e-92ad-d79c42d3ba43", >>>> "assert_condition": "cur2 >= p.length", >>>> "crash_id": >>>> "2024-01-04T21:57:07.100877Z_12f89c2c-4209-4f5a-b243-f0445ba629d2", >>>> "assert_condition": "cur >= p.length", >>>> "crash_id": >>>> "2024-01-05T00:35:08.561753Z_a189d967-ab02-4c61-bf68-1229222fd259", >>>> "assert_condition": "cur >= fnode.size", >>>> "crash_id": >>>> "2024-01-05T04:11:48.625086Z_a598cbaf-2c4f-4824-9939-1271eeba13ea", >>>> "assert_condition": "cur >= p.length", >>>> "crash_id": >>>> "2024-01-05T13:49:34.911210Z_953e38b9-8ae4-4cfe-8f22-d4b7cdf65cea", >>>> "assert_condition": "cur >= p.length", >>>> "crash_id": >>>> "2024-01-05T13:54:25.732770Z_4924b1c0-309c-4471-8c5d-c3aaea49166c", >>>> "assert_condition": "cur >= p.length", >>>> "crash_id": >>>> "2024-01-05T16:35:16.485416Z_0bca3d2a-2451-4275-a049-a65c58c1aff1”, >>> >>> As noted in >>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ >>> >>> <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/> >>> >>> <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ >>> >>> <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/>> >>> >>>> You can apparently work around the issue by setting >>>> 'bluestore_volume_selection_policy' config parameter to rocksdb_original. >>> >>> However, after trying to set that parameter with `ceph config set osd.$osd >>> bluestore_volume_selection_policy rocksdb_original` it doesn’t appear to >>> set? >>> >>>> $ ceph config show-with-defaults osd.0 | grep >>>> bluestore_volume_selection_policy >>>> bluestore_volume_selection_policy use_some_extra >>> >>>> $ ceph config set osd.0 bluestore_volume_selection_policy rocksdb_original >>>> $ ceph config show osd.0 | grep bluestore_volume_selection_policy >>>> bluestore_volume_selection_policy use_some_extra >>>> default mom >>> >>> This, I assume, should reflect the new setting, however it still shows the >>> default “use_some_extra” value. >>> >>> But then this seems to imply that the config is set? >>>> $ ceph config dump | grep bluestore_volume_selection_policy >>>> osd.0 dev bluestore_volume_selection_policy >>>> rocksdb_original * >>>> [snip] >>>> osd.9 dev bluestore_volume_selection_policy >>>> rocksdb_original * >>> >>> Does this need to be set in ceph.conf or is there another setting that also >>> needs to be set? >>> Even after bouncing the OSD daemon, `ceph config show` still reports >>> “use_some_extra" >>> >>> Appreciate any help they can offer to point me towards to bridge the gap >>> between now and the next point release. >>> >>> Thanks, >>> Reed >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io> >>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> <mailto:ceph-users-le...@ceph.io> >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io> >> To unsubscribe send an email to ceph-users-le...@ceph.io >> <mailto:ceph-users-le...@ceph.io> _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io