Hi, Just pinging to check if this issue was understood yet?
Cheers, Dan On Mon, Apr 12, 2021 at 9:12 PM Jonas Jelten <jel...@in.tum.de> wrote: > > Hi Igor! > > I have plenty of OSDs to loose, as long as the recovery works well afterward, > so I can go ahead with it :D > > What debug flags should I activate? osd=10, bluefs=20, bluestore=20, > rocksdb=10, ...? > > I'm not sure it's really the transaction size, since the broken WriteBatch is > dumped, and the command index is out of range (that's the WriteBatch tag). > I don't see why the transaction size would result in such a corruption - my > naive look at the rocksdb sources looks like 14851 repairs shouldn't overflow > the 32-bit WriteBatch entry counter, but who knows. > > Are rocksdb keys like this normal? If yes, what's the construction logic? The > pool is called 'dumpsite'. > > 0x80800000000000000a194027'Rdumpsite!rbd_data.6.28423ad8f48ca1.0000000001b366ff!='0xfffffffffffffffeffffffffffffffff'o' > 0x80800000000000000a1940f69264756d'psite!rbd_data.6.28423ad8f48ca1.00000000011bdd0c!='0xfffffffffffffffeffffffffffffffff'o' > > > -- Jonas > > > > > > On 12/04/2021 16.54, Igor Fedotov wrote: > > Sorry for being too late to the party... > > > > I think the root cause is related to the high amount of repairs made during > > the first post-upgrade fsck run. > > > > The check (and fix) for zombie spanning blobs was been backported to > > v15.2.9 (here is the PR https://github.com/ceph/ceph/pull/39256). And I > > presumt it's the one which causes BlueFS data corruption due to huge > > transaction happening during such a repair. > > > > I haven't seen this exact issue (as having that many zombie blobs is a > > rarely met bug by itself) but we had to some degree similar issue with > > upgrading omap names, see: https://github.com/ceph/ceph/pull/39377 > > > > Huge resulting transaction could cause too big write to WAL which in turn > > caused data corruption (see https://github.com/ceph/ceph/pull/39701) > > > > Although the fix for the latter has been merged for 15.2.10 some additional > > issues with huge transactions might still exist... > > > > > > If someone can afford another OSD loss it could be interesting to get an > > OSD log for such a repair with debug-bluefs set to 20... > > > > I'm planning to make a fix to cap transaction size for repair in the > > nearest future anyway though.. > > > > > > Thanks, > > > > Igor > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io