[ceph-users] Re: Snaptrim issue after nautilus to octopus upgrade

Oliver Freyermuth Fri, 23 Aug 2024 08:49:16 -0700

Hi Özkan,

in our case, we tried online compaction first, and it helped to resolve the 
issue completely. I did first test with a single OSD daemon (i.e. only online 
compaction of that single OSD), and checked that the load of that daemon went 
down significantly
(that was while snaptrims with high sleep value were still going on).
Then, I went in batches of 10 % of the cluster's OSDs, and they finished rather 
fast (few minutes) so I could do it without a downtime, actually.


In older threads on this list, snaptrim issues which seemed similar (but not 
clearly related to an upgrade) required more heavy operations (either offline 
compaction or OSD recreation).
Since online compaction is comparatibely "cheap", I'd always try this first, in 
my case, each OSD took less than 2-3 minutes for this, but of course your mileage may 
vary.

Cheers,
        Oliver

Am 23.08.24 um 17:42 schrieb Özkan Göksu:

Hello Oliver.

Thank you so much for the answer!

I was thinking of re-creating the OSD's but if you are sure the compaction is 
the solution here then it's worth to try.
I'm planning to shutdown all the VM's and when the cluster is safe then I will 
try OSD compaction.
May I learn did you do online compaction or offline?

Because I have 2 side and I can shutdown 1 entire rack and do the offline 
compaction and do the same thing other side when its done.
What do you think?

Regards.





Oliver Freyermuth <freyerm...@physik.uni-bonn.de 
<mailto:freyerm...@physik.uni-bonn.de>>, 23 Ağu 2024 Cum, 18:06 tarihinde şunu 
yazdı:

    Hi Özkan,

    FWIW, we observed something similar after upgrading from Mimic => Nautilus 
=> Octopus and starting to trim snapshots after.

    The size of our cluster was a bit smaller, but the effect was the same: 
When snapshot trimming started, OSDs went into high load and RBD I/O was 
extremely slow.

    We tried to use:
       ceph tell osd.* injectargs '--osd-snap-trim-sleep 10'
    first, which helped, but of course snapshots kept piling up.

    Finally, we performed only RocksDB compactions via:

       for A in {0..5}; do ceph tell osd.$A compact | sed 's/^/'$A': /' & done

    for some batches of OSDs, and their load went down heavily. Finally, after 
we'd churned through all OSDs, I/O load was low again, and we could go back to 
the default:
       ceph tell osd.* injectargs '--osd-snap-trim-sleep 0'

    After this, the situation has stabilized for us. So my guess would be that 
the RocksDBs grew too much after the OMAP format conversion and the compaction 
shrank them again.

    Maybe that also helps in your case?

    Interestingly, we did not observe this on other clusters (one mainly for 
CephFS, another one with mirrored RBD volumes), which took the same upgrade 
path.

    Cheers,
             Oliver

    Am 23.08.24 um 16:46 schrieb Özkan Göksu:
     > Hello folks.
     >
     > We have a ceph cluster and we have 2000+ RBD drives on 20 nodes.
     >
     > We upgraded the cluster from 14.2.16 to 15.2.14 and after the upgrade we
     > started to see snap trim issues.
     > Without the "nosnaptrim" flag, the system is not usable right now.
     >
     > I think the problem is because of the omap conversion at Octopus upgrade.
     >
     > Note that the first time each OSD starts, it will do a format conversion 
to
     > improve the accounting for “omap” data. This may take a few minutes to as
     > much as a few hours (for an HDD with lots of omap data). You can disable
     > this automatic conversion with:
     >
     > What should I do to solve this problem?
     >
     > Thanks.
     > _______________________________________________
     > ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io>
     > To unsubscribe send an email to ceph-users-le...@ceph.io 
<mailto:ceph-users-le...@ceph.io>

-- Oliver Freyermuth

    Universität Bonn
    Physikalisches Institut, Raum 1.047
    Nußallee 12
    53115 Bonn
    --
    Tel.: +49 228 73 2367
    Fax:  +49 228 73 7869
    --


--
Oliver Freyermuth
Universität Bonn
Physikalisches Institut, Raum 1.047
Nußallee 12
53115 Bonn
--
Tel.: +49 228 73 2367
Fax:  +49 228 73 7869
--

smime.p7s
Description: Kryptografische S/MIME-Signatur

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Snaptrim issue after nautilus to octopus upgrade

Reply via email to