[ceph-users] snaptrim not making progress

Frank Schilder Mon, 29 Jul 2024 01:24:40 -0700

Hi all,

our cluster is octopus latest. We seem to have a problem with snaptrim. On a 
pool for HDD RBD images I observed today that all PGs are either in state 
snaptrim or snaptrim_wait. It looks like the snaptrim process does not actually 
make any progress. There is no CPU activity by these OSDs indicating they would 
do snaptrimming (usuallu they would at least use 50% CPU as shown in top). I 
also don't see anything in the OSD logs.


For our VMs we run daily snapshot rotation and snaptrim usually finishes within 
a few minutes. We had a VM with disks on that pool cause an error due to a 
hanging virsh domfsfreeze command. This, however, is routine, we see this 
happening every now and then without any follow-up issues. I'm wondering now if 
we might have hit a race for the first time. Is there anything on an RBD image 
or pool that could block snaptrim from starting or progressing?

Thanks for any pointers!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] snaptrim not making progress

Reply via email to