On 17-12-14 05:31 PM, David Turner wrote:
I've tracked this in a much more manual way.  I would grab a random subset [..]

This was all on a Hammer cluster.  The changes to the snap trimming queues going into the main osd thread made it so that our use case was not viable on Jewel until changes to Jewel that happened after I left.  It's exciting that this will actually be a reportable value from the cluster.

Sorry that this story doesn't really answer your question, except to say that people aware of this problem likely have a work around for it.  However I'm certain that a lot more clusters are impacted by this than are aware of it and being able to quickly see that would be beneficial to troubleshooting problems.  Backporting would be nice.  I run a few Jewel clusters that have some VM's and it would be nice to see how well the cluster handle snap trimming.  But they are much less critical on how much snapshots they do.

Thanks for your response, it pretty much confirms what I though:
- users aware of issue have their own hacks that don't need to be efficient or convenient. - users unaware of issue are, well, unaware and at risk of serious service disruption once disk space is all used up.

Hopefully it'll be convincing enough for devs. ;)

--
Piotr Dałek
piotr.da...@corp.ovh.com
https://www.ovh.com/us/
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to