Hi Stefan - we simply disabled exclusive-lock on all older (pre-jewel) images. We still allow the default jewel featuresets for newly created images because as you mention - the issue does not seem to affect them.
On Thu, May 4, 2017 at 10:19 AM, Stefan Priebe - Profihost AG < s.pri...@profihost.ag> wrote: > Hello Brian, > > this really sounds the same. I don't see this on a cluster with only > images created AFTER jewel. And it seems to start happening after i > enabled exclusive lock on all images. > > Did just use feature disable, exclusive-lock,fast-diff,object-map or did > you also restart all those vms? > > Greets, > Stefan > > Am 04.05.2017 um 19:11 schrieb Brian Andrus: > > Sounds familiar... and discussed in "disk timeouts in libvirt/qemu > VMs..." > > > > We have not had this issue since reverting exclusive-lock, but it was > > suggested this was not the issue. So far it's held up for us with not a > > single corrupt filesystem since then. > > > > On some images (ones created post-Jewel upgrade) the feature could not > > be disabled, but these don't seem to be affected. Of course, we never > > did pinpoint the cause of timeouts, so it's entirely possible something > > else was causing it but no other major changes went into effect. > > > > One thing to look for that might confirm the same issue are timeouts in > > the guest VM. Most OS kernel will report a hung task in conjunction with > > the hang up/lock/corruption. Wondering if you're seeing that too. > > > > On Wed, May 3, 2017 at 10:49 PM, Stefan Priebe - Profihost AG > > <s.pri...@profihost.ag <mailto:s.pri...@profihost.ag>> wrote: > > > > Hello, > > > > since we've upgraded from hammer to jewel 10.2.7 and enabled > > exclusive-lock,object-map,fast-diff we've problems with corrupting > VM > > filesystems. > > > > Sometimes the VMs are just crashing with FS errors and a restart can > > solve the problem. Sometimes the whole VM is not even bootable and we > > need to import a backup. > > > > All of them have the same problem that you can't revert to an older > > snapshot. The rbd command just hangs at 99% forever. > > > > Is this a known issue - anythink we can check? > > > > Greets, > > Stefan > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > > > > > > -- > > Brian Andrus | Cloud Systems Engineer | DreamHost > > brian.and...@dreamhost.com | www.dreamhost.com <http://www.dreamhost.com > > > -- Brian Andrus | Cloud Systems Engineer | DreamHost brian.and...@dreamhost.com | www.dreamhost.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com