Thu Feb 12 2015 at 16:23:38 użytkownik Andrey Korolyov <and...@xdel.ru> napisał:
On Fri, Feb 6, 2015 at 12:16 PM, Krzysztof Nowicki > <krzysztof.a.nowi...@gmail.com> wrote: > > Hi all, > > > > I'm running a small Ceph cluster with 4 OSD nodes, which serves as a > storage > > backend for a set of KVM virtual machines. The VMs use RBD for disk > storage. > > On the VM side I'm using virtio-scsi instead of virtio-blk in order to > gain > > DISCARD support. > > > > Each OSD node is running on a separate machine, using 3TB WD Black drive > + > > Samsung SSD for journal. The machines used for OSD nodes are not equal in > > spec. Three of them are small servers, while one is a desktop PC. The > last > > node is the one causing trouble. During high loads caused by remapping > due > > to one of the other nodes going down I've experienced some slow > requests. To > > my surprise however these slow requests caused aborts from the block > device > > on the VM side, which ended up corrupting files. > > > > What I wonder if such behaviour (aborts) is normal in case slow requests > > pile up. I always though that these requests would be delayed but > eventually > > they'd be handled. Are there any tunables that would help me avoid such > > situations? I would really like to avoid VM outages caused by such > > corruption issues. > > > > I can attach some logs if needed. > > > > Best regards > > Chris > > Hi, this is unevitable payoff for using scsi backend on a storage > which is capable to slow enough operations. There was some > argonaut/bobtail-era discussions in ceph ml, may be those readings can > be interesting for you. AFAIR the scsi disk would about after 70s of > non-receiving ack state for a pending operation. > Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. As for the versions I'm running all Ceph nodes on Gentoo with Ceph version 0.80.5. The VM guest in question is running Ubuntu 12.04 LTS with kernel 3.13. The guest filesystem is BTRFS. I'm thinking that the corruption may be some BTRFS bug.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com