>>To my surprise however these slow requests caused aborts from the block 
>>device on the VM side, which ended up corrupting files

This is very strange, you shouldn't have corruption.
Do you use writeback ?  if yes, do you have disable barrier on your filesystem ?

(What is the qemu version ? guest os ? guest os kernel ?)



----- Mail original -----
De: "Krzysztof Nowicki" <krzysztof.a.nowi...@gmail.com>
À: "ceph-users" <ceph-users@lists.ceph.com>
Envoyé: Vendredi 6 Février 2015 10:16:30
Objet: [ceph-users] OSD slow requests causing disk aborts in KVM

Hi all, 
I'm running a small Ceph cluster with 4 OSD nodes, which serves as a storage 
backend for a set of KVM virtual machines. The VMs use RBD for disk storage. On 
the VM side I'm using virtio-scsi instead of virtio-blk in order to gain 
DISCARD support. 

Each OSD node is running on a separate machine, using 3TB WD Black drive + 
Samsung SSD for journal. The machines used for OSD nodes are not equal in spec. 
Three of them are small servers, while one is a desktop PC. The last node is 
the one causing trouble. During high loads caused by remapping due to one of 
the other nodes going down I've experienced some slow requests. To my surprise 
however these slow requests caused aborts from the block device on the VM side, 
which ended up corrupting files. 

What I wonder if such behaviour (aborts) is normal in case slow requests pile 
up. I always though that these requests would be delayed but eventually they'd 
be handled. Are there any tunables that would help me avoid such situations? I 
would really like to avoid VM outages caused by such corruption issues. 

I can attach some logs if needed. 

Best regards 
Chris 

_______________________________________________ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to