I've noticed a pretty steep performance degradation when using RBDs with
LIO. I've tried a multitude of configurations to see if there are any
changes in performance and I've only found a few that work (sort of).

Details about the systems being used:

 - All network hardware for data is 10gbe, there is some management on
1gbe, but I can assure that it isn't being used (perf & bwm-ng shows this)
 - Ceph version 0.80.5
 - 20GB RBD (for our test, prod will be much larger, the size doesn't seem
to matter tho)
 - LIO version 4.1.0, RisingTide
 - Initiator is another linux system (However I've used ESXi as well with
no difference)
 - We have 8 OSD nodes, each with 8 2TB OSDs, 64 OSDs total
   * 4 nodes are in one rack 4 in another, crush maps have been configured
with this as well
   * All OSD nodes are running Centos 6.5
 - 2 Gateway nodes on HP Proliant blades (but I've only been using one for
testing, however the problem does exist on both)
   * All gateway nodes are running Centos 7

I've tested a multitude of things, mainly to see where the issue lies.

 - The performance of the RBD as a target using LIO
 - The performance of the RBD itself (no iSCSI or LIO)
 - LIO performance by using a ramdisk as a target (no RBD involved)
 - Setting the RBD up with LVM, then using a logical volume from that as a
target with LIO
 - Setting the RBD up in RAID0 & RAID1 (single disk, using mdadm), then
using that volume as a target with LIO
 - Mounting the RBD as ext4, then using a disk image and fileio as a target
 - Mounting the RBD as ext4, then using a disk image as a loop device and
blockio as a target
 - Setting the RBD up as a loop device, then setting that up as a target
with LIO

 - What tested with bad performance (Reads ~25-50MB/s - Writes ~25-50MB/s)
   * RBD setup as target using LIO
   * RBD -> LVM -> LIO target
   * RBD -> RAID0/1 -> LIO target
 - What tested with good performance (Reads ~700-800MB/s - Writes
~400-700MB/s)
   * RBD on local system, no iSCSI
   * Ramdisk (No RBD) -> LIO target
   * RBD -> Mounted ext4 -> disk image -> LIO fileio target
   * RBD -> Mounted ext4 -> disk image -> loop device -> LIO blockio target
   * RBD -> loop device -> LIO target

I'm just curious if anybody else has experienced these issues or has any
idea what's going on or has any suggestions on fixing this. I know using
loop devices sounds like a solution, but we hit a brick wall with the fact
loop devices are single threaded. The intent is to use this with VMWare
ESXi with the 2 gateways setup as a path to the target block devices. I'm
not opposed to using something somewhat kludgy, provided we can still use
multipath iSCSI within VMWare

Thanks for any help anyone can provide!
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to