I've noticed a pretty steep performance degradation when using RBDs with LIO. I've tried a multitude of configurations to see if there are any changes in performance and I've only found a few that work (sort of).
Details about the systems being used: - All network hardware for data is 10gbe, there is some management on 1gbe, but I can assure that it isn't being used (perf & bwm-ng shows this) - Ceph version 0.80.5 - 20GB RBD (for our test, prod will be much larger, the size doesn't seem to matter tho) - LIO version 4.1.0, RisingTide - Initiator is another linux system (However I've used ESXi as well with no difference) - We have 8 OSD nodes, each with 8 2TB OSDs, 64 OSDs total * 4 nodes are in one rack 4 in another, crush maps have been configured with this as well * All OSD nodes are running Centos 6.5 - 2 Gateway nodes on HP Proliant blades (but I've only been using one for testing, however the problem does exist on both) * All gateway nodes are running Centos 7 I've tested a multitude of things, mainly to see where the issue lies. - The performance of the RBD as a target using LIO - The performance of the RBD itself (no iSCSI or LIO) - LIO performance by using a ramdisk as a target (no RBD involved) - Setting the RBD up with LVM, then using a logical volume from that as a target with LIO - Setting the RBD up in RAID0 & RAID1 (single disk, using mdadm), then using that volume as a target with LIO - Mounting the RBD as ext4, then using a disk image and fileio as a target - Mounting the RBD as ext4, then using a disk image as a loop device and blockio as a target - Setting the RBD up as a loop device, then setting that up as a target with LIO - What tested with bad performance (Reads ~25-50MB/s - Writes ~25-50MB/s) * RBD setup as target using LIO * RBD -> LVM -> LIO target * RBD -> RAID0/1 -> LIO target - What tested with good performance (Reads ~700-800MB/s - Writes ~400-700MB/s) * RBD on local system, no iSCSI * Ramdisk (No RBD) -> LIO target * RBD -> Mounted ext4 -> disk image -> LIO fileio target * RBD -> Mounted ext4 -> disk image -> loop device -> LIO blockio target * RBD -> loop device -> LIO target I'm just curious if anybody else has experienced these issues or has any idea what's going on or has any suggestions on fixing this. I know using loop devices sounds like a solution, but we hit a brick wall with the fact loop devices are single threaded. The intent is to use this with VMWare ESXi with the 2 gateways setup as a path to the target block devices. I'm not opposed to using something somewhat kludgy, provided we can still use multipath iSCSI within VMWare Thanks for any help anyone can provide!
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com