
I'm trying to debut why there is a big difference using POSIX AIO and libaio 
when performing read tests from inside a VM using librbd.

The results I'm getting using FIO are:


Type: Random Read - IO Engine: POSIX AIO - Buffered: No - Direct: Yes - Block 
Size: 4KB - Disk Target: /:

Average: 2.54 MB/s
Average: 632 IOPS

Libaio Read:

Type: Random Read - IO Engine: Libaio - Buffered: No - Direct: Yes - Block 
Size: 4KB - Disk Target: /:

Average: 147.88 MB/s
Average: 36967 IOPS

When performing writes the differences aren't so big, because the cluster 
-which is in production right now- is CPU bonded:


Type: Random Write - IO Engine: POSIX AIO - Buffered: No - Direct: Yes - Block 
Size: 4KB - Disk Target: /:

Average: 14.87 MB/s
Average: 3713 IOPS

Libaio Write:

Type: Random Write - IO Engine: Libaio - Buffered: No - Direct: Yes - Block 
Size: 4KB - Disk Target: /:

Average: 14.51 MB/s
Average: 3622 IOPS

Even if the write results are CPU bonded, as the machines containing the OSDs 
don't have enough CPU to handle all the IOPS (CPU upgrades are on its way) I 
cannot really understand why I'm seeing so much difference in the read tests.

Some configuration background:

- Cluster and clients are using Hammer 0.94.90
- It's a full SSD cluster running over Samsung Enterprise SATA SSDs, with all 
the typical tweaks (Customized ceph.conf, optimized sysctl, etc...)
- Tried QEMU 2.0 and 2.7 - Similar results
- Tried virtio-blk and virtio-scsi - Similar results

I've been reading about POSIX AIO and Libaio, and I can see there are several 
differences on how they work (Like one being user space and the other one being 
kernel) but I don't really get why Ceph have such problems handling POSIX AIO 
read operations, but not write operation, and how to avoid them.

Right now I'm trying to identify if it's something wrong with our Ceph cluster 
setup, with Ceph in general or with QEMU (virtio-scsi or virtio-blk as both 
have the same behavior)

If you would like to try to reproduce the issue here are the two command lines 
I'm using:

fio --name=randread-posix --output ./test --runtime 60 --ioengine=posixaio 
--buffered=0 --direct=1 --rw=randread --bs=4k --size=1024m --iodepth=32
fio --name=randread-libaio --output ./test --runtime 60 --ioengine=libaio 
--buffered=0 --direct=1 --rw=randread --bs=4k --size=1024m --iodepth=32

If you could shed any light over this I would be really helpful, as right now, 
although I have still some ideas left to try, I'm don't have much idea about 
why is this happening...

ceph-users mailing list

Reply via email to