Hi, I run one I/O thread (sequential read 4 KB each time and read 8 GB in total) in host OS, the throughput is around 420 MB/s. However, when I run this I/O thread in one VM (no other VM is created and data-plane is enabled) with dedicated hardware, the throughput will be around 350 MB/s. The VM's experiment setting is as follows.
In the VM, there are 15 vCPUs (vCPU0 - vCPU14); each vCPU is pinned to corresponding dedicated pCPU (for example, vCPU 0 is pinned to pCPU0 ... vCPU 14 is pinned to pCPU 14); "Idle=poll" is added in the VM's boot grub so that the vCPU will not be idle; in the VM, all the interrupts are pinned to vCPU0 to guarantee these interrupts can be responded on time; the I/O thread is executed on one of the vCPU except vCPU0. In the host OS, there are 16 pCPUs (pCPU0 - pCPU15); pCPU 0 - pCPU 14 are used for vCPU0 - vCPU 14 in the VM dedicatedly; pCPU 15 is used by QEMU IOthread (data-plane) to handle I/O read requests from VM dedicatedly. Kernel version: 3.16.39 QEMU version: 2.4.1 I don't know why there is 70 MB/s difference between host OS and guest OS as above experiment. Does anyone have same experiences? Any comments? Thank you in advance. BTW, I have checked several times about my hardware configuration and I think the throughput difference as above should be related to QEMU. Maybe, I miss any configuration about QEMU. Best, Weiwei Jia