* Stefan Hajnoczi <stefa...@gmail.com> [2010-10-01 03:48]: > On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV) > <jv...@linux.vnet.ibm.com> wrote: > > On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote: > >> > >> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV) > >> <jv...@linux.vnet.ibm.com> wrote: > >>> > >>> Code: Mainline QEMU (git://git.qemu.org/qemu.git) > >>> Machine: LS21 blade. > >>> Disk: Local disk through VirtIO. > >>> Did not select any cache option. Defaulting to writethrough. > >>> > >>> Command tested: > >>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k > >>> count=100000 > >>> > >>> QEMU with smp=1 > >>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s > >>> > >>> QEMU with smp=4 > >>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s > >>> > >>> Is this expected? > >> > >> Did you configure with --enable-io-thread? > > > > Yes I did. > >> > >> Also, try using dd oflag=direct to eliminate effects introduced by the > >> guest page cache and really hit the disk. > > > > With oflag=direct , I see no difference and the throughput is so slow and I > > would not > > expect to see any difference. > > It is 225 kb/s for each thread either with smp=1 or with smp=4. > > If I understand correctly you are getting: > > QEMU oflag=direct with smp=1 > 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s > > QEMU oflag=direct with smp=4 > 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s > > This suggests the degradation for smp=4 is guest kernel page cache or > buffered I/O related. Perhaps lockholder preemption?
or just a single spindle maxed out because the blade hard drive doesn't have writecache enabled (it's disabled by default). -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ry...@us.ibm.com