Hi Kevin,

On Wed, Nov 26, 2014 at 10:46 PM, Kevin Wolf <kw...@redhat.com> wrote:
> This improves the performance of requests because an ACB doesn't need to
> be allocated on the heap any more. It also makes the code nicer and
> smaller.

I am not sure it is good way for linux aio optimization:

- for raw image with some constraint, coroutine can be avoided since
io_submit() won't sleep most of times

- handling one time coroutine takes much time than handling malloc,
memset and free on small buffer, following the test data:

         --   241ns per coroutine
         --   61ns per (malloc, memset, free for 128bytes)

I still think we should figure out a fast path to avoid cocourinte
for linux-aio with raw image, otherwise it can't scale well for high
IOPS device.

Also we can use simple buf pool to avoid the dynamic allocation
easily, can't we?

>
> As a side effect, the codepath taken by aio=threads is changed to use
> paio_submit_co(). This doesn't change the performance at this point.
>
> Results of qemu-img bench -t none -c 10000000 [-n] /dev/loop0:
>
>       |      aio=native       |     aio=threads
>       | before   | with patch | before   | with patch
> ------+----------+------------+----------+------------
> run 1 | 29.921s  | 26.932s    | 35.286s  | 35.447s
> run 2 | 29.793s  | 26.252s    | 35.276s  | 35.111s
> run 3 | 30.186s  | 27.114s    | 35.042s  | 34.921s
> run 4 | 30.425s  | 26.600s    | 35.169s  | 34.968s
> run 5 | 30.041s  | 26.263s    | 35.224s  | 35.000s
>
> TODO: Do some more serious benchmarking in VMs with less variance.
> Results of a quick fio run are vaguely positive.

I will do the test with Paolo's fast path approach under
VM I/O situation.

Thanks,
Ming Lei

Reply via email to