On Tue, Aug 05, 2014 at 06:00:22PM +0800, Ming Lei wrote: > On Tue, Aug 5, 2014 at 5:48 PM, Kevin Wolf <kw...@redhat.com> wrote: > > Am 05.08.2014 um 05:33 hat Ming Lei geschrieben: > >> Hi, > >> > >> These patches bring up below 4 changes: > >> - introduce object allocation pool and apply it to > >> virtio-blk dataplane for improving its performance > >> > >> - introduce selective coroutine bypass mechanism > >> for improving performance of virtio-blk dataplane with > >> raw format image > > > > Before applying any bypassing patches, I think we should understand in > > detail where we are losing performance with coroutines enabled. > > From the below profiling data, CPU becomes slow to run instructions > with coroutine, and CPU dcache miss is increased so it is very > likely caused by switching stack frequently. > > http://marc.info/?l=qemu-devel&m=140679721126306&w=2 > > http://pastebin.com/ae0vnQ6V
I have been wondering how to prove that the root cause is the ucontext coroutine mechanism (stack switching). Here is an idea: Hack your "bypass" code path to run the request inside a coroutine. That way you can compare "bypass without coroutine" against "bypass with coroutine". Right now I think there are doubts because the bypass code path is indeed a different (and not 100% correct) code path. So this approach might prove that the coroutines are adding the overhead and not something that you bypassed. Stefan
pgpkIlyg4_v72.pgp
Description: PGP signature