On Wed, 09/28 11:14, Roman Penyaev wrote: > On Wed, Sep 28, 2016 at 5:01 AM, Fam Zheng <f...@redhat.com> wrote: > > On Tue, 09/27 19:55, Roman Penyaev wrote: > >> > The bug is 100% deterministic. Just boot up a guest with -drive > >> > format=qcow2,aio=native. > >> > >> It turns out to be that everything is broken. I started all my > >> tests with format=raw,aio=native and immediately got coroutine > >> recursive. That is completely weird. > >> > >> So, what I did is the following: > >> > >> 1. Took latest master (nothing works) > >> 2. Did interactive rebase to 12c8720 > >> 12c8720 2016-06-28 | Merge remote-tracking branch > >> 'remotes/stefanha/tags/block-pull-request' into staging [Peter > >> Maydell] > >> > >> this merge request includes all your patches related to > >> virtio-blk and MQ support. > >> > >> 3. Applied 0ed93d84edab. Everything works fine. > > > > Have you tried qcow2 at this point? raw crashes with 1a62d0accdf85 doesn't > > mean > > qcow2 is fine without it. > > > > That's true. qcow2 IO path is different, and presence of the > patch 1a62d0accdf85 does not affect - coroutine still enters > recursively. > > But for me it is quite surprising that IO fragmentation (what > was done in 1a62d0accdf85) rises the misbehavior on raw IO path.
Maybe the mystery with this change is your particular I/O pattern on the raw image is change thereafter, from ioq = 1 to ioq > 1 (from the linux-aio.c's PoV, due to fragmentation), then multiple coroutines are created for one big request, to trigger the crash. Fam > > But of course originally issue was introduced by me. Stefan, > thanks for a fix. > > -- > Roman