On Sun, Feb 14, 2016 at 02:22:10PM +0800, Fam Zheng wrote: > On Tue, 02/09 13:47, Stefan Hajnoczi wrote: > > On Mon, Feb 08, 2016 at 03:17:23PM +0000, Dr. David Alan Gilbert wrote: > > > Does this make sense to everyone else, or does anyone have any better > > > suggestions? > > > > As a concrete example, any monitor command that calls bdrv_drain_all() > > can hang forever with the QEMU global mutex held if I/O requests are > > stuck (e.g. NFS mount is unreachable). > > > > bdrv_aio_cancel() can also hang but is mostly exposed to device > > emulation, not the monitor. > > > > One solution for these block layer functions is to add a timeout > > argument and let them return an error. This way the monitor and device > > emulation do not hang forever. > > Yes, there are a few places in block layer invoking aio_poll() in a loop > waiting for certain events, and a disconnected network link could make QEMU > hang. In these cases a timeout is a huge improvement. Maybe we can mark the > BDS as "hanging" (-EIO is returned for all further requests) and let > bdrv_drain_all() return.
No, we need to be very careful about hung requests that are still in flight. They could complete after a long time and modify the disk or guest RAM. The only thing bdrv_drain_all() can return is -ETIME. Beyond that it cannot pretend that requests have been drained since that could lead to data corruption/loss. The caller has to abort whatever it was trying to do since it has no guarantee that requests have drained. Maybe the remaining requests will fail and go away, maybe they will complete. We don't know... > > > > The benefit of the timeout is that both monitor and device emulation > > hangs are tackled. It also doesn't require monitor changes. > > > > I'm not sure who chooses the timeout value and which value makes sense > > (policy vs mechanism separation)... > > Default to 30 seconds like Linux, and make it tunable through command line > options as well as QMP? Yes. Either commands that block can take an optional timeout (seconds) parameter, or we could have a global disk I/O timeout value. I prefer passing an optional per-command value. Libvirt and other sophisticated clients could begin using it in a backwards-compatible way. Existing code wouldn't use it and still suffer from the hang problem (but at least QEMU is backwards-compatible). Stefan
signature.asc
Description: PGP signature