On 04/13/2017 09:39 AM, Stefan Hajnoczi wrote: > On Thu, Apr 13, 2017 at 01:45:55PM +0800, Paolo Bonzini wrote: >> >> >> On 13/04/2017 09:11, Jeff Cody wrote: >>>> It didn't make it into 2.9-rc4 because of limited time. :( >>>> >>>> Looks like there is no -rc5, we'll have to document this as a known issue. >>>> Users should "block-job-complete/cancel" as soon as possible to avoid such >>>> a >>>> hang. >>> >>> I'd argue for including a fix for 2.9, since this is both a regression, and >>> a hard lock without possible recovery short of restarting the QEMU process. >> >> It is a bit of a corner case (and jobs on I/O thread are relatively rare >> too), so maybe it's not worth delaying 2.9. It has been delayed already >> quite a bit. Another reason I think I prefer to wait is to ensure that >> we have an entry in qemu-iotests to avoid the future regression. > > I also think this does not require delaying the release: > > 1. It needs to be marked as a known issue in the release notes. > 2. Let's roll the 2.9.1 stable release within a month of 2.9.0. > > If both conditions are met then very few end users will be exposed to > the problem. I hope libvirt will create IOThreads by default soon but > for the time being it is not a widely used configuration.
Also, is it something that can be avoided by not doing a system_reset while a block job is still running? Libvirt can be taught to block reset while a job has still not been finished, if needs be. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature