On Fri, Nov 6, 2015 at 4:19 PM, Denis V. Lunev <d...@openvz.org> wrote: > On 11/06/2015 07:05 PM, Eric Blake wrote: >> >> On 11/06/2015 08:54 AM, Stefan Hajnoczi wrote: >>> >>> On Wed, Nov 04, 2015 at 08:19:31PM +0300, Denis V. Lunev wrote: >>>> >>>> with test >>>> while /bin/true ; do >>>> virsh snapshot-create rhel7 >>>> sleep 10 >>>> virsh snapshot-delete rhel7 --current >>>> done >>>> with enabled iothreads on a running VM leads to a lot of troubles: >>>> hangs, >>>> asserts, errors. >> >> That is a case of using libvirt to trigger internal snapshots... >> >>> The HMP monitor is legacy and also not used by modern libvirt. >> >> ...and libvirt is forced to use HMP for internal snapshots, since we >> _still_ haven't exposed internal snapshots as a QMP command. >> >>> I think the affected use cases are restricted to savevm+dataplane and >>> HMP+dataplane. >> >> The fact that the commit message calls out a libvirt method of >> triggering the bug does mean that it is user-visible, and so it would >> qualify as a bug fix even during hard freeze. But I also understand >> that taking a large complex series late in the game is not without risk; >> and it is not like this is a regression (rather, something that has >> never worked bulletproof), right? >> > yes, this was not working in the past and this is not a regression. > > The problem is that it seems that NOBODY uses iothreads in the > production or even for complex real life production tests. There > is another recently merged example of this (100% reproducible, > happens both on migration/snapshot). We have faced this on > suspend operation. > > commit 10a06fd65f667a972848ebbbcac11bdba931b544 > Author: Pavel Butsykin <pbutsy...@virtuozzo.com> > Date: Mon Oct 26 14:42:57 2015 +0300 > > virtio: sync the dataplane vring state to the virtqueue before > virtio_save > > I have started this initially as a set of small bits in savevm code > and was asked to move the code from savevm.c to block layer. > This has been done and yes, series becomes complex after > that and it was obvious that it will be complex when the task > was set to move a bunch of code from one place to another. > > Anyway, from my point of view the serie is not that complex. > It is just large and is doing simple things almost near copy/paste > and there is a month to catch bugs here. > > Can we still consider this for merge?
Absolutely, they are still bugs and we can fix them for 2.5. I just wanted to reflect on the scope of the bugs and it occurred to me that these code paths haven't been exercised/tested as often. Stefan