On Mon, 24 Jun 2019 19:54:27 +0200 Laurent Vivier <lviv...@redhat.com> wrote:
> On 01/06/2019 17:49, Greg Kurz wrote: > > On Fri, 31 May 2019 16:36:33 -0300 > > Eduardo Habkost <ehabk...@redhat.com> wrote: > > > >> On Tue, May 28, 2019 at 10:48:09AM +0800, Yongji Xie wrote: > >>> On Tue, 28 May 2019 at 02:54, Michael S. Tsirkin <m...@redhat.com> wrote: > >>> > >>>> > >>>> On Mon, May 27, 2019 at 12:44:46PM +0200, Greg Kurz wrote: > >>>>> On Fri, 24 May 2019 19:56:06 +0800 > >>>>> Yongji Xie <elohi...@gmail.com> wrote: > >>>>> > >>>>>> On Fri, 24 May 2019 at 18:20, Greg Kurz <gr...@kaod.org> wrote: > >>>>>>> > >>>>>>> On Mon, 20 May 2019 19:10:35 -0400 > >>>>>>> "Michael S. Tsirkin" <m...@redhat.com> wrote: > >>>>>>> > >>>>>>>> From: Xie Yongji <xieyon...@baidu.com> > >> [...] > >>>>>>>> @@ -1770,6 +1796,13 @@ static bool virtio_broken_needed(void *opaque) > >>>>>>>> return vdev->broken; > >>>>>>>> } > >>>>>>>> > >>>>>>>> +static bool virtio_started_needed(void *opaque) > >>>>>>>> +{ > >>>>>>>> + VirtIODevice *vdev = opaque; > >>>>>>>> + > >>>>>>>> + return vdev->started; > >>>>>>> > >>>>>>> Existing machine types don't know about the "virtio/started" > >>>>>>> subsection. This > >>>>>>> breaks migration to older QEMUs if the driver has started the device, > >>>>>>> ie. most > >>>>>>> probably always when it comes to live migration. > >>>>>>> > >>>>>>> My understanding is that we do try to support backward migration > >>>>>>> though. It > >>>>>>> is a regular practice in datacenters to migrate workloads without > >>>>>>> having to > >>>>>>> take care of the QEMU version. FWIW I had to fix similar issues > >>>>>>> downstream > >>>>>>> many times in the past because customers had filed bugs. > >>>>>>> > >>>>>> > >>>>>> If we do need to support backward migration, for this patch, what I > >>>>>> can think of is to only migrate the flag in the case that guest kicks > >>>>>> but not set DRIVER_OK. This could fix backward migration in most case. > >>>>>> > >>>>> > >>>>> You mean something like that ? > >>>>> > >>>>> static bool virtio_started_needed(void *opaque) > >>>>> { > >>>>> VirtIODevice *vdev = opaque; > >>>>> > >>>>> return vdev->started && !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK); > >>>>> } > >>>>> > >>>>>> Not sure if there is a more general approach... > >>>>>> > >>>>> > >>>>> Another approach would be to only implement the started flag for > >>>>> machine version > 4.0. This can be achieved by adding a "use-started" > >>>>> property to the base virtio device, true by default and set to > >>>>> false by hw_compat_4_0. > >>>> > >>>> I think this is best frankly. > >>>> > >>> > >>> Only implement the started flag for machine version > 4.0 might not be > >>> good because vhost-user-blk now need to use this flag. How about only > >>> migrating this flag for machine version > 4.0 instead? > >> > >> Was this implemented? Is migration from QEMU 4.1 to QEMU 4.0 > >> currently broken? > >> > > > > Answer is yes for both questions. > > > > Is there a fix? > The fix was merged with this PR: https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg01565.html Cheers, -- Greg > The problem is really easy to reproduce: start a guest with virtio-blk > and migrate once the driver has started. > > Thanks, > Laurent