On Mon, May 11, 2020 at 05:53:37PM +0800, Kirti Wankhede wrote: > > > On 5/5/2020 10:07 AM, Alex Williamson wrote: > > On Tue, 5 May 2020 04:48:14 +0530 > > Kirti Wankhede <kwankh...@nvidia.com> wrote: > > > >> On 3/26/2020 3:33 AM, Alex Williamson wrote: > >>> On Wed, 25 Mar 2020 02:39:07 +0530 > >>> Kirti Wankhede <kwankh...@nvidia.com> wrote: > >>>
<...> > >>>> +static int vfio_save_iterate(QEMUFile *f, void *opaque) > >>>> +{ > >>>> + VFIODevice *vbasedev = opaque; > >>>> + int ret, data_size; > >>>> + > >>>> + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); > >>>> + > >>>> + data_size = vfio_save_buffer(f, vbasedev); > >>>> + > >>>> + if (data_size < 0) { > >>>> + error_report("%s: vfio_save_buffer failed %s", vbasedev->name, > >>>> + strerror(errno)); > >>>> + return data_size; > >>>> + } > >>>> + > >>>> + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); > >>>> + > >>>> + ret = qemu_file_get_error(f); > >>>> + if (ret) { > >>>> + return ret; > >>>> + } > >>>> + > >>>> + trace_vfio_save_iterate(vbasedev->name, data_size); > >>>> + if (data_size == 0) { > >>>> + /* indicates data finished, goto complete phase */ > >>>> + return 1; > >>> > >>> But it's pending_bytes not data_size that indicates we're done. How do > >>> we get away with ignoring pending_bytes for the save_live_iterate phase? > >>> > >> > >> This is requirement mentioned above qemu_savevm_state_iterate() which > >> calls .save_live_iterate. > >> > >> /* > >> * this function has three return values: > >> * negative: there was one error, and we have -errno. > >> * 0 : We haven't finished, caller have to go again > >> * 1 : We have finished, we can go to complete phase > >> */ > >> int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy) > >> > >> This is to serialize savevm_state.handlers (or in other words devices). > > > > I've lost all context on this question in the interim, but I think this > > highlights my question. We use pending_bytes to know how close we are > > to the end of the stream and data_size to iterate each transaction > > within that stream. So how does data_size == 0 indicate we've > > completed the current phase? It seems like pending_bytes should > > indicate that. Thanks, > > > > Fixing this by adding a read on pending_bytes if its 0 and return > accordingly. > if (migration->pending_bytes == 0) { > ret = vfio_update_pending(vbasedev); > if (ret) { > return ret; > } > > if (migration->pending_bytes == 0) { > /* indicates data finished, goto complete phase */ > return 1; > } > } > just a question. if 1 is only returned when migration->pending_bytes is 0, does that mean .save_live_iterate of vmstates after "vfio-pci" would never be called until migration->pending_bytes is 0 ? as in qemu_savevm_state_iterate(), qemu_savevm_state_iterate { ... QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { ... ret = se->ops->save_live_iterate(f, se->opaque); ... if (ret <= 0) { /* Do not proceed to the next vmstate before this one reported completion of the current stage. This serializes the migration and reduces the probability that a faster changing state is synchronized over and over again. */ break; } } return ret; } in ram's migration code, its pending_bytes(remaining_size) is only updated in ram_save_pending() when it's below threshold, which means in ram_save_iterate() the pending_bytes is possible to be 0, so other vmstates have their chance to be called. Thanks Yan