On Wed, Aug 13, 2025 at 11:25:00AM +0200, Eugenio Perez Martin wrote:
> On Mon, Aug 11, 2025 at 11:56 PM Peter Xu <pet...@redhat.com> wrote:
> >
> > On Mon, Aug 11, 2025 at 05:26:05PM -0400, Jonah Palmer wrote:
> > > This effort was started to reduce the guest visible downtime by
> > > virtio-net/vhost-net/vhost-vDPA during live migration, especially
> > > vhost-vDPA.
> > >
> > > The downtime contributed by vhost-vDPA, for example, is not from having to
> > > migrate a lot of state but rather expensive backend control-plane latency
> > > like CVQ configurations (e.g. MQ queue pairs, RSS, MAC/VLAN filters, 
> > > offload
> > > settings, MTU, etc.). Doing this requires kernel/HW NIC operations which
> > > dominates its downtime.
> > >
> > > In other words, by migrating the state of virtio-net early (before the
> > > stop-and-copy phase), we can also start staging backend configurations,
> > > which is the main contributor of downtime when migrating a vhost-vDPA
> > > device.
> > >
> > > I apologize if this series gives the impression that we're migrating a lot
> > > of data here. It's more along the lines of moving control-plane latency 
> > > out
> > > of the stop-and-copy phase.
> >
> > I see, thanks.
> >
> > Please add these into the cover letter of the next post.  IMHO it's
> > extremely important information to explain the real goal of this work.  I
> > bet it is not expected for most people when reading the current cover
> > letter.
> >
> > Then it could have nothing to do with iterative phase, am I right?
> >
> > What are the data needed for the dest QEMU to start staging backend
> > configurations to the HWs underneath?  Does dest QEMU already have them in
> > the cmdlines?
> >
> > Asking this because I want to know whether it can be done completely
> > without src QEMU at all, e.g. when dest QEMU starts.
> >
> > If src QEMU's data is still needed, please also first consider providing
> > such facility using an "early VMSD" if it is ever possible: feel free to
> > refer to commit 3b95a71b22827d26178.
> >
> 
> While it works for this series, it does not allow to resend the state
> when the src device changes. For example, if the number of virtqueues
> is modified.

Some explanation on "how sync number of vqueues helps downtime" would help.
Not "it might preheat things", but exactly why, and how that differs when
it's pure software, and when hardware will be involved.

If it's only about pre-heat, could dest qemu preheat with max num of
vqueues?  Is it the same cost of downtime when growing num of queues,
v.s. shrinking num of queues?

For softwares, is it about memory transaction updates due to the vqueues?
If so, have we investigated a more generic approach on memory side, likely
some form of continuation from Chuang's work I previously mentioned?

-- 
Peter Xu


Reply via email to