On Thu, Sep 12, 2024 at 11:09:12AM +0300, Avihai Horon wrote:
> 
> On 09/09/2024 18:11, Peter Xu wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Mon, Sep 09, 2024 at 03:52:39PM +0300, Avihai Horon wrote:
> > > On 05/09/2024 21:31, Peter Xu wrote:
> > > > External email: Use caution opening links or attachments
> > > > 
> > > > 
> > > > On Thu, Sep 05, 2024 at 07:45:43PM +0300, Avihai Horon wrote:
> > > > > > Does it also mean then that the currently reported stop-size - 
> > > > > > precopy-size
> > > > > > will be very close to the constant non-iterable data size?
> > > > > It's not constant, while the VM is running it can change.
> > > > I wonder how heavy is VFIO_DEVICE_FEATURE_MIG_DATA_SIZE ioctl.
> > > > 
> > > > I just gave it a quick shot with a busy VM migrating and estimate() is
> > > > invoked only every ~100ms.
> > > > 
> > > > VFIO might be different, but I wonder whether we can fetch stop-size in
> > > > estimate() somehow, so it's still a pretty fast estimate() meanwhile we
> > > > avoid the rest of exact() calls (which are destined to be useless 
> > > > without
> > > > VFIO).
> > > > 
> > > > IIUC so far the estimate()/exact() was because ram sync is heavy when
> > > > exact().  When idle it's 80+ms now for 32G VM with current master (which
> > > > has a bug and I'm fixing it up [1]..), even if after the fix it's 3ms (I
> > > > think both numbers contain dirty bitmap sync for both vfio and kvm).  
> > > > So in
> > > > that case maybe we can still try fetching stop-size only for both
> > > > estimate() and exact(), but only sync bitmap in exact().
> > > IIUC, the end goal is to prevent migration thread spinning uselessly in
> > > pre-copy in such scenarios, right?
> > > If eventually we do call get stop-copy-size in estimate(), we will move 
> > > the
> > > spinning from "exact() -> estimate() -> exact() -> estimate() ..." to
> > > "estimate() -> estimate() -> ...".
> > > If so, what benefit would we get from this? We only move the useless work 
> > > to
> > > other place.
> > We can avoid exact() calls invoked for other vmstate handlers, e.g. RAM,
> > which can be much heavier and can require BQL during the slow process,
> > which can further block more vcpu operations during migration.
> > 
> > And as mentioned previously, VFIO is, AFAIK, the only handler that provide
> > different definitions of estimate() and exact(), which can be confusing,
> > and it's against the "estimate() is the fast-path" logic.
> > 
> > But I agree it's not fundamentally changing much..
> > 
> > > Shouldn't we directly go for the non precopy-able vs precopy-able report
> > > that you suggested?
> > Yep, I just thought the previous one would be much easier to achieve.
> 
> Yes, though I prefer not to add the get stop-copy-size ioctl in the
> estimate() flow because: a) it's guaranteed to be called (possibly many
> times) in every migration (either well configured which is the probable case
> or misconfigured which spins), and b) because how "heavy" get stop-copy-size
> is may differ from VFIO device to the other.
> 
> Maybe I am being a bit overcautious here, but let's explore other options
> first :)

Nop; that's totally ok.

> 
> > And
> > as you said, VFIO is still pretty special that the user will need manual
> > involvements anyway to specify e.g. very large downtimes, so this condition
> > shouldn't be a major case to happen.  Said that, if you have a solid idea
> > on this please feel free to go ahead directly with a complete solution.
> 
> I think it's possible to do it with what we currently have (VFIO uAPI-wise),
> I will try to think of one.
> 
> BTW, I checked again and I think we should drop this line from
> vfio_state_pending_exact():
>     *must_precopy += migration->precopy_init_size +
> migration->precopy_dirty_size;
> 
> I can send a patch for that.

Yes please.

I also wonder whether it'll be good to update the uAPI header too in Linux,
for vfio_device_feature_mig_data_size or VFIO_DEVICE_FEATURE_MIG_DATA_SIZE.
Currently it reads a bit ambiguous to me, while it could make a huge
difference iiuc when precopy size is non-trivial.

/*
 * Upon VFIO_DEVICE_FEATURE_GET read back the estimated data length that will
 * be required to complete stop copy.
 *
 * Note: Can be called on each device state.
 */

Maybe that'll also be helpful when other drivers will implement this, then
just to make sure both side (user / kernel) are crystal clear on the
definition.

-- 
Peter Xu


Reply via email to