On Thu, Sep 12, 2024 at 11:09:12AM +0300, Avihai Horon wrote: > > On 09/09/2024 18:11, Peter Xu wrote: > > External email: Use caution opening links or attachments > > > > > > On Mon, Sep 09, 2024 at 03:52:39PM +0300, Avihai Horon wrote: > > > On 05/09/2024 21:31, Peter Xu wrote: > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > On Thu, Sep 05, 2024 at 07:45:43PM +0300, Avihai Horon wrote: > > > > > > Does it also mean then that the currently reported stop-size - > > > > > > precopy-size > > > > > > will be very close to the constant non-iterable data size? > > > > > It's not constant, while the VM is running it can change. > > > > I wonder how heavy is VFIO_DEVICE_FEATURE_MIG_DATA_SIZE ioctl. > > > > > > > > I just gave it a quick shot with a busy VM migrating and estimate() is > > > > invoked only every ~100ms. > > > > > > > > VFIO might be different, but I wonder whether we can fetch stop-size in > > > > estimate() somehow, so it's still a pretty fast estimate() meanwhile we > > > > avoid the rest of exact() calls (which are destined to be useless > > > > without > > > > VFIO). > > > > > > > > IIUC so far the estimate()/exact() was because ram sync is heavy when > > > > exact(). When idle it's 80+ms now for 32G VM with current master (which > > > > has a bug and I'm fixing it up [1]..), even if after the fix it's 3ms (I > > > > think both numbers contain dirty bitmap sync for both vfio and kvm). > > > > So in > > > > that case maybe we can still try fetching stop-size only for both > > > > estimate() and exact(), but only sync bitmap in exact(). > > > IIUC, the end goal is to prevent migration thread spinning uselessly in > > > pre-copy in such scenarios, right? > > > If eventually we do call get stop-copy-size in estimate(), we will move > > > the > > > spinning from "exact() -> estimate() -> exact() -> estimate() ..." to > > > "estimate() -> estimate() -> ...". > > > If so, what benefit would we get from this? We only move the useless work > > > to > > > other place. > > We can avoid exact() calls invoked for other vmstate handlers, e.g. RAM, > > which can be much heavier and can require BQL during the slow process, > > which can further block more vcpu operations during migration. > > > > And as mentioned previously, VFIO is, AFAIK, the only handler that provide > > different definitions of estimate() and exact(), which can be confusing, > > and it's against the "estimate() is the fast-path" logic. > > > > But I agree it's not fundamentally changing much.. > > > > > Shouldn't we directly go for the non precopy-able vs precopy-able report > > > that you suggested? > > Yep, I just thought the previous one would be much easier to achieve. > > Yes, though I prefer not to add the get stop-copy-size ioctl in the > estimate() flow because: a) it's guaranteed to be called (possibly many > times) in every migration (either well configured which is the probable case > or misconfigured which spins), and b) because how "heavy" get stop-copy-size > is may differ from VFIO device to the other. > > Maybe I am being a bit overcautious here, but let's explore other options > first :)
Nop; that's totally ok. > > > And > > as you said, VFIO is still pretty special that the user will need manual > > involvements anyway to specify e.g. very large downtimes, so this condition > > shouldn't be a major case to happen. Said that, if you have a solid idea > > on this please feel free to go ahead directly with a complete solution. > > I think it's possible to do it with what we currently have (VFIO uAPI-wise), > I will try to think of one. > > BTW, I checked again and I think we should drop this line from > vfio_state_pending_exact(): > *must_precopy += migration->precopy_init_size + > migration->precopy_dirty_size; > > I can send a patch for that. Yes please. I also wonder whether it'll be good to update the uAPI header too in Linux, for vfio_device_feature_mig_data_size or VFIO_DEVICE_FEATURE_MIG_DATA_SIZE. Currently it reads a bit ambiguous to me, while it could make a huge difference iiuc when precopy size is non-trivial. /* * Upon VFIO_DEVICE_FEATURE_GET read back the estimated data length that will * be required to complete stop copy. * * Note: Can be called on each device state. */ Maybe that'll also be helpful when other drivers will implement this, then just to make sure both side (user / kernel) are crystal clear on the definition. -- Peter Xu