On Tue, 17 May 2022 13:08:44 -0300 Jason Gunthorpe <j...@nvidia.com> wrote:
> On Tue, May 17, 2022 at 10:00:45AM -0600, Alex Williamson wrote: > > > > This is really intended to be a NOP from where things are now, as if > > > you use mlx5 live migration without a patch like this then it causes a > > > botched pre-copy since everything just ends up permanently dirty. > > > > > > If it makes more sense we could abort the pre-copy too - in the end > > > there will be dirty tracking so I don't know if I'd invest in a big > > > adventure to fully define non-dirty tracking migration. > > > > How is pre-copy currently "botched" without a patch like this? If it's > > simply that the pre-copy doesn't converge and the downtime constraints > > don't allow the VM to enter stop-and-copy, that's the expected behavior > > AIUI, and supports backwards compatibility with existing SLAs. > > It means it always fails - that certainly isn't working live > migration. There is no point in trying to converge something that we > already know will never converge. If we eliminate the pre-copy phase then it's not so much live migration anyway. Trying to converge is indeed useless work, but afaik it's that useless work that generates the data that management tools can use to determine that SLAs cannot be achieved in a compatible way. > > I'm assuming that by setting this new skip_precopy flag that we're > > forcing the VM to move to stop-and-copy, regardless of any other SLA > > constraints placed on the migration. > > That does seem like a defect in this patch, any SLA constraints should > still all be checked under the assumption all ram is dirty. The migration iteration function certainly tries to compare remaining bytes to a threshold based on bandwidth and downtime. The exit path added here is the same as it would take if we had achieved our threshold limit. It's not clear to me that we're checking the downtime limit elsewhere or have the data to do it if we don't transfer anything estimate bandwidth. > > It seems like a better solution would be to expose to management > > tools that the VM contains a device that does not support the > > pre-copy phase so that downtime expectations can be adjusted. > > I don't expect this to be a real use case though.. > > Remember, you asked for this patch when you wanted qemu to have good > behavior when kernel support for legacy dirty tracking is removed > before we merge v2 support. Is wanting good behavior a controversial point? Did we define this as the desired good behavior? Ref? Thanks, Alex