Peter Xu <pet...@redhat.com> writes: > On Tue, Oct 17, 2023 at 04:12:40PM +0200, Markus Armbruster wrote: >> Peter Xu <pet...@redhat.com> writes: >> >> > Migration bandwidth is a very important value to live migration. It's >> > because it's one of the major factors that we'll make decision on when to >> > switchover to destination in a precopy process. >> > >> > This value is currently estimated by QEMU during the whole live migration >> > process by monitoring how fast we were sending the data. This can be the >> > most accurate bandwidth if in the ideal world, where we're always feeding >> > unlimited data to the migration channel, and then it'll be limited to the >> > bandwidth that is available. >> > >> > However in reality it may be very different, e.g., over a 10Gbps network we >> > can see query-migrate showing migration bandwidth of only a few tens of >> > MB/s just because there are plenty of other things the migration thread >> > might be doing. For example, the migration thread can be busy scanning >> > zero pages, or it can be fetching dirty bitmap from other external dirty >> > sources (like vhost or KVM). It means we may not be pushing data as much >> > as possible to migration channel, so the bandwidth estimated from "how many >> > data we sent in the channel" can be dramatically inaccurate sometimes. >> >> how much data we've sent to the channel >> >> > >> > With that, the decision to switchover will be affected, by assuming that we >> > may not be able to switchover at all with such a low bandwidth, but in >> > reality we can. >> > >> > The migration may not even converge at all with the downtime specified, >> > with that wrong estimation of bandwidth, keeping iterations forever with a >> >> iterating forever >> >> > low estimation of bandwidth. >> > >> > The issue is QEMU itself may not be able to avoid those uncertainties on >> > measuing the real "available migration bandwidth". At least not something >> > I can think of so far. >> > >> > One way to fix this is when the user is fully aware of the available >> > bandwidth, then we can allow the user to help providing an accurate value. >> > >> > For example, if the user has a dedicated channel of 10Gbps for migration >> > for this specific VM, the user can specify this bandwidth so QEMU can >> > always do the calculation based on this fact, trusting the user as long as >> > specified. It may not be the exact bandwidth when switching over (in which >> > case qemu will push migration data as fast as possible), but much better >> > than QEMU trying to wildly guess, especially when very wrong. >> > >> > A new parameter "avail-switchover-bandwidth" is introduced just for this. >> > So when the user specified this parameter, instead of trusting the >> > estimated value from QEMU itself (based on the QEMUFile send speed), it >> > trusts the user more by using this value to decide when to switchover, >> > assuming that we'll have such bandwidth available then. >> > >> > Note that specifying this value will not throttle the bandwidth for >> > switchover yet, so QEMU will always use the full bandwidth possible for >> > sending switchover data, assuming that should always be the most important >> > way to use the network at that time. >> > >> > This can resolve issues like "unconvergence migration" which is caused by >> > hilarious low "migration bandwidth" detected for whatever reason. >> >> "unconvergence" isn't a word :) >> >> Suggest "like migration not converging, because the automatically >> detected migration bandwidth is hilariously low for whatever reason." >> >> Appreciate the thorough explanation! > > Thanks for reviewing! > > The patch is already in today's migration pull, so unfortunately no planned > repost for now. I'll amend the commit message and collect the ACK if I'll > need to redo it.
Didn't see the PR, and didn't expect it so soon. >> > Reported-by: Zhiyi Guo <zh...@redhat.com> >> > Reviewed-by: Joao Martins <joao.m.mart...@oracle.com> >> > Signed-off-by: Peter Xu <pet...@redhat.com> >> > --- >> > v4: >> > - Rebase to master, with duplicated documentations >> > --- >> > qapi/migration.json | 34 +++++++++++++++++++++++++++++++++- >> > migration/migration.h | 2 +- >> > migration/options.h | 1 + >> > migration/migration-hmp-cmds.c | 14 ++++++++++++++ >> > migration/migration.c | 24 +++++++++++++++++++++--- >> > migration/options.c | 28 ++++++++++++++++++++++++++++ >> > migration/trace-events | 2 +- >> > 7 files changed, 99 insertions(+), 6 deletions(-) >> > >> > diff --git a/qapi/migration.json b/qapi/migration.json >> > index 8843e74b59..0c897a99b1 100644 >> > --- a/qapi/migration.json >> > +++ b/qapi/migration.json >> > @@ -759,6 +759,16 @@ >> > # @max-bandwidth: to set maximum speed for migration. maximum speed >> > # in bytes per second. (Since 2.8) >> > # >> > +# @avail-switchover-bandwidth: to set the available bandwidth that >> > +# migration can use during switchover phase. NOTE! This does not >> > +# limit the bandwidth during switchover, but only for calculations >> > when >> > +# making decisions to switchover. By default, this value is zero, >> > +# which means QEMU will estimate the bandwidth automatically. This >> > can >> > +# be set when the estimated value is not accurate, while the user is >> > +# able to guarantee such bandwidth is available when switching over. >> > +# When specified correctly, this can make the switchover decision much >> > +# more accurate. (Since 8.2) >> >> We tend to eschew abbreviations in QAPI schema identifiers. >> available-switchover-bandwidth is a mouthful, though. What do you >> think? > > The named changed in the past versions, and IIRC avail-switchover-bandwidth > is something we came up with last.. assuming a trade-off between length and > sane meanings. I don't have anything better to come up.. :) > > Please shoot if you have better suggestions. We still have three weeks to > 8.2 soft freeze. I just had a similar conversation with Vladimir for "[PATCH 2/4] qapi: introduce device-sync-config". Quoting myself: In the words of Captain Barbossa, it's "more what you'd call 'guidelines' than actual rules." I didn't come up with the "avoid abbreviations" stylistic guideline. I inherited it. I do like consistent style. I don't like excessively long names. Sometimes these likes conflict, and we need to pick. For what it's worth, there's precedence for "avail" in the schema. Let's move on.