On Wed, Aug 9, 2023 at 9:59 PM <gudkov.and...@huawei.com> wrote: > On Sun, Aug 06, 2023 at 02:31:43PM +0800, Yong Huang wrote: > > On Sat, Aug 5, 2023 at 2:05 AM Markus Armbruster <arm...@redhat.com> > wrote: > > > > > Andrei Gudkov <gudkov.and...@huawei.com> writes: > > > > > > > Introduces alternative argument calc-time-ms, which is the > > > > the same as calc-time but accepts millisecond value. > > > > Millisecond granularity allows to make predictions whether > > > > migration will succeed or not. To do this, calculate dirty > > > > rate with calc-time-ms set to max allowed downtime, convert > > > > measured rate into volume of dirtied memory, and divide by > > > > network throughput. If the value is lower than max allowed > > > > downtime, then migration will converge. > > > > > > > > Measurement results for single thread randomly writing to > > > > a 1/4/24GiB memory region: > > > > > > > > +--------------+-----------------------------------------------+ > > > > | calc-time-ms | dirty rate MiB/s | > > > > | +----------------+---------------+--------------+ > > > > | | theoretical | page-sampling | dirty-bitmap | > > > > | | (at 3M wr/sec) | | | > > > > +--------------+----------------+---------------+--------------+ > > > > | 1GiB | > > > > +--------------+----------------+---------------+--------------+ > > > > | 100 | 6996 | 7100 | 3192 | > > > > | 200 | 4606 | 4660 | 2655 | > > > > | 300 | 3305 | 3280 | 2371 | > > > > | 400 | 2534 | 2525 | 2154 | > > > > | 500 | 2041 | 2044 | 1871 | > > > > | 750 | 1365 | 1341 | 1358 | > > > > | 1000 | 1024 | 1052 | 1025 | > > > > | 1500 | 683 | 678 | 684 | > > > > | 2000 | 512 | 507 | 513 | > > > > +--------------+----------------+---------------+--------------+ > > > > | 4GiB | > > > > +--------------+----------------+---------------+--------------+ > > > > | 100 | 10232 | 8880 | 4070 | > > > > | 200 | 8954 | 8049 | 3195 | > > > > | 300 | 7889 | 7193 | 2881 | > > > > | 400 | 6996 | 6530 | 2700 | > > > > | 500 | 6245 | 5772 | 2312 | > > > > | 750 | 4829 | 4586 | 2465 | > > > > | 1000 | 3865 | 3780 | 2178 | > > > > | 1500 | 2694 | 2633 | 2004 | > > > > | 2000 | 2041 | 2031 | 1789 | > > > > +--------------+----------------+---------------+--------------+ > > > > | 24GiB | > > > > +--------------+----------------+---------------+--------------+ > > > > | 100 | 11495 | 8640 | 5597 | > > > > | 200 | 11226 | 8616 | 3527 | > > > > | 300 | 10965 | 8386 | 2355 | > > > > | 400 | 10713 | 8370 | 2179 | > > > > | 500 | 10469 | 8196 | 2098 | > > > > | 750 | 9890 | 7885 | 2556 | > > > > | 1000 | 9354 | 7506 | 2084 | > > > > | 1500 | 8397 | 6944 | 2075 | > > > > | 2000 | 7574 | 6402 | 2062 | > > > > +--------------+----------------+---------------+--------------+ > > > > > > > > Theoretical values are computed according to the following formula: > > > > size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), > > > > where size is in bytes, time is in seconds, and wps is number of > > > > writes per second. > > > > > > > > Signed-off-by: Andrei Gudkov <gudkov.and...@huawei.com> > > > > --- > > > > qapi/migration.json | 14 ++++++-- > > > > migration/dirtyrate.h | 12 ++++--- > > > > migration/dirtyrate.c | 81 > +++++++++++++++++++++++++------------------ > > > > 3 files changed, 67 insertions(+), 40 deletions(-) > > > > > > > > diff --git a/qapi/migration.json b/qapi/migration.json > > > > index 8843e74b59..82493d6a57 100644 > > > > --- a/qapi/migration.json > > > > +++ b/qapi/migration.json > > > > @@ -1849,7 +1849,11 @@ > > > > # @start-time: start time in units of second for calculation > > > > # > > > > # @calc-time: time period for which dirty page rate was measured > > > > -# (in seconds) > > > > +# (rounded down to seconds). > > > > +# > > > > +# @calc-time-ms: actual time period for which dirty page rate was > > > > +# measured (in milliseconds). Value may be larger than > requested > > > > +# time period due to measurement overhead. > > > > # > > > > # @sample-pages: number of sampled pages per GiB of guest memory. > > > > # Valid only in page-sampling mode (Since 6.1) > > > > @@ -1866,6 +1870,7 @@ > > > > 'status': 'DirtyRateStatus', > > > > 'start-time': 'int64', > > > > 'calc-time': 'int64', > > > > + 'calc-time-ms': 'int64', > > > > 'sample-pages': 'uint64', > > > > 'mode': 'DirtyRateMeasureMode', > > > > '*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } } > > > > @@ -1908,6 +1913,10 @@ > > > > # dirty during @calc-time period, further writes to this page > will > > > > # not increase dirty page rate anymore. > > > > # > > > > +# @calc-time-ms: the same as @calc-time but in milliseconds. These > > > > +# two arguments are mutually exclusive. Exactly one of them must > > > > +# be specified. (Since 8.1) > > > > +# > > > > # @sample-pages: number of sampled pages per each GiB of guest > memory. > > > > # Default value is 512. For 4KiB guest pages this corresponds > to > > > > # sampling ratio of 0.2%. This argument is used only in page > > > > @@ -1925,7 +1934,8 @@ > > > > # 'sample-pages': > 512} } > > > > # <- { "return": {} } > > > > ## > > > > -{ 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64', > > > > +{ 'command': 'calc-dirty-rate', 'data': {'*calc-time': 'int64', > > > > + '*calc-time-ms': 'int64', > > > > '*sample-pages': 'int', > > > > '*mode': > > > 'DirtyRateMeasureMode'} } > > > > > > > > > > Having both @calc-time and @calc-time-ms is ugly. > > > > > > Can we deprecate @calc-time? > > > > > Since the upper app Libvirt has used this field to implement > > the virDomainStartDirtyRateCalc API unfortunately. > > Deprecating this requires the extra patch on Libvirt but no > > functional improvement, IMHO, the field could remain untouched. > > > > > > > > I don't like the name @calc-time-ms. We don't put units in names > > > elsewhere. > > > > > > Differently ugly: new member containing the fractional part, i.e. time > > > in seconds = calc-time + fractional-part / 1000. With a better name, > of > > > course. > > > > > > [...] > > > > > > > > As another alternative I can propose to add an optional field that > specifies time unit. > > Initiate dirty page rate measurements for 300ms period: > {"execute": "calc-dirty-rate", > "arguments":{"calc-time": 300, "time-unit": "millis"}} > > Query dirty rate. Report calc-time in milliseconds: > {"execute": "query-dirty-rate", > "arguments":{"time-unit": "millis"}} >
This sounds good and compatible with the old api. Thanks ! > > > Thanks, > > > > Yong > > -- > > Best regards > yong. -- Best regards