On 28.06.2016 17:28, Sascha Silbe wrote: > ratelimit_calculate_delay() previously reset the accounting every time > slice, no matter how much data had been processed before. This had (at > least) two consequences: > > 1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream. > > Not sure if there are real-world use cases where this would be a > problem. Mirroring and backup over a slow link (e.g. DSL) would > come to mind, though. > > 2. Tests for block job operations (e.g. cancel) were rather racy > > All block jobs currently use a time slice of 100ms. That's a > reasonable value to get smooth output during regular > operation. However this also meant that the state of block jobs > changed every 100ms, no matter how low the configured limit was. On > busy hosts, qemu often transferred additional chunks until the test > case had a chance to cancel the job. > > Fix the block job rate limit code to delay for more than one time > slice to address the above issues. To make it easier to handle > oversized chunks we switch the semantics from returning a delay > _before_ the current request to a delay _after_ the current > request. If necessary, this delay consists of multiple time slice > units. > > Since the mirror job sends multiple chunks in one go even if the rate > limit was exceeded in between, we need to keep track of the start of > the current time slice so we can correctly re-compute the delay for > the updated amount of data. > > The minimum bandwidth now is 1 data unit per time slice. The block > jobs are currently passing the amount of data transferred in sectors > and using 100ms time slices, so this translates to 5120 > bytes/second. With chunk sizes usually being O(512KiB), tests have > plenty of time (O(100s)) to operate on block jobs. The chance of a > race condition now is fairly remote, except possibly on insanely > loaded systems. > > Signed-off-by: Sascha Silbe <si...@linux.vnet.ibm.com> > --- > block/commit.c | 13 +++++-------- > block/mirror.c | 4 +++- > block/stream.c | 12 ++++-------- > include/qemu/ratelimit.h | 43 ++++++++++++++++++++++++++++++++++--------- > 4 files changed, 46 insertions(+), 26 deletions(-) >
[...] > diff --git a/block/mirror.c b/block/mirror.c > index a04ed9c..d9d70f0 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -416,7 +416,9 @@ static uint64_t coroutine_fn > mirror_iteration(MirrorBlockJob *s) > assert(io_sectors); > sector_num += io_sectors; > nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk); > - delay_ns += ratelimit_calculate_delay(&s->limit, io_sectors); > + if (s->common.speed) { > + delay_ns = ratelimit_calculate_delay(&s->limit, io_sectors); > + } Hmm... Was it a bug that ratelimit_calculate_delay() was called unconditionally before? > } > return delay_ns; > } [...] > diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h > index d413a4a..12db769 100644 > --- a/include/qemu/ratelimit.h > +++ b/include/qemu/ratelimit.h > @@ -15,34 +15,59 @@ > #define QEMU_RATELIMIT_H 1 > > typedef struct { > - int64_t next_slice_time; > + int64_t slice_start_time; > + int64_t slice_end_time; > uint64_t slice_quota; > uint64_t slice_ns; > uint64_t dispatched; > } RateLimit; > > +/** Calculate and return delay for next request in ns > + * > + * Record that we sent @p n data units. If we may send more data units > + * in the current time slice, return 0 (i.e. no delay). Otherwise > + * return the amount of time (in ns) until the start of the next time > + * slice that will permit sending the next chunk of data. > + * > + * Recording sent data units even after exceeding the quota is > + * permitted; the time slice will be extended accordingly. > + */ > static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n) > { > int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > + uint64_t delay_slices; > > - if (limit->next_slice_time < now) { > - limit->next_slice_time = now + limit->slice_ns; > + assert(limit->slice_quota && limit->slice_ns); > + > + if (limit->slice_end_time < now) { > + /* Previous, possibly extended, time slice finished; reset the > + * accounting. */ > + limit->slice_start_time = now; > + limit->slice_end_time = now + limit->slice_ns; > limit->dispatched = 0; > } > - if (limit->dispatched == 0 || limit->dispatched + n <= > limit->slice_quota) { > - limit->dispatched += n; > + > + limit->dispatched += n; > + if (limit->dispatched < limit->slice_quota) { Nitpick: This should probably stay <=. > + /* We may send further data within the current time slice, no > + * need to delay the next request. */ > return 0; > - } else { > - limit->dispatched = n; > - return limit->next_slice_time - now; > } > + > + /* Quota exceeded. Calculate the next time slice we may start > + * sending data again. */ > + delay_slices = (limit->dispatched + limit->slice_quota - 1) / > + limit->slice_quota; > + limit->slice_end_time = limit->slice_start_time + > + delay_slices * limit->slice_ns; I think it would make sense to make this a floating point calculation. If you don't agree, though: Reviewed-by: Max Reitz <mre...@redhat.com> > + return limit->slice_end_time - now; > } > > static inline void ratelimit_set_speed(RateLimit *limit, uint64_t speed, > uint64_t slice_ns) > { > limit->slice_ns = slice_ns; > - limit->slice_quota = ((double)speed * slice_ns)/1000000000ULL; > + limit->slice_quota = MAX(((double)speed * slice_ns) / 1000000000ULL, 1); > } > > #endif >
signature.asc
Description: OpenPGP digital signature