On Thu, Nov 3, 2011 at 4:03 PM, Kevin Wolf <kw...@redhat.com> wrote: > Am 03.11.2011 04:18, schrieb Zhi Yong Wu: >> On Wed, Nov 2, 2011 at 7:51 PM, Kevin Wolf <kw...@redhat.com> wrote: >>> Am 02.11.2011 07:01, schrieb Zhi Yong Wu: >>>> Signed-off-by: Zhi Yong Wu <wu...@linux.vnet.ibm.com> >>>> Signed-off-by: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> >>>> --- >>>> block.c | 233 >>>> +++++++++++++++++++++++++++++++++++++++++++++++-- >>>> block.h | 1 + >>>> block_int.h | 1 + >>>> qemu-coroutine-lock.c | 8 ++ >>>> qemu-coroutine.h | 6 ++ >>>> 5 files changed, 242 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/block.c b/block.c >>>> index c70f86d..440e889 100644 >>>> --- a/block.c >>>> +++ b/block.c >>>> @@ -74,6 +74,13 @@ static BlockDriverAIOCB >>>> *bdrv_co_aio_rw_vector(BlockDriverState *bs, >>>> bool is_write); >>>> static void coroutine_fn bdrv_co_do_rw(void *opaque); >>>> >>>> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >>>> + bool is_write, double elapsed_time, uint64_t *wait); >>>> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, >>>> + double elapsed_time, uint64_t *wait); >>>> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, >>>> + bool is_write, int64_t *wait); >>>> + >>>> static QTAILQ_HEAD(, BlockDriverState) bdrv_states = >>>> QTAILQ_HEAD_INITIALIZER(bdrv_states); >>>> >>>> @@ -107,6 +114,28 @@ int is_windows_drive(const char *filename) >>>> #endif >>>> >>>> /* throttling disk I/O limits */ >>>> +void bdrv_io_limits_disable(BlockDriverState *bs) >>>> +{ >>>> + bs->io_limits_enabled = false; >>>> + >>>> + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { >>> >>> This if is unnecessary, you can just always run the while loop. >> Good catch. Removed. >>> >>>> + while (qemu_co_queue_next(&bs->throttled_reqs)); >>>> + } >>>> + >>>> + qemu_co_queue_init(&bs->throttled_reqs); >>> >>> Why? >> Removed. >>> >>>> + >>>> + if (bs->block_timer) { >>>> + qemu_del_timer(bs->block_timer); >>>> + qemu_free_timer(bs->block_timer); >>>> + bs->block_timer = NULL; >>>> + } >>>> + >>>> + bs->slice_start = 0; >>>> + bs->slice_end = 0; >>>> + bs->slice_time = 0; >>>> + memset(&bs->io_disps, 0, sizeof(bs->io_disps)); >>>> +} >>>> + >>>> static void bdrv_block_timer(void *opaque) >>>> { >>>> BlockDriverState *bs = opaque; >>>> @@ -116,14 +145,13 @@ static void bdrv_block_timer(void *opaque) >>>> >>>> void bdrv_io_limits_enable(BlockDriverState *bs) >>>> { >>>> - bs->io_limits_enabled = true; >>>> qemu_co_queue_init(&bs->throttled_reqs); >>>> - >>>> - bs->block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs); >>>> - bs->slice_time = 5 * BLOCK_IO_SLICE_TIME; >>>> - bs->slice_start = qemu_get_clock_ns(vm_clock); >>>> - bs->slice_end = bs->slice_start + bs->slice_time; >>>> + bs->block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs); >>>> + bs->slice_time = 5 * BLOCK_IO_SLICE_TIME; >>>> + bs->slice_start = qemu_get_clock_ns(vm_clock); >>>> + bs->slice_end = bs->slice_start + bs->slice_time; >>> >>> You're only changing whitespace here, right? If so, can you please use >>> the final version in patch 1? >> OK >>> >>>> memset(&bs->io_disps, 0, sizeof(bs->io_disps)); >>>> + bs->io_limits_enabled = true; >>>> } >>>> >>>> bool bdrv_io_limits_enabled(BlockDriverState *bs) >>>> @@ -137,6 +165,30 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) >>>> || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; >>>> } >>>> >>>> +static void bdrv_io_limits_intercept(BlockDriverState *bs, >>>> + bool is_write, int nb_sectors) >>>> +{ >>>> + int64_t wait_time = -1; >>>> + >>>> + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { >>>> + qemu_co_queue_wait(&bs->throttled_reqs); >>>> + goto resume; >>>> + } else if (bdrv_exceed_io_limits(bs, nb_sectors, is_write, >>>> &wait_time)) { >>>> + qemu_mod_timer(bs->block_timer, >>>> + wait_time + qemu_get_clock_ns(vm_clock)); >>>> + qemu_co_queue_wait(&bs->throttled_reqs); >>>> + >>>> +resume: >>> >>> This goto construct isn't very nice. You could have avoided it with an >>> "else return;" >> else return? no, it can not be returned shortly, i think. >>> >>> But anyway, why do you even need the else if? Can't you directly use the >>> while loop? The difference would be that qemu_co_queue_next() is run >>> even if a request is allowed without going through the queue, but would >>> that hurt? >> Great point. thanks. >>> >>> >>>> + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, >>>> &wait_time)) { >>>> + qemu_mod_timer(bs->block_timer, >>>> + wait_time + qemu_get_clock_ns(vm_clock)); >>>> + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); >>> >>> Why do you want to insert at the head? Wouldn't a queue be more >> In fact, we hope to keep each request's timing, in FIFO mode. The next >> throttled requests will not be dequeued until the current request is >> allowed to be serviced. So if the current request still exceeds the >> limits, it will be inserted to the head. All requests followed it will >> be still in throttled_reqs queue. > > Ah, now this makes a lot more sense. Can you add a comment stating > exactly this? Sure. > > Of course this means that you still need the separate if because the > first time you want to queue the coroutine at the end. But you can still > get rid of the goto by making the part starting with the while loop > unconditional. Right, I have applied this policy in next version.
> > Kevin > -- Regards, Zhi Yong Wu