12.08.2019 19:11, Max Reitz wrote: > On 12.08.19 17:47, Vladimir Sementsov-Ogievskiy wrote: >> 12.08.2019 18:10, Max Reitz wrote: >>> On 10.08.19 21:31, Vladimir Sementsov-Ogievskiy wrote: >>>> backup_cow_with_offload can transfer more than one cluster. Let >>>> backup_cow_with_bounce_buffer behave similarly. It reduces the number >>>> of IO requests, since there is no need to copy cluster by cluster. >>>> >>>> Logic around bounce_buffer allocation changed: we can't just allocate >>>> one-cluster-sized buffer to share for all iterations. We can't also >>>> allocate buffer of full-request length it may be too large, so >>>> BACKUP_MAX_BOUNCE_BUFFER is introduced. And finally, allocation logic >>>> is to allocate a buffer sufficient to handle all remaining iterations >>>> at the point where we need the buffer for the first time. >>>> >>>> Bonus: get rid of pointer-to-pointer. >>>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>>> --- >>>> block/backup.c | 65 +++++++++++++++++++++++++++++++------------------- >>>> 1 file changed, 41 insertions(+), 24 deletions(-) >>>> >>>> diff --git a/block/backup.c b/block/backup.c >>>> index d482d93458..65f7212c85 100644 >>>> --- a/block/backup.c >>>> +++ b/block/backup.c >>>> @@ -27,6 +27,7 @@ >>>> #include "qemu/error-report.h" >>>> >>>> #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16) >>>> +#define BACKUP_MAX_BOUNCE_BUFFER (64 * 1024 * 1024) >>>> >>>> typedef struct CowRequest { >>>> int64_t start_byte; >>>> @@ -98,44 +99,55 @@ static void cow_request_end(CowRequest *req) >>>> qemu_co_queue_restart_all(&req->wait_queue); >>>> } >>>> >>>> -/* Copy range to target with a bounce buffer and return the bytes copied. >>>> If >>>> - * error occurred, return a negative error number */ >>>> +/* >>>> + * Copy range to target with a bounce buffer and return the bytes copied. >>>> If >>>> + * error occurred, return a negative error number >>>> + * >>>> + * @bounce_buffer is assumed to enough to store >>> >>> s/to/to be/ >>> >>>> + * MIN(BACKUP_MAX_BOUNCE_BUFFER, @end - @start) bytes >>>> + */ >>>> static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob >>>> *job, >>>> int64_t start, >>>> int64_t end, >>>> bool >>>> is_write_notifier, >>>> bool >>>> *error_is_read, >>>> - void >>>> **bounce_buffer) >>>> + void *bounce_buffer) >>>> { >>>> int ret; >>>> BlockBackend *blk = job->common.blk; >>>> - int nbytes; >>>> + int nbytes, remaining_bytes; >>>> int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0; >>>> >>>> assert(QEMU_IS_ALIGNED(start, job->cluster_size)); >>>> - bdrv_reset_dirty_bitmap(job->copy_bitmap, start, job->cluster_size); >>>> - nbytes = MIN(job->cluster_size, job->len - start); >>>> - if (!*bounce_buffer) { >>>> - *bounce_buffer = blk_blockalign(blk, job->cluster_size); >>>> - } >>>> + bdrv_reset_dirty_bitmap(job->copy_bitmap, start, end - start); >>>> + nbytes = MIN(end - start, job->len - start); >>>> >>>> - ret = blk_co_pread(blk, start, nbytes, *bounce_buffer, read_flags); >>>> - if (ret < 0) { >>>> - trace_backup_do_cow_read_fail(job, start, ret); >>>> - if (error_is_read) { >>>> - *error_is_read = true; >>>> + >>>> + remaining_bytes = nbytes; >>>> + while (remaining_bytes) { >>>> + int chunk = MIN(BACKUP_MAX_BOUNCE_BUFFER, remaining_bytes); >>>> + >>>> + ret = blk_co_pread(blk, start, chunk, bounce_buffer, read_flags); >>>> + if (ret < 0) { >>>> + trace_backup_do_cow_read_fail(job, start, ret); >>>> + if (error_is_read) { >>>> + *error_is_read = true; >>>> + } >>>> + goto fail; >>>> } >>>> - goto fail; >>>> - } >>>> >>>> - ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer, >>>> - job->write_flags); >>>> - if (ret < 0) { >>>> - trace_backup_do_cow_write_fail(job, start, ret); >>>> - if (error_is_read) { >>>> - *error_is_read = false; >>>> + ret = blk_co_pwrite(job->target, start, chunk, bounce_buffer, >>>> + job->write_flags); >>>> + if (ret < 0) { >>>> + trace_backup_do_cow_write_fail(job, start, ret); >>>> + if (error_is_read) { >>>> + *error_is_read = false; >>>> + } >>>> + goto fail; >>>> } >>>> - goto fail; >>>> + >>>> + start += chunk; >>>> + remaining_bytes -= chunk; >>>> } >>>> >>>> return nbytes; >>>> @@ -301,9 +313,14 @@ static int coroutine_fn backup_do_cow(BackupBlockJob >>>> *job, >>>> } >>>> } >>>> if (!job->use_copy_range) { >>>> + if (!bounce_buffer) { >>>> + size_t len = MIN(BACKUP_MAX_BOUNCE_BUFFER, >>>> + MAX(dirty_end - start, end - dirty_end)); >>>> + bounce_buffer = blk_try_blockalign(job->common.blk, len); >>>> + } >>> >>> If you use _try_, you should probably also check whether it succeeded. >> >> Oops, you are right, of course. >> >>> >>> Anyway, I wonder whether it’d be better to just allocate this buffer >>> once per job (the first time we get here, probably) to be of size >>> BACKUP_MAX_BOUNCE_BUFFER and put it into BackupBlockJob. (And maybe add >>> a buf-size parameter similar to what the mirror jobs have.) >>> >> >> Once per job will not work, as we may have several guest writes in parallel >> and therefore >> several parallel copy-before-write operations. > > Hm. I’m not quite happy with that because if the guest just issues many > large discards in parallel, this means that qemu will allocate a large > amount of memory. > > It would be nice if there was a simple way to keep track of the total > memory usage and let requests yield if they would exceed it.
Agree, it should be fixed anyway. > >> Or if you mean writing an allocator based >> on once-allocated buffer like in mirror, I really dislike this idea, as we >> already have >> allocator: memalign/malloc/free and it works well, no reason to invent a new >> one and >> hardcode it into block layer (look at my answer to Eric on v2 of this patch >> for more info). > > Well, at least it’d be something we can delay until blockdev-copy > arrives(TM). > > Max > >> Or, if you mean only backup_loop generated copy-requests, yes we may keep >> only one buffer for them, >> but: >> 1. it is not how it works now, so my patch is not a degradation in this case >> 2. I'm going to parallelize backup loop too, like my series "qcow2: async >> handling of fragmented io", >> which will need several allocated buffers anyway. >> >>> >>>> ret = backup_cow_with_bounce_buffer(job, start, dirty_end, >>>> is_write_notifier, >>>> - error_is_read, >>>> &bounce_buffer); >>>> + error_is_read, >>>> bounce_buffer); >>>> } >>>> if (ret < 0) { >>>> break; >>>> >>> >>> >> >> > > -- Best regards, Vladimir