backup: teach backup_cow_with_bounce_buffer to copy more at once

Vladimir Sementsov-Ogievskiy Mon, 12 Aug 2019 09:37:38 -0700

12.08.2019 19:11, Max Reitz wrote:
> On 12.08.19 17:47, Vladimir Sementsov-Ogievskiy wrote:
>> 12.08.2019 18:10, Max Reitz wrote:
>>> On 10.08.19 21:31, Vladimir Sementsov-Ogievskiy wrote:
>>>> backup_cow_with_offload can transfer more than one cluster. Let
>>>> backup_cow_with_bounce_buffer behave similarly. It reduces the number
>>>> of IO requests, since there is no need to copy cluster by cluster.
>>>>
>>>> Logic around bounce_buffer allocation changed: we can't just allocate
>>>> one-cluster-sized buffer to share for all iterations. We can't also
>>>> allocate buffer of full-request length it may be too large, so
>>>> BACKUP_MAX_BOUNCE_BUFFER is introduced. And finally, allocation logic
>>>> is to allocate a buffer sufficient to handle all remaining iterations
>>>> at the point where we need the buffer for the first time.
>>>>
>>>> Bonus: get rid of pointer-to-pointer.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
>>>> ---
>>>>    block/backup.c | 65 +++++++++++++++++++++++++++++++-------------------
>>>>    1 file changed, 41 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/block/backup.c b/block/backup.c
>>>> index d482d93458..65f7212c85 100644
>>>> --- a/block/backup.c
>>>> +++ b/block/backup.c
>>>> @@ -27,6 +27,7 @@
>>>>    #include "qemu/error-report.h"
>>>>    
>>>>    #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
>>>> +#define BACKUP_MAX_BOUNCE_BUFFER (64 * 1024 * 1024)
>>>>    
>>>>    typedef struct CowRequest {
>>>>        int64_t start_byte;
>>>> @@ -98,44 +99,55 @@ static void cow_request_end(CowRequest *req)
>>>>        qemu_co_queue_restart_all(&req->wait_queue);
>>>>    }
>>>>    
>>>> -/* Copy range to target with a bounce buffer and return the bytes copied. 
>>>> If
>>>> - * error occurred, return a negative error number */
>>>> +/*
>>>> + * Copy range to target with a bounce buffer and return the bytes copied. 
>>>> If
>>>> + * error occurred, return a negative error number
>>>> + *
>>>> + * @bounce_buffer is assumed to enough to store
>>>
>>> s/to/to be/
>>>
>>>> + * MIN(BACKUP_MAX_BOUNCE_BUFFER, @end - @start) bytes
>>>> + */
>>>>    static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob 
>>>> *job,
>>>>                                                          int64_t start,
>>>>                                                          int64_t end,
>>>>                                                          bool 
>>>> is_write_notifier,
>>>>                                                          bool 
>>>> *error_is_read,
>>>> -                                                      void 
>>>> **bounce_buffer)
>>>> +                                                      void *bounce_buffer)
>>>>    {
>>>>        int ret;
>>>>        BlockBackend *blk = job->common.blk;
>>>> -    int nbytes;
>>>> +    int nbytes, remaining_bytes;
>>>>        int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>>    
>>>>        assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>> -    bdrv_reset_dirty_bitmap(job->copy_bitmap, start, job->cluster_size);
>>>> -    nbytes = MIN(job->cluster_size, job->len - start);
>>>> -    if (!*bounce_buffer) {
>>>> -        *bounce_buffer = blk_blockalign(blk, job->cluster_size);
>>>> -    }
>>>> +    bdrv_reset_dirty_bitmap(job->copy_bitmap, start, end - start);
>>>> +    nbytes = MIN(end - start, job->len - start);
>>>>    
>>>> -    ret = blk_co_pread(blk, start, nbytes, *bounce_buffer, read_flags);
>>>> -    if (ret < 0) {
>>>> -        trace_backup_do_cow_read_fail(job, start, ret);
>>>> -        if (error_is_read) {
>>>> -            *error_is_read = true;
>>>> +
>>>> +    remaining_bytes = nbytes;
>>>> +    while (remaining_bytes) {
>>>> +        int chunk = MIN(BACKUP_MAX_BOUNCE_BUFFER, remaining_bytes);
>>>> +
>>>> +        ret = blk_co_pread(blk, start, chunk, bounce_buffer, read_flags);
>>>> +        if (ret < 0) {
>>>> +            trace_backup_do_cow_read_fail(job, start, ret);
>>>> +            if (error_is_read) {
>>>> +                *error_is_read = true;
>>>> +            }
>>>> +            goto fail;
>>>>            }
>>>> -        goto fail;
>>>> -    }
>>>>    
>>>> -    ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer,
>>>> -                        job->write_flags);
>>>> -    if (ret < 0) {
>>>> -        trace_backup_do_cow_write_fail(job, start, ret);
>>>> -        if (error_is_read) {
>>>> -            *error_is_read = false;
>>>> +        ret = blk_co_pwrite(job->target, start, chunk, bounce_buffer,
>>>> +                            job->write_flags);
>>>> +        if (ret < 0) {
>>>> +            trace_backup_do_cow_write_fail(job, start, ret);
>>>> +            if (error_is_read) {
>>>> +                *error_is_read = false;
>>>> +            }
>>>> +            goto fail;
>>>>            }
>>>> -        goto fail;
>>>> +
>>>> +        start += chunk;
>>>> +        remaining_bytes -= chunk;
>>>>        }
>>>>    
>>>>        return nbytes;
>>>> @@ -301,9 +313,14 @@ static int coroutine_fn backup_do_cow(BackupBlockJob 
>>>> *job,
>>>>                }
>>>>            }
>>>>            if (!job->use_copy_range) {
>>>> +            if (!bounce_buffer) {
>>>> +                size_t len = MIN(BACKUP_MAX_BOUNCE_BUFFER,
>>>> +                                 MAX(dirty_end - start, end - dirty_end));
>>>> +                bounce_buffer = blk_try_blockalign(job->common.blk, len);
>>>> +            }
>>>
>>> If you use _try_, you should probably also check whether it succeeded.
>>
>> Oops, you are right, of course.
>>
>>>
>>> Anyway, I wonder whether it’d be better to just allocate this buffer
>>> once per job (the first time we get here, probably) to be of size
>>> BACKUP_MAX_BOUNCE_BUFFER and put it into BackupBlockJob.  (And maybe add
>>> a buf-size parameter similar to what the mirror jobs have.)
>>>
>>
>> Once per job will not work, as we may have several guest writes in parallel 
>> and therefore
>> several parallel copy-before-write operations.
> 
> Hm.  I’m not quite happy with that because if the guest just issues many
> large discards in parallel, this means that qemu will allocate a large
> amount of memory.
> 
> It would be nice if there was a simple way to keep track of the total
> memory usage and let requests yield if they would exceed it.


Agree, it should be fixed anyway.

> 
>> Or if you mean writing an allocator based
>> on once-allocated buffer like in mirror, I really dislike this idea, as we 
>> already have
>> allocator: memalign/malloc/free and it works well, no reason to invent a new 
>> one and
>> hardcode it into block layer (look at my answer to Eric on v2 of this patch 
>> for more info).
> 
> Well, at least it’d be something we can delay until blockdev-copy
> arrives(TM).
> 
> Max
> 
>> Or, if you mean only backup_loop generated copy-requests, yes we may keep 
>> only one buffer for them,
>> but:
>> 1. it is not how it works now, so my patch is not a degradation in this case
>> 2. I'm going to parallelize backup loop too, like my series "qcow2: async 
>> handling of fragmented io",
>>      which will need several allocated buffers anyway.
>>
>>>
>>>>                ret = backup_cow_with_bounce_buffer(job, start, dirty_end,
>>>>                                                    is_write_notifier,
>>>> -                                                error_is_read, 
>>>> &bounce_buffer);
>>>> +                                                error_is_read, 
>>>> bounce_buffer);
>>>>            }
>>>>            if (ret < 0) {
>>>>                break;
>>>>
>>>
>>>
>>
>>
> 
> 


-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once

Reply via email to