On 10.07.20 19:41, Andrey Shinkevich wrote: > On 10.07.2020 18:24, Max Reitz wrote: >> On 09.07.20 16:52, Andrey Shinkevich wrote: >>> On 25.06.2020 18:21, Max Reitz wrote: >>>> Because of the (not so recent anymore) changes that make the stream job >>>> independent of the base node and instead track the node above it, we >>>> have to split that "bottom" node into two cases: The bottom COW node, >>>> and the node directly above the base node (which may be an R/W filter >>>> or the bottom COW node). >>>> >>>> Signed-off-by: Max Reitz <mre...@redhat.com> >>>> --- >>>> qapi/block-core.json | 4 +++ >>>> block/stream.c | 63 >>>> ++++++++++++++++++++++++++++++++------------ >>>> blockdev.c | 4 ++- >>>> 3 files changed, 53 insertions(+), 18 deletions(-) >>>> >>>> diff --git a/qapi/block-core.json b/qapi/block-core.json >>>> index b20332e592..df87855429 100644 >>>> --- a/qapi/block-core.json >>>> +++ b/qapi/block-core.json >>>> @@ -2486,6 +2486,10 @@ >>>> # On successful completion the image file is updated to drop the >>>> backing file >>>> # and the BLOCK_JOB_COMPLETED event is emitted. >>>> # >>>> +# In case @device is a filter node, block-stream modifies the first >>>> non-filter >>>> +# overlay node below it to point to base's backing node (or NULL if >>>> @base was >>>> +# not specified) instead of modifying @device itself. >>>> +# >>>> # @job-id: identifier for the newly-created block job. If >>>> # omitted, the device name will be used. (Since 2.7) >>>> # >>>> diff --git a/block/stream.c b/block/stream.c >>>> index aa2e7af98e..b9c1141656 100644 >>>> --- a/block/stream.c >>>> +++ b/block/stream.c >>>> @@ -31,7 +31,8 @@ enum { >>>> typedef struct StreamBlockJob { >>>> BlockJob common; >>>> - BlockDriverState *bottom; >>>> + BlockDriverState *base_overlay; /* COW overlay (stream from >>>> this) */ >>>> + BlockDriverState *above_base; /* Node directly above the base */ >>> Keeping the base_overlay is enough to complete the stream job. >> Depends on the definition. If we decide it isn’t enough, then it isn’t >> enough. >> >>> The above_base may disappear during the job and we can't rely on it. >> In this version of this series, it may not, because the chain is frozen. >> So the above_base cannot disappear. > > Once we insert a filter above the top bs of the stream job, the parallel > jobs in > > the iotests #030 will fail with 'frozen link error'. It is because of the > > independent parallel stream or commit jobs that insert/remove their filters > > asynchroniously.
I’m not sure whether that’s a problem with this series specifically. >> We can discuss whether we should allow it to disappear, but I think not. >> >> The problem is, we need something to set as the backing file after >> streaming. How do we figure out what that should be? My proposal is we >> keep above_base and use its immediate child. > > We can do the same with the base_overlay. > > If the backing node turns out to be a filter, the proper backing child will > > be set after the filter is removed. So, we shouldn't care. And what if the user manually added some filter above the base (i.e. below base_overlay) that they want to keep after the job? >> If we don’t keep above_base, then we’re basically left guessing as to >> what should be the backing file after the stream job. >> >>>> BlockdevOnError on_error; >>>> char *backing_file_str; >>>> bool bs_read_only; >>>> @@ -53,7 +54,7 @@ static void stream_abort(Job *job) >>>> if (s->chain_frozen) { >>>> BlockJob *bjob = &s->common; >>>> - bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->bottom); >>>> + bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->above_base); >>>> } >>>> } >>>> @@ -62,14 +63,15 @@ static int stream_prepare(Job *job) >>>> StreamBlockJob *s = container_of(job, StreamBlockJob, >>>> common.job); >>>> BlockJob *bjob = &s->common; >>>> BlockDriverState *bs = blk_bs(bjob->blk); >>>> - BlockDriverState *base = backing_bs(s->bottom); >>>> + BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs); >>>> + BlockDriverState *base = bdrv_filter_or_cow_bs(s->above_base); >>> The initial base node may be a top node for a concurrent commit job and >>> >>> may disappear. >> Then it would just be replaced by another node, though, so above_base >> keeps a child. The @base here is not necessarily the initial @base, and >> that’s intentional. > > Not really. In my example, above_base becomes a dangling > > pointer because after the commit job finishes, its filter that should > belong to the > > commit job frozen chain will be deleted. If we freeze the link to the > above_base > > for this job, the iotests #30 will not pass. So it doesn’t become a dangling pointer, because it’s frozen. 030 passes after this series, so I’m not sure whether I can consider that problem part of this series. I think if adding a filter node becomes a problem, we have to consider relaxing the restrictions when we do that, not now. >>> base = bdrv_filter_or_cow_bs(s->base_overlay) is more reliable. >> But also wrong. The point of keeping above_base around is to get its >> child here to use that child as the new backing child of the top node. >> >>>> Error *local_err = NULL; >>>> int ret = 0; >>>> - bdrv_unfreeze_backing_chain(bs, s->bottom); >>>> + bdrv_unfreeze_backing_chain(bs, s->above_base); >>>> s->chain_frozen = false; >>>> - if (bs->backing) { >>>> + if (bdrv_cow_child(unfiltered_bs)) { >>>> const char *base_id = NULL, *base_fmt = NULL; >>>> if (base) { >>>> base_id = s->backing_file_str; >>>> @@ -77,8 +79,8 @@ static int stream_prepare(Job *job) >>>> base_fmt = base->drv->format_name; >>>> } >>>> } >>>> - bdrv_set_backing_hd(bs, base, &local_err); >>>> - ret = bdrv_change_backing_file(bs, base_id, base_fmt); >>>> + bdrv_set_backing_hd(unfiltered_bs, base, &local_err); >>>> + ret = bdrv_change_backing_file(unfiltered_bs, base_id, >>>> base_fmt); >>>> if (local_err) { >>>> error_report_err(local_err); >>>> return -EPERM; >>>> @@ -109,14 +111,15 @@ static int coroutine_fn stream_run(Job *job, >>>> Error **errp) >>>> StreamBlockJob *s = container_of(job, StreamBlockJob, >>>> common.job); >>>> BlockBackend *blk = s->common.blk; >>>> BlockDriverState *bs = blk_bs(blk); >>>> - bool enable_cor = !backing_bs(s->bottom); >>>> + BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs); >>>> + bool enable_cor = !bdrv_cow_child(s->base_overlay); >>>> int64_t len; >>>> int64_t offset = 0; >>>> uint64_t delay_ns = 0; >>>> int error = 0; >>>> int64_t n = 0; /* bytes */ >>>> - if (bs == s->bottom) { >>>> + if (unfiltered_bs == s->base_overlay) { >>>> /* Nothing to stream */ >>>> return 0; >>>> } >>>> @@ -150,13 +153,14 @@ static int coroutine_fn stream_run(Job *job, >>>> Error **errp) >>>> copy = false; >>>> - ret = bdrv_is_allocated(bs, offset, STREAM_CHUNK, &n); >>>> + ret = bdrv_is_allocated(unfiltered_bs, offset, STREAM_CHUNK, >>>> &n); >>>> if (ret == 1) { >>>> /* Allocated in the top, no need to copy. */ >>>> } else if (ret >= 0) { >>>> /* Copy if allocated in the intermediate images. Limit >>>> to the >>>> * known-unallocated area [offset, >>>> offset+n*BDRV_SECTOR_SIZE). */ >>>> - ret = bdrv_is_allocated_above(backing_bs(bs), s->bottom, >>>> true, >>>> + ret = bdrv_is_allocated_above(bdrv_cow_bs(unfiltered_bs), >>>> + s->base_overlay, true, >>>> offset, n, &n); >>>> /* Finish early if end of backing file has been >>>> reached */ >>>> if (ret == 0 && n == 0) { >>>> @@ -223,9 +227,29 @@ void stream_start(const char *job_id, >>>> BlockDriverState *bs, >>>> BlockDriverState *iter; >>>> bool bs_read_only; >>>> int basic_flags = BLK_PERM_CONSISTENT_READ | >>>> BLK_PERM_WRITE_UNCHANGED; >>>> - BlockDriverState *bottom = bdrv_find_overlay(bs, base); >>>> + BlockDriverState *base_overlay = bdrv_find_overlay(bs, base); >>>> + BlockDriverState *above_base; >>>> - if (bdrv_freeze_backing_chain(bs, bottom, errp) < 0) { >>>> + if (!base_overlay) { >>>> + error_setg(errp, "'%s' is not in the backing chain of '%s'", >>>> + base->node_name, bs->node_name); >>> Sorry, I am not clear with the error message. >>> >>> In this case, there is no an intermediate COW node but the base, if not >>> NULL, is >>> >>> in the backing chain of bs, isn't it? >>> >>>> + return; >>>> + } >>>> + >>>> + /* >>>> + * Find the node directly above @base. @base_overlay is a COW >>>> overlay, so >>>> + * it must have a bdrv_cow_child(), but it is the immediate >>>> overlay of >>>> + * @base, so between the two there can only be filters. >>>> + */ >>>> + above_base = base_overlay; >>>> + if (bdrv_cow_bs(above_base) != base) { >>>> + above_base = bdrv_cow_bs(above_base); >>>> + while (bdrv_filter_bs(above_base) != base) { >>>> + above_base = bdrv_filter_bs(above_base); >>>> + } >>>> + } >>>> + >>>> + if (bdrv_freeze_backing_chain(bs, above_base, errp) < 0) { >>> When a concurrent stream job tries to freeze or remove the above_base >>> node, >>> >>> we will encounter the frozen node error. The above_base node is a part >>> of the >>> >>> concurrent job frozen chain. >> Correct. >> >>>> return; >>>> } >>>> @@ -255,14 +279,19 @@ void stream_start(const char *job_id, >>>> BlockDriverState *bs, >>>> * and resizes. Reassign the base node pointer because the >>>> backing BS of the >>>> * bottom node might change after the call to >>>> bdrv_reopen_set_read_only() >>>> * due to parallel block jobs running. >>>> + * above_base node might change after the call to >>> Yes, if not frozen. >>>> + * bdrv_reopen_set_read_only() due to parallel block jobs running. >>>> */ >>>> - base = backing_bs(bottom); >>>> - for (iter = backing_bs(bs); iter && iter != base; iter = >>>> backing_bs(iter)) { >>>> + base = bdrv_filter_or_cow_bs(above_base); >>>> + for (iter = bdrv_filter_or_cow_bs(bs); iter != base; >>>> + iter = bdrv_filter_or_cow_bs(iter)) >>>> + { >>>> block_job_add_bdrv(&s->common, "intermediate node", iter, 0, >>>> basic_flags, &error_abort); >>>> } >>>> - s->bottom = bottom; >>>> + s->base_overlay = base_overlay; >>>> + s->above_base = above_base; >>> Generally, being the filter for a concurrent job, the above_base node >>> may be deleted any time >>> >>> and we will keep the dangling pointer. It may happen even earlier if >>> above_base is not frozen. >>> >>> If it is, as it here, we may get the frozen link error then. >> I’m not sure what you mean here. Freezing it was absolutely >> intentional. A dangling pointer would be a problem, but that’s why it’s >> frozen, so it stays around and can’t be deleted any time. >> >> Max > > The nodes we freeze should be in one context of the relevant job: > > filter->top_node->intermediate_node(s) > > We would not include the base or any filter above it to the frozen chain > > because they are of a different job context. They aren’t really, because we need to know the backing node of @device after the job. > Once 'this' job is completed, we set the current backing child of the > base_overlay > > and may not care of its character. If that is another job filter, it > will be replaced > > with the proper node afterwards. But what if there are filters above the base that the user wants to keep after the job? Max
signature.asc
Description: OpenPGP digital signature