On 02.07.20 11:19, Vladimir Sementsov-Ogievskiy wrote: > 02.07.2020 11:09, Max Reitz wrote: >> On 01.07.20 16:34, Vladimir Sementsov-Ogievskiy wrote: >>> 30.06.2020 11:45, Max Reitz wrote: >>>> This migration parameter allows mapping block node names and bitmap >>>> names to aliases for the purpose of block dirty bitmap migration. >>>> >>>> This way, management tools can use different node and bitmap names on >>>> the source and destination and pass the mapping of how bitmaps are >>>> to be >>>> transferred to qemu (on the source, the destination, or even both with >>>> arbitrary aliases in the migration stream). >>>> >>>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>>> Signed-off-by: Max Reitz <mre...@redhat.com> >>>> --- >>>> qapi/migration.json | 83 +++++++- >>>> migration/migration.h | 3 + >>>> migration/block-dirty-bitmap.c | 372 >>>> ++++++++++++++++++++++++++++----- >>>> migration/migration.c | 29 +++ >>>> 4 files changed, 432 insertions(+), 55 deletions(-) >>>> >>>> diff --git a/qapi/migration.json b/qapi/migration.json >>>> index d5000558c6..5aeae9bea8 100644 >>>> --- a/qapi/migration.json >>>> +++ b/qapi/migration.json >>>> @@ -507,6 +507,44 @@ >>>> 'data': [ 'none', 'zlib', >>>> { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] } >>>> +## >>>> +# @BitmapMigrationBitmapAlias: >>>> +# >>>> +# @name: The name of the bitmap. >>>> +# >>>> +# @alias: An alias name for migration (for example the bitmap name on >>>> +# the opposite site). >>>> +# >>>> +# Since: 5.1 >>>> +## >>>> +{ 'struct': 'BitmapMigrationBitmapAlias', >>>> + 'data': { >>>> + 'name': 'str', >>>> + 'alias': 'str' >>>> + } } >>>> + >>>> +## >>>> +# @BitmapMigrationNodeAlias: >>>> +# >>>> +# Maps a block node name and the bitmaps it has to aliases for dirty >>>> +# bitmap migration. >>>> +# >>>> +# @node-name: A block node name. >>>> +# >>>> +# @alias: An alias block node name for migration (for example the >>>> +# node name on the opposite site). >>>> +# >>>> +# @bitmaps: Mappings for the bitmaps on this node. >>>> +# >>>> +# Since: 5.1 >>>> +## >>>> +{ 'struct': 'BitmapMigrationNodeAlias', >>>> + 'data': { >>>> + 'node-name': 'str', >>>> + 'alias': 'str', >>>> + 'bitmaps': [ 'BitmapMigrationBitmapAlias' ] >>> >>> So, we still can't migrate bitmaps from one node to different nodes, >>> but we >>> also don't know a usecase for it, so it seems OK. But with such >>> scheme we >>> can select and rename bitmaps in-flight, and Peter said about >>> corresponding >>> use-case. >>> >>> I'm OK with this, still, just an idea in my mind: >>> >>> we could instead just have a list of >>> >>> BitmapMigrationAlias: { >>> node-name >>> bitmap-name >>> node-alias >>> bitmap-alias >>> } >>> >>> so, mapping is set for each bitmap in separate. >> >> Well, OK, but why? > > But why not :) Just thinking out loud. May be someone will imaging good > reasons for it.
The reason for “Why not” is that this code now exists. ;) >>>> + } >>>> + } >>>> + >>>> + g_hash_table_insert(bitmaps_map, >>>> + g_strdup(bmap_map_from), >>>> g_strdup(bmap_map_to)); >>>> + } >>>> + >>>> + amin = g_new(AliasMapInnerNode, 1); >>>> + *amin = (AliasMapInnerNode){ >>> >>> style: space before '{' >> >> Is that required? > > If checkpatch doesn't complain, than not.. O:) >>>> + .string = g_strdup(node_map_to), >>>> + .subtree = bitmaps_map, >>>> + }; >>>> + >>>> + g_hash_table_insert(alias_map, g_strdup(node_map_from), amin); >>>> + } >>>> + >>>> + return alias_map; >>>> + >>>> +fail: >>>> + g_hash_table_destroy(alias_map); >>>> + return NULL; >>>> +} >>>> + >>>> +/** >>>> + * Run construct_alias_map() in both directions to check whether @bbm >>>> + * is valid. >>>> + * (This function is to be used by migration/migration.c to validate >>>> + * the user-specified block-bitmap-mapping migration parameter.) >>>> + * >>>> + * Returns true if and only if the mapping is valid. >>>> + */ >>>> +bool check_dirty_bitmap_mig_alias_map(const >>>> BitmapMigrationNodeAliasList *bbm, >>>> + Error **errp) >>>> +{ >>>> + GHashTable *alias_map; >>>> + >>>> + alias_map = construct_alias_map(bbm, true, errp); >>>> + if (!alias_map) { >>>> + return false; >>>> + } >>>> + g_hash_table_destroy(alias_map); >>>> + >>>> + alias_map = construct_alias_map(bbm, false, errp); >>>> + if (!alias_map) { >>>> + return false; >>>> + } >>>> + g_hash_table_destroy(alias_map); >>>> + >>>> + return true; >>>> +} >>>> + >>>> void init_dirty_bitmap_incoming_migration(void) >>>> { >>>> qemu_mutex_init(&finish_lock); >>>> @@ -191,11 +351,11 @@ static void send_bitmap_header(QEMUFile *f, >>>> DirtyBitmapMigBitmapState *dbms, >>>> qemu_put_bitmap_flags(f, flags); >>>> if (flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) { >>>> - qemu_put_counted_string(f, dbms->node_name); >>>> + qemu_put_counted_string(f, dbms->node_alias); >>>> } >>>> if (flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) { >>>> - qemu_put_counted_string(f, bdrv_dirty_bitmap_name(bitmap)); >>>> + qemu_put_counted_string(f, dbms->bitmap_alias); >>>> } >>>> } >>>> @@ -263,15 +423,20 @@ static void dirty_bitmap_mig_cleanup(void) >>>> QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, >>>> entry); >>>> bdrv_dirty_bitmap_set_busy(dbms->bitmap, false); >>>> bdrv_unref(dbms->bs); >>>> + g_free(dbms->node_alias); >>>> + g_free(dbms->bitmap_alias); >>>> g_free(dbms); >>>> } >>>> } >>>> /* Called with iothread lock taken. */ >>>> -static int add_bitmaps_to_list(BlockDriverState *bs, const char >>>> *bs_name) >>>> +static int add_bitmaps_to_list(BlockDriverState *bs, const char >>>> *bs_name, >>>> + GHashTable *alias_map) >>>> { >>>> BdrvDirtyBitmap *bitmap; >>>> DirtyBitmapMigBitmapState *dbms; >>>> + GHashTable *bitmap_aliases; >>> >>> can bitmap_aliases be const ptr too? >> >> Unfortunately no because g_hash_table_lookup() expects the hash table to >> not be const, for whatever reason. >> >>>> + const char *node_alias, *bitmap_name, *bitmap_alias; >>>> Error *local_err = NULL; >>>> bitmap = bdrv_dirty_bitmap_first(bs); >>>> @@ -279,21 +444,40 @@ static int add_bitmaps_to_list(BlockDriverState >>>> *bs, const char *bs_name) >>>> return 0; >>>> } >>>> + bitmap_name = bdrv_dirty_bitmap_name(bitmap); >>> >>> Note, that bitmap is wrong here: it may be internal unnamed bitmap which >>> we should ignore. >>> I have sent a patch for this: "[PATCH] migration/block-dirty-bitmap: fix >>> add_bitmaps_to_list" >>> >>>> + >>>> if (!bs_name || strcmp(bs_name, "") == 0) { >>>> error_report("Bitmap '%s' in unnamed node can't be >>>> migrated", >>>> - bdrv_dirty_bitmap_name(bitmap)); >>>> + bitmap_name); >>>> return -1; >>>> } >>>> - if (bs_name[0] == '#') { >>>> + if (alias_map) { >>>> + const AliasMapInnerNode *amin = >>>> g_hash_table_lookup(alias_map, bs_name); >>>> + >>>> + if (!amin) { >>>> + error_report("Bitmap '%s' on node '%s' with no alias >>>> cannot be " >>>> + "migrated", bitmap_name, bs_name); >>> >>> As I've said before, it may be reasonable to ignore bitmaps not >>> referenced in the hash-table. >> >> No problem with that. We just decided on this behavior when we >> discussed the RFC. > > Sorry for that. The reason for my changed opinion is a recent bug from > customers about bitmap migration. No problem. (My original proposal was different still, where non-specified mappings would default to the identity mapping.) >>> And it seems to be good default behavior. We are really tired from >>> problems with bitmaps which >>> can't migrate for different reasons, when bitmaps are actually >>> non-critical data, and choosing >>> from customer problems like: >>> >>> 1. - Hey, after update migration is broken! It says that some bitmaps >>> can't be migrated, what is that? >>> 2. - Hmm, it seems, that in some cases, incremental backup >>> doesn't work >>> after migration and full backup >>> is automatically done instead.. Why? >>> >>> I now prefer the [2]. >>> >>>> + return -1; >>>> + } >>>> + >>>> + node_alias = amin->string; >>>> + bitmap_aliases = amin->subtree; >>>> + } else { >>>> + node_alias = bs_name; >>>> + bitmap_aliases = NULL; >>>> + } >>> >>> Hmm, actually bs_name argument is incompatiblewith alias_map, let's >>> assert that they are not non-null simultaneously. >>> >>> Ah stop, I see, you use bs_name as node-name later and before.. Hmm, >>> seems all this a bit confused. >>> >>> Prior the patch, why do we have bs_name: because it may be node-name or >>> blk-name, to be use in migration protocol as (actually) an alias, so >>> bs_name is more like an alias.. >>> >>> So, we have bs, which already have bs->node_name, we have alias_map, >>> where we have relation node_name -> alias, and we have bs_name, which is >>> something like an alias_name. >>> >>> I think the most clean thing is to allow only one of bs_name and >>> alias_map to be non-null, use bs_name only for old scenario, and for new >>> scenario use bdrv_get_node_name() for error-reporting. And a comment >>> about function semantics won't hurt. >>> >>> upd: aha, I see that in case of new semantics, bs_name is exactly >>> bdrv_get_node_name(). It's a bit redundant but OK too.. Let's at least >>> add an assertion. >> >> Now I’m confused. What assertion? I don’t think I want to add an >> assertion that exactly one of bs_name or alias_map is NULL, because that >> would complicate the code. The caller would have to pass NULL for >> bs_name, and then add_bitmaps_to_list() would need to fetch the node >> name again. That seems redundant to me. > > I mean something like > > assert(!alias_map || !strcmp(alias_map, bdrv_get_node_name(bs)); Ah, OK, sure. (with s/alias_map/bs_name/ in the strcmp(), I presume) > I am afraid of interfaces with redundant parameters, it seems strange > and unsafe to pass node_name together with bs, which has bs->node_name > which is (and must be, in case of new semantics with alias_map) the same. > > Still, I don't mind keeping it as is, I can think of some refactoring > (if any) later. > >> >>>> + >>>> + if (node_alias[0] == '#') { >>>> error_report("Bitmap '%s' in a node with auto-generated " >>>> "name '%s' can't be migrated", >>>> - bdrv_dirty_bitmap_name(bitmap), bs_name); >>>> + bitmap_name, node_alias); >>> >>> OK, this should not relate to mapped case, as aliases are well-formed. >>> >>>> return -1; >>>> } >>>> FOR_EACH_DIRTY_BITMAP(bs, bitmap) { >>>> - if (!bdrv_dirty_bitmap_name(bitmap)) { >>>> + bitmap_name = bdrv_dirty_bitmap_name(bitmap); >>>> + if (!bitmap_name) { >>>> continue; >>>> } >>>> @@ -302,12 +486,24 @@ static int >>>> add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name) >>>> return -1; >>>> } >>>> + if (bitmap_aliases) { >>>> + bitmap_alias = g_hash_table_lookup(bitmap_aliases, >>>> bitmap_name); >>>> + if (!bitmap_alias) { >>>> + error_report("Bitmap '%s' with no alias on node '%s' >>>> cannot be " >>>> + "migrated", bitmap_name, bs_name); >>>> + return -1; >>>> + } >>>> + } else { >>>> + bitmap_alias = bitmap_name; >>>> + } >>>> + >>> >>> Hmm, we don't error-out if we didn't find a bitmap, specified in >>> alias-map. But it seems to be an error from the user point-of-view (the >>> required action can't be done). >>> >>> So, probably, we want loop through the alias-map (and in this case we >>> don't need a map, but can work with QAPI list), find corresponding >>> bitmaps and migrate them, and fail if some specified in the alias-map >>> bitmap is not found. >> >> Do we? >> >> I don’t consider setting an alias an action request like “Migrate this >> bitmap”. I just consider it establishing a mapping. If some elements >> are not used, so be it. > > OK. This non-strict behavior is in good relation with ignoring > not-mapped bitmaps which I've proposed. I think so, too. > We can add any kind of restrictions as an option later. Oh, yes, that, too. Like you proposed in the RFC, we could add e.g. a block-bitmap-mapping-strictness migration parameter at a later point if there’s a use for something like it. >> I don’t know if doing it differently would actually be beneficial for >> anyone, but OTOH naively it seems like a more invasive code change. >> > > I don't see real benefits, we can go either way, so, not worth rewriting > the patch. > > === > > I feel like a stupid reviewer :) Huh? If anything, a stupid review on a design-changing patch would be a plain “R-b” without having actually considered the impact. You do consider the impact and question it in all places. I don’t think I need to mention this, but that’s a very good and important thing to do, because it forces me to reason why we’d want this or that design. Without being questioned, I wouldn’t have to reason about that. (Which may be a problem in our patch workflow – authors don’t need to reason unless questioned.[1]) Sorry if I gave the impression of dismissing your comments. It should be my burden to reason why I took certain design decisions. Max [1] I suppose I should address this for my own patches by meticulously writing down everything where I had to question myself and then always include my reasoning in the patch notes somewhere.
signature.asc
Description: OpenPGP digital signature