02.10.2019 14:11, Kevin Wolf wrote: > Am 02.10.2019 um 12:46 hat Peter Krempa geschrieben: >> On Tue, Oct 01, 2019 at 12:07:54 -0400, John Snow wrote: >>> >>> >>> On 10/1/19 11:57 AM, Vladimir Sementsov-Ogievskiy wrote: >>>> 01.10.2019 17:10, John Snow wrote: >>>>> >>>>> >>>>> On 10/1/19 10:00 AM, Vladimir Sementsov-Ogievskiy wrote: >>>>>>> Otherwise: I have a lot of cloudy ideas on how to solve this, but >>>>>>> ultimately what we want is to be able to find the "addressable" name for >>>>>>> the node the bitmap is attached to, which would be the name of the first >>>>>>> ancestor node that isn't a filter. (OR, the name of the block-backend >>>>>>> above that node.) >>>>>> Not the name of ancestor node, it will break mapping: it must be name of >>>>>> the >>>>>> node itself or name of parent (may be through several filters) >>>>>> block-backend >>>>>> >>>>> >>>>> Ah, you are right of course -- because block-backends are the only >>>>> "nodes" for which we actually descend the graph and add the bitmap to >>>>> its child. >>>>> >>>>> So the real back-resolution mechanism is: >>>>> >>> >>> Amendment: >>> - If our local node-name N is well-formed, use this. >> >> I'd like to re-iterate that the necessity to keep node names same on >> both sides of migration is unexpected, undocumented and in some cases >> impossible. > > I think the (implicitly made) requirement is not that all node-names are > kept the same, but only the node-names of those nodes for which > migration transfers some state. > > It seems to me that bitmap migration is the first case of putting > something in the migration stream that isn't related to a frontend, but > to the backend, so the usual device hierarchy to address information > doesn't work here. And it seems the implications of this weren't really > considered sufficiently, resulting in the design problem we're > discussing now. > > What we need to transfer is dirty bitmaps, which can be attached to any > node in the block graph. If we accept that the way to transfer this is > the migration stream, we need a way to tell which bitmap belongs to > which node. Matching node-name is the obvious answer, just like a > matching device tree hierarchy is used for frontends. > > If we don't want to use the migration stream for backends, we would need > to find another way to transfer the bitmaps. I would welcome removing > backend data from the migration stream, but if this includes > non-persistent bitmaps, I don't see what the alternative could be.
But how to migrate persistent bitmaps if storage is not shared? And even with only persistent bitmaps and shared storage: bitmaps data may be large, and storing/loading it during migration downtime will increase it. > >> If you want to mandate that they must be kept the same please document >> it and also note the following: >> >> - during migrations the storage layout may change e.g. a backing chain >> may become flattened, thus keeping node names stable beyond the top >> layer is impossible > > You don't want to transfer bitmaps of nodes that you're going to drop. > I'm not an expert for these bitmaps, but I think this just means you > would have to disable any bitmaps on the backing files to be dropped on > the source host before you migrate. You mean remove them.. But yes, any way it's not a problem. If corresponding node isn't exist on target, we don't need any bitmaps for it. > >> - in some cases (readonly image in a cdrom not present on destination, >> thus not relevant here probably) it may even become impossible to >> create any node thus keeping the top node may be impossible > > Same thing, you don't want to transfer a bitmap for a node that > disappears. > >> - it should be documented when and why this happens and how management >> tools are supposed to do it >> >> - please let me know what's actually expected, since libvirt >> didn't enable blockdev yet we can fix any unexpected expectations >> >> - Document it so that the expectations don't change after this. > > Yes, we need a good and ideally future-proof rule of which node-names > need to stay the same. Currently it's only bitmaps, but might we get > another feature later where we want to transfer more backend data? > >> - Ideally node names will not be bound to anything and freely >> changeable. If necessary we can provide a map to qemu during migration >> which is probably less painful and more straightforward than keeping >> them in sync somehow ... > > A map feels painful for the average user (and for the QEMU > implementation), even if it looks convenient for libvirt. If anything, > I'd make it optional and default to 1:1 mappings for anything that isn't > explicitly mapped. > Hmm, I don't think that optional map is painful. What about the following: 1. If map is provided: - migrate only bitmaps in nodes, specified by map - bitmaps migrated only accordingly to the map, block device names are not involved at all 2. If map not provided: - For nodes directly bound to named block backends, or through several filters, use name of this block backend. - For other nodes use node-name === And I think [2.] should be done now to fix current bug, and [1.] may be postponed until we really need it. -- Best regards, Vladimir