On Wed, Jan 11, 2023 at 05:58:36PM +0100, David Hildenbrand wrote: > On 11.01.23 17:35, Peter Xu wrote: > > On Wed, Jan 11, 2023 at 02:48:09PM +0100, David Hildenbrand wrote: > > > On 10.01.23 21:03, Peter Xu wrote: > > > > On Tue, Jan 10, 2023 at 12:52:32PM +0100, David Hildenbrand wrote: > > > > > The following seems to work, > > > > > > > > That looks much better at least from the diffstat pov (comparing to the > > > > existing patch 1+5 and the framework changes), thanks. > > > > > > > > > but makes analyze-migration.py angry: > > > > > > > > > > $ ./scripts/analyze-migration.py -f STATEFILE > > > > > Traceback (most recent call last): > > > > > File "/home/dhildenb/git/qemu/./scripts/analyze-migration.py", > > > > > line 605, in <module> > > > > > dump.read(dump_memory = args.memory) > > > > > File "/home/dhildenb/git/qemu/./scripts/analyze-migration.py", > > > > > line 539, in read > > > > > classdesc = self.section_classes[section_key] > > > > > ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^ > > > > > KeyError: ('0000:00:03.0/virtio-mem-early', 0) > > > > > > > > > > > > > > > We need the vmdesc to create info for the device. > > > > > > > > Migration may ignore the save entry if save_state() not provided in the > > > > "devices" section: > > > > > > > > if ((!se->ops || !se->ops->save_state) && !se->vmsd) { > > > > continue; > > > > } > > > > > > > > Could you try providing a shim save_state() for the new virtio-mem save > > > > entry? > > > > > > > > /* > > > > * Shim function to make sure the save entry will be dumped into > > > > "devices" > > > > * section, to make analyze-migration.py happy. > > > > */ > > > > static void virtio_mem_save_state_early(QEMUFile *file, void *opaque) > > > > { > > > > } > > > > > > > > Then: > > > > > > > > static const SaveVMHandlers vmstate_virtio_mem_device_early_ops = { > > > > .save_setup = virtio_mem_save_setup_early, > > > > .save_state = virtio_mem_save_state_early, > > > > .load_state = virtio_mem_load_state_early, > > > > }; > > > > > > > > I'm not 100% sure it'll work yet, but maybe worth trying. > > > > > > It doesn't. virtio_mem_load_state_early() will get called twice (once with > > > state saved during save_setup() and once with effectively nothing during > > > save_state(), which breaks the whole migration). > > > > Oh hold on.. so load_state() to pair save_setup()? Maybe you should pair it > > with load_setup(), which I just noticed.. Then the load_state() needs to > > be another shim like save_state(). > > Let me see if that improves the situation. E.g., ram.c writes state in > save_setup() but doesn't load any state in load_setup() -- it's all done in > load_state(). > > ... no, still not working. It will read garbage during load_setup() now. > > qemu-system-x86_64: Property 'memaddr' changed from 0x100000002037261 to > 0x140000000 > qemu-system-x86_64: Failed to load virtio-mem-device-early:tmp > qemu-system-x86_64: Load state of device 0000:00:03.0/virtio-mem-early > failed > qemu-system-x86_64: load of migration failed: Invalid argument > > > I don't think that load_setup() is supposed to consume anything useful from > the migration stream. I suspects it should actually not even consume a QEMU > file right now, because they way it's used is just for initializing stuff. > > qemu_loadvm_state_main() does the actual loading part, parsing sections etc. > qemu_loadvm_state_setup() doesn't do any of that. > > AFAIKS, at least qemu_loadvm_state_setup() would have to parse sections and > the save_setup() users would have to be converted into using load_setup() as > well. Not sure if more is missing.
Ouch, yeah, load_setup() is unfortunate. :( I think load_setup() should be named load_init() without having the qemufile at all. But let's keep that aside for now, and see what other option we have.. > > Even with that, I doubt that it would make analyze-migration.py work, > because what we store is something different than what we record in the > vmdesc. > > > > > > > > > vmdesc handling is also wrong, because analyze-migration.py gets confused > > > because it receives data stored during save_setup() but vmdesc created > > > during save_state() was told that there would be "nothing" to interpret > > > ... > > > > > > $ ./scripts/analyze-migration.py -f STATEFILE > > > { > > > "ram (2)": { > > > "section sizes": { > > > "0000:00:03.0/mem0": "0x0000000f00000000", > > > "pc.ram": "0x0000000100000000", > > > "/rom@etc/acpi/tables": "0x0000000000020000", > > > "pc.bios": "0x0000000000040000", > > > "0000:00:02.0/e1000.rom": "0x0000000000040000", > > > "pc.rom": "0x0000000000020000", > > > "/rom@etc/table-loader": "0x0000000000001000", > > > "/rom@etc/acpi/rsdp": "0x0000000000001000" > > > } > > > }, > > > "0000:00:03.0/virtio-mem-early (51)": { > > > "data": "" > > > } > > > } > > > > Yeah this is expected, because the whole data stream within the setup phase > > is a black box and not described by anything. "ram" can display correctly > > only because it's hard coded in the python script, I think. The trick > > above can only make the script not break but not teaching the script to > > also check for the early vmsd. > > Note that the issue here is that the scripts stops the output after the > virtio-mem device. So I'm not complaining about the "data": "", but about > the vmstate according to the vmdesc not having any other devices :) > > > > > But that's one step forward and IMHO it's fine for special devices. For > > example, even with your initial patch, I think the static analyzer (aka, > > qemu -dump-vmstate) will also ignore your new vmsd anyway because it's not > > directly bound to the DeviceState* but registered separately. So no ideal > > solution so far, afaict, without more work done on this aspect. > > > > > > > > > > > Not sure if the whole thing becomes nicer when manually looking up the > > > vmdesc ... because filling it will also requires manually storing the > > > se->idstr and the se->instance_id, whereby both are migration-internal and > > > not available inside save_setup(). > > > > > > > > > Hm, I really prefer something like the simplified version that let's > > > migration core deal with vmstate to be migrated during save_setup() phase. > > > We could avoid the vmstate->immutable flag and simply use a separate > > > function for registering the vmstate, like: > > > > > > vmstate_register_immutable() > > > vmstate_register_early() > > > ... > > > > I agree, this looks useful. It's just that the devices that need this will > > be rare, and unfortunately some of them already implemented the stream in > > other ways so maybe non-trivial to convert them. > > Right, I agree about the "rare" part and that conversion might be hard, > because they are not using a vmstate descriptor. > > The only way to avoid that is moving away from using a vmstate descriptor > and storing everything manually in virtio-mem code. Not particularly happy > about that, but it would be the only feasible option without touching > migration code that I can see. > > > > > Not against it if you want to keep exploring, but if so please avoid using > > the priority field, also I'd hope the early vmsd will be saved within > > qemu_savevm_state_setup() so maybe it can be another alternative to > > save_setup(). > > > > Maybe it's still easier we keep going with the save_setup() and the shim > > functions above, if it works for you. > > I'll happy to go the path you suggested if we can make it work. I'd be happy > to have something "reasonable" for the virtio-mem device in the > analyze-migration.py output. But I could live with just nothing useful for > the device itself -- as long as at least the other devices still show up. So, let's see whether we can go back to the load_state() approach.. static const SaveVMHandlers vmstate_virtio_mem_device_early_ops = { .save_setup = virtio_mem_save_setup_early, .save_state = virtio_mem_save_state_early, .load_state = virtio_mem_load_state_early, }; vfio handled it using a single header flag for either save_setup() or save_state(), then load_state() identifies it: data = qemu_get_be64(f); ... switch (data) { case VFIO_MIG_FLAG_DEV_CONFIG_STATE: ... case VFIO_MIG_FLAG_DEV_SETUP_STATE: ... Maybe play the same trick here? The flag needs to be hard coded but at least not the vmsd. Based on previous experience, I could have missed something else, though. :) Or if you like go the other way I'm fine too. Thanks, -- Peter Xu