On 9/14/21 3:58 PM, Daniel P. Berrangé wrote: > On Tue, Sep 14, 2021 at 02:30:06PM +0100, Dr. David Alan Gilbert wrote: >> * David Hildenbrand (da...@redhat.com) wrote: >>> On 14.09.21 15:17, Dr. David Alan Gilbert (git) wrote: >>>> From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> >>>> >>>> The subsection name for page-poison was typo'd as: >>>> >>>> vitio-balloon-device/page-poison >>>> >>>> Note the missing 'r' in virtio. >>>> >>>> When we have a machine type that enables page poison, and the guest >>>> enables it (which needs a new kernel), things fail rather unpredictably. >>>> >>>> The fallout from this is that most of the other subsections fail to >>>> load, including things like the feature bits in the device, one >>>> possible fallout is that the physical addresses of the queues >>>> then get aligned differently and we fail with an error about >>>> last_avail_idx being wrong. >>>> It's not obvious to me why this doesn't produce a more obvious failure, >>>> but virtio's vmstate loading is a bit open-coded. >>>> >>>> Fixes: 7483cbbaf82 ("virtio-balloon: Implement support for page poison >>>> reporting feature") >>>> bz: https://bugzilla.redhat.com/show_bug.cgi?id=1984401 >>>> Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> >>>> --- >>>> hw/virtio/virtio-balloon.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c >>>> index 5a69dce35d..c6962fcbfe 100644 >>>> --- a/hw/virtio/virtio-balloon.c >>>> +++ b/hw/virtio/virtio-balloon.c >>>> @@ -852,7 +852,7 @@ static const VMStateDescription >>>> vmstate_virtio_balloon_free_page_hint = { >>>> }; >>>> static const VMStateDescription vmstate_virtio_balloon_page_poison = { >>>> - .name = "vitio-balloon-device/page-poison", >>>> + .name = "virtio-balloon-device/page-poison", >>>> .version_id = 1, >>>> .minimum_version_id = 1, >>>> .needed = virtio_balloon_page_poison_support, >>>> >>> >>> Oh, that's very subtle. I wasn't even aware that the prefix really has to >>> match the actual device ... I thought the whole idea of the prefix here was >>> just to make the string unique ... >> >> Subsection naming is *very* critical; the logic is something like: >> 'we're loading the X device' >> a subsection arrives for 'N/P' >> if 'X==N' then it looks in X for subsection P. >> If 'X!=N' then it assumes we've finished loading X >> and P is really for an outer device that X is part of. >> This is horrible. > > Is there value in making this more explicit via a code convention > for .name field initializers. eg instead of > > .name = "virtio-balloon-device/page-poison", > > Prefer > > .name = TYPE_VIRTIO_BALLOON "/page-poison" > > ?
IIUC so far only user-creatable devices are required to have a stable name in the TYPE definition (because CLI must stay stable). Which is why this type is not recommended for migration section names.