On Thu, 25 Aug 2022 17:18:42 +0100 Shameer Kolothum <shameerali.kolothum.th...@huawei.com> wrote:
> Hi > > On arm/virt platform, Chen Xiang reported a Guest crash while > attempting the below steps, > > 1. Launch the Guest with nvdimm=on > 2. Hot-add a NVDIMM dev > 3. Reboot > 4. Guest boots fine. > 5. Reboot again. > 6. Guest boot fails. > > QEMU_EFI reports the below error: > ProcessCmdAddPointer: invalid pointer value in "etc/acpi/tables" > OnRootBridgesConnected: InstallAcpiTables: Protocol Error > > Debugging shows that on first reboot(after hot-adding NVDIMM), > Qemu updates the etc/table-loader len, > > qemu_ram_resize() > fw_cfg_modify_file() > fw_cfg_modify_bytes_read() > > And in fw_cfg_modify_bytes_read() we set the "callback_opaque" for > the "key" entry to NULL. Because of this, on the second reboot, > virt_acpi_build_update() is called with a NULL "build_state" and > returns without updating the ACPI tables. This seems to be > upsetting the firmware. > > To fix this, don't change the callback_opaque in fw_cfg_modify_bytes_read(). Fixes: bdbb5b1706d165 ("fw_cfg: add fw_cfg_machine_reset function") Acked-by: Igor Mammedov <imamm...@redhat.com> CCing Gerd to have a second set of eyes on it > Reported-by: chenxiang <chenxian...@hisilicon.com> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.th...@huawei.com> > --- > I am still not very convinced this is the root cause of the issue. > Though it looks like setting callback_opaque to NULL while updating > the file size is wrong, what puzzles me is that on the second reboot > we don't have any ACPI table size changes and ideally firmware should > see the updated tables from the first reboot itself. > > Please take a look and let me know. > > Thanks, > Shameer > > --- > hw/nvram/fw_cfg.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c > index d605f3f45a..dfe8404c01 100644 > --- a/hw/nvram/fw_cfg.c > +++ b/hw/nvram/fw_cfg.c > @@ -728,7 +728,6 @@ static void *fw_cfg_modify_bytes_read(FWCfgState *s, > uint16_t key, > ptr = s->entries[arch][key].data; > s->entries[arch][key].data = data; > s->entries[arch][key].len = len; > - s->entries[arch][key].callback_opaque = NULL; > s->entries[arch][key].allow_write = false; As Christian have mentioned, this also looks bogus. perhaps another patch to fix that as well. > return ptr;