Hi Laszlo, On 2/28/19 1:27 PM, Laszlo Ersek wrote: > On 02/28/19 11:12, Auger Eric wrote: >> Hi Laszlo, >> >> On 2/27/19 9:14 PM, Laszlo Ersek wrote: >>> On 02/27/19 13:55, Shameerali Kolothum Thodi wrote: >>>> Hi Laszlo, >>>> >>>>> -----Original Message----- >>>>> From: Shameerali Kolothum Thodi >>>>> Sent: 25 February 2019 09:54 >>>>> To: 'Laszlo Ersek' <ler...@redhat.com>; Auger Eric >>>>> <eric.au...@redhat.com>; >>>>> shannon.zha...@gmail.com; peter.mayd...@linaro.org; >>>>> imamm...@redhat.com; qemu-devel@nongnu.org; qemu-...@nongnu.org >>>>> Cc: xuwei (O) <xuw...@huawei.com>; Linuxarm <linux...@huawei.com>; Ard >>>>> Biesheuvel <ard.biesheu...@linaro.org>; Leif Lindholm (Linaro address) >>>>> <leif.lindh...@linaro.org> >>>>> Subject: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support >>>> >>>> [...] >>>> >>>>>>>> The root cause seems to be EDK2 ACPI table checksum failure >>>>>>>> as NFIT table is getting updated on hot-add. This needs further >>>>>>>> investigation. >>>>>>> + Ard, Leif, Laszlo if they have any idea of what is missing/wrong. >>>>>> >>>>>> Huh, very interesting; I usually don't expect my sanity checks to fire >>>>>> in practice. :) >>>>>> >>>>>> The message >>>>>> >>>>>> ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables" >>>>>> >>>>>> is logged by OVMF's and ArmVirtQemu's ACPI Platform DXE Driver when it >>>>>> finds an invalid COMMAND_ADD_CHECKSUM command in QEMU's ACPI >>>>>> linker/loader script. >>>>>> >>>>>> Please see the command definition in QEMU's >>>>>> "hw/acpi/bios-linker-loader.c". In particular, please refer to the >>>>>> function bios_linker_loader_add_checksum(), which builds the command >>>>>> structure, and documents the fields. >>>>>> >>>>>> (You may also refer to QEMU_LOADER_ADD_CHECKSUM in file >>>>>> "OvmfPkg/AcpiPlatformDxe/QemuLoader.h" in the edk2 source tree, for the >>>>>> same information.) >>>>>> >>>>>> The error message is logged if: >>>>>> - the offset at which the checksum should be stored falls outside of the >>>>>> size of the fw_cfg blob, or >>>>>> - the range over which the checksum should be calculated falls outside >>>>>> (at least in part) of the fw_cfg blob. >>>>>> >>>>>> To me this suggests that QEMU generates an invalid >>>>>> COMMAND_ADD_CHECKSUM >>>>>> command for the firmware. >>>>>> >>>>>> ... I've tried to skim the patches briefly. I think there must be an >>>>>> error in the DSDT building logic that is only active on reboot if an >>>>>> nvdimm module was hot-added before the reboot. >>>>> >>>>> Thanks for taking a look and the pointers. I will debug this further >>>>> and get back. >>>> >>>> The root cause of the issue seems to be UEFI not seeing the updated acpi >>>> table blob size on reboot once a new NFIT table is added(nvdimm hot added). >>>> >>>> Please see the debug logs below, >>>> >>>> Initial Guest boot >>>> --------------------------- >>>> >>>> Debug logs from Qemu: >>>> >>>> build_header: acpi sig DSDT len 0x5127 >>>> build_header: acpi sig FACP len 0x10c >>>> build_header: acpi sig APIC len 0xa8 >>>> build_header: acpi sig GTDT len 0x60 >>>> build_header: acpi sig MCFG len 0x3c >>>> build_header: acpi sig SPCR len 0x50 >>>> build_header: acpi sig SRAT len 0x92 >>>> build_header: acpi sig SSDT len 0x38f >>>> build_header: acpi sig XSDT len 0x5c >>>> virt_acpi_build: acpi table_blob len 0x5844 >>>> >>>> Debug logs from UEFI: >>>> >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x9 Start=0x0 >>>> Length=0x5127 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5130 >>>> Start=0x5127 Length=0x10C Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x523C >>>> Start=0x5233 Length=0xA8 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x52E4 >>>> Start=0x52DB Length=0x60 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5344 >>>> Start=0x533B Length=0x3C Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5380 >>>> Start=0x5377 Length=0x50 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x53D0 >>>> Start=0x53C7 Length=0x92 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5462 >>>> Start=0x5459 Length=0x38F Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x57F1 >>>> Start=0x57E8 Length=0x5C Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 >>>> Length=0x14 Blob->Size=0x24 >>>> ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x20 Start=0x0 >>>> Length=0x24 Blob->Size=0x24 >>>> InstallQemuFwCfgTables: installed 8 tables >>>> >>>> Guest Reboot after ndimm hot added >>>> ------------------------------------ >>>> >>>> Debug logs from Qemu: >>>> >>>> build_header: acpi sig DSDT len 0x5127 >>>> build_header: acpi sig FACP len 0x10c >>>> build_header: acpi sig APIC len 0xa8 >>>> build_header: acpi sig GTDT len 0x60 >>>> build_header: acpi sig MCFG len 0x3c >>>> build_header: acpi sig SPCR len 0x50 >>>> build_header: acpi sig SRAT len 0x92 >>>> build_header: acpi sig SSDT len 0x38f >>>> build_header: acpi sig NFIT len 0xe0 -->New >>>> build_header: acpi sig XSDT len 0x64 >>>> virt_acpi_build: acpi table_blob len 0x592c -->blob len updated >>>> >>>> Debug logs from UEFI: >>>> >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x9 Start=0x0 >>>> Length=0x5127 Blob->Size=0x5844 -->Wrong blob size. >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5130 >>>> Start=0x5127 Length=0x10C Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x523C >>>> Start=0x5233 Length=0xA8 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x52E4 >>>> Start=0x52DB Length=0x60 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5344 >>>> Start=0x533B Length=0x3C Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5380 >>>> Start=0x5377 Length=0x50 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x53D0 >>>> Start=0x53C7 Length=0x92 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x5462 >>>> Start=0x5459 Length=0x38F Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x57F1 >>>> Start=0x57E8 Length=0xE0 Blob->Size=0x5844 >>>> ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables" >>>> OnRootBridgesConnected: InstallAcpiTables: Protocol Error >>>> >>>> >>>> To me it seems on ARM vit acpi path, the blob len is calculated based >>>> on actual tables and is updated only in virt_acpi_setup() --> >>>> acpi_add_rom_blob() >>>> path. I had a look at the x86 code and it looks like, there, the blob len >>>> gets updated >>>> with an additional buffer to take care of table resizing[1]. >>>> >>>> As a hack i added the same to ARM virt and it seems to resolve the issue. >>>> I am not sure this is the best approach to fix this though. >>>> >>>> Please let me know your thoughts. >>>> >>>> Thanks, >>>> Shameer >>>> >>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c >>>> index 132414c..4291553 100644 >>>> --- a/hw/arm/virt-acpi-build.c >>>> +++ b/hw/arm/virt-acpi-build.c >>>> @@ -50,6 +50,8 @@ >>>> #define ARM_SPI_BASE 32 >>>> #define ACPI_POWER_BUTTON_DEVICE "PWRB" >>>> >>>> +#define ACPI_BUILD_TABLE_SIZE 0x20000 >>>> + >>>> static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus) >>>> { >>>> uint16_t i; >>>> @@ -886,6 +888,10 @@ void virt_acpi_build(VirtMachineState *vms, >>>> AcpiBuildTables *tables) >>>> build_rsdp(tables->rsdp, tables->linker, &rsdp_data); >>>> } >>>> >>>> + /* Make sure we have a buffer in case we need to resize the tables. */ >>>> + g_array_set_size(tables_blob, ROUND_UP(acpi_data_len(tables_blob), >>>> + ACPI_BUILD_TABLE_SIZE)); >>>> + >>>> /* Cleanup memory that's no longer used. */ >>>> g_array_free(table_offsets, true); >>>> } >>>> >>>> [1] https://github.com/qemu/qemu/blob/master/hw/i386/acpi-build.c#L2792 >>> >>> Nice analysis, thanks. >>> >>> I think the line that you reference, i.e. >>> >>> acpi_align_size(tables_blob, ACPI_BUILD_TABLE_SIZE); >>> >>> in acpi_build() [hw/i386/acpi-build.c] masks this issue for x86 only as >>> a side effect. To my understanding, the alignment / padding exists there >>> for migration compatibility. It doesn't exist for updating the size of >>> the ACPI blobs in fw_cfg across reboots. The issue is masked because the >>> alignment is large enough (un-changed) to contain the regenerated blobs >>> as well.> >>> Given that the "virt" machine type is versioned, I think migration >>> compat is a valid concern there too. This in itself would justify a >>> similar padding. >> I don't understand the migration compat issue. Please could you elaborate? > > git-blame explains it to some extent -- please see commit 07fb61760cde > ("pc: hack for migration compatibility from QEMU 2.0", 2014-07-28). > > I don't remember any details at this point that the commit does not > state. (I see that I reviewed the patch back then, so perhaps the > mailing list archive has some discussion.) > > Interestingly, the commit message refers to "memory hotplug work" too. > > ... Ahh, wait, I do remember the main issue now. Here's the thing. The > ACPI payload that QEMU generates for the firmware is considered a part > of the firmware itself. Therefore, it is not versioned -- because the > firmware itself is not versioned. (In other words, if you migrate a VM > from one host to another host, and that other host has different > firmware that the VM will pick up after re-launch (from cold boot), then > the firmware will change in the VM.) > > By considering ACPI a part of the firmware, QEMU never versioned the > ACPI payload, just like the actual firmware was never versioned. In > other words, if you have machine type Foo on qemu release Bar, and > machine type Foo on qemu release Baz, compat properties and such will > ensure that the virtual hardware looks the same to the guest, but QEMU > will *not* ensure that the ACPI payload generated at QEMU startup (more > precisely, at "machine done") will be identical. Despite the fact that > both QEMU instances use machine type Foo. > > Now, combine this with the feature that fw_cfg has been backed by RAM > Blocks, for a quite long time now (this wasn't always the case, but it > has been for multiple years now). The end result is that the RAM > block(s) holding the initial ACPI payload may differ between releases > Bar and Baz, within the same machine type Foo. This means that migration > between them will fail, due to RAMBlock size difference. > > Hence the padding -- it tries to cancel out small variances in ACPI > payload size. > >>> >>> I don't know if we want to specifically care about size-changing >>> ACPI-regen across reboot. I believe measures for that specific use case >>> don't exist in x86 machine types either. >> The NFIT redimensioning should exit on x86 too? > > That's not my point. My point was that the padding, which was originally > supposed to mask variances in ACPI payload size across *QEMU releases*, > for migration compat, ended up masking a variance of different origin: > namely ACPI regeneration at reboot (with different contents). In other > words, we never implemented any specific measures for this > resize-on-reboot issue, instead we allowed the migration compat code > (the padding) to take care of it as well. > > In virt, there is no such ACPI padding code (for migration compat) -- > for whatever reason --, and so it *also* cannot take care of the > resize-on-reboot problem.
That's clearer now. Thank you for the explanations. Thanks Eric > > [...] > > Thanks > Laszlo >