On Mon, 15 May 2023, Roger Pau Monné wrote: > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > From: Stefano Stabellini <stefano.stabell...@amd.com> > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > the tables in the guest. Instead, copy the tables to Dom0. > > > > This is a workaround. > > > > Signed-off-by: Stefano Stabellini <stefano.stabell...@amd.com> > > --- > > As mentioned in the cover letter, this is a RFC workaround as I don't > > know the cause of the underlying problem. I do know that this patch > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > parse ACPI tables. > > I'm unsure how safe this is for native systems, as it's possible for > firmware to modify the data in the tables, so copying them would > break that functionality. > > I think we need to get to the root cause that triggers this behavior > on QEMU. Is it the table checksum that fail, or something else? Is > there an error from Linux you could reference?
I agree with you but so far I haven't managed to find a way to the root of the issue. Here is what I know. These are the logs of a successful boot using this patch: [ 10.437488] ACPI: Early table checksum verification disabled [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] And these are the logs of the same boot (unsuccessful) without this patch: [ 10.516015] ACPI: Early table checksum verification disabled [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] It is clearly a memory corruption around FACS but I couldn't find the reason for it. The mapping code looks correct. I hope you can suggest a way to narrow down the problem. If I could, I would suggest to apply this patch just for the QEMU PVH tests but we don't have the infrastructure for that in gitlab-ci as there is a single Xen build for all tests. If it helps to repro on your side, you can just do the following, assuming your Xen repo is in /local/repos/xen: cd /local/repos/xen mkdir binaries cd binaries mkdir -p dist/install/ docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12 cp /initrd* /local/repos/xen/binaries exit docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19 cp /bzImage /local/repos/xen/binaries exit That's it. Now you have enough pre-built binaries to repro the issue. Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add dom0=pvh dom0_mem=1G dom0-iommu=none on the Xen command line. I also removed "timeout" and pipe "tee" at the end for my own convenience: # Run the test -rm -f smoke.serial -set +e -timeout -k 1 720 \ qemu-system-x86_64 \ -cpu qemu64,+svm \ -m 2G -smp 2 \ -monitor none -serial stdio \ -nographic \ -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& tee smoke.serial + -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 make sure to build the Xen hypervisor binary and place the binary under /local/repos/xen/binaries/ You can finally run the test with the below: cd .. docker run -it -v /local/repos/xen:/local/repos/xen registry.gitlab.com/xen-project/xen/debian:unstable cd /local/repos/xen bash automation/scripts/qemu-alpine-x86_64.sh It usually gets stuck halfway through the boot without this patch.