On Mon, 15 May 2023, Roger Pau Monné wrote:
> On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote:
> > From: Stefano Stabellini <stefano.stabell...@amd.com>
> > 
> > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of
> > the tables in the guest. Instead, copy the tables to Dom0.
> > 
> > This is a workaround.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabell...@amd.com>
> > ---
> > As mentioned in the cover letter, this is a RFC workaround as I don't
> > know the cause of the underlying problem. I do know that this patch
> > solves what would be otherwise a hang at boot when Dom0 PVH attempts to
> > parse ACPI tables.
> 
> I'm unsure how safe this is for native systems, as it's possible for
> firmware to modify the data in the tables, so copying them would
> break that functionality.
> 
> I think we need to get to the root cause that triggers this behavior
> on QEMU.  Is it the table checksum that fail, or something else?  Is
> there an error from Linux you could reference?

I agree with you but so far I haven't managed to find a way to the root
of the issue. Here is what I know. These are the logs of a successful
boot using this patch:

[   10.437488] ACPI: Early table checksum verification disabled
[   10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS )
[   10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS  BXPCRSDT 
00000001 BXPC 00000001)
[   10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS  BXPCAPIC 
00000001 BXPC 00000001)
[   10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS  BXPCFACP 
00000001 BXPC 00000001)
[   10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 
0x67, should be 0x30 (20220331/tbprint-174)
[   10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS  BXPCDSDT 
00000001 BXPC 00000001)
[   10.451258] ACPI: FACS 0x000000004005FAD9 000040
[   10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff]
[   10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8]
[   10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75]
[   10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18]


And these are the logs of the same boot (unsuccessful) without this
patch:

[   10.516015] ACPI: Early table checksum verification disabled
[   10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS )
[   10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS  BXPCRSDT 
00000001 BXPC 00000001)
[   10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS  BXPCAPIC 
00000001 BXPC 00000001)
[   10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� 
FFFFFFFF ���� FFFFFFFF)
[   10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff]
[   10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b]

It is clearly a memory corruption around FACS but I couldn't find the
reason for it. The mapping code looks correct. I hope you can suggest a
way to narrow down the problem. If I could, I would suggest to apply
this patch just for the QEMU PVH tests but we don't have the
infrastructure for that in gitlab-ci as there is a single Xen build for
all tests.

If it helps to repro on your side, you can just do the following,
assuming your Xen repo is in /local/repos/xen:


cd /local/repos/xen
mkdir binaries
cd binaries
mkdir -p dist/install/

docker run -it -v `pwd`:`pwd` 
registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12
cp /initrd* /local/repos/xen/binaries
exit

docker run -it -v `pwd`:`pwd` 
registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19
cp /bzImage /local/repos/xen/binaries
exit

That's it. Now you have enough pre-built binaries to repro the issue.
Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add

  dom0=pvh dom0_mem=1G dom0-iommu=none

on the Xen command line. I also removed "timeout" and pipe "tee" at the
end for my own convenience:

 # Run the test
-rm -f smoke.serial
-set +e
-timeout -k 1 720 \
 qemu-system-x86_64 \
     -cpu qemu64,+svm \
     -m 2G -smp 2 \
     -monitor none -serial stdio \
     -nographic \
     -device virtio-net-pci,netdev=n0 \
-    -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& tee smoke.serial
+    -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0
 

make sure to build the Xen hypervisor binary and place the binary under
/local/repos/xen/binaries/

You can finally run the test with the below:

cd ..
docker run -it -v /local/repos/xen:/local/repos/xen 
registry.gitlab.com/xen-project/xen/debian:unstable
cd /local/repos/xen
bash automation/scripts/qemu-alpine-x86_64.sh

It usually gets stuck halfway through the boot without this patch.

Reply via email to