Quoting Ard Biesheuvel (2023-01-19 12:11:34) > (cc Marc) > > Context: > - on my TX2 (with the S1PTW r/o memslot fix applied), the new version > of ArmVirtQemu that uses an initial ID map in emulated NOR flash works > fine. > - in Oliver's case (which is a slightly different flavor of TX2), it > crashes extremely early, presumably at the point where this ID map is > activated. > > More details at the end. > > On Thu, 19 Jan 2023 at 12:03, Oliver Steffen <ostef...@redhat.com> wrote: > > > > Quoting Ard Biesheuvel (2023-01-18 10:22:12) > > > On Wed, 18 Jan 2023 at 09:48, Ard Biesheuvel <a...@kernel.org> wrote: > > > > > > > > On Wed, 18 Jan 2023 at 09:28, Oliver Steffen <ostef...@redhat.com> > > > > wrote: > > > > > > > > > > Quoting Ard Biesheuvel (2023-01-18 08:34:32) > > > > > > On Wed, 18 Jan 2023 at 07:37, Oliver Steffen <ostef...@redhat.com> > > > > > > wrote: > > > > > > > > > > > > > > On Tue, Jan 17, 2023 at 3:57 PM Ard Biesheuvel <a...@kernel.org> > > > > > > > wrote: > > > > > > >> > > > > > > >> On Tue, 17 Jan 2023 at 13:48, Oliver Steffen > > > > > > >> <ostef...@redhat.com> wrote: > > > > > > >> > > > > > > > >> > Hi Ard, Hi everyone, > > > > > > >> > > > > > > > >> > Thanks for the work! > > > > > > >> > > > > > > > >> > But somehow this patch (as it was merged into master branch) > > > > > > >> > does not > > > > > > >> > work for me on the ThunderX box we have. > > > > > > >> > > > > > > > >> > Any idea what could be wrong? > > > > > > >> > > > > > > >> I'm not sure I understand the question. The patch targets > > > > > > >> ThunderX, > > > > > > >> and you are using a ThunderX2. > > > > > > >> > > > > > > >> What were you expecting to happen, and what is happening instead? > > > > > > > > > > > > > > > > > > > > > Firmware does not start at all when using KVM. > > > > > > > > > > > > > > Please excuse my limited knowledge of Arm processor variants. > > > > > > > I assumed that ThunderX and ThunderX2 are very similar and hoped > > > > > > > the fix would also work for this case. > > > > > > > > > > > > > > The issue was introduced by the same commit that Dann > > > > > > > reported (07be1d34d95460a238fcd0f6693efb747c28b329): > > > > > > > "ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot". > > > > > > > > > > > > > > > > > > > Can you share the QEMU command line that you are using? I use a > > > > > > ThunderX2 basically 24/7 to do all my Linux and EDK2 development, so > > > > > > this change was developed on ThunderX2 and so I'm surprised you are > > > > > > seeing this issue. > > > > > > > > > > > > Did you try the DEBUG build as well? > > > > > Yes, debug is on. > > > > > > > > > > Here is what I have, trying with the master branch from just now > > > > > (998ebe5ca0ae5c449e83ede533bee872f97d63af): > > > > > > > > > > # make -C BaseTools && \ > > > > > . ./edksetup.sh && \ > > > > > build -t GCC5 -a AARCH64 \ > > > > > -p ArmVirtPkg/ArmVirtQemu.dsc \ > > > > > -DCAVIUM_ERRATUM_27456 \ > > > > > -b DEBUG > > > > > > > > > > # /usr/libexec/qemu-kvm \ > > > > > -machine accel=kvm -m 1G -boot menu=on \ > > > > > -blockdev > > > > > node-name=code,driver=file,filename="${FW_CODE_RESIZED}",read-only=on > > > > > \ > > > > > -blockdev node-name=vars,driver=file,filename="${FW_VARS}" \ > > > > > -machine pflash0=code \ > > > > > -machine pflash1=vars \ > > > > > -cpu max \ > > > > > -net none \ > > > > > -serial stdio > > > > > > > > > > > > > My distro does not have qemu-kvm, and using the command line above > > > > results in the following if i try it with qemu-system-aarch64 > > > > > > > > """ > > > > qemu-system-aarch64: No machine specified, and there is no default > > > > Use -machine help to list supported machines > > > > """ > > > > > > > > unless i change it to > > > > > > > > qemu-system-aarch64 -machine virt,accel=kvm -m 1G -boot menu=on \ > > > > -blockdev > > > > node-name=code,driver=file,filename=$HOME/bin/flash0.img,read-only=on > > > > \ > > > > -blockdev node-name=vars,driver=file,filename=$HOME/bin/flash1.img \ > > > > -machine pflash0=code \ > > > > -machine pflash1=vars \ > > > > -cpu max \ > > > > -net none \ > > > > -nographic > > > > > > > > and that works fine with my firmware build. > > > > > > > > > > > > > # /usr/libexec/qemu-kvm --version > > > > > QEMU emulator version 7.2.0 (qemu-kvm-7.2.0-3.el9) > > > > > > > > > > # uname -r > > > > > 5.14.0-234.el9.aarch64 > > > > > > > > > > > > > Yeah, that is quite old. One potential issue that comes to mind here > > > > is the one address by the patch below > > > > > > > > > > > > > > > > > > > > > > > Since you have the same CPU... Might this be a bug in KVM? > > > > > > > > > > > > > Indeed. Could you try applying this patch? > > > > > > > > commit 406504c7b0405d74d74c15a667cd4c4620c3e7a9 > > > > Author: Marc Zyngier <m...@kernel.org> > > > > Date: Tue Dec 20 14:03:52 2022 +0000 > > > > > > > > KVM: arm64: Fix S1PTW handling on RO memslots > > > > > > > > Or check whether this is generally reproducible with newer kernels? > > > > > > Another thing you might try: > > > > > > - build the firmware with the following hunk applied > > > > > > """ > > > diff --git > > > a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > index 5ac7c732f6ec..f4e1285beefc 100644 > > > --- a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > +++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > @@ -40,6 +40,12 @@ > > > .set sctlrval, SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | > > > SCTLR_EL1_ITD | SCTLR_EL1_SED > > > .set sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | > > > SCTLR_EL1_RES1 > > > > > > + .align 11 > > > +.Lvectors: > > > + .rept 16 > > > + .align 7 > > > + b . > > > + .endr > > > > > > ASM_FUNC(ArmPlatformPeiBootAction) > > > #ifdef CAVIUM_ERRATUM_27456 > > > @@ -90,6 +96,8 @@ ASM_FUNC(ArmPlatformPeiBootAction) > > > msr mair_el1, x0 // set up the 1:1 mapping > > > msr tcr_el1, x1 > > > msr ttbr0_el1, x2 > > > + adr x0, .Lvectors > > > + msr vbar_el1, x0 > > > isb > > > > > > tlbi vmalle1 // invalidate any cached translations > > > """ > > > > > > - run qemu with the -s option and let it crash > > > > > > - connect with gdb and dump the exception context > > > > > > target remote:1234 > > > set radix 16 > > > p $FAR_EL1 > > > p $ESR_EL1 > > > p $ELR_EL1 > > > > > > That should at least tell us why the crash is occurring. > > > > > > > I tried the most recent Qemu master (v7.2.50) and also v7.0.0, > > on the 5.14 (RHEL) kernel and on 6.1.6-200.fc37.aarch64 (from Fedora). > > No luck. > > > > Does that include a backport of commit > 406504c7b0405d74d74c15a667cd4c4620c3e7a9? > > > I applied the patch and attached gdb, as described (Qemu 7.2.50): > > > > p $ELR_EL1 > > (gdb) p $FAR_EL1 > > $1 = 0x6200 > > (gdb) p $ESR_EL1 > > $2 = 0x86000010 > > (gdb) p $ELR_EL1 > > $3 = 0x6200 > > > > There is no sign of any crash. It seems like it does not even start > > running. > > > > So 0x6200 is the sync exception vector, which is both the code > location of the crash and the faulting address. This means fetching > the instructions to handle the original exception failed, and so the > original exception reason (ESR) is lost. However, the synchronous > external abort (https://esr.arm64.dev/?#0x86000010) that you are > seeing might point to an issue similar (or the same) that Marc > recently fixed in KVM. > > It is quite odd that this does not reproduce *at all* on my TX2. > Fedora kernels don't use 64k pages right? >
Kernel config says: CONFIG_ARM64_4K_PAGES=y -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#98891): https://edk2.groups.io/g/devel/message/98891 Mute This Topic: https://groups.io/mt/96075174/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-