Quoting Ard Biesheuvel (2023-01-19 12:11:34)
>  (cc Marc)
>
> Context:
> - on my TX2 (with the S1PTW r/o memslot fix applied), the new version
> of ArmVirtQemu that uses an initial ID map in emulated NOR flash works
> fine.
> - in Oliver's case (which is a slightly different flavor of TX2), it
> crashes extremely early, presumably at the point where this ID map is
> activated.
>
> More details at the end.
>
> On Thu, 19 Jan 2023 at 12:03, Oliver Steffen <ostef...@redhat.com> wrote:
> >
> > Quoting Ard Biesheuvel (2023-01-18 10:22:12)
> > > On Wed, 18 Jan 2023 at 09:48, Ard Biesheuvel <a...@kernel.org> wrote:
> > > >
> > > > On Wed, 18 Jan 2023 at 09:28, Oliver Steffen <ostef...@redhat.com> 
> > > > wrote:
> > > > >
> > > > > Quoting Ard Biesheuvel (2023-01-18 08:34:32)
> > > > > > On Wed, 18 Jan 2023 at 07:37, Oliver Steffen <ostef...@redhat.com> 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Jan 17, 2023 at 3:57 PM Ard Biesheuvel <a...@kernel.org> 
> > > > > > > wrote:
> > > > > > >>
> > > > > > >> On Tue, 17 Jan 2023 at 13:48, Oliver Steffen 
> > > > > > >> <ostef...@redhat.com> wrote:
> > > > > > >> >
> > > > > > >> > Hi Ard, Hi everyone,
> > > > > > >> >
> > > > > > >> > Thanks for the work!
> > > > > > >> >
> > > > > > >> > But somehow this patch (as it was merged into master branch) 
> > > > > > >> > does not
> > > > > > >> > work for me on the ThunderX box we have.
> > > > > > >> >
> > > > > > >> > Any idea what could be wrong?
> > > > > > >>
> > > > > > >> I'm not sure I understand the question. The patch targets 
> > > > > > >> ThunderX,
> > > > > > >> and you are using a ThunderX2.
> > > > > > >>
> > > > > > >> What were you expecting to happen, and what is happening instead?
> > > > > > >
> > > > > > >
> > > > > > > Firmware does not start at all when using KVM.
> > > > > > >
> > > > > > > Please excuse my limited knowledge of Arm processor variants.
> > > > > > > I assumed that ThunderX and ThunderX2 are very similar and hoped
> > > > > > > the fix would also work for this case.
> > > > > > >
> > > > > > > The issue was introduced by the same commit that Dann
> > > > > > > reported (07be1d34d95460a238fcd0f6693efb747c28b329):
> > > > > > > "ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot".
> > > > > > >
> > > > > >
> > > > > > Can you share the QEMU command line that you are using? I use a
> > > > > > ThunderX2 basically 24/7 to do all my Linux and EDK2 development, so
> > > > > > this change was developed on ThunderX2 and so I'm surprised you are
> > > > > > seeing this issue.
> > > > > >
> > > > > > Did you try the DEBUG build as well?
> > > > > Yes, debug is on.
> > > > >
> > > > > Here is what I have, trying with the master branch from just now
> > > > > (998ebe5ca0ae5c449e83ede533bee872f97d63af):
> > > > >
> > > > > # make -C BaseTools && \
> > > > >   . ./edksetup.sh && \
> > > > >   build -t GCC5 -a AARCH64 \
> > > > >     -p ArmVirtPkg/ArmVirtQemu.dsc \
> > > > >     -DCAVIUM_ERRATUM_27456 \
> > > > >     -b DEBUG
> > > > >
> > > > > # /usr/libexec/qemu-kvm \
> > > > >     -machine accel=kvm -m 1G -boot menu=on \
> > > > >     -blockdev 
> > > > > node-name=code,driver=file,filename="${FW_CODE_RESIZED}",read-only=on
> > > > > \
> > > > >     -blockdev node-name=vars,driver=file,filename="${FW_VARS}" \
> > > > >     -machine pflash0=code \
> > > > >     -machine pflash1=vars \
> > > > >     -cpu max \
> > > > >     -net none \
> > > > >     -serial stdio
> > > > >
> > > >
> > > > My distro does not have qemu-kvm, and using the command line above
> > > > results in the following if i try it with qemu-system-aarch64
> > > >
> > > > """
> > > > qemu-system-aarch64: No machine specified, and there is no default
> > > > Use -machine help to list supported machines
> > > > """
> > > >
> > > > unless i change it to
> > > >
> > > > qemu-system-aarch64 -machine virt,accel=kvm -m 1G -boot menu=on \
> > > >     -blockdev 
> > > > node-name=code,driver=file,filename=$HOME/bin/flash0.img,read-only=on
> > > > \
> > > >     -blockdev node-name=vars,driver=file,filename=$HOME/bin/flash1.img \
> > > >     -machine pflash0=code \
> > > >     -machine pflash1=vars \
> > > >     -cpu max \
> > > >     -net none \
> > > >     -nographic
> > > >
> > > > and that works fine with my firmware build.
> > > >
> > > >
> > > > > # /usr/libexec/qemu-kvm --version
> > > > > QEMU emulator version 7.2.0 (qemu-kvm-7.2.0-3.el9)
> > > > >
> > > > > # uname -r
> > > > > 5.14.0-234.el9.aarch64
> > > > >
> > > >
> > > > Yeah, that is quite old. One potential issue that comes to mind here
> > > > is the one address by the patch below
> > > >
> > > >
> > > > >
> > > > >
> > > > > Since you have the same CPU... Might this be a bug in KVM?
> > > > >
> > > >
> > > > Indeed. Could you try applying this patch?
> > > >
> > > > commit 406504c7b0405d74d74c15a667cd4c4620c3e7a9
> > > > Author: Marc Zyngier <m...@kernel.org>
> > > > Date:   Tue Dec 20 14:03:52 2022 +0000
> > > >
> > > >     KVM: arm64: Fix S1PTW handling on RO memslots
> > > >
> > > > Or check whether this is generally reproducible with newer kernels?
> > >
> > > Another thing you might try:
> > >
> > > - build the firmware with the following hunk applied
> > >
> > > """
> > > diff --git 
> > > a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
> > > b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
> > > index 5ac7c732f6ec..f4e1285beefc 100644
> > > --- a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
> > > +++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
> > > @@ -40,6 +40,12 @@
> > >   .set    sctlrval, SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA |
> > > SCTLR_EL1_ITD | SCTLR_EL1_SED
> > >   .set    sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | 
> > > SCTLR_EL1_RES1
> > >
> > > +  .align  11
> > > +.Lvectors:
> > > +  .rept   16
> > > +  .align  7
> > > +  b       .
> > > +  .endr
> > >
> > >  ASM_FUNC(ArmPlatformPeiBootAction)
> > >  #ifdef CAVIUM_ERRATUM_27456
> > > @@ -90,6 +96,8 @@ ASM_FUNC(ArmPlatformPeiBootAction)
> > >    msr    mair_el1, x0            // set up the 1:1 mapping
> > >    msr    tcr_el1, x1
> > >    msr    ttbr0_el1, x2
> > > +  adr    x0, .Lvectors
> > > +  msr    vbar_el1, x0
> > >    isb
> > >
> > >    tlbi   vmalle1                 // invalidate any cached translations
> > > """
> > >
> > > - run qemu with the -s option and let it crash
> > >
> > > - connect with gdb and dump the exception context
> > >
> > > target remote:1234
> > > set radix 16
> > > p $FAR_EL1
> > > p $ESR_EL1
> > > p $ELR_EL1
> > >
> > > That should at least tell us why the crash is occurring.
> > >
> >
> > I tried the most recent Qemu master (v7.2.50) and also v7.0.0,
> > on the 5.14 (RHEL) kernel and on 6.1.6-200.fc37.aarch64 (from Fedora).
> > No luck.
> >
>
> Does that include a backport of commit 
> 406504c7b0405d74d74c15a667cd4c4620c3e7a9?
>
> > I applied the patch and attached gdb, as described (Qemu 7.2.50):
> >
> >   p $ELR_EL1
> >   (gdb) p $FAR_EL1
> >   $1 = 0x6200
> >   (gdb) p $ESR_EL1
> >   $2 = 0x86000010
> >   (gdb) p $ELR_EL1
> >   $3 = 0x6200
> >
> > There is no sign of any crash. It seems like it does not even start
> > running.
> >
>
> So 0x6200 is the sync exception vector, which is both the code
> location of the crash and the faulting address. This means fetching
> the instructions to handle the original exception failed, and so the
> original exception reason (ESR) is lost. However, the synchronous
> external abort (https://esr.arm64.dev/?#0x86000010) that you are
> seeing might point to an issue similar (or the same) that Marc
> recently fixed in KVM.
>
> It is quite odd that this does not reproduce *at all* on my TX2.
> Fedora kernels don't use 64k pages right?
>

Kernel config says:
CONFIG_ARM64_4K_PAGES=y



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#98891): https://edk2.groups.io/g/devel/message/98891
Mute This Topic: https://groups.io/mt/96075174/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to