Update: The issue does not follow toolchain, which is good. Rather I did
find that upstream v5.13-rc1 is stable - with the patch in comment #15
reverted - while upstream v5.13 is not. I bisected v5.13-rc1..v5.13,
reverting the comment #15 patch at each test. A "bad" kernel wouldn't
always fail the same way, but would always fail before completing boot.
The bisect hit this commit:

commit 0c6c2d3615efb7c292573f2e6c886929a2b2da6c (HEAD, refs/bisect/bad)
Author: Mark Brown <broo...@kernel.org>
Date:   Wed Apr 28 13:12:31 2021 +0100

    arm64: Generate cpucaps.h

While this looks innocuous, it is messing with the code that chooses
which "features" a CPU has, which includes erratum that may need kernel
workarounds. So I went back and compared the CPU features messages
between a "good" kernel and a "bad" one. Noticeably missing from a "bad"
one was this message:

[    0.000000] CPU features: kernel page table isolation forced OFF by
ARM64_WORKAROUND_CAVIUM_27456

I went back and tested Ubuntu's 5.13.0-16 w/ kpti=off, and it booted
fine. As does upstream v5.15-rc2, where previously I was also seeing
stack overflows/corruption. So, seems like the above change is likely
the problem, next step is to figure out why.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1942633

Title:
  Can not boot impish in Cavium ThunderX

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942633/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to