Update: The issue does not follow toolchain, which is good. Rather I did find that upstream v5.13-rc1 is stable - with the patch in comment #15 reverted - while upstream v5.13 is not. I bisected v5.13-rc1..v5.13, reverting the comment #15 patch at each test. A "bad" kernel wouldn't always fail the same way, but would always fail before completing boot. The bisect hit this commit:
commit 0c6c2d3615efb7c292573f2e6c886929a2b2da6c (HEAD, refs/bisect/bad) Author: Mark Brown <broo...@kernel.org> Date: Wed Apr 28 13:12:31 2021 +0100 arm64: Generate cpucaps.h While this looks innocuous, it is messing with the code that chooses which "features" a CPU has, which includes erratum that may need kernel workarounds. So I went back and compared the CPU features messages between a "good" kernel and a "bad" one. Noticeably missing from a "bad" one was this message: [ 0.000000] CPU features: kernel page table isolation forced OFF by ARM64_WORKAROUND_CAVIUM_27456 I went back and tested Ubuntu's 5.13.0-16 w/ kpti=off, and it booted fine. As does upstream v5.15-rc2, where previously I was also seeing stack overflows/corruption. So, seems like the above change is likely the problem, next step is to figure out why. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1942633 Title: Can not boot impish in Cavium ThunderX To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942633/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs