On Wed, Jan 24, 2024 at 02:27:10PM +0100, Björn Töpel wrote: > Conor Dooley <co...@kernel.org> writes: > > > On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote: > >> Hi! > >> > >> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that > >> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2 > >> ("target/riscv/cpu.c: restrict 'marchid' value") > >> > >> Reverting that commit, or the hack below solves the boot issue: > >> > >> --8<-- > >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > >> index 8cbfc7e781ad..e18596c8a55a 100644 > >> --- a/target/riscv/cpu.c > >> +++ b/target/riscv/cpu.c > >> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj) > >> cpu->cfg.ext_xtheadsync = true; > >> > >> cpu->cfg.mvendorid = THEAD_VENDOR_ID; > >> + cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) | > >> + (QEMU_VERSION_MINOR << 8) | > >> + (QEMU_VERSION_MICRO)); > >> #ifndef CONFIG_USER_ONLY > >> set_satp_mode_max_supported(cpu, VM_1_10_SV39); > >> #endif > >> --8<-- > >> > >> I'm unsure what the correct qemu way of adding a default value is, > >> or if c906 should have a proper marchid. > > > > The "correct" marchid/mimpid values for the c906 are zero. > > Ok! Thanks for clearing that up for me. > > > I haven't looked into the code at all, so I am "assuming" that it is > > being zero intialised at present. Linux applies the errata fixups for > > the c906 when archid and impid are both zero - so your patch will avoid > > these fixups being applied. > > I'm also assuming 0, -- will double-check. Hmm, that means that the > *previous* marchid was incorrect (pre d6a427e2c0b2). > > > Do you think that perhaps the emulation in QEMU does not support what > > the kernel uses once then errata fixups are enabled? > > Did a quick look at the c906 "in_asm,int" logs: > > | 0x80201040: 12000073 sfence.vma zero,zero > | 0x80201044: 18051073 csrrw zero,satp,a0 > | > | riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, > epc:0x0000000080201048, tval:0x0000000080201048, desc=exec_page_fault > | riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, > epc:0xffffffff80001048, tval:0xffffffff80001048, desc=exec_page_fault > | ...cont forever > > So it looks like we're tripping over the page tables, when we're turning > on paging. > > Hmm, maybe it's not qemu, but the c906 that has been broken for a while?
I didn't know what you mean by "not qemu, but the c906", so I went and boot tested my d1 nezha. On today's next (6.8.0-rc1-next-20240124) it booted into my initramfs with no problems. Obivously though my config is unlikely to match yours, but that seems like a core thing that should be hit regardless of config. So perhaps this is a c906-in-QEMU problem? Lacking emulation for something the kernel uses perhaps? I know nothing about the capabilities of its emulation in QEMU, so I am of no help. Cheers, Conor. > > I'll disable it temporarily from CI anyhow, and will continue digging. > > > Thanks for the pointers/clarifications, Conor! > Björn > > _______________________________________________ > linux-riscv mailing list > linux-ri...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
signature.asc
Description: PGP signature