On 21/10/2024 10:53 am, Andrew Cooper wrote: > On 21/10/2024 1:45 am, Daniel P. Smith wrote: >> To start transitioning consider_modules() over to struct boot_module, begin >> with taking the array of struct boot_modules but use the temporary struct >> element mod. >> >> No functional change intended. >> >> Signed-off-by: Daniel P. Smith <dpsm...@apertussolutions.com> >> Reviewed-by: Jason Andryuk <jason.andr...@amd.com> > I've taken patch 1 (currently in testing to push). > > However, this change breaks PV Shim (same test case that I debugged the > OSSTest failure with on Saturday). > > (XEN) PVH start info: (pa 0000ffc0) > (XEN) version: 1 > (XEN) flags: 0 > (XEN) nr_modules: 2 > (XEN) modlist_pa: 000000000000ff60 > (XEN) cmdline_pa: 000000000000ffa0 > (XEN) cmdline: 'pv-shim console=xen,pv' > (XEN) rsdp_pa: 00000000fc008000 > (XEN) mod[0].pa: 0000000001042000 > (XEN) mod[0].size: 0000000006907440 > (XEN) mod[0].cmdline_pa: 0000000000000000 > (XEN) mod[1].pa: 00000000016d9000 > (XEN) mod[1].size: 0000000019870751 > (XEN) mod[1].cmdline_pa: 0000000000000000 > (XEN) Bootloader: PVH Directboot > (XEN) Command line: pv-shim console=xen,pv > (XEN) Xen image load base address: 0 > ... > (XEN) ----[ Xen-4.20-unstable x86_64 debug=y ubsan=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82d040a51d38>] > arch/x86/pv/dom0_build.c#dom0_construct+0x2020/0x24eb > ... > (XEN) Xen call trace: > (XEN) [<ffff82d040a51d38>] R > arch/x86/pv/dom0_build.c#dom0_construct+0x2020/0x24eb > (XEN) [<ffff82d040a52222>] F dom0_construct_pv+0x1f/0xcd > (XEN) [<ffff82d040a7bb27>] F construct_dom0+0xad/0xc0 > (XEN) [<ffff82d040a6eb3f>] F __start_xen+0x50f2/0x53aa > (XEN) [<ffff82d040206184>] F __high_start+0x94/0xa0 > ... > (XEN) Pagetable walk from ffff8287fffffff8: > (XEN) L4[0x105] = 000000001fa17063 ffffffffffffffff > (XEN) L3[0x01f] = 0000000000000000 ffffffffffffffff > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=0000] > (XEN) Faulting linear address: ffff8287fffffff8 > (XEN) **************************************** > > We've got a wild access, but it's not clear exactly what's going on here. > > addr2line says it's line 904 which is in the middle of the > page_list_for_each() loop setting up the physmap. > > My gut feeling is that the pointer -> array conversion in > consider_modules() isn't correct, because that's literally the only > non-trivial change, but I can't spot anything concretely wrong...
To follow up, this isn't really the fault of this patch. It's a hindenbug left by prior patches. Fix at https://lore.kernel.org/xen-devel/20241022124114.84498-1-andrew.coop...@citrix.com/T/#u ~Andrew