On 19/03/2025 2:29 pm, Jan Beulich wrote: > On 19.03.2025 15:20, Andrew Cooper wrote: >> The current CI failures happen to be a latent bug triggered by a narrow set >> of >> properties of the initrd, which CI encountered by chance. > Plus properties of the host memory map. > >> One step during boot involves constructing directmap mappings for modules. >> With some probing at the point of creation, it is observed that there's a 4k >> mapping missing towards the end of the initrd. >> >> (XEN) === Mapped Mod1 [0000000394001000, 00000003be1ff6dc] to Directmap >> (XEN) Probing paddr 394001000, va ffff830394001000 >> (XEN) Probing paddr 3be1ff6db, va ffff8303be1ff6db >> (XEN) Probing paddr 3bdffffff, va ffff8303bdffffff >> (XEN) Probing paddr 3be001000, va ffff8303be001000 >> (XEN) Probing paddr 3be000000, va ffff8303be000000 >> (XEN) Early fatal page fault at e008:ffff82d04032014c >> (cr2=ffff8303be000000, ec=0000) >> >> The conditions for this bug appear to be map_pages_to_xen() call with a >> non-2M >> aligned start address, some number of full 2M pages, then a tail needing 4k >> pages. >> >> Anyway, the condition for spotting superpage boundaries in map_pages_to_xen() >> is wrong. The IS_ALIGNED() macro expects a power of two for the alignment >> argument, and subtracts 1 itself. >> >> Fixing this causes the failing case to now boot. >> >> Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page >> alignment") >> Debugged-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com> >> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> >> Tested-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com> > Reviewed-by: Jan Beulich <jbeul...@suse.com> > >> Judging by how IS_ALIGNED() is wrong, I think the pre-condition might be >> exactly 4k past a 2M boundary, not just simply a non-2M boundary. > That's the understanding I have gained, yes. > >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -5505,7 +5505,7 @@ int map_pages_to_xen( >> \ >> ASSERT(!mfn_eq(m_, INVALID_MFN)); \ >> IS_ALIGNED(PFN_DOWN(v) | mfn_x(m_), \ >> - (1UL << (PAGETABLE_ORDER * ((n) - 1))) - 1); \ >> + (1UL << (PAGETABLE_ORDER * ((n) - 1)))); \ > Can we also get rid of the now unnecessary outermost pair of parentheses > of that operand, please? Imo any reduction in parentheses in constructs > like this helps readability.
Ok. I'll adjust all of these, and get the fix in. Thanks. ~Andrew