On 19/03/2025 2:29 pm, Jan Beulich wrote:
> On 19.03.2025 15:20, Andrew Cooper wrote:
>> The current CI failures happen to be a latent bug triggered by a narrow set 
>> of
>> properties of the initrd, which CI encountered by chance.
> Plus properties of the host memory map.
>
>> One step during boot involves constructing directmap mappings for modules.
>> With some probing at the point of creation, it is observed that there's a 4k
>> mapping missing towards the end of the initrd.
>>
>>   (XEN) === Mapped Mod1 [0000000394001000, 00000003be1ff6dc] to Directmap
>>   (XEN) Probing paddr 394001000, va ffff830394001000
>>   (XEN) Probing paddr 3be1ff6db, va ffff8303be1ff6db
>>   (XEN) Probing paddr 3bdffffff, va ffff8303bdffffff
>>   (XEN) Probing paddr 3be001000, va ffff8303be001000
>>   (XEN) Probing paddr 3be000000, va ffff8303be000000
>>   (XEN) Early fatal page fault at e008:ffff82d04032014c 
>> (cr2=ffff8303be000000, ec=0000)
>>
>> The conditions for this bug appear to be map_pages_to_xen() call with a 
>> non-2M
>> aligned start address, some number of full 2M pages, then a tail needing 4k
>> pages.
>>
>> Anyway, the condition for spotting superpage boundaries in map_pages_to_xen()
>> is wrong.  The IS_ALIGNED() macro expects a power of two for the alignment
>> argument, and subtracts 1 itself.
>>
>> Fixing this causes the failing case to now boot.
>>
>> Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page 
>> alignment")
>> Debugged-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>> Tested-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
> Reviewed-by: Jan Beulich <jbeul...@suse.com>
>
>> Judging by how IS_ALIGNED() is wrong, I think the pre-condition might be
>> exactly 4k past a 2M boundary, not just simply a non-2M boundary.
> That's the understanding I have gained, yes.
>
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -5505,7 +5505,7 @@ int map_pages_to_xen(
>>                                                                  \
>>      ASSERT(!mfn_eq(m_, INVALID_MFN));                           \
>>      IS_ALIGNED(PFN_DOWN(v) | mfn_x(m_),                         \
>> -               (1UL << (PAGETABLE_ORDER * ((n) - 1))) - 1);     \
>> +               (1UL << (PAGETABLE_ORDER * ((n) - 1))));         \
> Can we also get rid of the now unnecessary outermost pair of parentheses
> of that operand, please? Imo any reduction in parentheses in constructs
> like this helps readability.

Ok. I'll adjust all of these, and get the fix in.

Thanks.

~Andrew

Reply via email to