On 23/01/2026 3:31 pm, Julian Vetter wrote:
> On 1/22/26 15:11, Jan Beulich wrote:
>> On 22.01.2026 14:57, Andrew Cooper wrote:
>>> On 22/01/2026 1:48 pm, Julian Vetter wrote:
>>>> (XEN) Early fatal page fault at e008:ffff82d0403b38e0
>>>> (cr2=0000000001100202, ec=0009)
>>>> (XEN) ----[ Xen-4.22-unstable  x86_64  debug=y  Not tainted ]----
>>>> (XEN) CPU:    0
>>>> (XEN) RIP:    e008:[<ffff82d0403b38e0>] memcmp+0x20/0x46
>>>> (XEN) RFLAGS: 0000000000010002   CONTEXT: hypervisor
>>>> (XEN) rax: 0000000000000000   rbx: 0000000001100000   rcx: 0000000000000000
>>>> (XEN) rdx: 0000000000000004   rsi: ffff82d0404a0d23   rdi: 0000000001100202
>>>> (XEN) rbp: ffff82d040497d88   rsp: ffff82d040497d78   r8:  0000000000000016
>>>> (XEN) r9:  ffff82d04061a180   r10: ffff82d04061a188   r11: 0000000000000010
>>>> (XEN) r12: 0000000001100000   r13: 0000000000000001   r14: ffff82d0404d2b80
>>>> (XEN) r15: ffff82d040462750   cr0: 0000000080050033   cr4: 00000000000000a0
>>>> (XEN) cr3: 00000000b5d0e000   cr2: 0000000001100202
>>>> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>>> (XEN) Xen code around <ffff82d0403b38e0> (memcmp+0x20/0x46):
>>>> (XEN)  0f 1f 84 00 00 00 00 00 <0f> b6 04 0f 44 0f b6 04 0e 44 29 c0 75
>>>> 13 48 83
>>>> (XEN) Xen stack trace from rsp=ffff82d040497d78:
>>>> (XEN)    ffff82d040483f79 0000000000696630 ffff82d040497db0 
>>>> ffff82d040483fd2
>>>> (XEN)    0000000000696630 ffff82d040200000 0000000000000001 
>>>> ffff82d040497ef8
>>>> (XEN)    ffff82d04047c4ac 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    ffff82d04062c6d8 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000140000 0000000000000000 
>>>> 0000000000000001
>>>> (XEN)    0000000000000000 0000000000000000 ffff82d040497f08 
>>>> ffff82d0404d2b80
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000800000000 000000010000006e 
>>>> 0000000000000003
>>>> (XEN)    00000000000002f8 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000099f30ba0 0000000099feeda7 0000000000000000 
>>>> ffff82d040497fff
>>>> (XEN)    00000000b9cf3920 ffff82d0402043e8 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000e01000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    00000000000000a0 0000000000000000 0000000000000000 
>>>> 0000000000000000
>>>> (XEN) Xen call trace:
>>>> (XEN)    [<ffff82d0403b38e0>] R memcmp+0x20/0x46
>>>> (XEN)    [<ffff82d040483f79>] S arch/x86/bzimage.c#bzimage_check+0x2e/0x73
>>>> (XEN)    [<ffff82d040483fd2>] F bzimage_headroom+0x14/0xa5
>>>> (XEN)    [<ffff82d04047c4ac>] F __start_xen+0x908/0x2452
>>>> (XEN)    [<ffff82d0402043e8>] F __high_start+0xb8/0xc0
>>>> (XEN)
>>>> (XEN) Pagetable walk from 0000000001100202:
>>>> (XEN)  L4[0x000] = 00000000b5c9d063 ffffffffffffffff
>>>> (XEN)
>>>> (XEN) ****************************************
>>>> (XEN) Panic on CPU 0:
>>>> (XEN) FATAL TRAP: vec 14, #PF[0009] IN INTERRUPT CONTEXT
>>>> (XEN) ****************************************
>>> Huh, that means we have a bug in the pagewalk rendering.  It shouldn't
>>> give up like that.
>> Is it perhaps too early for mfn_valid() to return "true" for the page table
>> page in question?
> Yes, this is indeed the problem. Thank you Jan. The mfn_valid() doesn't 
> work yet, because max_page is set afterwards in __start_xen. Here is the 
> actual translation:
>
> (XEN) Xen call trace:
> (XEN)    [<ffff82d0403b3820>] R memcmp+0x20/0x46
> (XEN)    [<ffff82d040483f79>] S arch/x86/bzimage.c#bzimage_check+0x2e/0x73
> (XEN)    [<ffff82d040483fd2>] F bzimage_headroom+0x14/0xa5
> (XEN)    [<ffff82d04047c4ac>] F __start_xen+0x908/0x2452
> (XEN)    [<ffff82d0402043e8>] F __high_start+0xb8/0xc0
> (XEN)
> (XEN) Pagetable walk from 0000000001100202:
> (XEN) Using simple walk without mfn_valid
> (XEN) Early pagetable walk from 0000000001100202 (cr3=00000000b5d0e000):
> (XEN)  L4[0x000] = 00000000b5c9d063
> (XEN)  L3[0x000] = 00000000b5c99063
> (XEN)  L2[0x008] = 80000000000001e3 (2MB)
>
> And I also found the actual issue with the code, and why it fails in the 
> first place. Somewhere before early_init_{intel,amd}, there is 
> bzimage_headroom(bootstrap_map_bm(&bi->mods[0]), bi->mods[0].size), and 
> the 'bootstrap_map_bm()' maps the new page with __PAGE_HYPERVISOR_RO, 
> which has PAGE_NX. So, not sure how to work around this.

I'm working on a cleanup series to untangle the mess.

~Andrew

Reply via email to