On 09/12/2020 13:59, Manuel Bouyer wrote:
> On Wed, Dec 09, 2020 at 01:28:54PM +0000, Andrew Cooper wrote:
>> Pagefaults on IRET come either from stack accesses for operands (not the
>> case here as Xen is otherwise working fine), or from segement selector
>> loads for %cs and %ss.
>>
>> In this example, %ss is in the LDT, which specifically does use
>> pagefaults to promote the frame to PGT_segdesc.
>>
>> I suspect that what is happening is that handle_ldt_mapping_fault() is
>> failing to promote the page (for some reason), and we're taking the "In
>> hypervisor mode? Leave it to the #PF handler to fix up." path due to the
>> confusion in context, and Xen's #PF handler is concluding "nothing else
>> to do".
>>
>> The older behaviour of escalating to the failsafe callback would have
>> broken this cycle by rewriting %ss and re-entering the kernel.
>>
>>
>> Please try the attached debugging patch, which is an extension of what I
>> gave you yesterday.  First, it ought to print %cr2, which I expect will
>> point to Xen's virtual mapping of the vcpu's LDT.  The logic ought to
>> loop a few times so we can inspect the hypervisor codepaths which are
>> effectively livelocked in this state, and I've also instrumented
>> check_descriptor() failures because I've got a gut feeling that is the
>> root cause of the problem.
> here's the output:
> (XEN) IRET fault: #PF[0000]                                            
> [23/1999]
> (XEN) %cr2 ffff820000010040                                                   
>  
> (XEN) IRET fault: #PF[0000]                                                   
>  
> (XEN) %cr2 ffff820000010040                                                 
> (XEN) IRET fault: #PF[0000]
> (XEN) %cr2 ffff820000010040
> (XEN) IRET fault: #PF[0000]
> (XEN) %cr2 ffff820000010040
> (XEN) domain_crash called from extable.c:216
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-4.15-unstable  x86_64  debug=y   Tainted:   C   ]----
> (XEN) CPU:    0
> (XEN) RIP:    0047:[<00007f7ff60007d0>]
> (XEN) RFLAGS: 0000000000000202   EM: 0   CONTEXT: pv guest (d0v0)
> (XEN) rax: ffff82d04038c309   rbx: 0000000000000000   rcx: 000000000000e008
> (XEN) rdx: 0000000000010086   rsi: ffff83007fcb7f78   rdi: 000000000000e010
> (XEN) rbp: 0000000000000000   rsp: 00007f7fff4876c0   r8:  0000000e00000000
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
> (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 0000000000002660
> (XEN) cr3: 0000000079cdb000   cr2: ffffa1000000a040
> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: ffffffff80cf2dc0
> (XEN) ds: 0023   es: 0023   fs: 0000   gs: 0000   ss: 003f   cs: 0047
> (XEN) Guest stack trace from rsp=00007f7fff4876c0:
> (XEN)    0000000000000001 00007f7fff487bd8 0000000000000000 0000000000000000
> (XEN)    0000000000000003 00000000aee00040 0000000000000004 0000000000000038
> (XEN)    0000000000000005 0000000000000008 0000000000000006 0000000000001000
> (XEN)    0000000000000007 00007f7ff6000000 0000000000000008 0000000000000000
> (XEN)    0000000000000009 00000000aee01cd0 00000000000007d0 0000000000000000
> (XEN)    00000000000007d1 0000000000000000 00000000000007d2 0000000000000000
> (XEN)    00000000000007d3 0000000000000000 000000000000000d 00007f7fff488000
> (XEN)    00000000000007de 00007f7fff4877c0 0000000000000000 0000000000000000
> (XEN)    6e692f6e6962732f 0000000000007469 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
Huh, so it is the LDT, but we're not getting as far as inspecting the
target frame.

I wonder if the LDT is set up correctly.  How about this incremental delta?

~Andrew

diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c
index 88b05bef38..be59a3e216 100644
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -203,13 +203,16 @@ search_pre_exception_table(struct cpu_user_regs *regs)
         __start___pre_ex_table, __stop___pre_ex_table-1, addr);
     if ( fixup )
     {
+        struct vcpu *curr = current;
         static int count;
 
         printk(XENLOG_ERR "IRET fault: %s[%04x]\n",
                vec_name(regs->entry_vector), regs->error_code);
 
         if ( regs->entry_vector == X86_EXC_PF )
-            printk(XENLOG_ERR "%%cr2 %016lx\n", read_cr2());
+            printk(XENLOG_ERR "%%cr2 %016lx, LDT base %016lx, limit
%04x\n",
+                   read_cr2(), curr->arch.pv.ldt_base,
+                   (curr->arch.pv.ldt_ents << 3) | 7);
 
         if ( count++ > 2 )
         {
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 1059f3ce66..3ac07a84c3 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1233,6 +1233,8 @@ static int handle_ldt_mapping_fault(unsigned int
offset,
     }
     else
     {
+        printk(XENLOG_ERR "*** pv_map_ldt_shadow_page(%#x) failed\n",
offset);
+
         /* In hypervisor mode? Leave it to the #PF handler to fix up. */
         if ( !guest_mode(regs) )
             return 0;


Reply via email to