On 11/05/16 08:35, Jan Beulich wrote: >>>> On 11.05.16 at 07:49, <jgr...@suse.com> wrote: >> On 10/05/16 18:35, Boris Ostrovsky wrote: >>> On 05/10/2016 11:43 AM, Juergen Gross wrote: >>>> On 10/05/16 17:35, Jan Beulich wrote: >>>>>>>> On 10.05.16 at 17:19, <jgr...@suse.com> wrote: >>>>>> On 10/05/16 15:57, Jan Beulich wrote: >>>>>>>>>> On 10.05.16 at 15:39, <boris.ostrov...@oracle.com> wrote: >>>>>>>> I didn't finish unwrapping the stack yesterday. Here it is: >>>>>>>> >>>>>>>> setup_arch -> dmi_scan_machine -> dmi_walk_early -> early_ioremap >>>>>>> Ah, that makes sense. Yet why would early_ioremap() involve an >>>>>>> M2P lookup? As said, MMIO addresses shouldn't be subject to such >>>>>>> lookups. >>>>>> early_ioremap()-> >>>>>> __early_ioremap()-> >>>>>> __early_set_fixmap()-> >>>>>> set_pte()-> >>>>>> xen_set_pte_init()-> >>>>>> mask_rw_pte()-> >>>>>> pte_pfn()-> >>>>>> pte_val()-> >>>>>> xen_pte_val()-> >>>>>> pte_mfn_to_pfn() >>>>> Well, I understand (also from Boris' first reply) that's how it is, >>>>> but not why it is so. I.e. the call flow above doesn't answer my >>>>> question. >>>> On x86 early_ioremap() and early_memremap() share a common sub-function >>>> __early_ioremap(). This together with pvops requires a common set_pte() >>>> implementation leading to the mfn validation in the end. >>> >>> Do we make any assumptions about where DMI data lives? >> >> I don't think so. >> >> So the basic problem is the page fault due to the sparse m2p map before >> the #PF handler is registered. >> >> What do you think about registering a minimal #PF handler in >> xen_arch_setup() being capable to handle this problem? This should be >> doable without major problems. I can do a patch. > > To me that would feel like working around the issue instead of > admitting that the removal of _PAGE_IOMAP was a mistake.
Hmm, I don't think so. Having a Xen specific pte flag seems to be much more intrusive than having an early boot page fault handler consisting of just one line being capable to mimic the default handler in just one aspect (see attached patch - only compile tested). Adding David as he removed _PAGE_IOMAP in kernel 3.18. Juergen
commit 272793dcb989fc1ff2caaa9519f8f1ea5434b578 Author: Juergen Gross <jgr...@suse.com> Date: Wed May 11 07:53:54 2016 +0200 xen: register early page fault handler In early boot of dom0 accesses to the sparse m2p list of the hypervisor can result in unhandled page faults as the #PF handler handling this case via exception table isn't yet registered. Install a primitive early page fault handler for this case. diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 858b555..a20ea98 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -911,6 +911,7 @@ idtentry stack_segment do_stack_segment has_error_code=1 idtentry xen_debug do_debug has_error_code=0 idtentry xen_int3 do_int3 has_error_code=0 idtentry xen_stack_segment do_stack_segment has_error_code=1 +idtentry xen_page_fault xen_do_page_fault has_error_code=1 #endif idtentry general_protection do_general_protection has_error_code=1 diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index c3496619..f91cb3f 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -16,6 +16,7 @@ asmlinkage void int3(void); asmlinkage void xen_debug(void); asmlinkage void xen_int3(void); asmlinkage void xen_stack_segment(void); +asmlinkage void xen_page_fault(void); asmlinkage void overflow(void); asmlinkage void bounds(void); asmlinkage void invalid_op(void); @@ -54,6 +55,7 @@ asmlinkage void trace_page_fault(void); #define trace_alignment_check alignment_check #define trace_simd_coprocessor_error simd_coprocessor_error #define trace_async_page_fault async_page_fault +#define trace_xen_page_fault xen_page_fault #endif dotraplinkage void do_divide_error(struct pt_regs *, long); @@ -74,6 +76,7 @@ asmlinkage struct pt_regs *sync_regs(struct pt_regs *); #endif dotraplinkage void do_general_protection(struct pt_regs *, long); dotraplinkage void do_page_fault(struct pt_regs *, unsigned long); +dotraplinkage void xen_do_page_fault(struct pt_regs *, unsigned long); #ifdef CONFIG_TRACING dotraplinkage void trace_do_page_fault(struct pt_regs *, unsigned long); #else diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 7ab2951..eaee9d3 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -17,7 +17,10 @@ #include <asm/e820.h> #include <asm/setup.h> #include <asm/acpi.h> +#include <asm/desc.h> #include <asm/numa.h> +#include <asm/traps.h> +#include <asm/uaccess.h> #include <asm/xen/hypervisor.h> #include <asm/xen/hypercall.h> @@ -1067,4 +1070,19 @@ void __init xen_arch_setup(void) #ifdef CONFIG_NUMA numa_off = 1; #endif + + sort_main_extable(); + set_intr_gate(X86_TRAP_PF, xen_page_fault); +} + +/* + * Early page fault handler being capable to handle page faults resulting + * from accesses via xen_safe_read_ulong(). + * This page fault handler will be active in early boot only. It is being + * replaced by the default page fault handler later. + */ +dotraplinkage void notrace +xen_do_page_fault(struct pt_regs *regs, unsigned long error_code) +{ + fixup_exception(regs, X86_TRAP_PF); }
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel