Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG

2024-11-25 Thread David Woodhouse
On Mon, 2024-11-25 at 10:21 +0100, Ingo Molnar wrote: > > * David Woodhouse wrote: > > > From: David Woodhouse > > > > Signed-off-by: David Woodhouse > > --- > >  arch/x86/kernel/relocate_kernel_64.S | 4 > >  1 file changed, 4 insertions(+) > > > > diff --git a/arch/x86/kernel/relocate_

Re: [EXTERNAL] [RFC PATCH v3 01/20] x86/kexec: Ensure control_code_page is mapped in kexec page tables

2024-11-25 Thread David Woodhouse
On Mon, 2024-11-25 at 09:54 +, David Woodhouse wrote: > From: David Woodhouse > > The control_code_page should be explicitly mapped into the identity > mapped page tables for the relocate_kernel environment. This only seems > to have worked by luck before, because it tended to be within the s

[RFC PATCH v3 11/20] x86/kexec: Drop page_list argument from relocate_kernel()

2024-11-25 Thread David Woodhouse
From: David Woodhouse The kernel's virtual mapping of the relocate_kernel page currently needs to be RWX because it is written to before the %cr3 switch. Now that the relocate_kernel page has its own .data section and local variables, it can also have *global* variables. So eliminate the separat

[RFC PATCH v3 03/20] x86/kexec: Clean up and document register use in relocate_kernel_64.S

2024-11-25 Thread David Woodhouse
From: David Woodhouse Add more comments explaining what each register contains, and save the preserve_context flag to a non-clobbered register sooner, to keep things simpler. Signed-off-by: David Woodhouse Acked-by: Kai Huang --- arch/x86/kernel/relocate_kernel_64.S | 18 ++ 1

Re: [RFC PATCH] x86/mm: Disable PTI for kernel_ident_mapping_init()

2024-11-25 Thread Dave Hansen
On 11/25/24 09:05, David Woodhouse wrote: > Not sure I like this very much, but it works, and mirrors what > arch/x86/boot/compressed/ident_map_64.c already does. I don't like it much, either. arch/x86/boot/compressed/ is already on the road to sharing no code with the core kernel and it's full o

Re: [RFC PATCH] x86/mm: Disable PTI for kernel_ident_mapping_init()

2024-11-25 Thread Dave Hansen
On 11/25/24 10:53, David Woodhouse wrote: >> I think we have a lot of software-available space in the page table >> pointer entries. What would folks think if we set a special bit in those >> p4d entries that said: >> >> "I don't need to be propagated to >> the user portion of the page ta

Re: [RFC PATCH] x86/mm: Disable PTI for kernel_ident_mapping_init()

2024-11-25 Thread David Woodhouse
On 25 November 2024 19:13:02 GMT, Dave Hansen wrote: >On 11/25/24 10:53, David Woodhouse wrote: >>> I think we have a lot of software-available space in the page table >>> pointer entries. What would folks think if we set a special bit in those >>> p4d entries that said: >>> >>> "I don't need

[Invitation] Linux MM Alignment Session on Kexec Hand Over (KHO) on Wednesday

2024-11-25 Thread David Rientjes
Hi everybody, We host a biweekly series, the Linux MM Alignment Session, on Wednesdays. We'd like to invite MM developers to attend and will announce the topic for the next instance on the Monday prior to the next meeting. Our next Linux MM Alignment Session is scheduled for Wednesday. The detai

Re: [Invitation] Linux MM Alignment Session on Kexec Hand Over (KHO) on Wednesday

2024-11-25 Thread David Rientjes
Since the timezone changed recently, here is the world schedule for reference: PST (UTC-8) 9:00am MST (UTC-7) 10:00am CST (UTC-6) 11:00am EST (UTC-5) 12:00pm Rio de Janeiro (UTC-3) 2:00pm London (GMT)5:00pm Berlin (UTC+1) 6:00p

[RFC PATCH] x86/mm: Disable PTI for kernel_ident_mapping_init()

2024-11-25 Thread David Woodhouse
From: David Woodhouse With PTI enabled, set_p4d() and set_pgd() will scribble over the end of the 4KiB page allocated by the ->alloc_pgt_page() callback, expecting it to have been an 8KiB allocation with the userspace version immediately after the kernel's version. So build *just* this code with

Re: [RFC PATCH] x86/mm: Disable PTI for kernel_ident_mapping_init()

2024-11-25 Thread David Woodhouse
On Mon, 2024-11-25 at 10:31 -0800, Dave Hansen wrote: > On 11/25/24 09:05, David Woodhouse wrote: > > Not sure I like this very much, but it works, and mirrors what > > arch/x86/boot/compressed/ident_map_64.c already does. > > I don't like it much, either. > > arch/x86/boot/compressed/ is already

Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG

2024-11-25 Thread Ingo Molnar
* David Woodhouse wrote: > > Just curious: did you write this code to debug the series, or was > > there some original hair-tearing regression that motivated you? Is > > there's an upstream fix to marvel at and be horrified about in > > equal measure? > > https://lore.kernel.org/all/2ab14f6

Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG

2024-11-25 Thread David Woodhouse
On Mon, 2024-11-25 at 21:34 +0100, Ingo Molnar wrote: >   > > The realisation that we never even explicitly mapped the control code > > page and always just got lucky because it happened to be in the same > > 2MiB or 1GiB superpage as something else that we did map... was just > > a bonus :) >

Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG

2024-11-25 Thread Ingo Molnar
* David Woodhouse wrote: > From: David Woodhouse > > Signed-off-by: David Woodhouse > --- > arch/x86/kernel/relocate_kernel_64.S | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/x86/kernel/relocate_kernel_64.S > b/arch/x86/kernel/relocate_kernel_64.S > index 67f6853c7abe.

[RFC PATCH v3 08/20] x86/kexec: Invoke copy of relocate_kernel() instead of the original

2024-11-25 Thread David Woodhouse
From: David Woodhouse This currently calls set_memory_x() from machine_kexec_prepare() just like the 32-bit version does. That's actually a bit earlier than I'd like, as it leaves the page RWX all the time the image is even *loaded*. Subsequent commits will eliminate all the writes to the page b

[RFC PATCH v3 05/20] x86/kexec: Only swap pages for preserve_context mode

2024-11-25 Thread David Woodhouse
From: David Woodhouse There's no need to swap pages (which involves three memcopies for each page) in the plain kexec case. Just do a single copy from source to destination page. Signed-off-by: David Woodhouse --- arch/x86/kernel/relocate_kernel_64.S | 4 1 file changed, 4 insertions(+)

[RFC PATCH v3 06/20] x86/kexec: Allocate PGD for x86_64 transition page tables separately

2024-11-25 Thread David Woodhouse
From: David Woodhouse There's no good reason for this to be part of the control_code_page; just allocate it separately on x86_64 like i386 does. Signed-off-by: David Woodhouse --- arch/x86/include/asm/kexec.h | 18 --- arch/x86/kernel/machine_kexec_64.c | 49 -

[RFC PATCH v3 09/20] x86/kexec: Move relocate_kernel to kernel .data section

2024-11-25 Thread David Woodhouse
From: David Woodhouse Now that the copy is executed instead of the original, the relocate_kernel page can live in the kernel's .text section. This will allow subsequent commits to actually add real data to it and clean up the code somewhat as well as making the control page ROX. Signed-off-by: D

[RFC PATCH v3 13/20] x86/kexec: Clean up register usage in relocate_kernel()

2024-11-25 Thread David Woodhouse
From: David Woodhouse The memory encryption flag is passed in %r8 because that's where the calling convention puts it. Instead of moving it to %r12 and then using %r8 for other things, just leave it in %r8 and use other registers instead. Signed-off-by: David Woodhouse --- arch/x86/kernel/relo

[RFC PATCH v3 01/20] x86/kexec: Ensure control_code_page is mapped in kexec page tables

2024-11-25 Thread David Woodhouse
From: David Woodhouse The control_code_page should be explicitly mapped into the identity mapped page tables for the relocate_kernel environment. This only seems to have worked by luck before, because it tended to be within the same 2MiB or 1GiB large page already mapped for another reason. A su

[RFC PATCH v3 15/20] x86/kexec: Add CONFIG_KEXEC_DEBUG option

2024-11-25 Thread David Woodhouse
From: David Woodhouse This does nothing yet. Signed-off-by: David Woodhouse --- arch/x86/Kconfig.debug | 8 1 file changed, 8 insertions(+) diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index 74777a97e394..9dde32865a9b 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86

[RFC PATCH v3 16/20] x86/kexec: Debugging support: load a GDT

2024-11-25 Thread David Woodhouse
From: David Woodhouse There are some failure modes which lead to triple-faults in the relocate_kernel function, which is fairly much undebuggable for normal mortals. Adding a GDT in the relocate_kernel environment is step 1 towards being able to catch faults and do something more useful. Signed

[RFC PATCH v3 14/20] x86/kexec: Mark relocate_kernel page as ROX instead of RWX

2024-11-25 Thread David Woodhouse
From: David Woodhouse All writes to the page now happen before it gets marked as executable (or after it's already switched to the identmap page tables where it's OK to be RWX). Signed-off-by: David Woodhouse --- arch/x86/kernel/machine_kexec_64.c | 3 ++- 1 file changed, 2 insertions(+), 1 de

[RFC PATCH v3 02/20] x86/kexec: Restore GDT on return from preserve_context kexec

2024-11-25 Thread David Woodhouse
From: David Woodhouse The restore_processor_state() function explicitly states that "the asm code that gets us here will have restored a usable GDT". That wasn't true in the case of returning from a preserve_context kexec. Make it so. Without this, the kernel was depending on the called function

[RFC PATCH v3 19/20] x86/kexec: Add 8250 serial port output

2024-11-25 Thread David Woodhouse
From: David Woodhouse If a serial port was configured for early_printk, use it for debug output from the relocate_kernel exception handler too. Signed-off-by: David Woodhouse --- arch/x86/include/asm/kexec.h | 1 + arch/x86/kernel/early_printk.c | 6 + arch/x86/kernel/reloc

[RFC PATCH v3 17/20] x86/kexec: Debugging support: Load an IDT and basic exception entry points

2024-11-25 Thread David Woodhouse
From: David Woodhouse Signed-off-by: David Woodhouse --- arch/x86/include/asm/kexec.h | 5 ++ arch/x86/kernel/machine_kexec_64.c | 23 arch/x86/kernel/relocate_kernel_64.S | 82 3 files changed, 110 insertions(+) diff --git a/arch/x86/include/as

[RFC PATCH v3 12/20] x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page

2024-11-25 Thread David Woodhouse
From: David Woodhouse All writes to the relocate_kernel control page are now done *after* the %cr3 switch via simple %rip-relative addressing, which means the DATA() macro with its pointer arithmetic can also now be removed. Signed-off-by: David Woodhouse --- arch/x86/kernel/relocate_kernel_64

[RFC PATCH v3 20/20] [DO NOT MERGE] x86/kexec: Add int3 in kexec path for testing

2024-11-25 Thread David Woodhouse
From: David Woodhouse Signed-off-by: David Woodhouse --- arch/x86/kernel/relocate_kernel_64.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index 01a31e4a0664..ff8a813a9f9b 100644 --- a/arch/x86/k

[RFC PATCH v3 10/20] x86/kexec: Add data section to relocate_kernel

2024-11-25 Thread David Woodhouse
From: David Woodhouse Now that the relocate_kernel page is handled sanely by a linker script we can have actual data, and just use %rip-relative addressing to access it. Signed-off-by: David Woodhouse --- arch/x86/kernel/machine_kexec_64.c | 8 +++- arch/x86/kernel/relocate_kernel_64.S | 62

[RFC PATCH v3 04/20] x86/kexec: Use named labels in swap_pages in relocate_kernel_64.S

2024-11-25 Thread David Woodhouse
From: David Woodhouse Make the code a little more readable. Signed-off-by: David Woodhouse Acked-by: Kai Huang --- arch/x86/kernel/relocate_kernel_64.S | 30 ++-- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/relocate_kernel_64.S b/ar

[RFC PATCH v3 18/20] x86/kexec: Debugging support: Dump registers on exception

2024-11-25 Thread David Woodhouse
From: David Woodhouse The actual serial output function is a no-op for now. Signed-off-by: David Woodhouse --- arch/x86/kernel/relocate_kernel_64.S | 104 --- 1 file changed, 96 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86

[RFC PATCH v3 07/20] x86/kexec: Copy control page into place in machine_kexec_prepare()

2024-11-25 Thread David Woodhouse
From: David Woodhouse There's no need for this to wait until the actual machine_kexec() invocation; future changes will need to make the control page read-only and executable, so all writes should be completed before machine_kexec_prepare() returns. Signed-off-by: David Woodhouse --- arch/x86/

[RFC PATCH v3 00/20] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving

2024-11-25 Thread David Woodhouse
Debugging kexec failures is painful, as anything going wrong in execution of the critical relocate_kernel() function tends to just lead to a triple fault. Thus leading to *weeks* of my life that I won't get back. Having hacked something up for my own use, I figured I should share it... Add a CONFI

Re: [PATCH v1 03/11] fs/proc/vmcore: disallow vmcore modifications after the vmcore was opened

2024-11-25 Thread Baoquan He
On 11/22/24 at 10:30am, David Hildenbrand wrote: > On 22.11.24 10:16, Baoquan He wrote: > > On 10/25/24 at 05:11pm, David Hildenbrand wrote: > > ..snip... > > > @@ -1482,6 +1470,10 @@ int vmcore_add_device_dump(struct vmcoredd_data > > > *data) > > > return -EINVAL; > > >

Re: [EXTERNAL] [RFC PATCH v3 01/20] x86/kexec: Ensure control_code_page is mapped in kexec page tables

2024-11-25 Thread David Woodhouse
On Mon, 2024-11-25 at 10:29 +, David Woodhouse wrote: > On Mon, 2024-11-25 at 09:54 +, David Woodhouse wrote: > > From: David Woodhouse > > > > The control_code_page should be explicitly mapped into the identity > > mapped page tables for the relocate_kernel environment. This only seems >