On 29/03/26 06:48, Ritesh Harjani (IBM) wrote:
Sourabh Jain <[email protected]> writes:
The kexec sequence invokes enter_vmx_ops() and exit_vmx_ops() with the
MMU disabled. In this context, code must not rely on normal virtual
address translations or trigger page faults.
With KASAN enabled, these functions get instrumented and may access
shadow memory using regular address translation. When executed with
the MMU off, this can lead to page faults (bad_page_fault) from which
the kernel cannot recover in the kexec path, resulting in a hang.
Right, so with mmu off, kernel can't access KASAN shadow memory.
So, let me trace down the path a bit... you skipped an important detail
i.e. preempt_count() is always inline, and we play a few tricks in kexec
path to tell enter_vmx_ops() that we are in HARDIRQ mode.
default_machine_kexec(image)
current_thread_info()->preempt_count = HARDIRQ_OFFSET
kexec_sequence(..., copy_with_mmu_off = 1)
if (copy_with_mmu_off) bl real_mode
bl kexec_copy_flush(image)
memcpy(ranges, image->segment, ...)
copy_segments()
copy_page(dest, addr)
bl enter_vmx_ops()
if (in_interrupt() == true) return 0 // preempt_count
== HARDIRQ_OFFSET
beq .Lnonvmx_copy
Yes since preempt_count for the current thread is set to HARDIRQ_OFFSET
we return early from copy_page() -> copypage_power7 -> enter_vmx_ops()
and call to exit_vmx_ops is skipped.
Mark enter_vmx_ops() and exit_vmx_ops() with __no_sanitize_address to
avoid KASAN instrumentation and ensure kexec boots fine with KASAN
enabled.
IIUC, preempt_count() is always inline, and since you are disabling kasan
instrumentation on enter_vmx_ops(), hence it just works for this reason.
But you missed adding that detail here.
Yeah it is worth adding that in commit message. I will add it in v2.
enter_vmx_ops()
if (in_interrupt()) // return 0
preempt_count() & ... | HARDIRQ_OFFSET // preempt_count() is this
is __always_inline
static __always_inline int preempt_count(void)
{
return READ_ONCE(current_thread_info()->preempt_count);
}
Cc: Aditya Gupta <[email protected]>
Cc: Daniel Axtens <[email protected]>
Cc: Hari Bathini <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Ritesh Harjani (IBM) <[email protected]>
Cc: Shivang Upadhyay <[email protected]>
Cc: Venkat Rao Bagalkote <[email protected]>
Reported-by: Aboorva Devarajan <[email protected]>
Signed-off-by: Sourabh Jain <[email protected]>
---
arch/powerpc/lib/vmx-helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
index 554b248002b4..c01b2d856650 100644
--- a/arch/powerpc/lib/vmx-helper.c
+++ b/arch/powerpc/lib/vmx-helper.c
@@ -52,7 +52,7 @@ int exit_vmx_usercopy(void)
}
EXPORT_SYMBOL(exit_vmx_usercopy);
-int enter_vmx_ops(void)
In that case, should we should add a comment here saying:
/*
* Can be called from kexec copy_page() path with MMU off. The kexec
* code sets preempt_count to HARDIRQ_OFFSET so we return early here.
* Since in_interrupt() is always inline, __no_sanitize_address on this
* function is sufficient to avoid KASAN shadow memory accesses in real
* mode.
*/
Thanks for the write up, I will add it in v2.
+int __no_sanitize_address enter_vmx_ops(void)
{
if (in_interrupt())
return 0;
@@ -69,7 +69,7 @@ int enter_vmx_ops(void)
* passed a pointer to the destination which we return as required by a
* memcpy implementation.
*/
-void *exit_vmx_ops(void *dest)
+void __no_sanitize_address *exit_vmx_ops(void *dest)
I am assuming since we never enter into VMX in kexec path, so kexec path
must not be calling exit_vmx_ops anyways? So do we need __no_sanitize_address
here?
Agree in copypage_power7() we jump to .Lnonvmx_copy label and do
not call exit_vmx_ops. I will remove __no_sanitize_address from
exit_vmx_ops().
Thanks for the detailed review Ritesh.
- Soruabh Jain