From: Kees Cook
To: Nathan Chancellor
CC: John Rowley ; johan...@sipsolutions.net;
linux-wirel...@vger.kernel.org; sta...@vger.kernel.org; gustavo...@kernel.org;
linux-hardening@vger.kernel.org
Date: 30 Dec 2024 18:35:34
Subject: Re: UBSAN array-index-o
On Tue, Nov 19, 2024 at 07:47:18AM +0530, R Sundar wrote:
> Rearrange misplaced functions in sorted order.
>
> Suggested-by: Andy Shevchenko
> Signed-off-by: R Sundar
Reviewed-by: Larysa Zaremba
This is a much-needed patch.
R Sundar,
Can I pick it into my series in case it will not be accept
On Mon, Jan 13, 2025 at 05:46:44PM +0100, Thomas Weißschuh wrote:
> Hi everybody,
>
> as you know, leaking raw kernel pointers to the user is problematic as
> they can be used to break KASLR.
> Therefore back in 2011 the %pK format specifier was added [0], printing
> certain pointers zeroed out or
On 14/01/25 18:03, Larysa Zaremba wrote:
On Tue, Nov 19, 2024 at 07:47:18AM +0530, R Sundar wrote:
Rearrange misplaced functions in sorted order.
Suggested-by: Andy Shevchenko
Signed-off-by: R Sundar
Reviewed-by: Larysa Zaremba
This is a much-needed patch.
R Sundar,
Can I pick it into my
Context
===
We've observed within Red Hat that isolated, NOHZ_FULL CPUs running a
pure-userspace application get regularly interrupted by IPIs sent from
housekeeping CPUs. Those IPIs are caused by activity on the housekeeping CPUs
leading to various on_each_cpu() calls, e.g.:
64359.05220959
call_dest_name() does not get passed the file pointer of validate_call(),
which means its invocation of insn_reloc() will always return NULL. Make it
take a file pointer.
While at it, make sure call_dest_name() uses arch_dest_reloc_offset(),
otherwise it gets the pv_ops[] offset wrong.
Fabricatin
A later commit will reduce the size of the RCU watching counter to free up
some bits for another purpose. Paul suggested adding a config option to
test the extreme case where the counter is reduced to its minimum usable
width for rcutorture to poke at, so do that.
Make it only configurable under R
I had to look into objtool itself to understand what this warning was
about; make it more explicit.
Signed-off-by: Valentin Schneider
Acked-by: Josh Poimboeuf
---
tools/objtool/check.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/objtool/check.c b/tools/objtool/chec
From: Josh Poimboeuf
Deferring a code patching IPI is unsafe if the patched code is in a
noinstr region. In that case the text poke code must trigger an
immediate IPI to all CPUs, which can rudely interrupt an isolated NO_HZ
CPU running in userspace.
Some noinstr static branches may really need
We now have an RCU_EXPERT config for testing small-sized RCU dynticks
counter: CONFIG_RCU_DYNTICKS_TORTURE.
Modify scenario TREE04 to exercise to use this config in order to test a
ridiculously small counter (2 bits).
Link:
http://lore.kernel.org/r/4c2cb573-168f-4806-b1d9-164e8276e66a@paulmck-l
From: Josh Poimboeuf
Deferring a code patching IPI is unsafe if the patched code is in a
noinstr region. In that case the text poke code must trigger an
immediate IPI to all CPUs, which can rudely interrupt an isolated NO_HZ
CPU running in userspace.
If a noinstr static call only needs to be pa
Later commits will cause objtool to warn about static calls being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
pv_sched_clock is updated in:
o __init vmware_paravirt_ops_setup()
o __init xen_init_time_common()
o kvm_sched_clock_init() <-
Later commits will cause objtool to warn about static calls being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
perf_lopwr_cb is used in .noinstr code, but is only ever updated in __init
amd_brs_lopwr_init(), and can thus be marked as __ro
The static call is only ever updated in
__init xen_time_setup_guest()
so mark it appropriately as __ro_after_init.
Signed-off-by: Valentin Schneider
---
arch/arm/kernel/paravirt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel
Later commits will cause objtool to warn about static keys being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
__sched_clock_stable is used in .noinstr code, and can be modified at
runtime (e.g. time_cpufreq_notifier()). Suppressing the te
CT_STATE_KERNEL being zero means it can be (and is) omitted in a handful of
places. A later patch will change CT_STATE_KERNEL into a non-zero value,
prepare that by using it where it should be:
o In the initial CT state
o At kernel entry / exit
No change in functionality intended.
Signed-off-by:
From: Josh Poimboeuf
Warn about static branches/calls in noinstr regions, unless the
corresponding key is RO-after-init or has been manually whitelisted with
DEFINE_STATIC_KEY_*_NOINSTR(().
Signed-off-by: Josh Poimboeuf
[Added NULL check for insn_call_dest() return value]
Signed-off-by: Valenti
Later commits will cause objtool to warn about static calls being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
x86_idle is updated in:
o xen_set_default_idle() <- __init xen_arch_setup()
o __init select_idle_routine()
IOW purely init con
The static call is only ever updated in
__init pv_time_init()
__init xen_init_time_common()
__init vmware_paravirt_ops_setup()
__init xen_time_setup_guest(
so mark it appropriately as __ro_after_init.
Signed-off-by: Valentin Schneider
---
arch/x86/kernel/paravirt.c | 2 +-
1 file chang
The static call is only ever updated in:
__init pv_time_init()
__init xen_time_setup_guest()
so mark it appropriately as __ro_after_init.
Signed-off-by: Valentin Schneider
---
arch/riscv/kernel/paravirt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel
The static call is only ever updated in
__init pv_time_init()
__init xen_time_setup_guest()
so mark it appropriately as __ro_after_init.
Signed-off-by: Valentin Schneider
---
arch/loongarch/kernel/paravirt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/loongarch
The static call is only ever updated in
__init pv_time_init()
__init xen_time_setup_guest()
so mark it appropriately as __ro_after_init.
Signed-off-by: Valentin Schneider
---
arch/arm64/kernel/paravirt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/
sched_clock_running is only ever enabled in the __init functions
sched_clock_init() and sched_clock_init_late(), and is never disabled. Mark
it __ro_after_init.
Signed-off-by: Valentin Schneider
---
kernel/sched/clock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/s
Later commits will cause objtool to warn about static keys being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
mds_idle_clear is used in .noinstr code, and can be modified at
runtime (SMT hotplug). Suppressing the text_poke_sync() IPI has
Later commits will cause objtool to warn about static keys being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
stack_erasing_bypass is used in .noinstr code, and can be modified at runtime
(proc/sys/kernel/stack_erasing write). However it
Later commits will cause objtool to warn about static keys being used in
.noinstr sections in order to safely defer instruction patching IPIs
targeted at NOHZ_FULL CPUs.
These keys are used in .noinstr code, and can be modified at runtime
(/proc/kernel/vmx* write). However it is not expected that
ct_nmi_{enter, exit}() only touches the RCU watching counter and doesn't
modify the actual CT state part context_tracking.state. This means that
upon receiving an IRQ when idle, the CT_STATE_IDLE->CT_STATE_KERNEL
transition only happens in ct_idle_exit().
One can note that ct_nmi_enter() can only
smp_call_function() & friends have the unfortunate habit of sending IPIs to
isolated, NOHZ_FULL, in-userspace CPUs, as they blindly target all online
CPUs.
Some callsites can be bent into doing the right, such as done by commit:
cc9e303c91f5 ("x86/cpu: Disable frequency requests via aperfmperf
A later patch will require to easily exclude CT_STATE_KERNEL from a genuine
a ct->state read CT_STATE_KERNEL, which requires that value being non-zero
and exclusive with the other CT_STATE_* values.
This increases the size of the CT_STATE region of the ct->state variable by
two bits.
Signed-off-b
text_poke_bp_batch() sends IPIs to all online CPUs to synchronize
them vs the newly patched instruction. CPUs that are executing in userspace
do not need this synchronization to happen immediately, and this is
actually harmful interference for NOHZ_FULL CPUs.
As the synchronization IPIs are sent u
From: Peter Zijlstra
Later patches will require issuing a __flush_tlb_all() from noinstr code.
This requires making both __flush_tlb_local() and __flush_tlb_global()
noinstr-compliant.
For __flush_tlb_global(), both native_flush_tlb_global() and xen_flush_tlb()
need to be made noinstr.
Forgo us
Later patches will require issuing a __flush_tlb_all() from noinstr code.
Both __flush_tlb_local() and __flush_tlb_global() are now
noinstr-compliant, so __flush_tlb_all() can be made noinstr itself.
Signed-off-by: Valentin Schneider
---
arch/x86/include/asm/tlbflush.h | 2 +-
arch/x86/mm/tlb.c
vunmap()'s issued from housekeeping CPUs are a relatively common source of
interference for isolated NOHZ_FULL CPUs, as they are hit by the
flush_tlb_kernel_range() IPIs.
Given that CPUs executing in userspace do not access data in the vmalloc
range, these IPIs could be deferred until their next k
Later patches will require issuing a __flush_tlb_all() from noinstr code.
This requires making both __flush_tlb_local() and __flush_tlb_global()
noinstr-compliant.
For __flush_tlb_local(), xen_flush_tlb() has already been made noinstr, so
it's just native_flush_tlb_global(), and simply __always_in
With NO_HZ_IDLE, we get CONTEXT_TRACKING_IDLE, so we get these
transitions:
ct_idle_enter()
ct_kernel_exit()
ct_state_inc_clear_work()
ct_idle_exit()
ct_kernel_enter()
ct_work_flush()
With just CONTEXT_TRACKING_IDLE, ct_state_inc_clear_work() is just
ct_state_inc() and ct
On Tue, Jan 14, 2025 at 06:51:23PM +0100, Valentin Schneider wrote:
> The static call is only ever updated in:
>
> __init pv_time_init()
> __init xen_time_setup_guest()
>
> so mark it appropriately as __ro_after_init.
>
> Signed-off-by: Valentin Schneider
> ---
> arch/riscv/kernel/paravirt
On Tue, Jan 14, 2025 at 6:51 PM Valentin Schneider wrote:
> vunmap()'s issued from housekeeping CPUs are a relatively common source of
> interference for isolated NOHZ_FULL CPUs, as they are hit by the
> flush_tlb_kernel_range() IPIs.
>
> Given that CPUs executing in userspace do not access data i
On Tue, Jan 14, 2025, Valentin Schneider wrote:
> text_poke_bp_batch() sends IPIs to all online CPUs to synchronize
> them vs the newly patched instruction. CPUs that are executing in userspace
> do not need this synchronization to happen immediately, and this is
> actually harmful interference for
Please use "KVM: VMX:" for the scope.
On Tue, Jan 14, 2025, Valentin Schneider wrote:
> Later commits will cause objtool to warn about static keys being used in
> .noinstr sections in order to safely defer instruction patching IPIs
> targeted at NOHZ_FULL CPUs.
>
> These keys are used in .noinstr
On 1/14/25 09:51, Valentin Schneider wrote:
> + cr4 = this_cpu_read(cpu_tlbstate.cr4);
> + asm volatile("mov %0,%%cr4": : "r" (cr4 ^ X86_CR4_PGE) : "memory");
> + asm volatile("mov %0,%%cr4": : "r" (cr4) : "memory");
> + /*
> + * In lieu of not having the pinning crap, hard fai
On Tue, Jan 14, 2025, Sean Christopherson wrote:
> On Tue, Jan 14, 2025, Valentin Schneider wrote:
> > +/**
> > + * is_kernel_noinstr_text - checks if the pointer address is located in the
> > + *.noinstr section
> > + *
> > + * @addr: address to check
> > + *
> > + * Returns: t
Add the __counted_by compiler attribute to the flexible array member
attrs to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.
Increment num before adding a new param_attribute to the attrs array and
adjust the array index accordingly. Increment num immediately aft
On Tue, Jan 14, 2025, Valentin Schneider wrote:
> Later patches will require issuing a __flush_tlb_all() from noinstr code.
> This requires making both __flush_tlb_local() and __flush_tlb_global()
> noinstr-compliant.
>
> For __flush_tlb_local(), xen_flush_tlb() has already been made noinstr, so
>
On Tue, Jan 14, 2025, Valentin Schneider wrote:
> static __always_inline void arch_context_tracking_work(enum ct_work work)
> {
> switch (work) {
> - case CT_WORK_n:
> - // Do work...
> + case CT_WORK_SYNC:
> + sync_core();
Not your bug, but serialize() need
44 matches
Mail list logo