On 07/04/25 2:12 pm, Viktor Malik wrote:
On 3/31/25 15:19, Shung-Hsi Yu wrote:
Hi all,
On ppc64le (v6.14, kernel config attached), I've observed that fentry
BPF programs stop being invoked after the target kernel function is live
patched. This occurs regardless of whether the BPF program was attached
before or after the live patch. I believe fentry/fprobe on ppc64le is
added with [1].
Steps to reproduce on ppc64le:
- Use bpftrace (v0.10.0+) to attach a BPF program to cmdline_proc_show
with fentry (kfunc is the older name bpftrace used for fentry, used
here for max compatability)
bpftrace -e 'kfunc:cmdline_proc_show { printf("%lld: cmdline_proc_show() called
by %s\n", nsecs(), comm) }'
- Run `cat /proc/cmdline` and observe bpftrace output
- Load samples/livepatch/livepatch-sample.ko
- Run `cat /proc/cmdline` again. Observe "this has been live patched" in
output, but no new bpftrace output.
Note: once the live patching module is disabled through the sysfs interface
the BPF program invocation is restored.
Is this the expected interaction between fentry BPF and live patching?
On x86_64 it does _not_ happen, so I'd guess the behavior on ppc64le is
unintended. Any insights appreciated.
I'm not sure if this is related but I found out that when a kernel is
compiled with KASAN=y (full config attached), the above steps without
the bpftrace part lead to a kernel panic upon running the second `cat
/proc/cmdline` command (the livepatched one).
Here's the relevant part of the kdump:
[ 457.405298] BUG: Unable to handle kernel data access on write at
0xc0000000000f9078
[ 457.405320] Faulting instruction address: 0xc0000000018ff958
[ 457.405328] Oops: Kernel access of bad area, sig: 11 [#1]
[ 457.405336] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
[ 457.405347] Modules linked in: livepatch_sample(K) bonding tls rfkill
vmx_crypto ibmveth pseries_rng sg fuse loop nfnetlink vsock_loopback
vmw_vsock_virtio_transport_common vsock xfs sd_mod ibmvscsi scsi_transport_srp
dm_mirror dm_region_hash dm_log dm_mod
[ 457.405410] CPU: 6 UID: 0 PID: 5141 Comm: cat Kdump: loaded Tainted: G
K 6.14.0+ #9 VOLUNTARY
[ 457.405424] Tainted: [K]=LIVEPATCH
[ 457.405430] Hardware name: IBM,9009-22A POWER9 (architected) 0x4e0202
0xf000005 of:IBM,FW910.00 (VL910_062) hv:phyp pSeries
[ 457.405440] NIP: c0000000018ff958 LR: c0000000018ff930 CTR: c0000000009c0790
[ 457.405449] REGS: c00000005f2e7790 TRAP: 0300 Tainted: G K
(6.14.0+)
[ 457.405459] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 2822880b
XER: 20040000
[ 457.405484] CFAR: c0000000008addc0 DAR: c0000000000f9078 DSISR: 0a000000
IRQMASK: 1
GPR00: c0000000018f2584 c00000005f2e7a30 c00000000280a900 c000000017ffa488
GPR04: 0000000000000008 0000000000000000 c0000000018f24fc 000000000000000d
GPR08: fffffffffffe0000 000000000000000d 0000000000000000 0000000000008000
GPR12: c0000000009c0790 c000000017ffa480 c00000005f2e7c78 c0000000000f9070
GPR16: c00000005f2e7c90 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000005f3efa80 c00000005f2e7c60 c00000005f2e7c88
GPR24: c00000005f2e7c60 0000000000000001 c0000000000f9078 0000000000000000
GPR28: 00007fff97960000 c000000017ffa480 0000000000000000 c0000000000f9078
[ 457.405605] NIP [c0000000018ff958] _raw_spin_lock_irqsave+0x68/0x110
[ 457.405619] LR [c0000000018ff930] _raw_spin_lock_irqsave+0x40/0x110
[ 457.405630] Call Trace:
[ 457.405635] [c00000005f2e7a30] [c000000000941804]
check_heap_object+0x34/0x390 (unreliable)
[ 457.405651] [c00000005f2e7a70] [c0000000018f2584]
__mutex_unlock_slowpath.isra.0+0xe4/0x230
[ 457.405665] [c00000005f2e7af0] [c0000000009c2f50] seq_read_iter+0x430/0xa90
[ 457.405679] [c00000005f2e7c00] [c000000000aade04]
proc_reg_read_iter+0xa4/0x200
[ 457.405692] [c00000005f2e7c40] [c00000000095345c] vfs_read+0x41c/0x510
[ 457.405705] [c00000005f2e7d30] [c0000000009545d4] ksys_read+0xa4/0x190
[ 457.405716] [c00000005f2e7d90] [c00000000003a3f0]
system_call_exception+0x1d0/0x440
[ 457.405729] [c00000005f2e7e50] [c00000000000cedc]
system_call_vectored_common+0x15c/0x2ec
[ 457.405744] --- interrupt: 3000 at 0x7fff97e75044
[ 457.405755] NIP: 00007fff97e75044 LR: 00007fff97e75044 CTR: 0000000000000000
[ 457.405764] REGS: c00000005f2e7e80 TRAP: 3000 Tainted: G K
(6.14.0+)
[ 457.405773] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR:
48222804 XER: 00000000
[ 457.405805] IRQMASK: 0
GPR00: 0000000000000003 00007fffc1908930 00007fff97f87100 0000000000000003
GPR04: 00007fff97960000 0000000000040000 0000000000000000 00007fff97f80248
GPR08: 0000000000000002 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fff9805a5a0 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000040000 00007fffc19091c8 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 00007fff9804f470
GPR24: 0000000000000000 0000000000040000 00007fffc190f1c5 000000007ff00000
GPR28: 0000000000000003 00007fff97960000 0000000000040000 0000000000000003
[ 457.405916] NIP [00007fff97e75044] 0x7fff97e75044
[ 457.405924] LR [00007fff97e75044] 0x7fff97e75044
[ 457.405932] --- interrupt: 3000
[ 457.405938] Code: 386d0008 4afae43d 60000000 a13d0008 3d00fffe 5529083c 61290001
7d40f829 7d474079 40c20018 7d474038 7ce74b78 <7ce0f92d> 40c2ffe8 7c2004ac
794a03e1
[ 457.405981] ---[ end trace 0000000000000000 ]---
[ 457.419259] pstore: backend (nvram) writing error (-1)
Interestingly, the panic doesn't occur when the bpftrace process is
running. Then, running `cat /proc/cmdline` works (even prints the
expected livepatched message) but doesn't appear in bpftrace output, as
Shung-Hsi observed.
On a kernel with KASAN=n, no panic happens.
This panic doesn't seem to be related to BPF (as it happens when no BPF
programs are involved) but it involves livepatch and occurs for the same
sequence of commands, so the two cases may be related. In this case, I
suspect that the issue is caused by an incorrect interaction of
livepatch and the ftrace changes introduced for BPF trampolines [1].
FWIW, there is patch cfec8463d9a1 ("powerpc/ftrace: Fix ftrace bug with
KASAN=y") which is fixing a bug in [1] appearing on KASAN=y kernel but
I'm not sure if it's related to this issue.
Thanks for reporting this, Viktor.
There was a bug in how clobbered register was restored in livepatch path
leading this failure. Posted the fix patch upstream [1]
FWIW, the problem Shung-Hsi observed still exists. Will try and get that
working..
- Hari
[1]
https://lore.kernel.org/linuxppc-dev/20250416191227.201146-1-hbath...@linux.ibm.com/
Viktor
[1] https://lore.kernel.org/all/20241030070850.1361304-1-hbath...@linux.ibm.com/
Thanks,
Shung-Hsi Yu
1: https://lore.kernel.org/all/20241030070850.1361304-2-hbath...@linux.ibm.com/