Dear maintainers and linux-trace-kernel,

I stumbled upon a bug in the histogram trigger implementation, where a
named histogram trigger with invalid onmax variable does not get
unregistered properly in the error path, and a subsequent access to
the same trigger file leads to kernel panic.

The issue reproduces on 6.14.0-rc4 with these commands (works with any
trace event):

$ cd /sys/kernel/tracing/events/rcu/rcu_callback
$ echo 'hist:name=bad:keys=common_pid:onmax(bogus).save(common_pid)' > trigger
bash: echo: write error: Invalid argument
$ echo 'hist:name=bad:keys=common_pid' > trigger

which leads to the panic:

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 UID: 0 PID: 2187 Comm: hist_panic_repr Kdump: loaded Not
tainted 6.14.0-rc4 #10
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41
04/01/2014
RIP: 0010:strcmp+0x10/0x30
...
Call Trace:
 <TASK>
 ? __die+0x24/0x70
 ? page_fault_oops+0x75/0x170
 ? exc_page_fault+0x70/0x160
 ? asm_exc_page_fault+0x26/0x30
 ? strcmp+0x10/0x30
 find_named_trigger+0x4a/0x70
 hist_register_trigger+0x3e/0x320
 event_hist_trigger_parse+0x520/0xa80
 trigger_process_regex+0xbc/0x110
 event_trigger_write+0x79/0xe0
 vfs_write+0xf7/0x420
 ? do_syscall_64+0x89/0x160
 ? syscall_exit_to_user_mode_prepare+0x154/0x190
 ksys_write+0x66/0xe0
 do_syscall_64+0x7d/0x160
 ? syscall_exit_to_user_mode_prepare+0x154/0x190
 ? syscall_exit_to_user_mode+0x32/0x1b0
 ? filp_flush+0x72/0x80
 ? filp_close+0x1f/0x30
 ? do_dup2+0xae/0x150
 ? ksys_dup3+0x65/0xf0
 ? syscall_exit_to_user_mode_prepare+0x154/0x190
 ? syscall_exit_to_user_mode+0x32/0x1b0
 ? clear_bhb_loop+0x25/0x80
 ? clear_bhb_loop+0x25/0x80
 ? clear_bhb_loop+0x25/0x80
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Further investigation revealed that hist_unregister_trigger called in
the out_unreg path in event_hist_trigger_parse (which, by the way,
accidentally passes glob+1 instead of glob; it doesn't matter only
because it is unused) does not find the trigger, and, thus, does not
free it and remove it from the named_triggers list.

Subsequent calls to find_named_trigger then finds the freed
hist_trigger_data, tries to compare against it for name and crashes
the kernel.

I'm not familiar with the trigger implementation. Do you have any
ideas on why the hist_unregister_trigger fails and/or a fix?

Thank you.

Tomas


Reply via email to