On 2/27/26 11:19, Steven Rostedt wrote:
> On Thu, 26 Feb 2026 21:03:03 -0800
> Chaitanya Kulkarni <[email protected]> wrote:
>
>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
>> index 3b7c102a6eb3..488552036583 100644
>> --- a/kernel/trace/blktrace.c
>> +++ b/kernel/trace/blktrace.c
>> @@ -383,7 +383,9 @@ static void __blk_add_trace(struct blk_trace *bt,
>> sector_t sector, int bytes,
>> cpu = raw_smp_processor_id();
>>
>> if (blk_tracer) {
>> + preempt_disable_notrace();
>> tracing_record_cmdline(current);
>> + preempt_enable_notrace();
>>
>> buffer = blk_tr->array_buffer.buffer;
>> trace_ctx = tracing_gen_ctx_flags(0);
> Do you know when this started? rcu_read_lock() doesn't disable preemption
> in PREEMPT environments, and hasn't for a very long time. I'm surprised it
> took this long to detect this? Perhaps this was a bug from day one?
This started with latest pull which I did on Wed.
Last time same test passed on 2/11/26 since I ran blktests and posted
following patch on 2/11/26 12:47 pacific :-
[PATCH V4] blktrace: log dropped REQ_OP_ZONE_XXX events ver1
Shinichiro CC'd here also reported same bug.
I think Fixes tag would be :-
Fixes: 7ffbd48d5cab ("tracing: Cache comms only after an event occurred")
Since above commit added __this_cpu_read(trace_cmdline_save) and
__this_cpu_write(trace_cmdline_save) :-
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b90a827a4641..88111b08b2c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -77,6 +77,13 @@ static int dummy_set_flag(u32 old_flags, u32 bit, int set)
return 0;
}
+/*
+ * To prevent the comm cache from being overwritten when no
+ * tracing is active, only save the comm when a trace event
+ * occurred.
+ */
+static DEFINE_PER_CPU(bool, trace_cmdline_save);
+
/*
* Kill all tracing for good (never come back).
* It is initialized to 1 but will turn to zero if the initialization
@@ -1135,6 +1142,11 @@ void tracing_record_cmdline(struct task_struct *tsk)
!tracing_is_on())
return;
+ if (!__this_cpu_read(trace_cmdline_save))
+ return;
+
+ __this_cpu_write(trace_cmdline_save, false);
+
trace_save_cmdline(tsk);
}
Full disclosure I've no idea why it has started showing up this month.
I've worked on blktrace extensively and tested my code a lot for form
2020-2021 after above commit but never has seen this even with my patches.
> Anyway, the tracing_record_cmdline() is to update the COMM cache so that
> the trace has way to show the task->comm based on the saved PID in the
> trace. It sets a flag to record the COMM from the sched_switch event if a
> trace event happened. It's not needed if no trace event occurred. That
> means, instead of adding preempt_disable() here, just move it after the
> ring buffer event is reserved, as that means preemption is disabled until
> the event is committed.
>
> i.e.
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index e6988929ead2..3735cbc1f99f 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -383,8 +383,6 @@ static void __blk_add_trace(struct blk_trace *bt,
> sector_t sector, int bytes,
> cpu = raw_smp_processor_id();
>
> if (blk_tracer) {
> - tracing_record_cmdline(current);
> -
> buffer = blk_tr->array_buffer.buffer;
> trace_ctx = tracing_gen_ctx_flags(0);
> switch (bt->version) {
> @@ -419,6 +417,8 @@ static void __blk_add_trace(struct blk_trace *bt,
> sector_t sector, int bytes,
> if (!event)
> return;
>
> + tracing_record_cmdline(current);
> +
> switch (bt->version) {
> case 1:
> record_blktrace_event(ring_buffer_event_data(event),
>
> -- Steve
Above does fix the problem and make testcase pass :-
blktests (master) # ./check blktrace
blktrace/001 (blktrace zone management command tracing) [passed]
runtime 3.650s ... 3.647s
blktrace/002 (blktrace ftrace corruption with sysfs trace) [passed]
runtime 0.411s ... 0.384s
blktests (master) #
I'll send a V2.
-ck