Re: [PATCH] blktrace: fix __this_cpu_read/write in preemptible context

Chaitanya Kulkarni Fri, 27 Feb 2026 15:08:55 -0800

On 2/27/26 11:19, Steven Rostedt wrote:
> On Thu, 26 Feb 2026 21:03:03 -0800
> Chaitanya Kulkarni <[email protected]> wrote:
>
>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
>> index 3b7c102a6eb3..488552036583 100644
>> --- a/kernel/trace/blktrace.c
>> +++ b/kernel/trace/blktrace.c
>> @@ -383,7 +383,9 @@ static void __blk_add_trace(struct blk_trace *bt, 
>> sector_t sector, int bytes,
>>      cpu = raw_smp_processor_id();
>>   
>>      if (blk_tracer) {
>> +            preempt_disable_notrace();
>>              tracing_record_cmdline(current);
>> +            preempt_enable_notrace();
>>   
>>              buffer = blk_tr->array_buffer.buffer;
>>              trace_ctx = tracing_gen_ctx_flags(0);
> Do you know when this started? rcu_read_lock() doesn't disable preemption
> in PREEMPT environments, and hasn't for a very long time. I'm surprised it
> took this long to detect this? Perhaps this was a bug from day one?


This started with latest pull which I did on Wed.

Last time same test passed on 2/11/26 since I ran blktests and posted
following patch on 2/11/26 12:47 pacific :-

[PATCH V4] blktrace: log dropped REQ_OP_ZONE_XXX events ver1

Shinichiro CC'd here also reported same bug.
I think Fixes tag would be :-

Fixes: 7ffbd48d5cab ("tracing: Cache comms only after an event occurred")

Since above commit added __this_cpu_read(trace_cmdline_save) and
__this_cpu_write(trace_cmdline_save) :-

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b90a827a4641..88111b08b2c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -77,6 +77,13 @@ static int dummy_set_flag(u32 old_flags, u32 bit, int set)
         return 0;
  }
  
+/*
+ * To prevent the comm cache from being overwritten when no
+ * tracing is active, only save the comm when a trace event
+ * occurred.
+ */
+static DEFINE_PER_CPU(bool, trace_cmdline_save);
+
  /*
   * Kill all tracing for good (never come back).
   * It is initialized to 1 but will turn to zero if the initialization
@@ -1135,6 +1142,11 @@ void tracing_record_cmdline(struct task_struct *tsk)
             !tracing_is_on())
                 return;
  
+       if (!__this_cpu_read(trace_cmdline_save))
+               return;
+
+       __this_cpu_write(trace_cmdline_save, false);
+
         trace_save_cmdline(tsk);
  }
  
Full disclosure I've no idea why it has started showing up this month.
I've worked on blktrace extensively and tested my code a lot for form
2020-2021 after above commit but never has seen this even with my patches.

> Anyway, the tracing_record_cmdline() is to update the COMM cache so that
> the trace has way to show the task->comm based on the saved PID in the
> trace. It sets a flag to record the COMM from the sched_switch event if a
> trace event happened. It's not needed if no trace event occurred. That
> means, instead of adding preempt_disable() here, just move it after the
> ring buffer event is reserved, as that means preemption is disabled until
> the event is committed.
>
> i.e.
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index e6988929ead2..3735cbc1f99f 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -383,8 +383,6 @@ static void __blk_add_trace(struct blk_trace *bt, 
> sector_t sector, int bytes,
>       cpu = raw_smp_processor_id();
>   
>       if (blk_tracer) {
> -             tracing_record_cmdline(current);
> -
>               buffer = blk_tr->array_buffer.buffer;
>               trace_ctx = tracing_gen_ctx_flags(0);
>               switch (bt->version) {
> @@ -419,6 +417,8 @@ static void __blk_add_trace(struct blk_trace *bt, 
> sector_t sector, int bytes,
>               if (!event)
>                       return;
>   
> +             tracing_record_cmdline(current);
> +
>               switch (bt->version) {
>               case 1:
>                       record_blktrace_event(ring_buffer_event_data(event),
>
> -- Steve

Above does fix the problem and make testcase pass :-

blktests (master) # ./check blktrace
blktrace/001 (blktrace zone management command tracing)      [passed]
     runtime  3.650s  ...  3.647s
blktrace/002 (blktrace ftrace corruption with sysfs trace)   [passed]
     runtime  0.411s  ...  0.384s
blktests (master) #

I'll send a V2.

-ck

Re: [PATCH] blktrace: fix __this_cpu_read/write in preemptible context

Reply via email to