On Tue, 12 Dec 2023 07:18:37 -0500 Steven Rostedt <rost...@goodmis.org> wrote:
> From: "Steven Rostedt (Google)" <rost...@goodmis.org> > > On 32bit machines, the 64 bit timestamps are broken up into 32 bit words > to keep from using local64_cmpxchg(), as that is very expensive on 32 bit > architectures. > > On 32 bit architectures, reading these timestamps can happen in a middle > of an update. In this case, the read returns "false", telling the caller > that the timestamp is in the middle of an update, and it needs to assume > it is corrupted. The code then accommodates this. > > When first reserving space on the ring buffer, a "before_stamp" and > "write_stamp" are read. If they do not match, or if either is in the > process of being updated (false was returned from the read), an absolute > timestamp is added and the delta is not used, as that requires reading > theses timestamps without being corrupted. > > The one case that this does not matter is if the event is the first event > on the sub-buffer, in which case, the event uses the sub-buffer's > timestamp and doesn't need the other stamps for calculating them. > > After some work to consolidate the code, if the before or write stamps are > in the process of updating, an absolute timestamp will be added regardless > if the event is the first event on the sub-buffer. This is wrong as it > should not care about the success of these reads if it is the first event > on the sub-buffer. > > Fix up the parenthesis so that even if the timestamps are corrupted, if > the event is the first event on the sub-buffer (w == 0) it still does not > force an absolute timestamp. > > It's actually likely that w is not zero, but move it out of the unlikeyl() > and test it first. It should be in hot cache anyway, and there's no reason > to do the rest of the test for the first event on the sub-buffer. And this > prevents having to test all the 'or' statements in that case. > > Cc: sta...@vger.kernel.org > Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some > branches") > Signed-off-by: Steven Rostedt (Google) <rost...@goodmis.org> > --- > Changes since v2: > https://lore.kernel.org/linux-trace-kernel/20231211115949.4692e...@gandalf.local.home > > - Move the test to 'w' out of the unlikely and do it first. > It's already in hot cache, and the rest of test shouldn't be done > if 'w' is zero. > > kernel/trace/ring_buffer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index b416bdf6c44a..095b86081ea8 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -3581,7 +3581,7 @@ __rb_reserve_next(struct ring_buffer_per_cpu > *cpu_buffer, > * absolute timestamp. > * Don't bother if this is the start of a new page (w == 0). > */ > - if (unlikely(!a_ok || !b_ok || (info->before != info->after && > w))) { > + if (w && unlikely(!a_ok || !b_ok || info->before != > info->after)) { > info->add_timestamp |= RB_ADD_STAMP_FORCE | > RB_ADD_STAMP_EXTEND; > info->length += RB_LEN_TIME_EXTEND; > } else { After this else, } else { info->delta = info->ts - info->after; The code is using info_after, but it is not sure 'a_ok'. Does this mean if 'w == 0 && !a_ok' this doesn't work correctly? What will be the expected behavior when w == 0 here? Thank you, > -- > 2.42.0 > -- Masami Hiramatsu (Google) <mhira...@kernel.org>