Hi Thomas, Sorry for the late reply. I was trying out some way to do it the way you suggested. Tried to talk to Carsten regarding the hrtimer latency patch but was unable to.
On 8 September 2016 at 13:36, Thomas Gleixner <t...@linutronix.de> wrote: > On Wed, 7 Sep 2016, Binoy Jayan wrote: >> This captures only the latencies introduced by disabled irqs and >> preemption. Additional per process data has to be captured to calculate >> the effective latencies introduced for individual processes. > > And what is the additional per process data and how is it captured and > used? This is the patch which would touch the scheduler code which I did not want to do. I was trying to achieve the same using kprobes but did not get around to make it work with the histograms. >> +static inline void latency_preempt_timing_start(enum latency_type ltype) >> +{ >> + this_cpu_write(lat_ts[ltype], (cycle_t) trace_clock_local()); > > What is this silly type cast for? Why can't you just use u64 ? >> +static inline void latency_preempt_timing_stop(enum latency_type type) >> +{ >> + trace_latency_preempt(type, >> + (cycle_t) trace_clock_local() - this_cpu_read(lat_ts[type])); > > And then of course you use a completely different data type in the trace > itself. This has been changed. >> +DECLARE_EVENT_CLASS(latency_template, >> + TP_PROTO(int ltype, cycles_t latency), > > Are you sure, that you know what you are doing here? If yes, then please > explain it in form of comments so mere mortals can understand it as well. Added comments for the same. >> /* start and stop critical timings used to for stoppage (in idle) */ >> void start_critical_timings(void) >> { >> + if (unlikely(trace_latency_preempt_enabled())) >> + latency_preempt_timing_start(LT_CRITTIME); > > I doubt, that this conditional is less expensive than a simple > unconditional store to a per cpu variable. This is changed as well. >> @@ -431,6 +451,9 @@ void stop_critical_timings(void) >> { >> if (preempt_trace() || irq_trace()) >> stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1); >> + >> + if (unlikely(trace_latency_preempt_enabled())) >> + latency_preempt_timing_stop(LT_CRITTIME); > > And this is silly as well. You can put the whole evaluation into the trace > event assignement so the tracer core will handle that conditional. I hope this can be done for "time_hardirqs_off" and "trace_preempt_off" as well. > Aside of that it is silly to evaluate trace_clock_local() for the actual > tracepoint simply because that time is already stored in the tracepoint > itself. The flow here is: > > event = trace_event_buffer_lock_reserve(); > entry = ring_buffer_event_data(event); > { <assign>; } <-- Here we assign the entries by the __field and > __array macros. > > > So you should talk to Steven about having a way to retrieve that time from > entry itself in a generic way. > Steven mentioned that the timestamp in the ring buffer is in the offset form and also it may not be equivalent to trace_clock_local() time. I also tried using the timestamp from the per cpu trace data but it did not seem to provide correct value for the timestamp. struct trace_array_cpu *data = per_cpu_ptr(irqsoff_trace->trace_buffer.data, cpu); data->preempt_timestamp Thanks, Binoy