Re: [RFC PATCH 2/6] Driver Tracing Interface (DTI) code

Mathieu Desnoyers Thu, 05 Jul 2007 09:15:27 -0700

* Tom Zanussi ([EMAIL PROTECTED]) wrote:
> On Tue, 2007-07-03 at 23:13 -0400, Mathieu Desnoyers wrote:
> > > +void *__dti_reserve(struct dti_info *dti, int prio, size_t len)
> > > +{
> > > + struct dti_event *event;
> > > + int event_len;
> > > +
> > > + if (!dti)
> > > +         return NULL;
> > > +
> > > + if (prio > dti->level)
> > > +         return NULL;
> > > +
> > > + event_len = len + sizeof(*event);
> > > +
> > > + event = relay_reserve(dti->trace->rchan, event_len);
> > > + if (!event)
> > > +         return NULL;
> > > +
> > > + event->time = sched_clock();
> > 
> > sched_clock() is dangerous.. you have to make sure you provide correct
> > clock errors in the trace parsing.. not exactly something interesting
> > when you are looking at the "one" case over 10000 that was faulty. See
> > the i386 sched-clock.c:
> > 
> >  * Scheduler clock - returns current time in nanosec units.
> >  * All data is local to the CPU.
> >  * The values are approximately[1] monotonic local to a CPU, but not
> >  * between CPUs.   There might be also an occasionally random error,
> >  * but not too bad. Between CPUs the values can be non monotonic.
> >  *
> >  * [1] no attempt to stop CPU instruction reordering, which can hit
> >  * in a 100 instruction window or so.
> > 
> > How do you plan to use these timestamps to reorder events across CPUs ?
> > 
> 
> lttng_timestamp()? ;-)
>


Yeah, this problem is not easy.. I have a set of per architecture
timestamping implementations that detects the limitations of the
hardware and overcomes them (32 bits TSC -> 64 bits extension that
supports read from NMI context, detect TSCs not in sync (AMDs and some
intels) that falls back to what is more or less a logical clock then,
printing warnings. I also keep a recipe cookbook about what people
should disable on these architectures to get precise timestamps.
> 
> > > +
> > > +static int vprintk_normal(struct dti_info *dti, int prio, const char* 
> > > fmt,
> > > +                   va_list args)
> > > +{
> > > + struct dti_event *event;
> > > + void *buf;
> > > + int len, event_len, rc = -1;
> > > + unsigned long flags;
> > > +
> > > + if (!dti)
> > > +         return -1;
> > > +
> > > + if (prio > dti->level)
> > > +         return -1;
> > > +
> > > + local_irq_save(flags);
> > > + buf = dti_printf_tmpbuf[smp_processor_id()];
> > > + len = vsnprintf(buf, DTI_PRINTF_TMPBUF_SIZE, fmt, args);
> > 
> > As I said, why not put the result of the vsnprintf directly in the
> > buffer after the relay_reserve ?
> > 
> 
> Because I don't know how much to reserve until I've done the
> vsnprintf().  I think that to have vsnprintf() print directly into the
> relay buffer, I'd need a relay_unreserve() to remove the unused portion.
> 

This is why I use a 2 pass approach in LTTng: the first one, given a
NULL buffer, calculates the reserve size, and then the second pass does
the copy.

> > > +
> > > +/**
> > > + *       dti_relog_initbuf - re-log static initbuf into real relay 
> > > channel
> > > + *       @work: work struct that contains the the dti handle
> > > + *
> > > + *       The global initbuf may contain events from multiple cpus.  These
> > > + *       events must be put into their respective cpu buffers once the
> > > + *       per-cpu channel is available.
> > > + *
> > > + *       Internal - exported for setup macros.
> > > + */
> > > +int dti_relog_initbuf(struct dti_handle *handle)
> > > +{
> > 
> > What do you do with the data recorded by the system while you do this
> > buffer copy? Is the timestamp ordering kept? Why do you copy the static
> > buffers into other buffers at all anyway ? They could be exported
> > through relay (if it supported exporting statically mapped buffers).
> > 
> 
> You're right, we need to do something about that.  As mentioned before,
> we copy the events from the static buffer into the relay channel (the
> timestamp ordering is kept) after it becomes available because that
> seems like the simplest thing to do.  We could add support to relay to
> export another set of (static) buffers, or add the relogging code to
> relay if other tracers want to use it.
> 
> 

The problem with relogging is that it will be a huge timestamping mess.
I doubt that we can reliably continue tracing while we do this copy.
Therefore, we will end up losing events or having timestamps jumping
backward.

A solution that would export the static buffers separately to user-space
and (maybe) allows them to be (somehow?) de-allocated once they have
been read by user-space seems cleaner to me. The switch between the
static buffers and the normal relay buffers could then be done
atomically in a simple pointer overwrite.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 2/6] Driver Tracing Interface (DTI) code

Reply via email to