* Daniel Walker ([EMAIL PROTECTED]) wrote: > On Mon, 2007-02-26 at 17:14 -0500, Mathieu Desnoyers wrote: > > > > For kernel and user space tracing, those small jumps are very annoying : > > it can show, in a trace, that a fork() appears on a CPU after the first > > schedule() of the thread on the other CPU : scheduling causality > > relationship > > can become very hard to follow. This is only a sample case. Inaccuracy and > > periodical modification of the clock time (non monotonic) can cause > > important > > inaccuracy in performance tests, even on UP systems. A monotonic clock, > > accessible from anywhere in kernel space (including NMI handler) and > > from user space is very useful for performance analysis and, more > > generally, for timestamping data in per cpu buffers so it can be later > > reordered correctly. > > What about adding a layer below do_gettimeofday() which just scheds the > adjustment process? That might be reasonable .. The NMI, and userspace > cases aren't very compelling right now, at least I'm not convinced a > whole new timing interface is needed .. > > The latency tracing system in the -rt branch modifies the gettimeofday > facilities , I'm not sure of the correctness of it but it gets called > from anyplace in the kernel including NMI's . > > Here's the function, > > cycle_t notrace get_monotonic_cycles(void) > { > cycle_t cycle_now, cycle_delta; > > /* read clocksource: */ > cycle_now = clocksource_read(clock); > > /* calculate the delta since the last update_wall_time: */ > cycle_delta = (cycle_now - clock->cycle_last) & clock->mask; > > return clock->cycle_last + cycle_delta; > } > > That looks safe. When converting this to nanoseconds you would still get > the time adjustments but it would be all at once instead of in little > increments .. >
ouch... if the clocksource used is the PIT on x86 : static cycle_t pit_read(void) { unsigned long flags; int count; u32 jifs; static int old_count; static u32 old_jifs; spin_lock_irqsave(&i8253_lock, flags); If an NMI nests over the spinlock, we have a deadlock. In addition, clock->cycle_last is a cycle_t, defined as a 64 bits on x86. If is therefore not updated atomically by change_clocksource, timekeeping_init, timekeeping_resume and update_wall_time. If an NMI fires right on top of the update, especially around the 32 bits wrap around, the time will be really fuzzy. Mathieu > Daniel > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/