Hi, I was debugging my application and noticed that a timerfd event was being triggered *before* the timer expires. I'd like to know if this behavior is expected. More details below.
I reduced the scope of the program to test a single timerfd and measure the difference in the result of clock_gettime() between two reads. Helper functions are elsewhere, but the loop is very simple: loop_time_fd = setup_timerfd(interval, 0); iter = 0; old = curr = now_usec(); do { if (iter > iterations) break; r = read(loop_time_fd, &events, sizeof(events)); if (r < 0) { if (errno == EINTR) continue; return log_error_errno(errno, "Error reading timerfd: %m"); } curr = now_usec(); /* mark as USEC_INFINITY those events we missed, the real wait * time is nonetheless in the next position */ for (; events > 1; events--) elapsed[iter++] = USEC_INFINITY; elapsed[iter++] = curr - old; old = curr; } while (1); Here we have: - now_usec() simply calls clock_gettime() with CLOCK_MONOTONIC as clockid and converts to usec. - setup_timerfd() creates the timerfd with CLOCK_MONOTONIC and let it in blocking mode (there's another version of the test with poll() and non-block fd, but it shows the same behavior) First I thought it could be because of the timerslack. I disabled it with prctl(). I'm also running the process with RT priority so AFAIK there shouldn't be a problem with timerslack. In order to reduce the overhead for the process I also tried to isolate it alone on a single cpu by booting with isolcpus and then setting the affinity accordingly. For whatever interval I configure and 10000 iterations, what I'm seeing in the elapsed vector are values like interval +- 70usec Due to the wakeup overhead, I understand the +70usec, but I'm not getting why there are elapsed values shorter than the configured interval. Tracing for wakeup events with ftrace[1] I also get the same results... the time difference between 2 wakeup events for ~30% of events are shorter the configured interval. I'm currently running with kernel version 3.18.6. Is there anything I'm missing? thanks -- Lucas De Marchi [1] trace-cmd record -F -e sched:sched_wakeup ./test-timer 1000 10000 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/