On Tue, 27 Nov 2007 15:03:47 -0500 Jeff Dike <[EMAIL PROTECTED]> wrote:
> [ This one needs to get into 2.6.24 ] > > With NO_HZ disabled, the UML idle loop effectively becomes a busy > loop, as it will sleep for no time. > > The cause was forgetting to restart the tick after waking up from > sleep. It was disabled before sleeping, and the remaining time used > as the interval to sleep. So, the tick needs to be restarted when > nanosleep finishes. > > This is done by introducing after_sleep_interval, which is empty in > the NO_HZ case, but which sets the tick starting in the !NO_HZ case. > > Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> > --- > arch/um/os-Linux/time.c | 54 > +++++++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 51 insertions(+), 3 deletions(-) > > Index: linux-2.6.22/arch/um/os-Linux/time.c > =================================================================== > --- linux-2.6.22.orig/arch/um/os-Linux/time.c 2007-11-14 10:33:29.000000000 > -0500 > +++ linux-2.6.22/arch/um/os-Linux/time.c 2007-11-26 15:50:46.000000000 > -0500 > @@ -59,7 +59,7 @@ long long disable_timer(void) > { > struct itimerval time = ((struct itimerval) { { 0, 0 }, { 0, 0 } }); > > - if(setitimer(ITIMER_VIRTUAL, &time, &time) < 0) > + if (setitimer(ITIMER_VIRTUAL, &time, &time) < 0) > printk(UM_KERN_ERR "disable_timer - setitimer failed, " > "errno = %d\n", errno); > > @@ -74,13 +74,61 @@ long long os_nsecs(void) > return timeval_to_ns(&tv); > } > > +#ifdef UML_CONFIG_NO_HZ Nothing ever defines this? > +static int after_sleep_interval(struct timespec *ts) > +{ > +} > +#else > +static inline long long timespec_to_us(const struct timespec *ts) > +{ > + return ((long long) ts->tv_sec * UM_USEC_PER_SEC) + > + ts->tv_nsec / UM_NSEC_PER_USEC; > +} > + > +static int after_sleep_interval(struct timespec *ts) > +{ > + int usec = UM_USEC_PER_SEC / UM_HZ; > + long long start_usecs = timespec_to_us(ts); > + struct timeval tv; > + struct itimerval interval; > + > + /* > + * It seems that rounding can increase the value returned from > + * setitimer to larger than the one passed in. Over time, > + * this will cause the remaining time to be greater than the > + * tick interval. If this happens, then just reduce the first > + * tick to the interval value. > + */ > + if (start_usecs > usec) > + start_usecs = usec; > + tv = ((struct timeval) { .tv_sec = start_usecs / UM_USEC_PER_SEC, > + .tv_usec = start_usecs % UM_USEC_PER_SEC }); > + interval = ((struct itimerval) { { 0, usec }, tv }); > + > + if (setitimer(ITIMER_VIRTUAL, &interval, NULL) == -1) > + return -errno; > + > + return 0; > +} > +#endif > + > extern void alarm_handler(int sig, struct sigcontext *sc); > > void idle_sleep(unsigned long long nsecs) > { > - struct timespec ts = { .tv_sec = nsecs / UM_NSEC_PER_SEC, > - .tv_nsec = nsecs % UM_NSEC_PER_SEC }; > + struct timespec ts; > + > + /* > + * nsecs can come in as zero, in which case, this starts a > + * busy loop. To prevent this, reset nsecs to the tick > + * interval if it is zero. > + */ > + if (nsecs == 0) > + nsecs = UM_NSEC_PER_SEC / UM_HZ; > + ts = ((struct timespec) { .tv_sec = nsecs / UM_NSEC_PER_SEC, > + .tv_nsec = nsecs % UM_NSEC_PER_SEC }); > > if (nanosleep(&ts, &ts) == 0) > alarm_handler(SIGVTALRM, NULL); > + after_sleep_interval(&ts); > } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/