Subject: [patch] sys_time() speedup From: Ingo Molnar <[EMAIL PROTECTED]>
improve performance of sys_time(). sys_time() returns time in seconds, but it does so by calling do_gettimeofday() and then returning the tv_sec portion of the GTOD time. But the data structure "xtime", which is updated by every timer/scheduler tick, already offers HZ granularity time. the patch improves the sysbench OLTP macrobenchmark significantly: 2.6.22-rc6: #threads 1: transactions: 3733 (373.21 per sec.) 2: transactions: 6676 (667.46 per sec.) 3: transactions: 6957 (695.50 per sec.) 4: transactions: 7055 (705.48 per sec.) 5: transactions: 6596 (659.33 per sec.) 2.6.22-rc6 + sys_time.patch: 1: transactions: 4005 (400.47 per sec.) 2: transactions: 7379 (737.77 per sec.) 3: transactions: 7347 (734.49 per sec.) 4: transactions: 7468 (746.65 per sec.) 5: transactions: 7428 (742.47 per sec.) mixed API uses of gettimeofday() and time() are guaranteed to be coherent via the use of a at-most-once-per-second slowpath that updates xtime. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- kernel/time.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) Index: linux/kernel/time.c =================================================================== --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -57,14 +57,17 @@ EXPORT_SYMBOL(sys_tz); */ asmlinkage long sys_time(time_t __user * tloc) { - time_t i; - struct timeval tv; + /* + * We read xtime.tv_sec atomically - it's updated + * atomically by update_wall_time(), so no need to + * even read-lock the xtime seqlock: + */ + time_t i = xtime.tv_sec; - do_gettimeofday(&tv); - i = tv.tv_sec; + smp_rmb(); /* sys_time() results are coherent */ if (tloc) { - if (put_user(i,tloc)) + if (put_user(i, tloc)) i = -EFAULT; } return i; @@ -373,6 +376,20 @@ void do_gettimeofday (struct timeval *tv tv->tv_sec = sec; tv->tv_usec = usec; + + /* + * Make sure xtime.tv_sec [returned by sys_time()] always + * follows the gettimeofday() result precisely. This + * condition is extremely unlikely, it can hit at most + * once per second: + */ + if (unlikely(xtime.tv_sec != tv->tv_sec)) { + unsigned long flags; + + write_seqlock_irqsave(&xtime_lock); + update_wall_time(); + write_seqlock_irqrestore(&xtime_lock); + } } EXPORT_SYMBOL(do_gettimeofday); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/