On Tue, Sep 30, 2014 at 01:40:11PM -0400, Rik van Riel wrote:
> On Tue, 30 Sep 2014 13:56:37 +0200
> Arnd Bergmann <a...@arndb.de> wrote:
> 
> > A recent change to update the stime/utime members of task_struct
> > using atomic cmpxchg broke configurations on 32-bit machines with
> > CONFIG_VIRT_CPU_ACCOUNTING_GEN set, because that uses 64-bit
> > nanoseconds, leading to a link-time error:
> > 
> > kernel/built-in.o: In function `cputime_adjust':
> > :(.text+0x25234): undefined reference to `__bad_cmpxchg'
> 
> Arnd, this should fix your problem, while still ensuring that
> the cpu time counters only ever go forward.
> 
> I do not have cross compiling toolchains set up here, but I assume
> this fixes your bug.
> 
> Ingo & Peter, if this patch fixes the bug for Arnd, could you please
> merge it into -tip?
> 
> Linus, the changeset causing the problem is only in -tip right now,
> and this patch will not apply to your tree.
> 
> ---8<---
> 
> Subject: sched,time: fix build error with 64 bit cputime_t on 32 bit systems
> 
> On 32 bit systems cmpxchg cannot handle 64 bit values, and
> cmpxchg64 needs to be used when full dynticks CPU accounting
> is enabled, since that turns cputime_t into a u64.
> 
> With jiffies based CPU accounting, cputime_t is an unsigned
> long. On 64 bit systems, cputime_t is always the size of a
> long.
> 
> Luckily the compiler can figure out whether we need to call
> cmpxchg or cmpxchg64.
> 
> Signed-off-by: Rik van Riel <r...@redhat.com>
> Reported-by: Arnd Bergmann <a...@arndb.de>
> ---
>  kernel/sched/cputime.c | 35 +++++++++++++++++++++++++----------
>  1 file changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 64492df..db239c9 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -555,6 +555,29 @@ static cputime_t scale_stime(u64 stime, u64 rtime, u64 
> total)
>  }
>  
>  /*
> + * Atomically advance counter to the new value. Interrupts, vcpu
> + * scheduling, and scaling inaccuracies can cause cputime_advance
> + * to be occasionally called with a new value smaller than counter.
> + * Let's enforce atomicity.
> + *
> + * Normally a caller will only go through this loop once, or not
> + * at all in case a previous caller updated counter the same jiffy.
> + */
> +static void cputime_advance(cputime_t *counter, cputime_t new)
> +{
> +     cputime_t old;
> +
> +     while (new > (old = ACCESS_ONCE(*counter))) {
> +             /* The compiler will optimize away this branch.  */
> +             if (sizeof(cputime_t) == sizeof(long))
> +                     cmpxchg(counter, old, new);
> +             else
> +                     /* 64 bit cputime_t on a 32 bit system... */
> +                     cmpxchg64(counter, old, new);

Maybe cmpxchg() should itself be a wrapper that does this build time choice
between cmpxchg32() and cmpxchg64().

And if it looks too dangerous to convert all users, this could be cmpxchg_t().

> +     }
> +}
> +
> +/*
>   * Adjust tick based cputime random precision against scheduler
>   * runtime accounting.
>   */
> @@ -599,16 +622,8 @@ static void cputime_adjust(struct task_cputime *curr,
>               utime = rtime - stime;
>       }
>  
> -     /*
> -      * If the tick based count grows faster than the scheduler one,
> -      * the result of the scaling may go backward.
> -      * Let's enforce monotonicity.
> -      * Atomic exchange protects against concurrent cputime_adjust().
> -      */
> -     while (stime > (rtime = ACCESS_ONCE(prev->stime)))
> -             cmpxchg(&prev->stime, rtime, stime);
> -     while (utime > (rtime = ACCESS_ONCE(prev->utime)))
> -             cmpxchg(&prev->utime, rtime, utime);
> +     cputime_advance(&prev->stime, stime);
> +     cputime_advance(&prev->utime, utime);
>  
>  out:
>       *ut = prev->utime;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to