On Fri, Dec 09, 2016 at 09:30:11AM +0100, Peter Zijlstra wrote:
> +static inline u64 mul_u32_u32(u32 a, u32 b)
> +{
> +     u64 ret;
> +
> +     asm ("mull %[b]" : "=A" (ret) : [a] "a" (a), [b] "g" (b) );
> +
> +     return ret;
> +}

ARGH, that's broken on x86_64, it needs to be:

        u32 high, low;

        asm ("mull %[b]" : "=a" (low), "=d" (high)
                         : [a] "a" (a), [b] "g" (b) );

        return low | ((u64)high) << 32;

The 'A' constraint doesn't work right.

And with that all the benchmark results are borken too.

root@ivb-ep:~/spinlocks# for i in -m64 -m32 -mx32 ; do echo $i; gcc -O3 $i -o 
mult mult.c -lm; ./mult; done

cond: avg: 7.474872 +- 0.008302
uncond: avg: 9.116401 +- 0.008468
128: avg: 0.826584 +- 0.005514

cond: avg: 16.604030 +- 0.009808
uncond: avg: 13.115470 +- 0.004452

cond: avg: 6.168156 +- 0.006650
uncond: avg: 7.202092 +- 0.006813
128: avg: 0.081809 +- 0.008440

Reply via email to