On Thu, 22 Apr 2010 17:29:53 +0200 tlaro...@polynum.com  wrote:
> Data:
> Under NetBSD/gcc, I have the following values:
> 
>       before: x1:=5440, x2:=-5843, x3:=78909
>       after: x1:=5440, x2:=-201, x3:=18166, r:=6827 t:=30232
> 
> Under Plan9/gcc, I have the following values:
> 
>       before: x1:=5440, x2:=-5843, x3:=78909
>       after: x1:=5440, x2:=2147483447, x3:=1073759990, r:=6827 t:=-1073711592
> 
> Uhm... seems to have a `slight' divergence...
> 
> In fact, all wrong values depend upon x2, that has the "correct"
> value... with 2^31 complement. A positive when it should be negative,
> since the offending code is the following:
> 
>       x2 = half ( x1 + x2 + xicorr ) ; 
> 
>         that is : 
>               x2 = (5440 - 5843 + 1) / 2;
> 
> Not exactly pushing things to the limit! And yes, the expected result is
> indeed -201.

You would get 2147483447 if x1 and x2 were treated as
unsigned numbers but -201 if treated as signed. Try this:

cat > x.c <<EOF
#include <stdio.h>
NUM f(NUM x, NUM y) { return (x + y + 1) / 2; }
int main(int c, char**v) { printf("%d\n", f(atoi(v[1]), atoi(v[2]))); }
EOF
cc -DNUM=signed   x.c && a.out 5440 -5843
cc -DNUM=unsigned x.c && a.out 5440 -5843

What is the type of x1 and x2? Can you show an actual C code
fragment?  Don't worry about it being complete. Just the half()
function (or macro), header of the function where it is
called, declarations for x1 and x2 and a couple of lines of
around call to half. I am still wondering if this is due to a
different interpretation of language semantics by the two
compilers.

> Since the problem arises in this context, but not if you just add
> this isolated in a test program, and call it with these very 3
> values (5440, -5843, 1), it is clear that's the way the computation
> is handled with huge number of parameters and auto variables
> that wreaks havoc.

You *suspect* this but you need to prove it.  An isolated
test case that doesn't trigger this problem simply means you
have not created the right condition for the bug.  Creating a
simple test can be tricky and may be more work than debugging
your program.

> If I declare all the auto volatile, this does nothing: same result.
> 
> If I do the addition, and afterwards take the half, that works:
> 
> x2 += x1 + xicorr;
> x2 = half(x2);        /* works! */

I wouldn't bother changing anything. You already have a
smoking gun (at least you know in which neighbourhood it has
gone off). You can try a binary search to narrow down the
area but in the end you will have to look at the assembly
output of the relevant code fragment.

Reply via email to