[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-29 Thread tg at mirbsd dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Thorsten Glaser changed: What|Removed |Added CC||tg at mirbsd dot org --- Comment #22 fr

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #21 from Dominique d'Humieres 2010-10-26 21:06:48 UTC --- > I guess you mean LLVM instead of clang, Yes, if you prefer. I was referring to the command I used. > F (6, a * a * a * a * a + 2 * a * a * a + 5 * a) you probably mean F

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #20 from Jakub Jelinek 2010-10-26 21:00:11 UTC --- If I translate the assembly back to C, it seems it is performing part of the arithmetics in TImode: unsigned long f (unsigned long a, unsigned long b) { if (a >= b) return 0;

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread joseph at codesourcery dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #19 from joseph at codesourcery dot com 2010-10-26 20:29:56 UTC --- On Tue, 26 Oct 2010, dominiq at lps dot ens.fr wrote: > --- Comment #13 from Dominique d'Humieres > 2010-10-26 16:36:05 UTC --- > > This multiplication transformati

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #18 from Jakub Jelinek 2010-10-26 19:11:59 UTC --- I guess you mean LLVM instead of clang, I'm pretty sure the FE doesn't perform this optimization. Anyway, given: #define F(n, exp) \ unsigned long

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #17 from Dominique d'Humieres 2010-10-26 18:53:49 UTC --- Note that clang seems to know the general result: \sum_{i=a}^b p(i)=P(b), where p(i) is a given polynomial of degree n and P(x) a polynomial of degree n+1 such that P(x)=P(x-1)

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #16 from Jakub Jelinek 2010-10-26 18:43:40 UTC --- chrec_apply is called with {a_4(D), +, {a_4(D) + 1, +, 1}_1}_1 chrec and ~a_4(D) + b_5(D) in x. I wonder if this can be fixed just by recognizing such special cases in chrec_apply (af

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #15 from Dominique d'Humieres 2010-10-26 17:15:31 UTC --- > For sum += 2 or sum += b sccp handles this, so I wonder whether it couldn't > handle even the sum += a case. 2 and b are constants while a is not. For constants you have to

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #13 from Dominique d'Humieres 2010-10-26 16:36:05 UTC --- > This multiplication transformation is incorrect if the loop wraps > (unsigned always wraps; never overflows). I think this is wrong: wrapping is nothing but a modulo 2^n o

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread pinskia at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #12 from pinskia at gmail dot com 2010-10-26 15:56:20 UTC --- On Oct 26, 2010, at 7:30 AM, "j...@jak-linux.org" wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 > > --- Comment #1 from Julian Andres Klode > 2010-10-26 1

Re: [Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread Andrew Pinski
On Oct 26, 2010, at 7:30 AM, "j...@jak-linux.org" > wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #1 from Julian Andres Klode 2010-10-26 14:30:24 UTC --- Created attachment 22162 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22162 Clang's assember This multi

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #11 from Paolo Carlini 2010-10-26 15:42:58 UTC --- Can we please stop talking about nano and giga numbers like kids? If an optimization like complete loop unrolling is involved of course very small or large numbers can be involved, do

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #10

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #9 from Jonathan Wakely 2010-10-26 15:28:51 UTC --- (In reply to comment #8) > > Since the optimization seems to be mostly there in -O3, it's just a matter of > enabling it in -O2. Or if you want all optimisations, it's just a matt

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #8 from Julian Andres Klode 2010-10-26 15:25:56 UTC --- (In reply to comment #6) > You get this kind of speedup if the compiler knows that the result of the loop > is > > sum=(b*(b-1)-a*(a-1))/2 > > In which case the timing is meani

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #7 from Julian Andres Klode 2010-10-26 15:00:37 UTC --- (In reply to comment #5) > (In reply to comment #4) > > GCC's output is significantly faster at -O3 or without the noinline > > attribute > > I just tested and at -O3, gcc-4.4

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #6 from Dominique d'Humieres 2010-10-26 14:59:18 UTC --- You get this kind of speedup if the compiler knows that the result of the loop is sum=(b*(b-1)-a*(a-1))/2 In which case the timing is meaningless (it is 0.000s on my laptop),

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #5 from Julian Andres Klode 2010-10-26 14:53:24 UTC --- (In reply to comment #4) > GCC's output is significantly faster at -O3 or without the noinline attribute I just tested and at -O3, gcc-4.4 creates slow code and gcc-4.5 fast cod

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #4 from Jonathan Wakely 2010-10-26 14:47:12 UTC --- GCC's output is significantly faster at -O3 or without the noinline attribute

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #3 from Julian Andres Klode 2010-10-26 14:32:27 UTC --- System information: Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-5' --with-bugurl=file:///usr/share/doc/gc

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Paolo Carlini changed: What|Removed |Added CC||paolo.carlini at oracle dot

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #1 from Julian Andres Klode 2010-10-26 14:30:24 UTC --- Created attachment 22162 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22162 Clang's assember Attaching the assembler output from clang, it should help understand which op