http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
Thorsten Glaser changed:
What|Removed |Added
CC||tg at mirbsd dot org
--- Comment #22 fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #21 from Dominique d'Humieres
2010-10-26 21:06:48 UTC ---
> I guess you mean LLVM instead of clang,
Yes, if you prefer. I was referring to the command I used.
> F (6, a * a * a * a * a + 2 * a * a * a + 5 * a)
you probably mean
F
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #20 from Jakub Jelinek 2010-10-26
21:00:11 UTC ---
If I translate the assembly back to C, it seems it is performing part of the
arithmetics in TImode:
unsigned long f (unsigned long a, unsigned long b)
{
if (a >= b)
return 0;
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #19 from joseph at codesourcery dot com 2010-10-26 20:29:56 UTC ---
On Tue, 26 Oct 2010, dominiq at lps dot ens.fr wrote:
> --- Comment #13 from Dominique d'Humieres
> 2010-10-26 16:36:05 UTC ---
> > This multiplication transformati
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #18 from Jakub Jelinek 2010-10-26
19:11:59 UTC ---
I guess you mean LLVM instead of clang, I'm pretty sure the FE doesn't perform
this optimization.
Anyway, given:
#define F(n, exp) \
unsigned long
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #17 from Dominique d'Humieres
2010-10-26 18:53:49 UTC ---
Note that clang seems to know the general result: \sum_{i=a}^b p(i)=P(b), where
p(i) is a given polynomial of degree n and P(x) a polynomial of degree n+1 such
that P(x)=P(x-1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #16 from Jakub Jelinek 2010-10-26
18:43:40 UTC ---
chrec_apply is called with
{a_4(D), +, {a_4(D) + 1, +, 1}_1}_1
chrec and ~a_4(D) + b_5(D) in x.
I wonder if this can be fixed just by recognizing such special cases in
chrec_apply (af
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #15 from Dominique d'Humieres
2010-10-26 17:15:31 UTC ---
> For sum += 2 or sum += b sccp handles this, so I wonder whether it couldn't
> handle even the sum += a case.
2 and b are constants while a is not. For constants you have to
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
Jakub Jelinek changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #13 from Dominique d'Humieres
2010-10-26 16:36:05 UTC ---
> This multiplication transformation is incorrect if the loop wraps
> (unsigned always wraps; never overflows).
I think this is wrong: wrapping is nothing but a modulo 2^n o
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #12 from pinskia at gmail dot com
2010-10-26 15:56:20 UTC ---
On Oct 26, 2010, at 7:30 AM, "j...@jak-linux.org" wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
>
> --- Comment #1 from Julian Andres Klode
> 2010-10-26 1
On Oct 26, 2010, at 7:30 AM, "j...@jak-linux.org" > wrote:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #1 from Julian Andres Klode
2010-10-26 14:30:24 UTC ---
Created attachment 22162
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22162
Clang's assember
This multi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #11 from Paolo Carlini 2010-10-26
15:42:58 UTC ---
Can we please stop talking about nano and giga numbers like kids? If an
optimization like complete loop unrolling is involved of course very small or
large numbers can be involved, do
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #10
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #9 from Jonathan Wakely 2010-10-26
15:28:51 UTC ---
(In reply to comment #8)
>
> Since the optimization seems to be mostly there in -O3, it's just a matter of
> enabling it in -O2.
Or if you want all optimisations, it's just a matt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #8 from Julian Andres Klode 2010-10-26
15:25:56 UTC ---
(In reply to comment #6)
> You get this kind of speedup if the compiler knows that the result of the loop
> is
>
> sum=(b*(b-1)-a*(a-1))/2
>
> In which case the timing is meani
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #7 from Julian Andres Klode 2010-10-26
15:00:37 UTC ---
(In reply to comment #5)
> (In reply to comment #4)
> > GCC's output is significantly faster at -O3 or without the noinline
> > attribute
>
> I just tested and at -O3, gcc-4.4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #6 from Dominique d'Humieres 2010-10-26
14:59:18 UTC ---
You get this kind of speedup if the compiler knows that the result of the loop
is
sum=(b*(b-1)-a*(a-1))/2
In which case the timing is meaningless (it is 0.000s on my laptop),
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #5 from Julian Andres Klode 2010-10-26
14:53:24 UTC ---
(In reply to comment #4)
> GCC's output is significantly faster at -O3 or without the noinline attribute
I just tested and at -O3, gcc-4.4 creates slow code and gcc-4.5 fast cod
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #4 from Jonathan Wakely 2010-10-26
14:47:12 UTC ---
GCC's output is significantly faster at -O3 or without the noinline attribute
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #3 from Julian Andres Klode 2010-10-26
14:32:27 UTC ---
System information:
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-5'
--with-bugurl=file:///usr/share/doc/gc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
Paolo Carlini changed:
What|Removed |Added
CC||paolo.carlini at oracle dot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #1 from Julian Andres Klode 2010-10-26
14:30:24 UTC ---
Created attachment 22162
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22162
Clang's assember
Attaching the assembler output from clang, it should help understand which
op
23 matches
Mail list logo