--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-03-05 19:03 ---
Subject: Re: [4.0/4.1 Regression] threefold
performance loss, not inlining as much
steven at gcc dot gnu dot org wrote:
> --- Additional Comments From steven at gcc dot gnu
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-04-07 12:50 ---
Subject: Re: Aliasing says stores to local
memory do alias
On 7 Apr 2005, dberlin at dberlin dot org wrote:
>
> --- Additional Comments From dberlin at gcc dot gnu dot org
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-09-17 14:17 ---
Please provide -fno-strict-aliasing with the build CFLAGS. I bugged the
Debian people to do this once, and this fixed all such issues.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id
memory
Product: gcc
Version: 4.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-09-17 17:07 ---
ipa-eh patch from
http://gcc.gnu.org/ml/gcc-patches/2005-09/msg00881.html
(with fix) does not really help.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23928
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-09-17 18:00 ---
eh-complexity patch from
http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01052.html
slightly edited to apply (and approved by GeoffK in june) helps:
peak memory usage is down to 1.2GB
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-09-17 18:43 ---
Extra ggc_collect after each optimize_inline_calls does not help reduce it
further.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23928
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-09-17 19:31 ---
Please fix the caller who is not folding the condition in the first place
instead.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23049
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-08-11 17:43 ---
I'll do that. Though
+ /* If we don't have , then we cannot
+optimize this case. */
+ if ((cond_code == NE_EXPR || cond_code
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-08-12 13:02 ---
Subject: Re: Aliasing can not tell array members
apart
On 12 Aug 2005, giovannibajo at libero dot it wrote:
> Can you document what's the compile-time effect of raising sa
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-08-13 14:17 ---
The problem is, we end up with
void g(A*) (a)
{
struct A D.1608;
:
D.1608 = *a;
f (D.1608) [tail call];
return;
}
after the tree optimizers. f (*a) would not be gimple, so
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-08-13 18:11 ---
With the copy ctor we end up with
void g(A*) (a)
{
struct A D.1603;
:
__comp_ctor (&D.1603, a);
f (&D.1603);
return;
}
which confuses me a bit, because here the prot
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-08-13 18:16 ---
Indeed - adding a destructor (or anything else that makes it a non-POD) "fixes"
the problem, too.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23372
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-06-01 08:16 ---
Subject: Re: [4.0/4.1 regression] __builtin_constant_p(&"Hello"[0])?1:-1
not compile-time constant
On 1 Jun 2005, pinskia at gcc dot gnu dot org wrote:
>
>
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-16 17:08 ---
The attached patch makes us for -O3 -funroll-loops -ffast-math produce in .vars
float foobar() ()
{
:
return a.array[3] * b.array[3] + a.array[2] * b.array[2] + a.array[0] *
b.array[0
The testcase
int foo(int bar)
{
int i, res = 0;
for (i=0; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=19131
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-22 18:16 ---
Subject: Re: alloca returning unnecessarily aligned pointer
and uses too much memory
pinskia at gcc dot gnu dot org wrote:
> --- Additional Comments From pinskia at gcc dot
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-23 22:23 ---
Subject: Re: alloca returning unnecessarily aligned pointer
and uses too much memory
pinskia at gcc dot gnu dot org wrote:
> --- Additional Comments From pinskia at gcc dot
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-12 09:35 ---
I guess we won't ever fix this for 3.3 and new-ra is dead, so this is "fixed".
--
What|Removed
--
Bug 13246 depends on bug 10469, which changed state.
Bug 10469 Summary: constant V4SF loads get moved inside loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10469
What|Old Value |New Value
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-12 09:44 ---
What is the status on this issue? I.e. +,-,*,/ on vector types for C++? Note
that trying to work around this missing feature with operator overloading like
v4sf operator+(const v4sf
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-12 11:05 ---
I can re-confirm that the patch moves 3.4 to the state of 3.3 - i.e. with an
extra imull compared to 2.95 and 4.0. The patch has bootstrapped with checking
enabled and -funroll-loops on
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.g
--
What|Removed |Added
OtherBugsDependingO||11706
nThis||
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19401
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-12 16:17 ---
Current status is that with -O2 on mainline we generate the same
(better) code for ::pow(x, 2) and std::pow(x, 2.0) than for
std::pow(x, 2) which looses because of the lack of unrolling
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-12 16:19 ---
In 3.4 one was able to do this by specifying -fpeel-loops and got complete loop
peeling enabled. In 4.0 this is also the case, but only for the RTL unroller -
the tree unroller is not
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-12 16:24 ---
Or stuff often found in C++ libraries:
template
struct Vector
{
Vector(float init)
{
for (int i=0; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=19401
--
What|Removed |Added
CC||rguenth at tat dot physik
||dot uni-tuebingen dot de
http
omponent: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19507
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-18 16:39 ---
A C testcase with the missing jump threading(?):
void bar(void);
void foo(const _Bool *flag)
{
if (*flag)
bar();
if (*flag)
bar();
}
a
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-18 20:10 ---
Subject: Re: missed tree-optimization
pinskia at gcc dot gnu dot org wrote:
> --- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-18
> 20:06 ---
>
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19516
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-18 22:29 ---
Done. PR19516.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19507
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-20 10:57 ---
This is also somewhat related to PR19401 as we do not unroll loops completely
with just -O2 at the moment, which is important for the second testcase.
--
http://gcc.gnu.org/bugzilla
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-20 14:57 ---
Subject: Re: unrolling happens too late/SRA
does not happen late enough
> Note PR 18755 blocks this if we go the SRA after loop optimization which
> seems like a better idea.
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-20 15:15 ---
Subject: Re: unrolling happens too late/SRA
does not happen late enough
On 20 Jan 2005, dberlin at dberlin dot org wrote:
> Wiat, why are we running SRA twice again at all?
> I
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-21 16:07 ---
Experimenting with SRA inside loop together with cleanup passes after
cunroll/sra didn't reveal anything good - even with loop cfg_cleanup patched in.
See thread starting at
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-23 11:13 ---
How comes, that if I change _Bool to int, after tree-optimizations we get
foo (flag)
{
int D.1121;
:
D.1121_2 = *flag_1;
if (D.1121_2 != 0) goto ; else goto ;
:;
bar ();
D
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-24 09:43 ---
Another one - matrix multiplication:
/* A [NxM], B [MxP] */
#define DOLOOP(N, M, P) \
void matmul ## N ## M ## P(double *res, const double *A, const double *B) \
{ \
int i,j,k
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19624
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-25 14:45 ---
Created an attachment (id=8060)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8060&action=view)
testcase
The testcase is reduced from a complex POOMA program.
--
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-25 14:52 ---
Oh, in principle this should compile to roughly the same as
void c_test(double *a, double *b, int ei, int ej, int stridea, int strideb)
{
for (int j=0; jhttp://gcc.gnu.org/bugzilla
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-25 15:27 ---
I guess making PRE and ivopts playing nicely together perfectly is near to
impossible - but any improvement in the 4.0 timeframe is welcome!
--
http://gcc.gnu.org/bugzilla
riable.
--
Summary: Aliasing says stores to local memory do alias
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-25 16:57 ---
Created an attachment (id=8062)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8062&action=view)
testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19626
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 08:47 ---
Subject: Re: Aliasing says stores to local
memory do alias
> D.2540 = (struct Loc<1> *) &dX.D.2210.D.2166.domain_m.buffer;
> That confuses the aliasing mechanism
>
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 10:24 ---
Subject: Re: [4.0 Regression] Many C++ compile-time
regressions for MICO's ORB code
> Bah, I hate profiles for "cc1plus -O2 ir.ii" without peaks:
>
> CPU
mary: Missed constant propagation with placement new
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rgu
ode for empty destructor
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-t
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 14:10 ---
We can also not fold &i[0] == &i[1] to false in
int foo(void)
{
int i[2];
if (&i[0] == &i[1])
return 1;
return 0;
}
or i+0
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 14:35 ---
Could we, in general, fold &a[i] TRUTHOP &a[j] to i TRUTHOP j? I guess the
only special case would be for sizeof(a[i]) == 0 -- but that is not allowed
by the standard? I'
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 14:54 ---
Subject: Re: fold misses that two ADDR_EXPR of
an arrary obvious not equal
On 26 Jan 2005, pinskia at gcc dot gnu dot org wrote:
> (In reply to comment #5)
> > Could we, i
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 15:30 ---
Ok - I guess it's ARRAY_REFs that are not folded ;) So the summary could be
"fold misses that two ARRAY_REFs with different offset of the same arrary are
obviously not equal&
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 16:16 ---
Umm, no. We fold the ARRAY_REF comparison to
PLUS_EXPR(ADDR_EXPR, INTEGER_CST) == PLUS_EXPR(ADDR_EXPR, INTEGER_CST)
oh well ;) So I guess transforming &a + i truth_op &a
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 17:24 ---
Hmm, it seems it causes
stage1/xgcc -Bstage1/ -B/usr/local/i686-pc-linux-gnu/bin/ -c -O2 -g
-fomit-frame-pointer -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes
-Wmissing
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-26 18:03 ---
Fails without the patch, too, with the same error.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15791
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-27 14:53 ---
Bootstrapping and testing completed successfully, but for the testcase
int g(void)
{
struct { int b[2]; } x;
return &x.b[0] == &x.b[1];
}
we have lowered the compa
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-28 14:21 ---
Folding &x.foo[2] == &x.foo to false does not help the testcase, as fold
never sees this comparison. Instead the initial code the C++ frontend
creates for ctor and dtor o
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-28 14:26 ---
One patch for empty-loop removal was posted here by Zdenek
http://gcc.gnu.org/ml/gcc-patches/2004-07/msg01679.html
--
What|Removed |Added
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-28 15:29 ---
Looking into it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19402
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-29 21:14 ---
Or we could simply unroll the loop completely, but while SCEV finds
the IV as
(set_scalar_evolution
(scalar = this_6)
(scalar_evolution = {(struct Foo * const) &x.foo[2]
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-01-30 18:54 ---
Subject: Re: Funny (horrible) code for empty
destructor
pinskia at gcc dot gnu dot org wrote:
> --- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-29
>
guenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19775
--
What|Removed |Added
Severity|normal |critical
Keywords||wrong-code
Known to fail|
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-02-03 17:32 ---
Subject: Re: [4.0 Regression] threefold performance
loss, not inlining as much
bonzini at gcc dot gnu dot org wrote:
> To the reporter: in this case you probably want __attribut
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2005-02-07 13:25 ---
Fixed.
--
What|Removed |Added
Status|NEW
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de
2004-10-16 15:42 ---
Subject: Re: alignof and sizeof (and other expressions) in
attributes does not compile inside template classes
giovannibajo at libero dot it wrote:
> --- Additional Comments F
dle struct initializer
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de
2004-10-17 21:51 ---
Created an attachment (id=7369)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=7369&action=view)
testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18042
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de
2004-10-25 13:02 ---
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
And
http://gcc.gnu.org/ml/gcc/2004-10/msg00955.html
--
http://gcc.gnu.org
Status: UNCONFIRMED
Severity: enhancement
Priority: P2
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18296
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de
2004-11-04 14:30 ---
Subject: Re: Misleading diagnostic for recursive template
instantiation
On 4 Nov 2004, pinskia at gcc dot gnu dot org wrote:
> Confirmed, I think PR 15538 would fix the prob
gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: g
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-11-29 11:04 ---
Looking at the 3.4 branch the defaults for the relevant inlining parameters are
the same. So the difference in performance has to be accounted to different
tree-node counting (or to
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-11-29 12:10 ---
Documentation patches for 3.4 and mainline are here:
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02457.html
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02551.html
--
http
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-06 09:53 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On 6 Dec 2004, pinskia at gcc dot gnu dot org wrote:
> No reason to keep this one open, there is
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-06 12:33 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On 6 Dec 2004, pinskia at gcc dot gnu dot org wrote:
> No reason to keep this one open, there is
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-06 13:18 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On 6 Dec 2004, hubicka at ucw dot cz wrote:
> The cfg inliner per se is not too interesting. W
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-06 14:31 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On 6 Dec 2004, hubicka at ucw dot cz wrote:
> > > the order of inlining decisions affecting
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-07 14:35 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On 6 Dec 2004, hubicka at ucw dot cz wrote:
> Looks like I get 4fold speedup on tree profiling w
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-07 15:09 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On 7 Dec 2004, hubicka at ucw dot cz wrote:
> > Yes, it seems so. Really nice improvement.
--- Additional Comments From rguenth at tat dot physik dot uni-tuebingen
dot de 2004-12-07 15:35 ---
Subject: Re: [4.0 Regression] Inlining limits
cause 340% performance regression
On Tue, 7 Dec 2004, Richard Guenther wrote:
> static inline void foo() {}
> void bar()
82 matches
Mail list logo