https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94026
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91322
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91322
--- Comment #10 from Wilco ---
(In reply to Christophe Lyon from comment #6)
> Created attachment 48184 [details]
> GCC passes dumps
So according to that, in 105t.vrp1 it removes the branch and unconditionally
calls abort:
Folding statement: _4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94442
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #6 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66462
--- Comment #12 from Wilco ---
(In reply to Segher Boessenkool from comment #11)
> I currently have
>
> ===
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index ad5135c..bc3d318 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -9050,6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91690
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #6 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91753
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91766
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #4 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #27 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84071
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81443
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #21 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84071
--- Comment #8 from Wilco ---
(In reply to Eric Botcazou from comment #6)
> > They are always written but have an undefined value. Adding 2 8-bit values
> > results in a 9-bit value with WORD_REGISTER_OPERATIONS.
>
> If they have an undefined va
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91144
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82853
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #23 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114
--- Comment #3 from Wilco ---
(In reply to Richard Biener from comment #1)
> This is probably related to targetm.sched.reassociation_width where reassoc
> will widen a PLUS chain so several instructions will be executable in
> parallel
> without
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #21 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #24 from Wilco ---
(In reply to Douglas Mencken from comment #22)
> (In reply to Wilco from comment #21)
>
> > That's odd. The stack pointer is definitely 16-byte aligned in all cases
> > right?
>
> As I know, PowerPC has no specia
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #26 from Wilco ---
(In reply to Douglas Mencken from comment #25)
> (In reply to Wilco from comment #24)
>
> > Yes the stage1 compiler would be fine or alternatively use
> > --disable-bootstrap to get an installed compiler.
>
> I’m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #32 from Wilco ---
(In reply to Segher Boessenkool from comment #29)
> It aligns the stack to 16:
>
> # r3 is size, at entry
> addi r3,r3,18
> ...
> rlwinm r3,r3,0,0,27
> ...
> neg r3,r3
> ..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #33 from Wilco ---
(In reply to Iain Sandoe from comment #30)
> From "Mac_OS_X_ABI_Function_Calls.pdf"
>
> m32 calling convention
>
> Prologs and Epilogs
> The called function is responsible for allocating its own stack frame,
> ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #38 from Wilco ---
(In reply to Douglas Mencken from comment #37)
> And some more in my wish list. May GCC don’t generate these
>
> .align2
>
> in text section? Any, each and every powerpc instruction is 32bit-wide, no
> and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #41 from Wilco ---
(In reply to Douglas Mencken from comment #40)
> To build it, I patched its sources with fix_gcc8_build.patch reversion
> together with changes from comment #16
So what is the disassembly now? The 2nd diff still s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85669
--- Comment #43 from Wilco ---
(In reply to Douglas Mencken from comment #42)
> (In reply to Wilco from comment #41)
>
> > So what is the disassembly now?
>
> $ /Developer/GCC/8.2p/PowerPC/32bit/bin/gcc -O2 -fno-inline pr78468.c
> -save-temps
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #13 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #5 from Wilco ---
This still fails on AArch64 in exactly the same way with latest trunk - can
someone reopen this? I don't seem to have the right permissions...
(In reply to Richard Biener from comment #4)
> So - can you please bisect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #6 from Wilco ---
This still fails on AArch64 in exactly the same way with latest trunk - can
someone reopen this? I don't seem to have the right permissions...
(In reply to Richard Biener from comment #4)
> So - can you please bisect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #8 from Wilco ---
In a few functions GCC decides that the assignments in loops are redundant. The
loops still execute but have their loads and stores removed. Eg. the first DO
loop in MP2NRG should be:
.L1027:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #9 from Wilco ---
The loops get optimized away in dom2. The info this phase emits is hard to
figure out, so it's not obvious why it thinks the array assignments are
redundant (the array is used all over the place so clearly cannot be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69619
--- Comment #2 from Wilco ---
Changing to c = 3 generates code after a short time. The issue is recursive
calls to expand_ccmp_expr during the 2 possible options tried to determine
costs. That makes the algorithm exponential.
A fix would be to e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69619
--- Comment #3 from Wilco ---
A simple workaround is to calculate cost1 early and only try the 2nd option if
the cost is low (ie. it's not a huge expression that may evaluate into lots of
ccmps). A slightly more advanced way would be to walk prep
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
Since a recent C++ header change abs() no longer gets inlined if we include an
unrelated header before it.
#include
#include
int
wrap_abs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69619
--- Comment #5 from Wilco ---
Proposed patch: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00206.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69657
--- Comment #5 from Wilco ---
(In reply to Andrew Pinski from comment #4)
> (In reply to Jonathan Wakely from comment #3)
> > Recategorising as component=c++, and removing the regression marker (because
> > the change in libstdc++ that reveals t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #29 from Wilco ---
(In reply to rguent...@suse.de from comment #28)
> On Fri, 5 Feb 2016, alalaw01 at gcc dot gnu.org wrote:
> > Should I raise a new bug for this, as both this and 53068 are CLOSED?
>
> I think this has been discuss
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #41 from Wilco ---
(In reply to Jerry DeLisle from comment #40)
> Do you have a reduced test case of the Fortran code we can look at?
See comment 13/14, the same common array is declared with different sizes in
various places.
> I a
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The following example generates very inefficient code on AArch64:
int f1(int i) { int p[1000]; p[i] = 1; return p[i + 10] + p[i + 20]; }
f1:
sub sp, sp, #4000
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #1 from Wilco ---
The regression seem to have appeared on trunk around Feb 3-9.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #5 from Wilco ---
(In reply to amker from comment #4)
> (In reply to ktkachov from comment #3)
> > Started with r233136.
>
> That's why I forced base+offset out of memory reference and kept register
> scaling in in the first place.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #5 from Wilco ---
(In reply to Jakub Jelinek from comment #3)
> If some arch in glibc implements memcpy.S and does not implement mempcpy.S,
> then obviously the right fix is to add mempcpy.S for that arch, usually it
> is just a matte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #6 from Wilco ---
(In reply to Jakub Jelinek from comment #4)
> Note the choice of this in a header file is obviously wrong, if you at some
> point fix this up, then apps will still call memcpy rather than mempcpy,
> even when the lat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #9 from Wilco ---
(In reply to H.J. Lu from comment #8)
> Inlining mempcpy uses a callee-saved register:
>
...
>
> Not inlining mempcpy is preferred.
If codesize is the only thing that matters... The cost is not at the caller
side
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #12 from Wilco ---
(In reply to Jiong Wang from comment #11)
> (In reply to Richard Henderson from comment #10)
> > Created attachment 37890 [details]
> > second patch
> >
> > Still going through full testing, but I wanted to post th
-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The expansion of __builtin_mempcpy is inefficient on many targets (eg. AArch64,
ARM, PPC). The issue is due to not using the same expansion options that memcpy
uses in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #15 from Wilco ---
(In reply to Richard Biener from comment #14)
> The regression in the original description looks severe enough to warrant
> some fixing even if regressing some other cases.
Agreed, I think the improvement from Rich
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #17 from Wilco ---
(In reply to Jiong Wang from comment #16)
> * for the second patch at #c10, if we always do the following no matter
> op0 is virtual & eliminable or not
>
> "op1 = force_operand (op1, NULL_RTX);"
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #20 from Wilco ---
(In reply to Richard Henderson from comment #19)
> I wish that message had been a bit more complete with the description
> of the performance issue. I must guess from this...
>
> > ldr dst1, [reg_base1, reg_ind
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
GCC emits the same code for caller-saves in all cases, even if the caller-save
is an immediate which can be trivially rematerialized. The caller-save code
should
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The following code in ira-costs.c tries to improve the memory cost for
rematerializeable loads. There are several issues with this though:
1. The memory cost can
-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
GCC uses a very basic check to determine whether to use a switch table. A
simple example from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11823 still
generates a huge table with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70861
--- Comment #3 from Wilco ---
(In reply to Andrew Pinski from comment #2)
> Note I think if we had gotos instead of assignment here we should do the
> similar thing for the switch table itself.
Absolutely, that was my point.
> Note also the ass
: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
IVOpt chooses between using indexing for induction variables or incrementing
pointers. Due to way loop unrolling works, a decision that is optimal if
unrolling is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70946
--- Comment #1 from Wilco ---
PR36712 seems related to this
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
When deciding which register to use regrename.c calls the target function
preferred_rename_class. However in pass 2 in find_rename_reg it then just
ignores this preference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961
--- Comment #3 from Wilco ---
(In reply to Eric Botcazou from comment #2)
> Pass #2 ignores it since the preference simply couldn't be honored.
In which case it should not rename that chain rather than just ignore the
preference (and a preferen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961
--- Comment #5 from Wilco ---
As for a simple example, Proc_4 in Dhrystone is a good one. With -O2 and
-fno-rename-registers I get the following on Thumb-2:
00c8 :
c8: b430push{r4, r5}
ca: f240 0300 movwr3,
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
When assigning the same immediate value to different registers, GCC will always
CSE the immediate and emit a register move for subsequent uses. This creates
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
With -Ofast GCC doesn't reassociate constant multiplies or negates away from
divisors to allow for more reciprocal division optimizations. It is also
possible to avoid division
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71022
--- Comment #2 from Wilco ---
(In reply to Richard Biener from comment #1)
> IRA might choose to do this as part of life-range splitting/shortening. Note
> that reg-reg moves may be cheaper code-size wise (like on CISC archs with
> non-fixed ins
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
There are 2 new failures in the tail-call-2.c test on recent trunk builds:
FAIL: gcc.dg/plugin/must-tail-call-2.c -fplugin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71951
--- Comment #11 from Wilco ---
(In reply to Icenowy Zheng from comment #10)
> In my environment (glibc 2.25, and both the building scripts of glibc and
> gcc have -fomit-frame-pointer automatically enabled), this bug is not fully
> resolved yet.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #38 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468
--- Comment #40 from Wilco ---
(In reply to Eric Botcazou from comment #39)
> > The existing alloca code relies on STACK_BOUNDARY being set correctly. Has
> > the value been fixed already for the OS variants mentioned? If stack
> > alignment can'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77484
--- Comment #29 from Wilco ---
(In reply to Jan Hubicka from comment #28)
> > On SPEC2000 the latest changes look good, compared to the old predictor gap
> > improved by 10% and INT/FP by 0.8%/0.6%. I'll run SPEC2006 tonight.
>
> It is rather su
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77484
--- Comment #31 from Wilco ---
(In reply to Jan Hubicka from comment #30)
> >
> > When I looked at gap at the time, the main change was the reordering of a
> > few
> > if statements in several hot functions. Incorrect block frequencies also
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71951
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #8 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81357
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #7 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82439
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #6 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #4 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
--- Comment #8 from Wilco ---
> /home/qinzhao/Install/latest/bin/gcc -O2 t_p_1.c t_p.c
> non-inlined version
> 20.84user 0.00system 0:20.83elapsed 100%CPU (0avgtext+0avgdata
> 360maxresident)k
> 0inputs+0outputs (0major+135minor)pagefaults 0swap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
--- Comment #9 from Wilco ---
(In reply to Qing Zhao from comment #7)
str(n)cmp with a constant string can be changed into memcmp if the string has a
known alignment or is an array of known size. We should check the common cases
are implemented.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
--- Comment #16 from Wilco ---
(In reply to Qing Zhao from comment #15)
> (In reply to Wilco from comment 14)
> > The only reason we have to do a character by character comparison is
> > because we
> > cannot read beyond the end of a string. How
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
--- Comment #18 from Wilco ---
(In reply to Qing Zhao from comment #17)
> (In reply to Wilco from comment #16)
>
> >> const char s[8] = “abcd\0abc”; // null byte in the middle of the string
> >> int f2(void) { return __builtin_strcmp(s, "abc")
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The __builtin_eh_return implementation on AArch64 generates incorrect code for
many cases due to using an incorrect offset/pointer when writing the new return
address to the stack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77455
Wilco changed:
What|Removed |Added
Target||AArch64
Known to fail|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66946
Wilco changed:
What|Removed |Added
Status|WAITING |RESOLVED
Resolution|---
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
Changes in the static branch predictor (around August last year) caused
regressions on SPEC2000. The PRED_CALL predictor causes GAP to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65068
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #3 from
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The recently introduced code hoisting aggressively moves common subexpressions
that might otherwise be mergeable with other
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77568
--- Comment #3 from Wilco ---
(In reply to Andrew Pinski from comment #1)
> I think this is just a pass ordering issue. We create fmas after PRE.
> Maybe we should do it both before and after ...
> Or enhance the pass which produces FMA to walk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77568
--- Comment #5 from Wilco ---
(In reply to Andrew Pinski from comment #2)
> Note there are two different issues here.
Well they are 3 examples of the same underlying issue - don't do a CSE when
it's not profitable. How they are resolved might be
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
A commonly used benchmark contains a hot loop which calls one of 2 virtual
functions via a static variable which is set just before. A reduced example is:
int f1(int x) { return x + 1; }
int f2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32650
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #4 from Wilco ---
(In reply to Bernd Edlinger from comment #3)
> (In reply to Wilco from comment #2)
> > (In reply to Bernd Edlinger from comment #1)
> > > some background about this bug can be found here:
> > >
> > > https://gcc.gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #8 from Wilco ---
(In reply to Bernd Edlinger from comment #7)
> (In reply to Richard Earnshaw from comment #6)
> > (In reply to Bernd Edlinger from comment #5)
> > > (In reply to Wilco from comment #4)
> > > > However dealing with pa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #12 from Wilco ---
It looks like we need a different approach, I've seen the extra SETs use up
more registers in some cases, and in other cases being optimized away early
on...
Doing shift expansion at the same time as all other DI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #11 from Wilco ---
(In reply to ktkachov from comment #10)
> Confirmed then. Wilco, if you're working on this can you please assign it to
> yourself?
Unfortunately the form doesn't allow me to do anything with the headers...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #14 from Wilco ---
(In reply to Bernd Edlinger from comment #13)
> I am still trying to understand why thumb1 seems to outperform thumb2.
>
> Obviously thumb1 does not have the shiftdi3 pattern,
> but even if I remove these from thum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #32 from Wilco ---
(In reply to Bernd Edlinger from comment #31)
> Sure, combine cant help, especially because it runs before split1.
>
> But I wondered why this peephole2 is not enabled:
>
> (define_peephole2 ; ldrd
> [(set (matc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862
--- Comment #4 from Wilco ---
(In reply to Vladimir Makarov from comment #3)
But I can not just revert the patch making ALL_REGS available
> to make
coloring heuristic more fotunate for your particular case, as it
> reopens the old PR for which
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862
--- Comment #13 from Wilco ---
(In reply to Vladimir Makarov from comment #9)
> Created attachment 35503 [details]
> ira-hook.patch
>
> Here is the patch. Could you try it and give me your opinion about it.
> Thanks.
I tried it out and when f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304
--- Comment #33 from Wilco ---
(In reply to Evandro from comment #32)
> (In reply to Ramana Radhakrishnan from comment #31)
> > (In reply to Evandro from comment #30)
> > > The performance impact of always referring to constants as if they were
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #6 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #10 from Wilco ---
The loops shown are not the correct inner loops for those options - with
-ffast-math they are vectorized. LLVM unrolls 2x but GCC doesn't. So the
question is why GCC doesn't unroll vectorized loops like LLVM?
GCC:
1 - 100 of 134 matches
Mail list logo