I stumbled across the following and was interested in others thoughts
before I proceed any further.
The algorithm described in the comments of bb-reorder.c (and the paper
cited) talk about two parameters for controlling which blocks will be added
to traces. "Branch threshhold" refers to the b
Steven Bosscher <[EMAIL PROTECTED]> wrote on 04/13/2005 09:39:55 AM:
> On Wednesday 13 April 2005 00:18, Pat Haugen wrote:
> > When we have a test block gating whether a loop should be
> > entered, the new block frequency check causes the code to pick the
non-loop
>
Steven Bosscher <[EMAIL PROTECTED]> wrote on 04/13/2005 02:10:07 PM:
>
> The problem with your original proposal is that computing
> post-dominance information really is expensive. Depending
> on how often this 50/50 case happens, in a real profile, it
> may or may not be worth the cost do as
Sorry if there is an obvious answer for this, but I'm not to familiar with
the tree-ssa phase.
My question is, why doesn't tree-ssa-dse.c do anything to the following
code?
int i,j,k,l;
void p1() {
i = 1; /* Dead store */
j = 2;
i = k; /* Dead store after copy prop */
if (i == 1)
cc1: warnings being treated as errors
/home/pthaugen/work/src/mainline/gcc/gcc/config/rs6000/rs6000.c:12538:
warning: ‘rs6000_invalid_within_doloop’ defined but not used
-Pat
[EMAIL PROTECTED] wrote on 06/09/2005 02:43:37 PM:
> cc1: warnings being treated as errors
> /home/pthaugen/work/src/mainline/gcc/gcc/config/rs6000/rs6000.c:12538:
> warning: ‘rs6000_invalid_within_doloop’ defined but not used
ChangeLog looks odd on this, Adrian changed the name of prot
The following patch fixes the code to match the commentary of the algorithm
such that block frequency is used instead of edge frequency. The change is
pretty much neutral on SPEC, max differences of +/- 1% with most showing no
difference.
Bootstrapped and regtested on powerpc64-unknown-linux-
It appears to me that both loop invariant motion passes (tree/rtl) don't
look at basic block frequencies and will gladly hoist invariant code
from a cold block within a loop. This can impact performance by
executing (possibly costly) code that would otherwise not be executed,
adds another regis
On 11/06/2014 01:00 PM, Richard Biener wrote:
Shouldn't we never hoist anything from a bb with lower execution frequency
to a bb with higher one? It seems LIM simply assumes that inside a loop
is always higher frequency than outside of it.
So - why artificially have that factor of 0.1 instead o
I'm seeing some odd behavior in ira for PowerPC, starting with the big ira merge
best I can tell (r171649).
void foo(float *f1, float*f2) {
*f1 = *f2;
}
If I compile with gcc -S -m64 -O3 -mcpu=power7 and look at the ira dump, I see
that the pseudo used to copy the data, r120, is spilled. Rel
On 05/16/2011 04:19 PM, Georg-Johann Lay wrote:
Pat Haugen schrieb:
I'm seeing some odd behavior in ira for PowerPC, starting with the big ira
merge best I can tell (r171649).
void foo(float *f1, float*f2) {
*f1 = *f2;
}
If I compile with gcc -S -m64 -O3 -mcpu=power7 and look at th
On 05/17/2011 11:07 AM, Vladimir Makarov wrote:
Thanks for pointing this out, Pat. Your patch could fix this particular problem
but using GENERAL_REGS only is wrong. The final allocno class should be
NON_SPECIAL_REGS. I will search for a better solution. Unfortunately, such
changes in the cod
I'm looking into a case where TER is forward propagating a series of
additions across a call.
extern void foo(void);
int bar(int a, int b, int c, int d, int e, int f, int g, int h) {
int ret;
ret = a + b + c + d + e + f + g + h;
foo();
return ret; /* 'ret' use replaced by rhs above */
On 10/18/2010 10:33 AM, Jeff Law wrote:
On 10/18/10 09:22, David Edelsohn wrote:
On Mon, Oct 18, 2010 at 8:27 AM, Nathan
Froyd wrote:
On Mon, Oct 18, 2010 at 02:49:21PM +0800, Jie Zhang wrote:
3. The aforementioned rs6000 hack rs6000_issue_rate was added by
2003-03-03 David Edelsohn
On 10/20/2010 7:48 PM, Jie Zhang wrote:
Running CPU2006, with the hack removed I see about a 1% improvement in
specint (10% in 456.hmmer, a couple others in the 3% range, -3%
401.bzip2) and a 1% degradation in specfp (mainly due to a 13%
degradation in 435.gromacs). But 454.calculix also fails fo
I was looking in to a degradation for perlbmk on PowerPC and tracked it
down to a mispredicted branch within a loop ( if (...) return 0; within
the loop). GCC is statically predicting the loop exit as not taken "bne-",
but it is obviously being taken the greatest share of the time because when
Jan Hubicka <[EMAIL PROTECTED]> wrote on 08/08/2006 01:04:33 AM:
> > are predicted. Should the 10% probability be applied without dividing
by
> > the number of exits (i.e. each exit has a 10% probability of being
taken,
> > independent of other loop exits)? The way things are now, once we get
mo
Pat Haugen <[EMAIL PROTECTED]> wrote on 08/08/2006 11:07:58 AM:
> Jan Hubicka <[EMAIL PROTECTED]> wrote on 08/08/2006 01:04:33 AM:
>
> > The code there is basically avoiding loops with many exists to be
> > predicted to not loop at all (ie if you have 10 exits,
Two part question:
1) Does the control flow graph exist at the time we're emitting assembler
instructions?
2) If so, how do I go at getting at the basic block info, specifically
successor info, if the only thing I have is a rtx for a conditional jump
insn?
Okay, maybe a 3-part question. Give
>
> Two part question:
>
> 1) Does the control flow graph exist at the time we're emitting
assembler
> instructions?
>
> 2) If so, how do I go at getting at the basic block info, specifically
> successor info, if the only thing I have is a rtx for a conditional jump
> insn?
>
>
> Okay, maybe a 3-p
Jan Hubicka <[EMAIL PROTECTED]> wrote on 08/19/2006 07:51:42 PM:
> >
> > Hi,
> > thepatch limiting minimal probability to 2% seems to make sense to me,
> > so please submit it for review. It would be nice to have the code to
> > compute maximal number of exits from loop too, but if it is really 9
Pat Haugen <[EMAIL PROTECTED]> wrote on 08/21/2006 01:22:25 PM:
> Jan Hubicka <[EMAIL PROTECTED]> wrote on 08/19/2006 07:51:42 PM:
>
> > Hi,
> > this patch at least hides the ugly details within some abstraction so
we
> > can eventally go for propagating r
Is there a reason REG_POINTER isn't propagated to the target register for
rtl insns of the form "reg_x = regP_y + reg_z", where regP_y is a reg
marked as REG_POINTER? It seems the attribute is only propagated when we
have "reg_x = regP_y + CONST", at least in the couple instances I saw
(regcla
Alexander Monakov <[EMAIL PROTECTED]> wrote on 09/29/2008 01:34:12 PM:
> I'm seeing a miscompilation on sel-sched branch that at first sight looks
> related to IRA merge.
>
> alias.c::anti_dependence disambiguates references to
> (mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64])
> and
> (
I'm looking into a few cases where we're still getting the base/index
operand ordering wrong on PowerPC for an indexed load/store instruction,
even after the PTR_PLUS merge and fix for PR28690. One of the cases I
observed was caused by reload picking r0 to use for the base reg opnd as a
result of
Ian Lance Taylor <[EMAIL PROTECTED]> wrote on 08/10/2007 07:17:21 PM:
>
> I'm not entirely clear: how do you propose changing the code?
>
I was thinking of reordering the if tests such that we check if op0 is
already ok_for_base or op1 is ok_for_index before we check the inverse
conditions (which
26 matches
Mail list logo