Calculating instruction costs

2013-07-09 Thread David Given
I'm working on a gcc backend for an architecture. The architecture has
instructions for indexed array access; so, ld r0, (r1, r2) is equivalent
to r0 = r1[r2] where r1 is a int32_t*.

I'm representing this in the .md file with the following pattern:

(define_insn "*si_load_indexed"
  [
(set
  (match_operand:SI 0 "register_operand" "=r")
  (mem:SI
(plus:SI
  (mult:SI
(match_operand:SI 1 "register_operand" "%r")
(const_int 4))
  (match_operand:SI 2 "register_operand" "r"
  ]
  ""
  "ld %0, (%2, %1)"
  [(set_attr "length" "4")]
)

However, the instruction is never actually being emitted. Looking at the
debug output from the instruction combining stage, I see this:

Trying 8, 9 -> 10:
Successfully matched this instruction:
(set (reg:SI 47 [ *_5 ])
(mem:SI (plus:SI (mult:SI (reg/v:SI 43 [ b ])
(const_int 4 [0x4]))
(reg:SI 0 r0 [ a ])) [2 *_5+0 S4 A32]))
rejecting combination of insns 8, 9 and 10
original costs 8 + 4 + 4 = 16
replacement cost 32

Instructions 8, 9 and 10 are:

(insn 8 5 9 2 (set (reg:SI 45)
(ashift:SI (reg/v:SI 43 [ b ])
(const_int 2 [0x2]))) test.c:5 15 {ashlsi3}
 (expr_list:REG_DEAD (reg/v:SI 43 [ b ])
(nil)))
(insn 9 8 10 2 (set (reg/f:SI 46)
(plus:SI (reg/v/f:SI 42 [ a ])
(reg:SI 45))) test.c:5 13 {addsi3}
 (expr_list:REG_DEAD (reg:SI 45)
(expr_list:REG_DEAD (reg/v/f:SI 42 [ a ])
(nil
(insn 10 9 15 2 (set (reg:SI 47 [ *_5 ])
(mem:SI (reg/f:SI 46) [2 *_5+0 S4 A32])) test.c:5 6 {*si_load}
 (expr_list:REG_DEAD (reg/f:SI 46)
(nil)))

If I've read this correctly, it indicates that the instruction pattern
has been matched, but the instruction has been rejected due to being
more expensive than the original instructions.

So, how is it calculating the cost of my instruction? Where's it getting
that 32 from (which seems weirdly high)?

Right now all the cost macros are left as the default, which is probably
the root of the problem; but I'm having a lot of trouble getting my head
around them. In the interest of actually getting something to work, are
there any ways of using a simplified cost model where the cost of each
instruction is specified manually in the instruction pattern alongside
the length? (Or even just *using* the length as the cost...)

-- 
┌─── dg@cowlark.com ─ http://www.cowlark.com ─
│ "USER'S MANUAL VERSION 1.0:  The information presented in this
│ publication has been carefully for reliability." --- anonymous
│ computer hardware manual



signature.asc
Description: OpenPGP digital signature


Re: Calculating instruction costs

2013-07-09 Thread Jay Foad
On 9 July 2013 11:02, David Given  wrote:
> Right now all the cost macros are left as the default, which is probably
> the root of the problem; but I'm having a lot of trouble getting my head
> around them. In the interest of actually getting something to work, are
> there any ways of using a simplified cost model where the cost of each
> instruction is specified manually in the instruction pattern alongside
> the length? (Or even just *using* the length as the cost...)

Hi David!

Does -Os achieve this?

Jay.

P.S. Sorry if you got two copies of this.


Re: Calculating instruction costs

2013-07-09 Thread Chung-Ju Wu
2013/7/9, David Given :
> I'm working on a gcc backend for an architecture. The architecture has
> instructions for indexed array access; so, ld r0, (r1, r2) is equivalent
> to r0 = r1[r2] where r1 is a int32_t*.
>
> I'm representing this in the .md file with the following pattern:
>
> (define_insn "*si_load_indexed"
>   [
> (set
>   (match_operand:SI 0 "register_operand" "=r")
>   (mem:SI
> (plus:SI
>   (mult:SI
> (match_operand:SI 1 "register_operand" "%r")
> (const_int 4))
>   (match_operand:SI 2 "register_operand" "r"
>   ]
>   ""
>   "ld %0, (%2, %1)"
>   [(set_attr "length" "4")]
> )
>
> However, the instruction is never actually being emitted. Looking at the
> debug output from the instruction combining stage, I see this:
>
> Trying 8, 9 -> 10:
> Successfully matched this instruction:
> (set (reg:SI 47 [ *_5 ])
> (mem:SI (plus:SI (mult:SI (reg/v:SI 43 [ b ])
> (const_int 4 [0x4]))
> (reg:SI 0 r0 [ a ])) [2 *_5+0 S4 A32]))
> rejecting combination of insns 8, 9 and 10
> original costs 8 + 4 + 4 = 16
> replacement cost 32
>
> Instructions 8, 9 and 10 are:
>
> (insn 8 5 9 2 (set (reg:SI 45)
> (ashift:SI (reg/v:SI 43 [ b ])
> (const_int 2 [0x2]))) test.c:5 15 {ashlsi3}
>  (expr_list:REG_DEAD (reg/v:SI 43 [ b ])
> (nil)))
> (insn 9 8 10 2 (set (reg/f:SI 46)
> (plus:SI (reg/v/f:SI 42 [ a ])
> (reg:SI 45))) test.c:5 13 {addsi3}
>  (expr_list:REG_DEAD (reg:SI 45)
> (expr_list:REG_DEAD (reg/v/f:SI 42 [ a ])
> (nil
> (insn 10 9 15 2 (set (reg:SI 47 [ *_5 ])
> (mem:SI (reg/f:SI 46) [2 *_5+0 S4 A32])) test.c:5 6 {*si_load}
>  (expr_list:REG_DEAD (reg/f:SI 46)
> (nil)))
>
> If I've read this correctly, it indicates that the instruction pattern
> has been matched, but the instruction has been rejected due to being
> more expensive than the original instructions.
>
> So, how is it calculating the cost of my instruction? Where's it getting
> that 32 from (which seems weirdly high)?
>
> Right now all the cost macros are left as the default, which is probably
> the root of the problem; but I'm having a lot of trouble getting my head
> around them. In the interest of actually getting something to work, are
> there any ways of using a simplified cost model where the cost of each
> instruction is specified manually in the instruction pattern alongside
> the length? (Or even just *using* the length as the cost...)
>

Hi, David,

I suggest setting a break point on rtx_costs and
see how it calculates the overall costs when combining
insns 8, 9, and 10.

Best regards,
jasonwucj


Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Andreas Arnez
When building gdb with newer gcc versions I frequently stumble across
maybe-uninitialized false positives, like the ones documented in bug
57237.  Various bugs address similar issues, and in bug 56526 Jakub
Jelinek wrote:

> Maybe-uninitialized warnings have tons of known false positives, while
> the predicated analysis can handle the simplest cases, it can't handle
> anything more complicated.

Even the description of this option states:

> These warnings are made optional because GCC is not smart enough to
> see all the reasons why the code might be correct in spite of
> appearing to have an error.

With this situation at hand, I wonder whether it's a good idea to keep
maybe-uninitialized included in -Wall.  Projects which have been using
"-Wall -Werror" successfully for many years are now forced to
investigate non-existing bugs in their code.



Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Andrew Haley
On 07/09/2013 12:59 PM, Andreas Arnez wrote:
> With this situation at hand, I wonder whether it's a good idea to keep
> maybe-uninitialized included in -Wall.  Projects which have been using
> "-Wall -Werror" successfully for many years are now forced to
> investigate non-existing bugs in their code.

But maybe-uninitialized is very useful, and it's not really inappropriate
for -Wall.  I would question the appropriateness of using -Wall -Werror
in production code.

Andrew.



Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Jonathan Wakely
On 9 July 2013 13:04, Andrew Haley wrote:
> On 07/09/2013 12:59 PM, Andreas Arnez wrote:
>> With this situation at hand, I wonder whether it's a good idea to keep
>> maybe-uninitialized included in -Wall.  Projects which have been using
>> "-Wall -Werror" successfully for many years are now forced to
>> investigate non-existing bugs in their code.
>
> But maybe-uninitialized is very useful, and it's not really inappropriate
> for -Wall.  I would question the appropriateness of using -Wall -Werror
> in production code.

Me too, but for those who prefer it there is
-Wno-error=maybe-uninitialized so they can keep everything else as an
error.


Re: Inline virtual call puzzling behaviour

2013-07-09 Thread Michael Matz
Hi,

On Sun, 7 Jul 2013, Thierry Lavoie wrote:

> int main(int argc, char** argv)
> {
> A* ptr = 0;
> if(argc == 1)
> ptr = new B();
> else
> ptr = new A();
> 
> ptr->blah();
> 
> B().blah();
> C().blah();
> }
> 
> The puzzling part is to find that the call for C().blah() is indeed 
> inlined and the ptr->blah() uses a VTable resolution, but the code for 
> B.blah() uses neither: the static adress is resolved but the code is not 
> inlined! (The same behaviour occurs if there would be a static-typed 
> pointer to an object of class B). I understand the compiler propagates 
> the types properly, but even after determining the correct type for the 
> object of type B, it only resolves the vtable reference (hence no call 
> *(%..x) ), but cannot perform the inlining.

It can, but decides to not do.  main is special and known to the compiler 
to be called only once, hence is itself cold, and GCC thinks inlining 
B::blah increases code size (of main).  Rename main to foo, or put the 
whole code of main into a loop with unknown bounds and you see B::blah 
inlined (as well as some other functions).  That C::blah is inlined in 
your example is a side-effect of that function being non-virtual, so some 
other transformations remove the 'this' argument, ultimately making gcc 
think inlining C::blah decreases code size of main (and hence it inlines 
even into that cold function).

> From a practical point of view, I understand this example does not 
> justify by itself the absolute need for inlining. However, I do have a 
> time-critical application that would get 25-30% increase in speed if I 
> could solve this issue.

Then the above testcase doesn't reflect the real issue you have.


Ciao,
Michael.


Re: Calculating instruction costs

2013-07-09 Thread Michael Matz
Hi,

On Tue, 9 Jul 2013, David Given wrote:

> Trying 8, 9 -> 10:
> Successfully matched this instruction:
> (set (reg:SI 47 [ *_5 ])
> (mem:SI (plus:SI (mult:SI (reg/v:SI 43 [ b ])
> (const_int 4 [0x4]))
> (reg:SI 0 r0 [ a ])) [2 *_5+0 S4 A32]))
> rejecting combination of insns 8, 9 and 10
> original costs 8 + 4 + 4 = 16
> replacement cost 32
> 
> So, how is it calculating the cost of my instruction? Where's it getting
> that 32 from (which seems weirdly high)?

As you didn't adjust any cost I would guess the high value comes from the 
default implementation of address_cost, which simply uses arithmetic cost, 
and the MULT in there is quite expensive by default.

See TARGET_ADDRESS_COST in several ports.


Ciao,
Michael.


Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Andrew Haley
On 07/09/2013 02:56 PM, Andreas Arnez wrote:
> What matters is whether *some* stages of production code development use
> this combination of options.  It could certainly be argued whether it
> should also be a project's "configure" default, like currently the case
> for gdb.

It's not a problem for GDB because they track GCC development.

Andrew.



Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Andreas Arnez
Andrew Haley  writes:

> On 07/09/2013 12:59 PM, Andreas Arnez wrote:
>> With this situation at hand, I wonder whether it's a good idea to keep
>> maybe-uninitialized included in -Wall.  Projects which have been using
>> "-Wall -Werror" successfully for many years are now forced to
>> investigate non-existing bugs in their code.
>
> But maybe-uninitialized is very useful, and it's not really inappropriate
> for -Wall.

The warning would be extremely useful, if false positives didn't obscure
real problems.  In its current state I consider its usefulness limited.

The reason I consider the warning (currently) inappropriate for -Wall is
that it does not seem to behave as reliably and predictably as the other
-Wall options.  The description of -Wall says:

"This enables all the warnings about constructions that some users
consider questionable, and that are easy to avoid (or modify to prevent
the warning) [...]"

Note the "easy to avoid".  If everybody agrees that -Wall should include
maybe-uninitialized, then I suggest to change the wording here.

> I would question the appropriateness of using -Wall -Werror in
> production code.

What matters is whether *some* stages of production code development use
this combination of options.  It could certainly be argued whether it
should also be a project's "configure" default, like currently the case
for gdb.



Pointer arithmetic

2013-07-09 Thread Hendrik Greving
On a machine with ABI ILP32LL64:

(insn 123 122 124 (nil) (set (reg:SI 392)
(mem:SI (plus:SI (reg/v:SI 386)
(reg/v:SI 349)) [0 sec 0 space 0, cmsmode 0 S4 A32])) -1 (nil)
(nil))

If we support legitimate memory addresses like [r1+r2] (e.g. indexed
addresses), can the above RTL match such a load? I am asking because
of overflows, I am not sure how that part is defined, and where the
Spec is. What do I need to check in the backend for such a definition?
Is this POINTER_SIZE? E.g. what if the machine supports > 32 bits, who
is responsible to make sure that there is no overflow > 32 bits in
this case? Compiler? Assembler? Or even the user?

Thanks,
Hendrik


Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Tom Tromey
Andrew> I would question the appropriateness of using -Wall -Werror in
Andrew> production code.

Andreas> What matters is whether *some* stages of production code
Andreas> development use this combination of options.  It could
Andreas> certainly be argued whether it should also be a project's
Andreas> "configure" default, like currently the case for gdb.

gdb only enables it for the development branch, not for releases.  If
you're building from CVS you're expected to know how to either fix these
problems or disable -Werror.  Typically the fix is trivial; if you look
through the archives you'll see fixes along these lines.

Tom


Resolving an issue of bootstrap failure from vectorization.

2013-07-09 Thread Cong Hou
Hi

My name is Cong Hou, and I am a Noogler working in the compiler
optimization team at Google.

When we were trying moving the vectorization from O3 to O2 in GCC 4.9,
we met a bootstrap failure from comparison between stage 2&3. This
failure is caused by a potential bug in GCC as stated below.

In the file tree-vect-data-refs.c, there is a qsort() function call
which sorts a group of data references using a comparison function
called "dr_group_sort_cmp()". In this function, the iterative hash
values of tree nodes are used for comparisons. For a declaration tree
node, its UID participates in the calculation of the hash value.
However, a specific declaration may have different UIDs whether the
debug information is switched on/off. In consequence, the results of
comparisons may vary in stage 2&3 during bootstrapping.

As a solution, I think we need a function to compare two tree nodes
that does not rely on UIDs. An apparent idea is comparing the tree
code first, and if they have the same tree code, then compare their
subnodes recursively. To resolve the issue we have now, I just
composed a very basic comparison function which only compares tree
codes, and this patch can make the bootstrap get passed. I have
attached the patch here.

I am wondering if this is a valid solution and appreciate if someone
could give me any feedback.


Thank you!


Cong


patch.diff
Description: Binary data


Re: Resolving an issue of bootstrap failure from vectorization.

2013-07-09 Thread Xinliang David Li
Please repost the patch to gcc-patches@ mailing list.

David

On Tue, Jul 9, 2013 at 1:10 PM, Cong Hou  wrote:
> Hi
>
> My name is Cong Hou, and I am a Noogler working in the compiler
> optimization team at Google.
>
> When we were trying moving the vectorization from O3 to O2 in GCC 4.9,
> we met a bootstrap failure from comparison between stage 2&3. This
> failure is caused by a potential bug in GCC as stated below.
>
> In the file tree-vect-data-refs.c, there is a qsort() function call
> which sorts a group of data references using a comparison function
> called "dr_group_sort_cmp()". In this function, the iterative hash
> values of tree nodes are used for comparisons. For a declaration tree
> node, its UID participates in the calculation of the hash value.
> However, a specific declaration may have different UIDs whether the
> debug information is switched on/off. In consequence, the results of
> comparisons may vary in stage 2&3 during bootstrapping.
>
> As a solution, I think we need a function to compare two tree nodes
> that does not rely on UIDs. An apparent idea is comparing the tree
> code first, and if they have the same tree code, then compare their
> subnodes recursively. To resolve the issue we have now, I just
> composed a very basic comparison function which only compares tree
> codes, and this patch can make the bootstrap get passed. I have
> attached the patch here.
>
> I am wondering if this is a valid solution and appreciate if someone
> could give me any feedback.
>
>
> Thank you!
>
>
> Cong


HAVE_ATTR_enabled mishandling?

2013-07-09 Thread David Given
I think I have found a bug. This is in stock gcc 4.8.1...

My backend does not use the 'enabled' attribute; therefore the following
code in insn-attr.h kicks in:

  #ifndef HAVE_ATTR_enabled
  #define HAVE_ATTR_enabled 0
  #endif

Therefore the following code in gcc/lra-constraints.c is enabled:

  #ifdef HAVE_ATTR_enabled
  if (curr_id->alternative_enabled_p != NULL
  && ! curr_id->alternative_enabled_p[nalt])
continue;
  #endif

->alternative_enabled_p is bogus; therefore segfault.

Elsewhere I see structures of the form:

  #if HAVE_ATTR_enabled
  ...
  #endif

So I think that #ifdef above is a straight typo. Certainly, changing it
to a #if makes the crash go away...

-- 
┌─── dg@cowlark.com ─ http://www.cowlark.com ─
│ "Every planet is weird. I spent six weeks on a moon where the
│ principal form of recreation was juggling geese. Baby geese. Goslings.
│ They were juggled." --- Firefly, _Our Mrs. Reynolds_



signature.asc
Description: OpenPGP digital signature


Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-09 Thread Jeff Law

On 07/09/2013 07:56 AM, Andreas Arnez wrote:

Andrew Haley  writes:


On 07/09/2013 12:59 PM, Andreas Arnez wrote:

With this situation at hand, I wonder whether it's a good idea to keep
maybe-uninitialized included in -Wall.  Projects which have been using
"-Wall -Werror" successfully for many years are now forced to
investigate non-existing bugs in their code.


But maybe-uninitialized is very useful, and it's not really inappropriate
for -Wall.


The warning would be extremely useful, if false positives didn't obscure
real problems.  In its current state I consider its usefulness limited.
Depends on your point of view.  This topic has been hashed through 
repeatedly on this list.


I personally like -Wall -Werror.  While we do run into false positives 
and the set of false positives does change from release to release as a 
result of optimizations, I believe there's been an overall improvement 
in the quality of the codebases that use -Wall -Werror.  I certainly see 
fewer bugs these days blamed on the compiler which are in fact 
uninitialized variables.



I'd like to revamp how we do path isolation to eliminate more of the 
false positives, but this warning is by its nature going to generate 
false positives.


What I would ask is that folks continue to file bugs for false 
positives.  While I can't guarantee we'll fix them, they do provide a 
good base of tests for path sensitive analysis.



I'd also suggest that when an initialization is added merely to avoid 
the warning that a comment be added to the code so that folks know it's 
merely to avoid the warning.



Jeff